The emergence of clusters in self-attention dynamics
DECISION AND CONTROL LECTURE THE GRAINGER COLLEGE OF ENGINEERING | ||
TITLE | The emergence of clusters in self-attention dynamics | |
Sponsor | Decision and Control | |
Date | Thursday, October 5, 2023 | |
Time | 4:00 PM | |
LOCATION | Coordinated science lab, rm b02 | |
Speaker: Postdoctoral Associate, Borjan Geshkovski from MIT
| ||
ABSTRACT | ||
With remarkable empirical success, Transformers enable large language models to compute very powerful representations using the self-attention mechanism. We model this mechanism as an interacting particle systems to brings and demonstrate the formation of clusters as the number of layers goes to infinity. Based on joint work with Cyril Letrouit (CNRS), Yury Polyanskiy (MIT) and Philippe Rigollet (MIT). | ||
BIO | ||
Borjan Geshkovski is currently a postdoc at MIT Math, where he works with Philippe Rigollet. He got his PhD from the Autonomous University in Madrid under the supervision of Enrique Zuazua. His research interests are centred around control, learning and PDE. |