
Speaker: Micha Elsner, Department of Linguistics, OSU
Title: The road to LLMs: What computational linguists have learned from four decades of experimental statistics
Abstract: While "large language models" (LLMs) are dominating the news cycle and academic discussions today, the idea of language modeling is not new. Researchers have been building practical LMs from data since at least IBM's research on machine translation in the late 1980s. What (besides cheap computing power and massive datasets) has changed to make the LLMs of today so much more powerful than the LMs of the preceding four decades? I will give a brief historical tour of the area in so far as I understand it, focusing on statistical issues from a relatively informal and applied point of view. These issues motivate modern modeling constructs like ResNets and attention layers which form important building blocks within LLM architectures. I will conclude by briefly discussing what we know so far about what LLMs learn and how they represent it.
Note: Seminars are free and open to the public. Reception to follow.