Ma191c Spring 2024: Mathematical Models of Generative Linguistics
Caltech, Linde Hall Room 289, Tuesday-Thursday 1:00-2:30pm
Instructor:
Matilde Marcolli
Brief Course Description
The goal of this class is to present a new mathematical model
of generative linguistics developed by Marcolli-Chomsky-Berwick
during the course of the past year. The class will include some
preliminary background on generative linguistics with main focus
on syntax.
Slides of Lectures
Slides of lectures will be posted here as the class progresses
First Part: Some General Linguistics Background and History of Generative Linguistics
- pdf What is linguistics?
- pdf Generative Linguistics
- pdf Mathematics of Formal Languages
- pdf Form formal languages to Minimalism
- pdf Probabilities in Computational Linguistics
- pdf Strong Minimalist Thesis
Second Part: Mathematical Structure of Syntactic Merge
- pdf Merge and Hopf algebras
- pdf Externalization
- pdf Comparison with Older Minimalism
- pdf Comparison with Physics
Third Part: Syntax and Semantics
- pdf Syntax-Semantics Interface
- pdf Semantics and Compositionality
- pdf Semiring Parsing
- pdf Adjunctions
Fourth Part: Generative linguistics and LLMs
- pdf Generative Linguistics versus Generative AI
Summary of lectures
- Tuesday April 2: introduction to linguistics: phonology, morphology, syntax, semantics; generative linguistics and its history; recursions in language and the theory of formal languages, Chomsky hierarchy
- Thursday April 4: cross-serial recursions in Dutch and Swiss German, mildly context sensitive grammars; quick overview of transformational grammar, principles and parameters, government and binding, minimalist program, more detailed discussion of formal languages, proof of Chomsky hierarchy and machine recognition
- Tuesday April 9: continuation of proof of Chomsky hierarchy and machine recognition, formal languages and group theory; tree adjoining grammars (TAGs); more details on transformational grammar; multiple context free grammars (mCFGs)
- Thursday April 11: computational minimalism, merge and move, external and internal merge, feature checking, derivations, merge grammars (MGs) and mCFGs; entropy and Shannon information, N-grams, probabilities and CFGs, probabilities and TAGs, probabilistic versus deterministic systems
- Tuesday April 16: probabilities and CFGs, probabilities and TAGs, probabilistic versus deterministic systems; New Minimalism: Strong Minimalist Thesis, Merge, overview of the main elements of the theory; magma of syntactic objects, generative process of syntactic objects as combinatorial Dyson-Schwinger equation, workspaces as binary forests
- Thursday April 18: Hopf algebras, product and coproduct and relations, graded connected Hopf algebras and antipode, combinatorial Hopf algebras, commutative Hopf algebras and affine group schemes, Hopf algebra of workspaces, different forms of coproduct and relations, different algebraic properties
- Tuesday April 23: Merge operators on workspaces, forms of Merge, External and Internal Merge, Sideward Merge, Internal Merge and magma unit, Minimal Search and cost function, Minimal Yield constraints, no complexity loss constraint, n-ary Merge undergeneration and overgeneration
- Thursday April 25: Merge as a Hopf algebra Markov chain, countercyclic movement and Late Merge and the insertion Lie algebra, Milnor-Moore theorem, head functions, head function and planar embeddings (Kayne's LCA), head and complement, phase theory, phases algorithm, labeling algoritm, phases and block spin renormalization, FormSet, prelude to Externalization: the Dutch cross-serial example
- Tuesday April 30: Theta theory, theta roles, operads, colored operads, obligatory control, FormCopy, restriction to diagonals; Externalization as correspondences, language-dependent planarization section, projection by syntactic parameters
- Thursday May 2: syntactic parameters, available data (SSWL, LanGeLin), structure in data, dimensionality, evidence of relation (deviation from Markov evolution on phylogenetic trees and coding theory perspective), independent coordinates by Belkin-Niyogi Laplacian embedding, clusters of syntactic parameters
- Tuesday May 7: syntactic parameters and dynamical variable, learning constraints (new minimalism) versus learning grammars (formal languages); comparison with older versions of Minimalism: Loday-Ronco Hopf algebra, relation to Connes-Kreimer Hopf algebra of rooted trees, External/Internal Merge feature checking and domains of definition, Internal Merge feature checking and computational complexity growth, Internal Merge domain and right-module comodule projective system, External Merge as operated algebra
- Thursday May 9: Comparison with physics: the renormalization problem in quantum field theory, Feynman diagrams and generative process of perturbative QFT, Hopf algebras of Feynman graphs, Rota-Baxter algebra of Laurent series, extraction of polar part, Birkhoff factorization of Hopf algebra characters, Connes-Kreimer renormalization as recursive factorization, Connes-Kreimer Hopf algebra of rooted trees, combinatorial Dyson-Schwinger equations, insertion Lie algebra, formal languages and Feynman graphs, graph grammars, context-free graph grammars and insertion Lie algebras, context-sensitive graph grammars and generative processes of Feynman graphs
- Tuesday May 14: Manin's Renormalization and Computation program, computability, partial recursive functions, Hopf algebra of flow charts, characters detecting the presence of noncomputability; Semantic spaces, semantics as a topological structure, thresholds and probes, Rota-Baxter semirings, syntax-semantics interface as Hopf algebra characters, examples: tropical semiring with ReLU threshold, Boolean semiring, Viterbi semiring with threshold, probes in vector space models of semantics, embedding syntactic objects in semantic spaces
- Thursday May 16: Externalization and the syntax-semantics interface: associahedra, associahedra with metric data, BHV moduli space of metric trees, moduli space of real curves of genus zero with marked points and relation to associahedra and BHV moduli spaces; Pietroski's compositional semantics, concatenation on strings and Merge, free symmetric Merge and concatenation, idempotents; Heim-Kratzer semantics, inductive types
- Tuesday May 21: Heim-Kratzer semantics, inductive types, interpretability, Boolean Hopf algebra character, topological Heim-Kratzer semantics, fuzzy Heim-Kratzer semantics, probes and Hopf algebra characters, logic from topology: Boolean and Heyting algebras, Brouwer logic and open sets of topological spaces; semiring parsing, revisiting Minimal Yield, size measures on the workspaces, Laurent series algebra of Merge derivations
- Thursday May 23: semiring parsing: Laurent series of Merge derivation and Minimal Yield constraints, renormalization to EM/IM, no-complexity-loss constraint and multivariable Laurent series renormalization; Hopf algebroids and groupoid schemes, bialgebroids and semigroupoid schemes, algebroids and directed graph schemes, Rota-Baxter algebroids, characters of Hopf algebroids to Rota-Baxter algebroids and inductive Birkhoff factorization, Rota-Baxter semiringoids, characters as weighted graph-templates of derivations with thresholds; Adjunctions, the Pair-Merge problem, the question of the generative power of adjunctions: the Tamari order and Loday's noncommutative sum of trees
- Tuesday May 28: LLMs and Zellig Harris' distributional theory, basic structure of attention modules: keys, queries, values, probabilities; maximizing attention as a Hopf algebra character and Birkhoff factorization; syntactic phenomena in LLMs and human language: persistent dimension comparison, LI-Adger database and comparative LLM performance on syntactic tests
- Thursday May 30: LLMs and the inverse prooblem of syntax, mechanistic interpretability of transformer architecture, zero-layer digram, one-layer skip trigram, query-key circuit and output-value circuit, composition of attention heads, questions about syntax circuits; prediction vs explanation, a Linear A LLM, Linear B and distributional theory; control theory and LLMs; direct comparison of mathematical models; physics as metaphor
- Tuesday June 4: final presentations
- Thursday June 6: final presentations
Reading Materials
There is no specific textbook for the class, but the following references will be useful
Main Books
- Noam Chomsky et al. "Merge and the Strong Minimalist Thesis", Cambridge 2023, pdf
- Matilde Marcolli, Noam Chomsky, Robert Berwick "Mathematical Structure of Syntactic Merge", MIT Press to appear, pdf
Papers and other reading material
- Generative syntax
- pdf Howard Lasnik, Syntactic Structures Revisited
- New Minimalism
- pdf Noam Chomsky, UCLA Lectures, 2019
- pdf Noam Chomsky, Minimalism: where are we now and where we can hope to go? 2021
- pdf Noam Chomsky, Some puzzling foundational issues, 2019
- pdf Noam Chomsky, Problems of Projection, 2013
- pdf Noam Chomsky, On Phases, 2008
- pdf Riny Huijbregts, Empirical cases that rule out Ternary Merge, 2021
- Computational Minimalism
- pdf Chris Collins and Edward Stabler, A Formalization of Minimalist Syntax, 2017
- Hopf algebras
- pdf Pawel Blasiak, Combinatorial route to algebra: the art of composition and decomposition, 2010
- pdf Jean-Louis Loday, Maria Ronco, Combinatorial Hopf Algebras, 2008
- pdf
M.Aguilar, N.Bergeron, F.Sottile, Combinatorial Hopf algebras and
generalized Dehn-Sommerville relations, 2014
- pdf D.Calaque, K.Ebrahimi-Fard, D.Machon, Two interacting Hopf algebras of trees: A Hopf-algebraic approach to composition and substitution of B-series
- pdf Persi Diaconis, C. Y. Amy Pang, Arun Ram, Hopf algebras and Markov chains: Two examples and a theory
- pdf C. Y. Amy Pang, Markov Chains from Descent Operators on Combinatorial Hopf Algebras
- Syntactic Parameters
- pdf G.Longobardi, A.Treves, Grammatical Parameters from a Gene-like Code to Self-Organizing Attractors: a research program
- pdf Ian Roberts, Parameter hierarchies and Universal Grammar
- pdf G.Longobardi et al, Toward a syntactic phylogeny of modern Indo-European languages
- pdf S. Gakkhar, M. Marcolli, Syntactic Structures and the General Markov Models
- pdf A.Port, T.Karidi, M.Marcolli, Topological analysis of syntactic structures
- pdf A.Ortegaray, R.Berwick, M.Marcolli, Heat Kernel Analysis of Syntactic Structures
- pdf K.Shu, M.Marcolli, Syntactic structures and code parameters
- pdf J.Park et al, Prevalence and recoverability of syntactic parameters in sparse distributed memories
- pdf K.Shiva, J.Tao, M.Marcolli, Syntactic Parameters and Spin
Glass Models of Language Change
- pdf K.Shu, A.Ortegaray, R.Berwick, M.Marcolli, Phylogenetics of Indo-European Language Families via an
Algebro-Geometric Analysis of Their Syntactic Structures
- Comparison with Old Minimalism
- pdf M.Aguilar, F.Sottile, Structure of the Loday-Ronco Hopf algebra of trees
- pdf Y.Zhang, Z.Gao, Hopf algebras of planar binary trees: an operated algebra approach
- Comparison with physics and the theory of computation
- pdf Kurush Ebrahimi-Fard, Dominique Manchon, The combinatorics of Bogoliubov's recursion in renormalization
- pdf L.Foissy, Classification of systems of Dyson-Schwinger equations in the
Hopf algebra of decorated rooted trees
- pdf M.Marcolli, A.Port, Graph grammars, insertion Lie algebras, and quantum field theory
- pdf Yu.I.Manin, Renormalization and Computation, I
- pdf Yu.I.Manin, Renormalization and Computation, II
- pdf C.Delaney, M.Marcolli, Dyson Schwinger equations in the
theory of computation
- Rota-Baxter semirings and Birkhoff factorization
- pdf M.Marcolli, N.Tedeschi, Entropy Algebras and Birkhoff factorization
- Externalization and the syntax-semantics interface
- pdf S.L.Devadoss, J. Morava, Navigation in tree spaces
- pdf L. Billera, S. Holmes, K. Vogtmann, Geometry of the space of phylogenetic trees
- pdf S.L.Devadoss, Tessellations of Moduli Spaces and the Mosaic Operad
- Adjunctions
- pdf P.Pietroski, Function and Concatenation
- pdf T.Graf, The Syntactic Algebra of Adjuncts
- pdf T.Hunter, Deconstructing merge and move to make room for adjunction
- Semiring parsing
- pdf Joshua Goodman, Semiring parsing
- LLMs vs generative linguistics
- pdf Zellig Harris, Distributional structures, 1954
- pdf F.Y.Lin, The transformations of transformations, Language and Communication 20 (2000) 197-253
- pdf A.Clark, "Learning context free grammars with the syntactic concept lattice, ICGI 2010, LNAI 6339, pp.38-51, 2010.
- pdf S.Gaubert, Y.Vlassopoulos, Directed metric structures arising in large language models
- pdf Tai-Danae Bradley, John Terilla, Yiannis Vlassopoulos, An enriched category theory of language: from syntax to semantics
- Coda
- pdf Noam Chomsky, "Linguistics then and now: some personal reflections", Ann. Rev. Linguist. 7 (2021) 1-11
Further suggested readings
- Generative syntax
- pdf
Noam Chomsky, "Three models for the description of Language"
- pdf S.Ginsburg, B.Partee, Mathematical Model of Transformational Grammar
- pdf N.Chomsky, The Minimalist Program
- Computational Minimalism
- pdf E.P.Stabler, "Computational perspectives on minimalism"
- pdf P.beim Graben, S.Gerth, "Geometric representations for minimalist grammars"
- pdf R.C.Berwick, "Mind the Gap"
- pdf T.Hunter, C.Dyer, "Distributions on Minimalist Grammar Derivations"
- Formal languages
- pdf N.Chomsky, M.Schutzenberger, The Algebraic Theory of Context-free Languages
- pdf P.A.Mellies, N.Zeilberger, Parsing as a lifting problem and the
Chomsky-Schutzenberger representation theorem
- pdf P. Flajolet, Analytic models and ambiguity of context-free languages
- pdf S.Giraudo, J.G.Luque, L.Mignot, F.Nicart, "Operads, quasiorders and regular languages"
- pdf L.Sennhauser, R.C.Berwick, "Evaluating the Ability of LSTMs to Learn Context-Free Grammars"
- pdf J.Shallit, "Number Theory and Formal Languages"
- pdf N.Ghani, A.Kurz, Higher dimensional trees algebraically
- pdf J.E.Pin, "Mathematical foundations of automata theory"
- syntactic parameters
- pdf R.Clark, "Kolmogorov complexity and the information content of parameters"
- pdf Partha Niyogi, Robert C. Berwick, "A dynamical systems model for language change"
- Semantics in Generative Linguistics
- pdf P.Pietroski, Conjoining Meanings
- pdf I.Heim, A.Kratzer, Semantics in Generative Grammar
- pdf N.Chomsky, Studies on Semantics in Generative Grammar
- LLMs vs generative linguistics
- pdf E. Tulchinskii et al, "Intrinsic Dimension Estimation for Robust Detection
of AI-Generated Texts", NeurIPS 2023
- pdf V.Dentella, F.Gunther, E.Leivada, "Systematic testing of three Language Models reveals low language accuracy, absence of response stability, and a yes-response bias", PNAS 2023
- pdf H.J.Vazquez Martinez, "The Acceptability Delta Criterion: Testing Knowledge of Language using the Gradience of Sentence Acceptability", Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, pages 479-495, 2021
- pdf A.Bhargava, C.Witkowski, M.Shah, M.Thomson, What's the magic word? A control theory of LLM prompting, 2024
- pdf C.D.Manning, K.Clark, J.Hewitt, Emergent linguistic structure in artificial neural networks trained by self-supervision, PNAS 2020
Schedule of Final Presentations
IMPORTANT: presentations will take place in ROOM 359 on
Tuesday June 4 and Thursday June 6, from 1 to 3:30 pm
Schedule of presentations
- Tuesday June 4, 1:00-1:30: Dylan, dimension and AI-generated text
- Tuesday June 4, 1:30-2:00: Jonghyeon, dynamical systems for language change
- Tuesday June 4, 2:00-2:30: Holly, Sanskrit computational linguistics
- Tuesday June 4: 2:30-3:00: Elizabeth, Kolomogorov complexity and parameters
- Tuesday June 4: 3:00-3:30: Zhaojun, computational minimalism
- Thursday June 6, 1:00-1:30: Elliott, Hopf algebra renormalization
- Thursday June 6, 1:30-2:00: Arnav, category theory in linguistics
- Thursday June 6, 2:00-2:30: Alan, ambiguity of context-free languages
- Thursday June 6, 2:30-3:00: Vinicius, Renormalization and computation
- Thursday June 6, 3:00-3:30: Edward, Hopf algebras and MZVs