Information, Geometry and Physics Seminar

Wed. April 17th, 2024 — 4pm

Toward a Control Theory of LLMs

Aman Bhargava, Computation and Neural Systems, Caltech and Shi-Zhuo Looi, Department of Mathematics, Caltech

Prompt engineering is crucial for deploying LLMs but is poorly understood mathematically. We formalize LLM systems as a class of discrete stochastic dynamical systems to explore prompt engineering through the lens of control theory. We investigate the reachable set of output token sequences $\mathcal R_y(\mathbf x_0)$ for which there exists a control input sequence $\mathbf u$ for each $\mathbf y \in \mathcal R_y(\mathbf x_0)$ that steers the LLM to output $\mathbf y$ from initial state sequence $\mathbf x_0$. We offer analytic analysis on the limitations on the controllability of self-attention in terms of reachable set, where we prove an upper bound on the reachable set of outputs $\mathcal R_y(\mathbf x_0)$ as a function of the singular values of the parameter matrices. We present complementary empirical analysis on the controllability of a panel of LLMs, including Falcon-7b, Llama-7b, and Falcon-40b. Our results demonstrate a lower bound on the reachable set of outputs $\mathcal R_y(\mathbf x_0)$ w.r.t. initial state sequences $\mathbf x_0$ sampled from the Wikitext dataset. We find that the correct next Wikitext token following sequence $\mathbf x_0$ is reachable over 97\% of the time with prompts of $k\leq 10$ tokens. We also establish that the top 75 most likely next tokens, as estimated by the LLM itself, are reachable at least 85\% of the time with prompts of $k\leq 10$ tokens. Intriguingly, short prompt sequences can dramatically alter the likelihood of specific outputs, even making the least likely tokens become the most likely ones. This control-centric analysis of LLMs demonstrates the significant and poorly understood role of input sequences in steering output probabilities, offering a foundational perspective for enhancing language model system capabilities.

Wed. April 10th, 2024 — 4pm

Predictive coding in neural networks can recover the environment's map

James Gornet, Caltech

Mapping is a general mechanism for generating an internal representation of sensory information. While spatial maps facilitate navigation and planning within an environment, mapping is a ubiquitous function that extends beyond visual-spatial mapping. However, it has been unclear how a single mechanism can generate both spatial and non-spatial maps. In this talk, I discuss how predictive coding—by predicting sensory data from past sensory experiences—provides a basic, general mechanism for charting both spatial and non-spatial maps. In this theoretical framework, an agent traverses some environment; the spatial structure of an agent's path is embedded in the sequential structure of agent's video observations. I demonstrate that a neural network that performs predictive coding can construct an implicit spatial map of an environment by assembling information from local paths into a global frame within the neural network's latent space. In addition, the neural network's latent variables generate firing patterns that resemble place fields in the rodent hippocampus. Predictive coding can be performed over any sensory modality that has some temporal sequence. Large language models such as GPT-4, for example, train on causal word prediction, a form of predictive coding, and build internal maps that support generalized reasoning. These results all suggest that predictive coding might provide a unified theory for building representations of information—connecting disparate theories including place cell formation in the hippocampus and human language.

Wed. March 27th, 2024 — 4pm -- Zoom link (Caltech account required)

Using Physics as a Microscope to Dissect Transcriptional Dynamics in Development

Hernan Garcia, Department of Molecular & Cell Biology and Department of Physics UC Berkeley

Over the last few decades we have largely identified the repressors and activators that shape gene expression patterns in developing embryos and that, in turn, dictate cellular fates. Yet, despite amassing this great reservoir of knowledge, we are still incapable of predicting how the number, placement and affinity of binding sites for these transcription factors in regulatory DNA dictate gene expression patterns in space and time. Achieving such predictive understanding calls for going beyond molecular parts lists and for obtaining the in vivo biochemical information necessary for fueling theoretical models of transcriptional regulation in developing animals. In this talk, I will show how we are using physics as a “microscope” to uncover the molecular mechanisms by which activators and repressors dictate transcription in space and time in developing animals. Specifically, using novel quantitative tools that we have developed for precision measurements, I will show that most developmental genes are transcribed in stochastic bursts, and that many transcription factors regulate gene expression by modulating the frequency, duration, and/or amplitude of these bursts. We will then engage in an iterative dialogue between theoretical models and quantitative experiments aimed at revealing the mechanisms underlying this control of transcriptional bursting. Our results challenge the textbook picture of activator and repressor action based on stable protein-protein interactions and call for a description of transcriptional control that acknowledges that the nucleus is not a bag of well-mixed transcription factors. Most importantly, our work sets a path forward for reaching a predictive understanding of cellular decision making and demonstrates how a quantitative dialogue between theory and experiment can shed light on biological mechanisms beyond the reach of even the best super resolution microscopes.

Wed. January 17th, 2024 — 4pm

Lie Groups and Hierarchy for Neural Visual Representations

Christian Shewmake, Redwood Center for Theoretical Neuroscience, UC Berkeley

How do brains represent the world? An emerging set of findings in neuroscience is beginning to illuminate a new paradigm for understanding the neural code. Across sensory and motor regions of the brain, neural structures are found to mirror the geometric structure of the world states they represent—either in their explicit anatomical arrangement or in the underlying low-dimensional manifolds traversed by their dynamics. This phenomenon can be observed in the circuit of neurons representing orientation in the fly, spatial position in place cells and grid cells in the rat, and changes in 3D orientation in human semicircular canals, among others. Such findings suggest that brains across species and sensory modalities have evolved a general computational strategy that leverages geometry, Bayesian inference, and dynamical systems to represent the structure of the world.
Can these geometric ideas be extended beyond representations of low-dimensional spaces, such as self position and orientation, to complex spaces such as visual scenes? Indeed, a basic understanding of visual representations in the primate visual cortical hierarchy remains elusive beyond V1. Likewise, a longstanding goal in unsupervised representation learning has been the discovery of low-dimensional, independent factors of variation latent in visual data. The goal of this talk is to introduce the emerging geometric paradigm in neuroscience and explore its implications for unsupervised learning of visual scene representations in brains and machines.

Wed. January 10th, 2024 — 4pm

From quantum learning theory to dimension-free Remez inequalities

Joseph Slote, CMS, Caltech

Harmonic analysis has long been a powerful tool in theoretical computer science, and we are starting to see applications in the noncommutative world of quantum computing. In this talk we will discuss how to learn quantum operators from very little information, thanks to noncommutative analogues of some classical inequalities in harmonic analysis: Bohnenblust—Hille-type inequalities. To prove these we exploit the geometry of Heisenberg-Weyl eigenspaces to reduce to a commutative question, which in turn is settled by a dimension-free Remez-type inequality, the first of its kind.
We will dive into some proof ideas according to time and interest.
Based on joint works with Lars Becker, Ohad Klein, Alexander Volberg, and Haonan Zhang.

Wed. November 15th, 2023 — 2pm

TBA

Giuseppe di Giulio, University of Würzburg

TBA

Wed. November 8th, 2023 — 2pm

Higher Information: The untold topological secrets of measures and states

Tom Mainiero, St Joseph's University

In the futuristic year 2023, an ex cohomological cop (Harrison Ford) is pulled out of retirement to track down obstructions to the factorizability of multipartite measures: entangled states and their classical counterparts...joint probability measures. A vast conspiracy is uncovered: associated to every multipartite measure is an emergent space whose topology encodes non-local correlations between various subsystems. Probing this space with tools with homology, homotopy, and category theory reveals evidence that entropy and mutual information are Euler characteristics: puppets in a larger informatic ruse. Join as we slowly uncover pieces of the big picture, unmasking filthy rich measures of shared information higher up on the food chain (complex).

Wed. October 25th, 2023 — 2pm

Rigorous results about entropies in QFT

Feng Xu, UC Riverside

In this talk I will give an introduction to some recent mathematical results about relative entropies in QFT. The talk should be accessible to graduate students.

Wed. October 11th, 2023 — 2pm

Quantifying emergent effects through homological algebra

Johnny Jingze Li, Mathematical Neuroscience Lab, UCSD

Emergent effects of complex systems is commonly understood as novel properties, patterns, or behaviors of systems that are not present in their components, sometimes expressed as “the whole is more than the sum of its parts”. I will discuss a framework based on (Adam, 2017) (link: https://elieadam.com/eadam_PhDThesis.pdf) that gives a measure of emergent effect as the “loss of exactness” computed from local structures, through category theory, homological algebra and quiver representations, and show that the derived functor that encodes emergent effects is related to information loss. I will also discuss potential connections to biological neural networks and renormalization groups.

Wed. October 4th, 2023 — 2pm

Bounding spectral gap for Laplacian and Dirac operator

Yixin Xu, Caltech

Associativity of point-wise multiplication between functions provides constraints on the spectral data of a manifold. In this talk, I will discuss how the method of semi-definite programming allows one to obtain upper bounds for the lowest non-zero eigenvalues of Laplacian and Dirac operator on 2d hyperbolic surfaces and orbifolds equipped with a spin structure. A numerical algorithm based on Selberg trace formula shows the [0;3,3,5] hyperbolic triangle and the Bolza surface nearly saturates the numerical bound at genus 0 and genus 2. This method also produces bounds that are specific to hyperelliptic surfaces. Our approach is inspired by the method of conformal bootstrap, which is a widely adopted technique to extract constraints on conformal field theories from basic self-consistency conditions.

Past Talks (AY 2022-23)

Wed. July 26, 2023 — 2pm

Grothendieck classes of quadric hypersurfaces

Gonçalo Tabuada, University of Warwick

The Grothendieck ring of varieties, introduced in a letter from Alexander Grothendieck to Jean-Pierre Serre (August 16th 1964), plays an important role in algebraic geometry. However, despite the efforts of several mathematicians, the structure of this ring still remains poorly understood. In this talk, in order to better understand the Grothendieck ring of varieties, I will describe some new structural properties of the Grothendieck classes of quadric hypersurfaces. More specifically, by combining the recent theory of noncommutative motives with the classical theory of motives, I will show that if two quadric hypersurfaces have the same Grothendieck class, then they have the same even Clifford algebra and the same signature. As an application, this implies in numerous cases (e.g., when the base field is a local or global field) that two quadric hypersurfaces have the same Grothendieck class if and only if they are isomorphic.

Wed. July 19, 2023 — 2pm

Von Neumann algebras and quantum gravity

Hong Liu, MIT

The classifications of von Neumann algebras can be interpreted as classifying entanglement patterns of general quantum systems. They can thus play significant roles in our explorations of quantum gravity. I discuss some recent examples in using them to understand spacetime structure, including emergence of space and time in the AdS/CFT duality.

Wed. June 28, 2023 — 2pm

Optimal transport in free probability

David Jekel, University of California, San Diego

Free probability is a theory of random variables which do not commute under multiplication, which can be realized as operators on a Hilbert space. Voiculescu showed that free probability describes the large N behavior of independent random N x N matrices in many situations. This talk discusses the analog of optimal transport theory for free probability, as well as the large N behavior of optimal transport for certain random matrix models. In the free setting, there are many obstructions to optimal transport that don't exist in the classical setting. However, there is still an analog of Monge-Kantorovich duality. Moreover, for certain classes of non-commutative random variables (those associated to free Gibbs laws), we have a very good understanding of the optimal transport maps and how they arise from the random matrix models in the large N limit.

Thursday June 8, 2023 — 2pm @ Linde Hall 387

Quantum Symmetry in Classical and Noncommutative Geometry

Debashish Goswami, Indian Statistical Institute Kolkata

Quantum groups are well-known symmetry objects in mathematics and mathematical physics. In this talk, we'll give a brief overview of the theory of quantum isometry groups of classical and noncommutative manifolds a la Connes developed by the speaker and collaborators over the last decade. We'll mainly work in the framework of compact quantum groups a la Woronowicz and study their (co)actions on the underlying C* algebra of a noncommutative manifold (completion of the algebra associated to the spectral triple) commuting with a suitable Dirac operator or Laplacian.
This is based on several joint works with J. Bhowmick, A. Skalski, T. Banica, S. Joardar, P. Etingof, C. Walton, A. Mandal and A. Chirvasitu.

Wed. May 31, 2023 — 2pm

Proper time from correlators in QFT and holography

Allic Sivaramakrishnan, Caltech

The proper time experienced by a classical probe particle has entered into the study of black hole infall, the boundary encoding of bulk observers in AdS/CFT, and interferometry. In principle, quantum effects can correct classical proper time even for free propagation, which naively renders the geodesic treatment inapplicable. We propose a prescription for extracting proper time from QFT correlators. The quantum corrections arising in this proposal inherit certain properties of geodesics, but violate others. We also discuss a candidate CFT dual to this proposal in AdS. Talk is based on work in progress.

Wed. May 24, 2023 — 2pm

The Magnitude of a Metric Space

Mark Meckes, Case Western Reserve University

Magnitude is a numerical isometric invariant of metric spaces defined recently by Tom Leinster, based on considerations from category theory. Despite the subject's youth, it has found connections with other invariants from a large and growing number of fields, including theoretical ecology, graph theory, topological data analysis, potential theory, and integral geometry. I will discuss the basic definitions and motivation and give a broad survey of these connections, ending with some applications in high-dimensional convexity.

Wed. May 3, 2023 — 2pm

Critical JT Gravity

Alicia Castro, Radboud University Nijmegen

In this talk, I will present JT-like gravity, a model of two-dimensional quantum gravity on constant negatively curved spacetimes, as a model of random hyperbolic surfaces. By studying the generating function of volumes of random hyperbolic surfaces with defects, i.e. weighted geodesic boundaries, we explore critical regimes where the surfaces develop macroscopic holes. We analyse the impact of this critical behavior on the density of states of the theory at the boundary, and we present a family of models that interpolate between systems with $\sqrt{E}$ and $E^{3/2}$, which are commonly found in models of JT Gravity coupled to dynamical end-of-the-world and FZZT branes.

Wed. April 19, 2023 — 2pm @ 114 E. Bridge

Typicality for Stratified Measures

Juan Pablo Vigneaux, Caltech

An m-rectifiable measure is, roughly speaking, supported by an m-dimensional manifold, and a stratified measure is a convex combination of rectiffiable measures, possibly in different dimensions. Stratified measures generalize discrete-continuous mixtures and have, in general, a nontrivial singular continuous part. We shall study a set of typical realizations of n independent trials of an stratified measure. This set is also stratified and the dimensions of the strata concentrate around the expected value (that either coincides or is conjectured to coincide with Renyi's information dimension of the measure). The entropy of the stratified measure quantifies the exponential rate of growth of the typical set; it also verifies a chain rule whose conditional term bounds the rate of growth of typical realizations in each typical stratum.

Wed. April 12, 2023 — 2pm

On The Origins of the Boltzmann Distribution, or Support Preserving Endomorphisms of Convolution Semigroups, joint with Fedor Sandomirskiy

Omer Tamuz, Caltech

The Boltzmann distribution is used in statistical mechanics to describe the distribution of states in systems with a given temperature. We give a novel characterization of this distribution as the unique one satisfying independence for uncoupled systems. The theorem boils down to a statement about symmetries of the convolution semigroup of finitely supported probability measures on the integers.

Information, Geometry and Physics Seminar

Past Talks (AY 2023-24)

Past Talks (AY 2022-23)

Information, Geometry and
Physics Seminar