This project explores the deep mathematical connections between renormalization group (RG) concepts of relevant and irrelevant variables in statistical physics and the notion of relevant and irrelevant information in information bottleneck (IB) theory. The candidate will develop a unified framework that transfers analytical and numerical techniques between these fields, using RG methods to solve information bottleneck problems and information-theoretic insights to generate new insights into renormalization group. This bidirectional approach will yield new solutions, perturbation theory, and computational methods that leverage the shared mathematical structure of coarse-graining in physics and optimal compression in information bottleneck theory.
Renormalization group (RG) and the information bottleneck (IB) method share a fundamental goal: systematically identifying what variables and information are relevant at different scales while discarding irrelevant details. In RG, relevant variables persist under coarse-graining while irrelevant ones vanish; in IB, relevant information is preserved while irrelevant information is compressed away. Recent work has revealed that these parallels extend beyond analogy to precise mathematical correspondence, with direct implications for modern machine learning where neural networks must learn relevant representations from high-dimensional data.
This project will develop a comprehensive theoretical framework exploiting this correspondence in both directions. From RG to IB, we will adapt Wilson's momentum-shell integration, decimation procedures, and critical phenomena analysis to derive new solutions for optimal data compression in machine learning. From IB to RG, we will use information-theoretic measures to define novel RG flows and generate insights into universal behavior. The research will establish how deep neural networks implicitly perform information bottleneck optimization during training, connecting information compression to RG-like hierarchical feature extraction.
Key technical objectives include: (1) establishing mappings between RG flows and information bottleneck optimization trajectories in representation learning, (2) developing perturbative expansions using both analytical and numerical methods, (3) exploring normalizing flows as a method for solving IB optimization, and (4) investigating how formulating IB and RG with alternative information measures could lead to new insights into statistical physics and neural networks.
The project combines rigorous mathematical analysis with practical algorithmic development for modern AI systems. Applications include understanding how transformer models compress information across layers, designing principled pruning methods based on RG or IB-inspired relevance measures, and developing new representation learning algorithms that explicitly optimize for relevant information at multiple scales. Expected outcomes include new solutions for information compression, insights into phase transitions during neural network training, and computational methods that leverage physics-inspired coarse-graining for efficient representation learning. The candidate will work at the intersection of theoretical physics, information theory, and machine learning, developing expertise valuable for both fundamental research and practical applications in AI systems that must identify and preserve relevant information across different levels of abstraction.
The opportunity ID for this research opportunity is 3651