Toggle navigation
OpenReview
.net
Login
×
Back to
NeurIPS
NeurIPS 2024 Workshop M3L Submissions
Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in Neural Nets
Yuandong Tian
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
Implicit Bias of Adam versus Gradient Descent in One-Hidden-Layer Neural Networks
Bhavya Vasudeva
,
Vatsal Sharan
,
Mahdi Soltanolkotabi
Published: 11 Oct 2024, Last Modified: 19 Nov 2024
M3L Poster
Readers:
Everyone
Understanding Factual Recall in Transformers via Associative Memories
Eshaan Nichani
,
Jason D. Lee
,
Alberto Bietti
Published: 11 Oct 2024, Last Modified: 02 Dec 2024
M3L Oral
Readers:
Everyone
From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
Kaiyue Wen
,
Huaqing Zhang
,
Hongzhou Lin
,
Jingzhao Zhang
Published: 11 Oct 2024, Last Modified: 04 Dec 2024
M3L Poster
Readers:
Everyone
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
Kairong Luo
,
Haodong Wen
,
Shengding Hu
,
Zhenbo Sun
,
Zhiyuan Liu
,
Maosong Sun
,
Kaifeng Lyu
,
Wenguang Chen
Published: 11 Oct 2024, Last Modified: 11 Dec 2024
M3L Poster
Readers:
Everyone
Provable unlearning in topic modeling and downstream tasks
Stanley Wei
,
Sadhika Malladi
,
Sanjeev Arora
,
Amartya Sanyal
Published: 11 Oct 2024, Last Modified: 13 Dec 2024
M3L Poster
Readers:
Everyone
Progressive distillation induces an implicit curriculum
Abhishek Panigrahi
,
Bingbin Liu
,
Sadhika Malladi
,
Andrej Risteski
,
Surbhi Goel
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
Increasing Fairness via Combination with Learning Guarantees
Yijun Bian
,
Kun Zhang
Published: 11 Oct 2024, Last Modified: 22 Nov 2024
M3L Poster
Readers:
Everyone
Optimizing Fine-Tuning Efficiency: Gradient Subspace Tracking on Grassmann Manifolds for Large Language Models
Sahar Rajabi
,
Sirisha Rambhatla
Published: 11 Oct 2024, Last Modified: 03 Dec 2024
M3L Poster
Readers:
Everyone
Transformers Provably Solve Parity Efficiently with Chain of Thought
Juno Kim
,
Taiji Suzuki
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
Declarative characterizations of direct preference alignment algorithms
Kyle Richardson
,
Vivek Srikumar
,
Ashish Sabharwal
Published: 11 Oct 2024, Last Modified: 11 Dec 2024
M3L Poster
Readers:
Everyone
Commute Your Domains: Trajectory Optimality Criterion for Multi-Domain Learning
Alexey Rukhovich
,
Alexander Podolskiy
,
Irina Piontkovskaya
Published: 11 Oct 2024, Last Modified: 04 Dec 2024
M3L Poster
Readers:
Everyone
Benign Overfitting in Single-Head Attention
Roey Magen
,
Shuning Shang
,
Zhiwei Xu
,
Spencer Frei
,
Wei Hu
,
Gal Vardi
Published: 11 Oct 2024, Last Modified: 03 Dec 2024
M3L Poster
Readers:
Everyone
Benign Overfitting in Out-of-Distribution Generalization of Linear Models
Shange Tang
,
Jiayun Wu
,
Jianqing Fan
,
Chi Jin
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models
Frederik Kunstner
,
Alan Milligan
,
Robin Yadav
,
Mark Schmidt
,
Alberto Bietti
Published: 11 Oct 2024, Last Modified: 07 Dec 2024
M3L Poster
Readers:
Everyone
Flavors of Margin: Implicit Bias of Steepest Descent in Homogeneous Neural Networks
Nikolaos Tsilivis
,
Gal Vardi
,
Julia Kempe
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
An empirical study of the $(L_0, L_1)$-smoothness condition
Y Cooper
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
Transformers are Efficient Compilers, Provably
Xiyu Zhai
,
Runlong Zhou
,
Liao Zhang
,
Simon Shaolei Du
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
Comparing Implicit and Denoising Score-Matching Objectives
Artem Artemev
,
Ayan Das
,
Farhang Nabiei
,
Alberto Bernacchia
Published: 11 Oct 2024, Last Modified: 27 Nov 2024
M3L Poster
Readers:
Everyone
A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers
William Merrill
,
Ashish Sabharwal
Published: 11 Oct 2024, Last Modified: 10 Dec 2024
M3L Poster
Readers:
Everyone
Misspecified $Q$ -Learning with Sparse Linear Function Approximation: Tight Bounds on Approximation Error
Ally Yalei Du
,
Lin Yang
,
Ruosong Wang
Published: 11 Oct 2024, Last Modified: 13 Dec 2024
M3L Poster
Readers:
Everyone
Dynamics of Concept Learning and Compositional Generalization
Yongyi Yang
,
Core Francisco Park
,
Ekdeep Singh Lubana
,
Maya Okawa
,
Wei Hu
,
Hidenori Tanaka
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data
Binghui Li
,
Yuanzhi Li
Published: 11 Oct 2024, Last Modified: 03 Dec 2024
M3L Poster
Readers:
Everyone
Provable weak-to-strong generalization via benign overfitting
David Xing Wu
,
Anant Sahai
Published: 11 Oct 2024, Last Modified: 03 Dec 2024
M3L Poster
Readers:
Everyone
Self-Improvement in Language Models: The Sharpening Mechanism
Audrey Huang
,
Adam Block
,
Dylan J Foster
,
Dhruv Rohatgi
,
Cyril Zhang
,
Max Simchowitz
,
Jordan T. Ash
,
Akshay Krishnamurthy
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
«
‹
1
2
3
4
›
»