Toggle navigation
OpenReview
.net
Login
×
Back to
NeurIPS
NeurIPS 2024 Workshop MINT Submissions
Linguistic Minimal Pairs Elicit Linguistic Similarity in Large Language Models
Xinyu Zhou
,
Delong Chen
,
Samuel Cahyawijaya
,
Xufeng Duan
,
Zhenguang Cai
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Steering Clear: A Systematic Study of Activation Steering in a Toy Setup
Dmitrii Krasheninnikov
,
David Krueger
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Dipper: Diversity in Prompts for Producing Large Language Model Ensembles in Reasoning tasks
Gregory Kang Ruey Lau
,
Wenyang Hu
,
Liu Diwen
,
Chen Jizhuo
,
See-Kiong Ng
,
Bryan Kian Hsiang Low
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Analysing the Residual Stream of Language Models Under Knowledge Conflicts
Yu Zhao
,
Xiaotang Du
,
Giwon Hong
,
Aryo Pradipta Gema
,
Alessio Devoto
,
Hongru WANG
,
Xuanli He
,
Kam-Fai Wong
,
Pasquale Minervini
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Zero-to-Hero: Enhancing Zero-Shot Novel View Synthesis via Attention Map Filtering
Ido Sobol
,
Chenfeng Xu
,
Or Litany
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Ablation is Not Enough to Emulate DPO: How Neuron Dynamics Drive Toxicity Reduction
Yushi Yang
,
Filip Sondej
,
Harry Mayne
,
Adam Mahdi
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Probing the Decision Boundaries of In-context Learning in Large Language Models
Siyan Zhao
,
Tung Nguyen
,
Aditya Grover
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Towards Reliable Evaluation of Behavior Steering Interventions in LLMs
Itamar Pres
,
Laura Ruis
,
Ekdeep Singh Lubana
,
David Krueger
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Comparing Bottom-Up and Top-Down Steering Approaches on In-Context Learning Tasks
Madeline Brumley
,
Joe Kwon
,
David Krueger
,
Dmitrii Krasheninnikov
,
Usman Anwar
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Linearly Controlled Language Generation with Performative Guarantees
Emily Cheng
,
Marco Baroni
,
Carmen Amo Alonso
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Steering Large Language Models using Conceptors: Improving Addition-Based Activation Engineering
Joris Postmus
,
Steven Abreu
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
SCIURus: Shared Circuits for Interpretable Uncertainty Representations in Language Models
Carter Teplica
,
Yixin Liu
,
Arman Cohan
,
Tim G. J. Rudner
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Unveiling and Manipulating Concepts in Time Series Foundation Models
Michał Wiliński
,
Mononito Goswami
,
Nina Żukowska
,
Willa Potosnak
,
Artur Dubrawski
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Uncovering Uncertainty in Transformer Inference
Greyson Brothers
,
Willa M. Mannering
,
John Winder
,
Amber Tien
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Can sparse autoencoders be used to decompose and interpret steering vectors?
Harry Mayne
,
Yushi Yang
,
Adam Mahdi
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Understanding Visual Concepts Across Models
Brandon Trabucco
,
Max A Gurinas
,
Kyle Doherty
,
Russ Salakhutdinov
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Decomposing and Editing Predictions by Modeling Model Computation
Harshay Shah
,
Andrew Ilyas
,
Aleksander Madry
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Analyzing (In)Abilities of SAEs via Formal Languages
Abhinav Menon
,
Manish Shrivastava
,
Ekdeep Singh Lubana
,
David Krueger
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models
Peng Wang
,
Zexi Li
,
Ningyu Zhang
,
Ziwen Xu
,
Yunzhi Yao
,
Yong Jiang
,
Pengjun Xie
,
Fei Huang
,
Huajun Chen
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Is Free Self-Alignment Possible?
Dyah Adila
,
Changho Shin
,
Yijing Zhang
,
Frederic Sala
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Representation Tuning
Christopher Ackerman
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
GPT-2 Small Fine-Tuned on Logical Reasoning Summarizes Information on Punctuation Tokens
Sonakshi Chauhan
,
Atticus Geiger
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Do LLMs internally ``know'' when they follow instructions?
Juyeon Heo
,
Christina Heinze-Deml
,
Oussama Elachqar
,
Shirley You Ren
,
Kwan Ho Ryan Chan
,
Udhyakumar Nallasamy
,
Andrew Miller
,
Jaya Narain
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
LoFiT: Localized Fine-tuning on LLM Representations
Fangcong Yin
,
Xi Ye
,
Greg Durrett
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
Entropy-Based Decoding for Retrieval-Augmented Large Language Models
Zexuan Qiu
,
Zijing Ou
,
Bin Wu
,
Jingjing Li
,
Aiwei Liu
,
Irwin King
Published: 09 Oct 2024, Last Modified: 15 Dec 2024
MINT@NeurIPS2024
Readers:
Everyone
«
‹
1
2
›
»