Research
I am currently working on a number of topics in LLMs.
Right now I'm trying to understand how reasoning improves LLMs.
Another big direction is in identifying what elements of the LLM pipeline are bottlenecking our scaling growth.
A lot of my research is conducted through the lens of sparsity. I was a lead organizer for the ICLR 2025 workshop on Sparsity in LLMs.
|
|
Safety Alignment Should be Made More Than Just a Few Tokens Deep
Xiangyu Qi,
Ashwinee Panda,
Kaifeng Lyu,
Xiao Ma,
Subhrajit Roy,
Ahmad Beirami,
Prateek Mittal,
Peter Henderson
At ICLR 2025 (Outstanding Paper Award)
paper /
code
We analyze safety alignment and show that it is largely shallow. We propose methods for making it deeper.
|
|
Dense Backpropagation Improves Training for Sparse Mixture-of-Experts
Ashwinee Panda,
Vatsal Baherwani,
Zain Sarwar,
Benjamin Therien,
Supriyo Chakraborty,
Tom Goldstein
NeurIPS 2024 Workshops
paper /
code /
thread
We present a lightweight approximation method that gives the MoE router a dense gradient update while continuing to sparsely activate its parameters.
|
|
Analysis of Attention in Video Diffusion Transformers
Yuxin Wen,
Jim Wu,
Ajay Jain,
Tom Goldstein,
Ashwinee Panda
Arxiv 2025
paper /
website /
thread
We conduct an in-depth analysis of attention in video diffusion transformers (VDiTs) and report a number of novel findings.
|
|
Privacy Auditing of Large Language Models
Ashwinee Panda*,
Xinyu Tang*,
Milad Nasr,
Christopher A. Choquette-Choo,
Prateek Mittal
At ICLR 2025
paper /
thread
We present the first method for doing privacy auditing of LLMs.
|
|
Gemstones 💎: A Model Suite for Multi-Faceted Scaling Laws
Sean McLeish,
John Kirchenbauer,
David Yu Miller,
Siddharth Singh,
Abhinav Bhatele,
Micah Goldblum,
Ashwinee Panda♠️,
Tom Goldstein
At ICLR 2025 Sci-FM
paper /
code /
models /
website /
thread
We release the Gemstone model suite for open-source scaling laws.
|
|
Using Attention Sinks to Identify and Evaluate Dormant Heads in Pretrained LLMs
Pedro Sandoval-Segura,
Xijun Wang,
Ashwinee Panda,
Micah Goldblum,
Ronen Basri,
Tom Goldstein,
David Jacobs
Arxiv 2025
paper /
thread
We propose a new definition for attention heads dominated by attention sinks, known as dormant attention heads.
|
|
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs
Ashwinee Panda,
Berivan Isik,
Xiangyu Qi,
Sanmi Koyejo,
Tsachy Weissman,
Prateek Mittal
At ICML 2024 WANT (Best Paper) / ES-FoMO (Oral)
paper /
code /
thread
Lottery Ticket Adaptation (LoTA) is a new adaptation method that handles challenging tasks, mitigates catastrophic forgetting, and enables model merging across different tasks.
|
|
Private Fine-tuning of Large Language Models with Zeroth-order Optimization
Xinyu Tang*,
Ashwinee Panda*,
Milad Nasr*,
Saeed Mahloujifar,
Prateek Mittal
At TMLR 2025, TPDP 2024 (Oral)
paper
We propose the first method for performing differentially private fine-tuning of large language models without backpropagation. Our method is the first to provide a nontrivial privacy-utility tradeoff under pure differential privacy.
|
|
A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization
Ashwinee Panda*,
Xinyu Tang*,
Vikash Sehwag,
Saeed Mahloujifar,
Prateek Mittal
At ICML 2024
talk /
paper /
code
We find that using scaling laws for Differentially Private Hyperparameter Optimization significantly outperforms prior work in privacy and compute cost.
|
|
Teach LLMs to Phish: Stealing Private Information from Language Models
Ashwinee Panda,
Christopher A. Choquette-Choo,
Zhengming Zhang,
Yaoqing Yang,
Prateek Mittal
At ICLR 2024
talk /
paper /
thread
We propose a new practical data extraction attack that we call "neural phishing". This attack enables an adversary to target and extract sensitive or personally identifiable information (PII), e.g., credit card numbers, from a model trained on user data.
|
|
Privacy-Preserving In-Context Learning for Large Language Models
Tong Wu*,
Ashwinee Panda*,
Tianhao Wang*,
Prateek Mittal
At ICLR 2024
talk /
paper /
code /
thread
We propose the first method for performing differentially private in-context learning. Our method generates sentences from in-context learning while keeping the in-context exemplars differentially private, that can be applied to blackbox APIs (ex RAG).
|
|
Visual Adversarial Examples Jailbreak Aligned Large Language Models
Xiangyu Qi*,
Kaixuan Huang*,
Ashwinee Panda,
Peter Henderson,
Mengdi Wang,
Prateek Mittal
At AAAI 2024 (Oral)
paper /
code
We propose the first method for generating visual adversarial examples that can serve as transferrable universal jailbreaks against aligned large language models.
|
|
Differentially Private Image Classification by Learning Priors from Random Processes
Xinyu Tang*,
Ashwinee Panda*,
Vikash Sehwag,
Prateek Mittal
At NeurIPS 2023 (Spotlight)
paper /
code
We pretrain networks with synthetic images that have strong performance on downstream private computer vision tasks.
|
|
Differentially Private Generation of High Fidelity Samples From Diffusion Models
Vikash Sehwag*,
Ashwinee Panda*,
Ashwini Pokle,
Xinyu Tang,
Saeed Mahloujifar,
Mung Chiang,
J Zico Kolter,
Prateek Mittal
At ICML 2023 GenAI Workshop
paper /
poster
We generate differentially private images from non-privately trained diffusion models by analyzing the inherent privacy of stochastic sampling.
|
|
Neurotoxin: Durable Backdoors in Federated Learning
Zhengming Zhang*,
Ashwinee Panda*,
Linyue Song,
Yaoqing Yang,
Prateek Mittal,
Joseph Gonzalez,
Kannan Ramchandran,
Michael Mahoney
In ICML 2022 (Spotlight)
paper /
poster /
code
Neurotoxin is a novel model poisoning attack for federated learning that stays present in the system for up to 5X longer than the baseline attack.
|
|
SparseFed: Mitigating Model Poisoning Attacks in Federated Learning via Sparsification
Ashwinee Panda,
Saeed Mahloujifar,
Arjun Bhagoji,
Supriyo Chakraborty,
Prateek Mittal
In AISTATS 2022
paper /
code
SparseFed is a provably robust defense against model poisoning attacks in federated learning that uses server-side sparsification to avoid updating malicious neurons.
|
|
FetchSGD: Communication-Efficient Federated Learning with Sketching
Daniel Rothchild*,
Ashwinee Panda*,
Enayat Ullah,
Nikita Ivkin,
Ion Stoica,
Vladimir Braverman,
Joseph Gonzalez,
Raman Arora
In ICML 2020
paper /
code
FetchSGD is a communication-efficient federated learning algorithm that compresses gradient updates with sketches.
|
|
SoftPBT: Leveraging Experience Replay for Efficient Hyperparameter Schedule Search
Ashwinee Panda,
Eric Liang,
Richard Liaw,
Joey Gonzalez
paper /
code
|
|