Research
I am currently working on a number of topics in LLM pretraining. I'm particularly interested in MoEs, scaling laws, and scalable low-resource decentralized training. I previously worked primarily on trustworthiness, mostly privacy and AI safety.
|
|
Dense Backpropagation Improves Routing for Sparsely-Gated Mixture-of-Experts
Ashwinee Panda*,
Vatsal Baherwani*,
Zain Sarwar,
Benjamin Therien,
Supriyo Chakraborty,
Tom Goldstein
At NeurIPS 2024 OPT / ENSLP / Compression
We propose a new method for performing a dense gradient update on MoEs while still doing sparse forward passes.
|
|
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs
Ashwinee Panda,
Berivan Isik,
Xiangyu Qi,
Sanmi Koyejo,
Tsachy Weissman,
Prateek Mittal
At ICML 2024 WANT (Best Paper) / ES-FoMO (Oral)
paper /
code /
thread
Lottery Ticket Adaptation (LoTA) is a new adaptation method that handles challenging tasks, mitigates catastrophic forgetting, and enables model merging across different tasks.
|
|
Private Auditing of Large Language Models
Ashwinee Panda*,
Xinyu Tang*,
Milad Nasr,
Christopher A. Choquette-Choo,
Prateek Mittal
At ICML 2024 NextGenAISafety (Oral)
paper
We present the first method for doing privacy auditing of LLMs.
|
|
Private Fine-tuning of Large Language Models with Zeroth-order Optimization
Xinyu Tang*,
Ashwinee Panda*,
Milad Nasr*,
Saeed Mahloujifar,
Prateek Mittal
At TPDP 2024 (Oral)
paper
We propose the first method for performing differentially private fine-tuning of large language models without backpropagation. Our method is the first to provide a nontrivial privacy-utility tradeoff under pure differential privacy.
|
|
A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization
Ashwinee Panda*,
Xinyu Tang*,
Vikash Sehwag,
Saeed Mahloujifar,
Prateek Mittal
At ICML 2024
talk /
paper /
code
We find that using scaling laws for Differentially Private Hyperparameter Optimization significantly outperforms prior work in privacy and compute cost.
|
|
Teach LLMs to Phish: Stealing Private Information from Language Models
Ashwinee Panda,
Christopher A. Choquette-Choo,
Zhengming Zhang,
Yaoqing Yang,
Prateek Mittal
At ICLR 2024
talk /
paper /
thread
We propose a new practical data extraction attack that we call "neural phishing". This attack enables an adversary to target and extract sensitive or personally identifiable information (PII), e.g., credit card numbers, from a model trained on user data.
|
|
Privacy-Preserving In-Context Learning for Large Language Models
Tong Wu*,
Ashwinee Panda*,
Tianhao Wang*,
Prateek Mittal
At ICLR 2024
talk /
paper /
code /
thread
We propose the first method for performing differentially private in-context learning. Our method generates sentences from in-context learning while keeping the in-context exemplars differentially private, that can be applied to blackbox APIs (ex RAG).
|
|
Visual Adversarial Examples Jailbreak Aligned Large Language Models
Xiangyu Qi*,
Kaixuan Huang*,
Ashwinee Panda,
Peter Henderson,
Mengdi Wang,
Prateek Mittal
At AAAI 2024 (Oral)
paper /
code
We propose the first method for generating visual adversarial examples that can serve as transferrable universal jailbreaks against aligned large language models.
|
|
Differentially Private Image Classification by Learning Priors from Random Processes
Xinyu Tang*,
Ashwinee Panda*,
Vikash Sehwag,
Prateek Mittal
At NeurIPS 2023 (Spotlight)
paper /
code
We pretrain networks with synthetic images that have strong performance on downstream private computer vision tasks.
|
|
Differentially Private Generation of High Fidelity Samples From Diffusion Models
Vikash Sehwag*,
Ashwinee Panda*,
Ashwini Pokle,
Xinyu Tang,
Saeed Mahloujifar,
Mung Chiang,
J Zico Kolter,
Prateek Mittal
At ICML 2023 GenAI Workshop
paper /
poster
We generate differentially private images from non-privately trained diffusion models by analyzing the inherent privacy of stochastic sampling.
|
|
Neurotoxin: Durable Backdoors in Federated Learning
Zhengming Zhang*,
Ashwinee Panda*,
Linyue Song,
Yaoqing Yang,
Prateek Mittal,
Joseph Gonzalez,
Kannan Ramchandran,
Michael Mahoney
In ICML 2022 (Spotlight)
paper /
poster /
code
Neurotoxin is a novel model poisoning attack for federated learning that stays present in the system for up to 5X longer than the baseline attack.
|
|
SparseFed: Mitigating Model Poisoning Attacks in Federated Learning via Sparsification
Ashwinee Panda,
Saeed Mahloujifar,
Arjun Bhagoji,
Supriyo Chakraborty,
Prateek Mittal
In AISTATS 2022
paper /
code
SparseFed is a provably robust defense against model poisoning attacks in federated learning that uses server-side sparsification to avoid updating malicious neurons.
|
|
FetchSGD: Communication-Efficient Federated Learning with Sketching
Daniel Rothchild*,
Ashwinee Panda*,
Enayat Ullah,
Nikita Ivkin,
Ion Stoica,
Vladimir Braverman,
Joseph Gonzalez,
Raman Arora
In ICML 2020
paper /
code
FetchSGD is a communication-efficient federated learning algorithm that compresses gradient updates with sketches.
|
|
SoftPBT: Leveraging Experience Replay for Efficient Hyperparameter Schedule Search
Ashwinee Panda,
Eric Liang,
Richard Liaw,
Joey Gonzalez
paper /
code
|
|