Hello! I recently graduated summa cum laude from the University of Maryland, College Park with degrees in Computer Science (Honors) and Applied Mathematics. My research interest focuses on developing robust agents and agentic systems. I'm currently working on scaling robust reinforcement learning to foundation models and developing theoretical guarantees for such algorithms. I am incoming at Qlabs.
I've been fortunate to work with and be advised by the following professors: Furong Huang on LLM safety and reasoning, Radu Balan on physics-informed machine learning, Hua Wei on multi-agent reinforcement learning, and Vaneet Aggarwal on robust reinforcement learning.
Most recent publications on Google Scholar.
‡ indicates equal contribution.
Compositional Adversarial Training for Robust Visual Watermarking
Anirudh Satheesh, Michael-Andrei Panaitescu-Liess, Andrew Xu, Georgios Milis, Heng Huang, Zikui Cai, Furong Huang
CompLearn ICML 2026 Workshop
Under Submission to NeurIPS 2026
Model Tampering Attacks Enable More Rigorous Evaluations of LLM Capabilities
Zora Che, Stephen Casper, Robert Kirk, Anirudh Satheesh, Stewart Slocum, Lev E McKinney, Rohit Gandikota, Aidan Ewart, Domenic Rosati, Zichu Wu, Zikui Cai, Bilal Chughtai, Yarin Gal, Furong Huang, Dylan Hadfield-Menell
TMLR 2025
A Technical Report on 'Erasing the Invisible': The 2024 NeurIPS Competition on Stress Testing Image Watermarks
Mucong Ding, Bang An, Tahseen Rabbani, Chenghao Deng, Anirudh Satheesh, Souradip Chakraborty, Mehrdad Saberi, Yuxin Wen, Kyle Rui Sang, Aakriti Agrawal, Xuandong Zhao, Mo Zhou, Mary-Anne Hartley, Lei Li, Yu-Xiang Wang, Vishal M. Patel, Soheil Feizi, Tom Goldstein, Furong Huang
NeurIPS D&B 2025
MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning
Zikui Cai, Andrew Wang, Anirudh Satheesh, Ankit Nakhawa, Hyunwoo Jae, Keenan Powell, Minghui Liu, Neel Jay, Sungbin Oh, Xiyao Wang, Yongyuan Liang, Tom Goldstein, Furong Huang
Under Review at ECCV 2026
EnsemW2S: Can an Ensemble of LLMs be Leveraged to Obtain a Stronger LLM?
Aakriti Agrawal, Mucong Ding, Zora Che, Chenghao Deng, Anirudh Satheesh, John Langford, Furong Huang
NeurIPS 2024 Safe Generative AI Workshop
ACL 2026 Findings
Uncertainty-Aware Answer Selection for Improved Reasoning in Multi-LLM Systems
Aakriti Agrawal, Rohith Aralikatti, Anirudh Satheesh, Souradip Chakraborty, Amrit Singh Bedi, Furong Huang
EMNLP 2025 Findings
Regret Analysis of Unichain Average Reward Constrained MDPs with General Parameterization
Anirudh Satheesh, Vaneet Aggarwal
Under Submission to NeurIPS 2026
Provably Efficient Algorithms for S-and Non-Rectangular Robust MDPs with General Parameterization
Anirudh Satheesh, Ziyi Chen, Furong Huang, Heng Huang
Under Submission to NeurIPS 2026
Global Convergence of Average Reward Constrained MDPs with Neural Critic and General Policy Parameterization
Anirudh Satheesh, Pankaj Kumar Barman, Washim Uddin Mondal, Vaneet Aggarwal
Under Submission to UAI 2026
Distributionally Robust Self Paced Curriculum Reinforcement Learning
Anirudh Satheesh, Keenan Powell, Vaneet Aggarwal
RLC 2026
Primal-Only Actor Critic Algorithm for Robust Constrained Average Cost MDPs
Anirudh Satheesh‡, Sooraj Sathish‡, Swetha Ganesh, Keenan Powell, Vaneet Aggarwal
arXiv
cMALC-D: Contextual Multi-Agent LLM-Guided Curriculum Learning with Diversity-Based Context Blending
Anirudh Satheesh, Keenan Powell, Hua Wei
CIKM 2025
A Constrained Multi-Agent Reinforcement Learning Approach to Autonomous Traffic Signal Control
Anirudh Satheesh, Keenan Powell
ACM Journal of Autonomous Transportation Systems 2025
PICore: Physics-Informed Unsupervised Coreset Selection for Data Efficient Neural Operator Training
Anirudh Satheesh, Anant Khandelwal, Mucong Ding, Radu Balan
TMLR 2025
SAFLEX: Self-Adaptive Augmentation via Feature Label Extrapolation
Mucong Ding, Bang An, Yuancheng Xu, Anirudh Satheesh, Furong Huang
ICLR 2024
Full Resume in PDF.