About Me

I am a Research Lead and Sr. Manager for Agentic Defenses and Harnesses at ServiceNow AI Research, and I am also an Adjunct Faculty member at the School of Computer Science, McGill University. My research focuses on building robust and safe AI agents — spanning agentic harnesses, defenses against adversarial and unsafe behavior, reliable agent evaluation, and multi-agent security and privacy.

I received my Ph.D. in Computer Science from the University of Edinburgh, UK, where I was advised by Prof. Mirella Lapata and Prof. Frank Keller. Prior to joining ServiceNow, I worked at Amazon AI and Alexa AI for six years. I have also worked as a visiting researcher at Meta AI Research and Microsoft Research Redmond and India.

I am an active member of the academic community—I have organized workshops (RepL4NLP 2018-2022), shared tasks, and served as Area Chair and Program Committee member at major NLP/ML conferences.

Research Interests

Robust & Safe Agents: Building reliable autonomous agents; defenses against adversarial, unsafe, and out-of-distribution behavior
Agentic Harnesses & Evaluation: Benchmarks, tooling, and guardrails for evaluating and deploying agents safely
Multi-agent Security & Privacy: Security and privacy in multi-agent systems

News

2026: MosaicLeaks: our paper on how deep research agents leak private information through their open web queries is out
2026: Organizing the Lifelong Agents workshop at COLM 2026
2026: Presenting a tutorial on Multimodal Agents at EACL 2026 — slides
2026: CUA-Suite released — a suite of benchmarks and tools for evaluating computer use agents
2026: GroundCUA and DrBench accepted at ICLR 2026
2026: StarFlow: Generating Structured Workflow Outputs From Sketch Images accepted at EACL 2026 as main conference paper
2025: AlignVLM accepted at NeurIPS 2025
2025: SafeArena benchmark for evaluating web agent safety accepted at ICML 2025

Selected Publications

See my Google Scholar for a complete list.

Grounding Computer Use Agents on Human Demonstrations
Aarash Feizi, Shravan Nayak, Xiangru Jian, Kevin Qinghong Lin, Kaixin Li, Rabiul Awal, Xing Han Lù, Johan Obando-Ceron, Juan A. Rodriguez, Nicolas Chapados, David Vazquez, Adriana Romero-Soriano, Reihaneh Rabbany, Perouz Taslakian, Christopher Pal, Sai Rajeswar, Spandana Gella
ICLR 2026 [paper] [website] [code] [dataset]

DRBench: A Realistic Benchmark for Enterprise Deep Research
Amirhossein Abaskohi, Tianyi Chen, Miguel Muñoz-Mármol, Curtis Fox, Amrutha Varshini Ramesh, Étienne Marcotte, Xing Han Lù, Nicolas Chapados, Spandana Gella, Christopher Pal, Alexandre Drouin, Issam H. Laradji
ICLR 2026 [paper] [dataset]

StarFlow: Generating Structured Workflow Outputs From Sketch Images
Patrice Bechard, Chao Wang, Amirhossein Abaskohi, Juan Rodriguez, Christopher Pal, David Vazquez, Spandana Gella, Sai Rajeswar, Perouz Taslakian
EACL 2026 [paper] [website]

SafeArena: Evaluating the Safety of Autonomous Web Agents
Ada Tur, Nicholas Meade, ..., Spandana Gella, Karolina Stanczak, Siva Reddy
ICML 2025 [paper] [website]

UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction
Shravan Nayak, ..., Spandana Gella, Sai Rajeswar Mudumba
ICML 2025 [paper] [website]

BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models
Juan A. Rodriguez, ..., Spandana Gella, et al.
ICLR 2025 [paper] [website]

BigCharts-R1: Enhanced Chart Reasoning with Visual Reinforcement Finetuning
Ahmed Masry, Abhay Puri, Masoud Hashemi, Juan A. Rodriguez, ..., Perouz Taslakian, Sai Rajeswar, Spandana Gella
COLM 2025 [paper] [website] [code]

AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Ahmed Masry, ..., Spandana Gella, Sai Rajeswar Mudumba
NeurIPS 2025 [paper] [website]

Contact

Feel free to reach out via email or connect on LinkedIn!