About Me

I am a Research Scientist and Senior Research Manager in the Enterprise Reliability group at ServiceNow AI Research, and I am also an Adjunct Faculty member at the School of Computer Science, McGill University. My research focuses on building robust, safe, and secure foundation models, with interests spanning multimodal understanding, AI safety, multi-agent security and privacy, and autonomous agents.

I received my Ph.D. in Computer Science from the University of Edinburgh, UK, where I was advised by Prof. Mirella Lapata and Prof. Frank Keller. Prior to joining ServiceNow, I worked at Amazon AI and Alexa AI for six years. I have also worked as a visiting researcher at Meta AI Research and Microsoft Research Redmond and India.

I am an active member of the academic community—I have organized workshops (RepL4NLP 2018-2022), shared tasks, and served as Area Chair and Program Committee member at major NLP/ML conferences.


Research Interests

  • Multimodal Foundation Models: Vision-language models, document understanding, chart reasoning
  • AI Agents Reliability: Safe deployment of autonomous agents, evaluating agent safety
  • Multi-agent Security and Privacy: Security and privacy in multi-agent systems

News

  • 2026: Presenting a tutorial on Multimodal Agents at EACL 2026slides
  • 2026: CUA-Suite released — a suite of benchmarks and tools for evaluating computer use agents
  • 2026: GroundCUA and DrBench accepted at ICLR 2026
  • 2026: StarFlow: Generating Structured Workflow Outputs From Sketch Images accepted at EACL 2026 as main conference paper
  • 2025: AlignVLM accepted at NeurIPS 2025
  • 2025: SafeArena benchmark for evaluating web agent safety accepted at ICML 2025

Selected Publications

See my Google Scholar for a complete list.

GroundCUA
Grounding Computer Use Agents on Human Demonstrations
Aarash Feizi, Shravan Nayak, Xiangru Jian, Kevin Qinghong Lin, Kaixin Li, Rabiul Awal, Xing Han Lù, Johan Obando-Ceron, Juan A. Rodriguez, Nicolas Chapados, David Vazquez, Adriana Romero-Soriano, Reihaneh Rabbany, Perouz Taslakian, Christopher Pal, Sai Rajeswar, Spandana Gella
ICLR 2026 [paper] [website] [code] [dataset]
DRBench
DRBench: A Realistic Benchmark for Enterprise Deep Research
Amirhossein Abaskohi, Tianyi Chen, Miguel Muñoz-Mármol, Curtis Fox, Amrutha Varshini Ramesh, Étienne Marcotte, Xing Han Lù, Nicolas Chapados, Spandana Gella, Christopher Pal, Alexandre Drouin, Issam H. Laradji
ICLR 2026 [paper] [dataset]
StarFlow
StarFlow: Generating Structured Workflow Outputs From Sketch Images
Patrice Bechard, Chao Wang, Amirhossein Abaskohi, Juan Rodriguez, Christopher Pal, David Vazquez, Spandana Gella, Sai Rajeswar, Perouz Taslakian
EACL 2026 [paper] [website]
SafeArena
SafeArena: Evaluating the Safety of Autonomous Web Agents
Ada Tur, Nicholas Meade, ..., Spandana Gella, Karolina Stanczak, Siva Reddy
ICML 2025 [paper] [website]
UI-Vision
UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction
Shravan Nayak, ..., Spandana Gella, Sai Rajeswar Mudumba
ICML 2025 [paper] [website]
BigDocs
BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models
Juan A. Rodriguez, ..., Spandana Gella, et al.
ICLR 2025 [paper] [website]
BigCharts-R1
BigCharts-R1: Enhanced Chart Reasoning with Visual Reinforcement Finetuning
Ahmed Masry, Abhay Puri, Masoud Hashemi, Juan A. Rodriguez, ..., Perouz Taslakian, Sai Rajeswar, Spandana Gella
COLM 2025 [paper] [website] [code]
AlignVLM
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Ahmed Masry, ..., Spandana Gella, Sai Rajeswar Mudumba
NeurIPS 2025 [paper] [website]

Contact

Feel free to reach out via email or connect on LinkedIn!