@roopalgarg

Staff Software Engineer @ Google DeepMind

Roopal Garg is a Staff Software Engineer at Google DeepMind, where he works on improving multimodal content understanding. His focus is on building high-quality datasets efficiently and using them to develop autoraters and metrics to advance and evaluate modern Vision-Language Models (VLMs). Recently, he's been focused on hyper-detailed image descriptions, their implications for text-to-image models, and expanding these capabilities to include global geographical and cultural understanding. He received his MS in Computer Science with a focus on Natural Language Processing from the University of Southern California in 2013.

TwitterLinkedInInstagramLink

ImageInWords: Unlocking Hyper-Detailed Image Descriptions

Cohere For AI - Community Talks 2024

Deep Learning for Natural Language Processing 

DataCon LA 2017