Om Khangaonkar

I'm an third-year undergrad at UC Davis advised by Hamed Pirsiavash. My research studies computer vision and machine learning.

Specifically, I am interested in the intersection of representation learning, generative modeling, and scene understanding. Large generative models (i.e. Stable Diffusion, FLUX.1) have learned to model a large amount of our visual world. How can we utilize the rich representations learned by these models to build generalizable models of perception from limited supervision, similar to humans?

Email / Twitter / Google Scholar

gen2seg: Generative Models Enable Generalizable Instance Segmentation
Om Khangaonkar and Hamed Pirsiavash
arXiv, 2025
project page / arXiv

We finetune generative models (i.e. Stable Diffusion, MAE) to segment instances using only using realistic synthetic data with a very limited set of object types (indoor furnishings and cars). Surprisingly, both models generalize to new object types and image styles unseen in finetuning (and for MAE, pretraining too).

Thanks to Jon Barron for this website template.