Avik Pal
Thesis project title: Hyperbolic embedding of multiple hierarchies
Supervisors:
Research Summary:
Recent trends in AI lean towards large-scale multimodal learning (e.g., CLIP) which requires an immense amount of data. This study aims to explore the potential of creating a semantic multimodal representation space by leveraging interpretable hyperbolic embeddings as an inductive prior. The initial focus is on Vision and Language modes, aiming to further accelerate training while reducing data requirements. The proposed methodology involves exploring hyperbolic representation learning methods and alignment mechanisms, incorporating insights from existing works.
MSc Honours Opportunity:
I am currently collaborating with Ph.D. students from PINLab at Sapienza University on my research topic. My research visit facilitated by the MSc Honours Programme would enable closer collaboration with them. Further, I am looking forward to presenting my work and findings along with all the interesting discussions and insights I’ll obtain from Fabio and his team.
Jona Ruthardt
Thesis project title: Large Language Models as Visual Experts
Supervisors:
Research Summary:
The recent surge in popularity of large language models (LLMs) can be attributed to their reasoning abilities, capacity to generate meaningful text, and the vast repository of embedded encyclopedic knowledge. Especially their comprehensive understanding of real-world concepts can be exploited in the vision domain to generalize to previously unseen objects, even when annotated data is not readily available.
The project aims to leverage LLMs for computer vision tasks that traditionally relied heavily on expert annotations and humans-in-the-loop for competitive performance. By tapping into the wealth of class-specific information within LLMs, Jona and his supervisors explore ways to diminish the reliance on expensive manual annotations without compromising performance on downstream tasks. Owing to this cross-modal knowledge transfer, resulting approaches promise to be scalable to more diverse datasets and also applicable to low-resource computer vision problems.
MSc Honours Opportunity:
Being part of the MSc Honours Programme provides me with the exciting opportunity to collaborate with Prof. Serge Belongie. His profound expertise in the broader field of Computer Vision, particularly within the targeted research direction, will serve as a significant asset to our project. Thanks to facilitating and funding a multi-week research visit, the MSc Honours Programme will provide me with a firsthand understanding of the Pioneer Centre for AI and foster connections with fellow researchers. The inclusion of three distinguished supervisors with complementary backgrounds and mentorship further enhances my learning opportunities and elevates the project’s overall quality. These invaluable experiences will equip me with key insights that will inform and guide my future academic and professional aspirations.
Robin Sasse
Thesis project title: Hierarchical Auto-Vocabulary Segmentation
Supervisors:
Research Summary:
While humans have the ability to recognize a near-infinite amount of objects, automated segmentation systems usually rely on a fixed number of objects, corresponding to the dataset used during training. Open-vocabulary segmentation methods address this issue by leveraging user input in the form of a list of objects presented alongside the image. Notwithstanding, these methods still rely on user interaction in order to provide results. Recently researchers have proposed new methods, which remove the need for human interaction and instead, rely on large visual-language models to generate labels from the image itself (we call these methods auto-vocabulary segmentation methods). In hierarchical segmentation multiple levels of class abstractions are identified at once (e.g. tire is part of a wheel is part of a car). Hierarchical auto-vocabulary segmentation yields a pixel-level classification with automatic labelling and hierarchical structuring all in one model pipeline.
MSc Honours Opportunity:
With ELLIS I have the chance to spent a month at the ETH Zurich, arguably Europe’s leading research institution in AI and especially Computer Vision. While I will lay the ground-work of my research in Amsterdam, I plan to refine my approach and broaden its use cases towards the end in Zurich. With Dr. Francis Engelmann, a leading researcher in 3D scene segmentation joins the project and I hope to receive valuable input from his experience in 3D, which I can either apply to my 2D segmentation method or find a way to extend my method into the 3D world. Spanning my network across the UvA and ETHZ, I hope to establish valuable connections across two of Europe’s most prestigious AI research institutions.
Egoitz Gonzalez
Thesis project title: SLAM Expert System for Learning Optimal Tracking Strategies
Supervisors:
Research Summary & MSc Honours Opportunity:
Simultaneous Localization and Mapping (SLAM) is a well-studied approach in Computer Vision for 3D scene representation from video captured by a moving camera. My research will focus on improving the most recent Dense Neural SLAM approaches by developing new techniques that will improve the robustness, tacking performance and fidelity of the 3D scene model. This project not only allows me employ diverse methodologies and acquire knowledge about 3D scene representation and SLAM but also helps me better understand how actual research projects work and how collaboration can lead to innovative ideas. Working with people that inspire and motivate you plays a key role, as well as learning from them and cultivating new skills that I could apply in my future projects.
Thanks to the ELLIS Honours Programme, I will have the opportunity to spend some time at ETH Zürich, collaborating with both supervisors and connecting with additional specialists in the field. This is an excellent opportunity that I likely wouldn’t have had access to without ELLIS, and it promises to greatly benefit both my personal and professional development. Engaging with other research groups, expanding my professional network, and learning from experts are undoubtedly the most effective ways of enriching my research experience and fostering personal growth.