Second Workshop on Foundation models

Keynote Speakers 

Rongrong Ji is currently a Professor and the Director of the Intelligent Multimedia Technology Laboratory and the Dean
Assistant with the School of Information Science and Engineering, Xiamen University. His research falls in the field of computer 
vision and machine learning. He serve as area chair of CVPR 2021, and associate editors in Neurocomputing, Multimedia 
Tools and Applications, The Visual Computer, PLOS ONE, Frontiers of ComputerScience etc., guest editors in ACM Transactions 
on Intelligent Systems and Technology, IEEE Multimedia Magazine,Signal Processing, Neurocomputing etc., General Chair of 
VALSE (Vision And Learning SEminar) 2017,Local/Session/Area Chairs in IEEE MMSP 2015, ACM ICMR 2014, IEEE VCIP 2014, 
ACM MMM 2015, IEEE ISM 2015etc. TPC Members in AAAI 2015, CVPR 2013, ICCV 2013, ACM Multimedia 2010-2015 etc. 
He have been a SeniorMember of IEEE (2014-now), Senior Member of ACM (2015-now), Chair of VAIG Group for IEEE Multimedia
Communication Technical Committee (MMTC) (2014-2016), Member of ACM, Chair of CCF YOCSEF Xiamen (2016-2017), and 
Executive Member of Fujian Association of Artificial Intelligence. He was a recipient of the ACM MultimediaBest Paper Award and
 the Best Thesis Award of the Harbin Institute of Technology. He have published 100+ papers intier-1 journal and conferences like

Ani Kembhavi is the Senior Director of Computer Vision at the Allen Institute for Artificial Intelligence 
(AI2) in Seattle. He is also an Affiliate Associate Professor at the Computer Science & Engineering 
department at the University of Washington. He obtained his PhD at the University of Maryland, 
College Park and spent 5 years at Microsoft. His research interests lie at the intersection of 
computer vision, natural language processing and embodiment. His work has been awarded 
a Best Paper Award at CVPR 2023, an Outstanding Paper Award at Neurips 2022, an AI2 Test 
of Time award in 2020 and an NVIDIA Pioneer Award in 2018.                                                                       

Yuliang Liu is currently serving as a professor at the Huazhong University of Science and Technology 
(HUST), where his research primarily focuses on vision and language processing, with a special 
emphasis on document analysis. He received his PhD from the South China University of Technology, 
followed by post-doctoral fellowships at The University of Adelaide and The Chinese University of 
Hong Kong (CUHK). He has authored more than 60 articles for prestigious international journals and 
conference proceedings, such as TPAMI and CVPR.

Ivan Laptev is a head of research at VisionLabs, a visiting professor at MBZUAI and on leave from INRIA
Paris. He has published over 150 technical papers most of which appeared in international journals and
major peer-reviewed conferences of the field. He served as an associate editor of IJCV and TPAMI,
he has served as a program chair for ICCV’23 and CVPR’18, he will serve as a program chair for ACCV’24
and is a regular area chair for CVPR, ICCV and ECCV. He has co-organized several tutorials, workshops and
challenges at major computer vision conferences. He has also co-organized a series of INRIA summer schools
on computer vision and machine learning (2010-2013) and Machines Can See summits (2017-2023).
He received an ERC Starting Grant in 2012 and was awarded a Helmholtz prize in 2017. Ivan’s main research
interests include visual recognition of human actions, objects and interactions, and more recently robotics.                                                        


Li Yuan is an Assistant Professor and Doctoral Supervisor at the School of ECE, Peking University, 
Shenzhen Graduate School. He has been recognized or selected as the National Outstanding Overseas 
Student Award, and the Forbes Asia 30 Under 30 list for 2023. His research primarily focuses on 
multi-modal machine learning. Notable academic contributions include the development of deep 
neural networks, such as T2T-ViT and VOLO. In terms of application, his work has led to the 
creation of vertical domain models, including ChatExcel and ChatLaw. His research group has 
also made contributions by open-sourcing important models and applications, such as Video-LLaVA 
and the Open-Sora Plan, available at:

Alex Schwing is an Associate Professor in the Department of Electri- cal and Computer Engineering at the
University of Illinois in Urbana-Champaign and affiliated with the Coordinated Science Laboratory and the
Computer Science Department. Prior to that he was a postdoctoral fellow in the Machine Learning Group
at the University of Toronto. He completed his PhD in computer science in the Computer Vision and Geometry
Group at ETH Zurich working with Marc Pollefeys, Tamir Hazan and Raquel Urtasun, and graduated from
Technical University of Munich (TUM) with a diploma in Electrical Engineering and Information Technology.
Alex’s research is cen- tered around machine learning and computer vision. He is interested in algorithms
for prediction with and learning of non-linear (deep nets), multivariate and structured distributions, and their
application in numerous tasks, e.g., for 3D scene understanding from a single image.                                       
Link to homepage:

Hao Su is an Associate Professor of Computer Science at the University of California, San Diego. He is the
Director of the Embodied AI Lab at UCSD, a founding member of the Data Science Institute, and a member
of the Center for Visual Computing and the Contextual Robotics Institute. He works on algorithms to model,
understand, and interact with the physical world. His interests span computer vision, machine learning,
computer graphics, and robotics – all areas in which he has published and lectured extensively. Hao Su
obtained his Ph.D. in Computer Science from Stanford. He served as the Area Chair or Associate Editor for
top conferences and journals in computer vision (ICCV/ECCV/CVPR), computer graphics (SIGGRAPH/ToG),
robotics (IROS/ICRA), and machine learning (NeurIPS/ICLR). He received the SIGGRAPH Best Ph.D. Thesis
Award Honorable Mention and the NSF CAREER Award.                                                        

Zeynep Akata is a professor of Computer Science within the Cluster of Excellence Machine Learning at the
University of Tu ̈bingen. After completing her PhD at the INRIA Rhone Alpes with Prof Cordelia Schmid (2014),
she worked as a post-doctoral researcher at the Max Planck Institute for Informatics with Prof Bernt Schiele
(2014-17) and at the University of California Berkeley with Prof Trevor Darrell (2016-17). Before moving to
Tu ̈bingen in October 2019, she was an assistant professor at the University of Amsterdam with Prof Max
Welling (2017-19).

Georgia Gkioxari is currently an Assistant Professor of Computing + Mathematical Sciences at Caltech, holding
 the distinguished title of William H. Hurt scholar. In the period spanning from 2016 to 2022, Georgia served as
a research scientist at Meta, specifically within the Facebook AI Research (FAIR) division. Her academic journey
led her to complete her doctoral studies at UC Berkeley, where she was under the guidance of Jitendra Malik.
Georgia’s contributions to the field of computer science and AI have earned her several notable awards and