Multimodal LLM/VLM/VLA Researcher

Huawei Canada • Vancouver, British Columbia, Canada • 14h ago

Overview

Seeking technically driven research leaders to spearhead the design, development, optimization, and exploration of multimodal large models. This role focuses on building next-generation intelligent service-aware systems and edge-cloud integrated AI architectures across devices, automotive, and cloud platforms, aiming to develop industry-leading multimodal large language models.

Responsibilities:

Conduct technical research on multimodal foundation models, including but not limited to pre-training, post-training, theoretical modeling, and capability evaluation, targeting applications in Huawei's consumer devices, automotive systems, and cloud platforms.
Design and optimize the architecture of multimodal foundation models.
Develop and optimize engineering solutions and algorithms for efficient training and inference of multimodal models.
Continuously track the latest research developments in the field of multimodal foundation models; lead exploration and validation of cutting-edge technologies.

Requirements:

Strong self-motivation, intellectual curiosity, and a passion for exploring new knowledge and emerging domains. Solid logical thinking and analytical skills.
Proficient in deep learning; candidates with hands-on experience in training and inference of large models (7B+ LLMs/VLMs/VLAs) are highly preferred.
Publications in top-tier conferences or journals (e.g., NeurIPS, ICML, ICLR, CVPR, T-RO) are strongly valued.
Familiarity with techniques such as prompt engineering and post-training; prior experience is a plus.
Proficient in programming languages such as C++ and Python, with strong coding and software development skills.