Huawei Canada has an immediate permanent opening for a Researcher.
About the team:
The Advanced Computing and Storage Lab, currently a part of the Vancouver
Research Centre, aims to explore adaptive computing system architectures to
address the challenges posed by flexible and variable application loads in the
future. It assists in ensuring the stability and quality of training clusters,
constructs dynamic cluster configuration strategy solvers, and establishes
precision control systems to create stable and efficient computing power
clusters. One of the lab's goals is to focus on key industry AI application
scenarios such as large model training/inference, based on key technologies like
low-precision training, multi-modal training, and reinforcement learning,
responsible for bottleneck analysis and the design and development of
optimization solutions, thereby improving training and inference performance as
well as usability.
About the job:
-
Aiming at key industry AI application scenarios such as large model training
and inference, this role focuses on advancing performance, efficiency, and
usability of AI systems on the Ascend platform. The work involves
low-precision training, multimodal optimization, reinforcement learning, and
training resource optimization to address system bottlenecks and deliver
next-generation AI capabilities.
-
Responsible for design and development of optimization solutions for AI
training and inference systems, with a focus on FP8 optimization, RL-driven
training agents, multimodal reinforcement learning or next-generation
multi-modal understanding & generation.
-
Combine AI algorithm requirements with system-level architectural
optimization in computing, I/O, scheduling, and precision control to improve
performance.
-
Build stable, efficient AI training clusters, leveraging dynamic cluster
configuration and precision control to ensure scalability and reliability.
-
Develop software frameworks, operator libraries, acceleration libraries, and
system-level optimizations for NPU platforms to accelerate large-model AI
training.
-
Drive innovation in optimizing large-model training and inference with
low-precision training, parallel strategy tuning, and reinforcement learning.
-
Grasp the latest research progress and technological trends in AI computing
cluster architecture design, training acceleration, and inference
acceleration across academia and industry to strengthen the competitiveness
of AI computing cluster systems
The base salary for this position ranges from $100,000 to $170,000 depending on
education, experience and demonstrated expertise
About the ideal candidate:
-
Ph.D or Masters degree in Computer Science, Computer Engineering majors in
artificial intelligence, computer science, software, automation, electronics,
communications, robotics, etc.
-
Familiar with the common model structures of large models such as Deepseek
and Llama, and have basic technical accumulation in large model training and
inference optimization in the fields of LLM, MoE, multimodality, etc.
-
Familiar with the hardware architecture and programming system of AI
accelerators such as GPU/NPU, and have experience in optimizing AI systems
with coordinated software and hardware cores.
-
Those with any of the following experience is an asset:
1) Solid programming foundation, familiar with Python/C/C++ programming
languages, good architecture design and programming habits
2) Ability to work independently and solve problems, good at communication,
willing to cooperate, keen on new technologies, good at summarizing and
sharing, and like hands-on practice
3) Experience in the development of AI training frameworks and AI reasoning
engines, or algorithm hardware and related experience
4) Strong research capabilities in new technologies and new architectures,
can quickly track and gain insights into the most cutting-edge AI
technologies in the industry, and lead the continuous leadership of system
architecture innovation.