Huawei Canada has an immediate Co-op opening for an Engineer.
About the team:
The Intelligent Cloud Infrastructure Lab aims to innovate technologies,
algorithms, systems, and platforms for next-generation cloud infrastructure. The
lab addresses scalability, performance, and resource utilization challenges in
existing cloud services while preparing for future challenges with appropriate
technologies and architectures. Additionally, the lab aims to understand
industry dynamics and technology trends to create a robust ecosystem.
About the job:
-
Understand AI System and Infrastructure technology landscape, and identify
scalability/performance issues or challenges of current LLM/multi-modal LLM
systems
-
Initiate and charter innovation projects to build or re-architect AI
infrastructure platform, and plan milestones accordingly
-
Provide/contribute a scalable and high-performance architecture design or
re-design for the infrastructure system that is optimized for AI training and
inferencing, which includes but not limited to cluster management and
scheduling, LLM model deployment, elastic LLM as well as AI container
cold/warm start-up optimization, and so on.
-
Collaborate with internal and external teams to deliver the project or
project features that improve our overall system scalability and performance.
The target annual compensation (based on 2080 hours per year) ranges from
$60,000 to $100,000 depending on education, experience and demonstrated
expertise.
About the ideal candidate:
-
Bachelors, Master/PhD degree in Computer Science, Computer Engineering
-
Experience in building large scale and high-performance distributed system
-
Experience in Nvidia TensorRT and/or Triton servers. Experience in container
virtualization technologies
-
Knowledge & experience in distributed system design & development, including
serverless technologies
-
Work experience in one or more of the following technologies: vLLM, Ray,
SGLang, Kubernetes, TensorRT-LLM, Pytorch framework, Cuda libraries, GPU
technologies
-
Work experience in one or more of the following programming languages: C/C++,
Go, Java, Rust, python, C#
-
Have excellent interpersonal and communication skills to collaborate with
multiple teams and build strong partnerships effectively
-
Demonstrated success working on software engineering problems that span
multiple products