Deep Learning SW Development Engineer - Training Libraries

Advanced Micro Devices • Markham, Ontario, Canada • 12h ago

Job Description: THE ROLE: THE PERSON: KEY RESPONSIBILITIES:

Optimize open-source deep learning training libraries, such as Megatron and Transformer Engine, for enhanced performance on AMD GPUs.
Analyze and optimize key deep learning models for performance on AMD GPUs in a distributed computing environment, targeting both scale-up (multi-GPU) and scale-out (multi-node) architectures.
Apply software engineering best practices while staying informed of trends and innovations in software, hardware, algorithms and architecture.
Contribute to the development and bring-up of new ASIC and hardware.
Apply a data-driven approach to optimization efforts and design groundbreaking AMD technologies.
Debug and resolve existing issues while researching more efficient alternatives to achieve the same objectives.
Collaborate with internal GPU library teams and develop technical relationships with peers and partners to optimize deep learning training.

PREFERRED EXPERIENCE:

Programming \& Development:
Expertise in C/C and Python, with strong skills in object-oriented programming, debugging, performance optimization, and concurrent programming.
Familiar with source control (GitHub), CI/CD, and Linux debugging/profiling tools.
GPU Kernel Development:
Experienced in GPU kernel optimization for deep learning using HIP and CUDA on AMD GPUs (GCN, RDNA).
Skilled in programming and performance optimization with tools like Composable Kernel (CK), CUTLASS, Triton, and assembly (ASM).
Deep Learning \& Optimization:
Expertise in integrating GPU performance into TensorFlow and PyTorch for model training and inference optimization.
Experience in analyzing and optimizing deep learning workloads with a focus on scaling and throughput.
Collaboration \& Communication:
Strong problem-solving and communication skills, with proven success in team collaboration.
Bachelor's or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent

Advanced Micro Devices

Related Jobs

Sales Associate LensCrafters

EssilorLuxottica • Toronto, Ontario, Canada • 12h ago

Security

12h ago

Database Administrator (DBA)

KDigitalLabs INC • Toronto, Ontario, Canada • from C$ 44 / hour • 1h ago

Information Technology and Telecom

1h ago

Director of Software Engineering (RF#77)

Mastercard • Toronto, Ontario, Canada • 12h ago

Construction

12h ago

Machine Operator

Give and Go Prepared Foods • Vaughan, Ontario, Canada • 12h ago

Production

12h ago

Accountant

Express Employment Professionals • Markham, Ontario, Canada • 12h ago

Finance

12h ago

Outside Golf Services - The Lake Joseph Golf Club

ClubLink • King, Ontario, Canada • 12h ago

Services

12h ago

Automotive Service Technician 310S

Kal Tire • Toronto, Ontario, Canada • 12h ago

Services

12h ago

Cook - MRHH Food Services (Acute Care)

Mackenzie Health • Richmond Hill, Ontario, Canada • 12h ago

Services

12h ago

Auto Spa Assistant - Contract

Pfaff Auto • Toronto, Ontario, Canada • 12h ago

Beauty

12h ago

Senior Planner and Scheduler

Toronto Transit Commission • Toronto, Ontario, Canada • 12h ago

Electronic

12h ago