KEY RESPONSIBILITIES:
- Enable and optimize key AI models (LLM, Vision, MultiModal, etc.) on AMD GPUs
- Optimize AI frameworks like PyTorch, TensorFlow, etc. on AMD GPUs in upstream open-source repositories
- Collaborate and interact with internal GPU library teams to analyze and optimize training and inference for AI
- Work with open-source framework maintainers to understand their requirements – and have your code changes integrated upstream
- Optimize GPU kernels, understand and drive AI operator performance (GEMM, Attention, etc.) with specialized teams
- Work in a distributed computing setting to optimize for both scale-up (multi-GPU) and scale-out (multi-node) systems
- Apply your knowledge of software engineering best practices
PREFERRED EXPERIENCE:
- Knowledge of GPU computing (HIP, CUDA, OpenCL)
- AI model experience or knowledge - Natural Language Processing, Vision, Audio, Recommendation systems
- Excellent C/C++/Python programming and software design skills, including debugging, performance analysis, and test design.
- Experiences to run workloads on large scale heterogeneous cluster is a plus
- Experiences to optimize GPU kernels for performance is a plus
ACADEMIC CREDENTIALS:
- Masters or PhD or equivalent experience in Computer Science, Computer Engineering, or related field