AI Computing Architect: NVIDIA

Aug 5, 2025 | Location: Shanghai, China; Beijing, China | Deadline: Not specified

NVIDIA, known as "the AI computing company," is a pioneer in developing intelligent machines. Their GPUs serve as the brains for computers, robots, and self-driving cars, enabling them to learn, perceive, and solve problems using AI algorithms. They are looking for talented individuals to join their AI Computing Architecture team to build real-time, cost-effective computing platforms for this rapidly growing field.

About the Role
NVIDIA is seeking an outstanding Performance Analysis Architect with a background in computer architecture to help analyze and develop the next generation of architectures that accelerate AI and high-performance computing applications. This role is crucial for advancing the state of the art in deep learning performance and efficiency.

Responsibilities
In this role, you will:

Develop innovative architectures to enhance deep learning performance and efficiency.

Analyze performance, cost, and power trade-offs by creating analytical models, simulators, and test suites.

Understand and analyze the interplay of hardware and software architectures on future algorithms, programming models, and applications.

Prototype key deep learning and data analytics algorithms and applications.

Actively collaborate with software, product, and research teams to guide the direction of deep-learning.

Requirements
Education: BS or higher degree in a relevant technical field (CS, EE, CE, Math, etc.).

Experience: 3+ years of work experience.

Skills: Strong programming skills in Python, C, and C++.

Background: A strong background in computer architecture.

Experience: Experience with performance modeling, architecture simulation, profiling, and analysis.

Foundation: A strong foundation in machine learning and deep learning.

Ways to Stand Out from the Crowd (Preferred Qualifications)
Experience with GPU Computing and parallel programming models such as CUDA and OpenCL.

Experience with the architecture of or workload analysis on other deep learning accelerators.

Experience with deep neural network training, inference, and optimization in leading frameworks (e.g., PyTorch, TensorFlow, TensorRT).

Experience with open-source AI compilers (OpenAI Triton, MLIR, TVM, XLA, etc.).

🧠 Related Jobs