Engineering Manager, ML Acceleration: Anthropic
Jan 27, 2026 |
Location: San Francisco, Seattle, or New York City (Hybrid - 25% in-office) |
Deadline: Not specified
Experience: Senior
Continent: North America
Salary: $500,000 - $850,000 USD (Annual Salary)
This is a Systems Engineering role disguised as an ML role. While "ML Engineers" usually focus on model architecture (changing layers, weights, and activation functions), the ML Acceleration team focuses on the infrastructure and hardware efficiency that allows those models to run.
You are managing the team responsible for the "speed of thought" of the AI. You ensure that Anthropicβs massive cluster of GPUs (H100s/B200s) runs at 100% utilization rather than 60%.
Key Responsibilities
The "Player-Coach": The JD explicitly states you must make "targeted contributions as an individual contributor." You cannot manage this team via spreadsheets; you need to be able to read a flame graph or a CUDA kernel profile and understand why the GPU is idling.
Bottleneck Destruction: Your primary KPI is Throughput and Latency. You are looking for inefficiencies in:
Network: (Infiniband/Ethernet constraints between nodes).
Memory: (HBM bandwidth limits).
Compute: (Matrix multiplication efficiency).
Inference vs. Training: You cover both.
Training: Making the model learn faster (reducing time-to-market).
Inference: Making Claude cheaper and faster to serve to users (improving margins).
Strategic Analysis: Why the salary is $850k
Compute Economics: Anthropic spends hundreds of millions (potentially billions) on compute. If your team improves training efficiency by just 5%, you effectively save the company tens of millions of dollars in hardware costs and energy. You pay for your own salary in a week.
The Talent Gap: There are very few people in the world who understand Distributed Systems (getting 10,000 computers to talk to each other) AND Deep Learning Internals (how a Transformer works).
The "1+ Years" Anomaly: Notice they only require "1+ years of management experience." This is incredibly low for an $850k role. This signals they prioritize Technical Depth over "Management Experience." They would rather hire a world-class kernel hacker who is new to management than a seasoned VP who hasn't written code in 5 years.
Candidate Profile
The HPC Expert: You likely come from a background in High Performance Computing, Game Engine design, or Low-Latency Trading.
The Stack: You are fluent in Python (for the model) but dangerous in C++/CUDA (for the hardware). You understand MPI, NCCL, and how PyTorch distributes tensors across devices.
The Mindset: You are obsessed with "Utilization." Seeing a GPU at 0% usage for even a millisecond bothers you.
Key "Nice to Haves" decoded
"GPU/Accelerator programming": You know how to write or optimize Triton or CUDA kernels.
"OS Internals": You understand how Linux handles memory paging and thread scheduling, because at this scale, the OS itself can become the bottleneck.
π Apply Now
π 11 views | π 0 clicks