Software Engineer, Benchmarking: Epoch AI
Jan 2, 2026 |
Location: Remote |
Deadline: Jan 11, 2026
Experience: Mid
Salary: $150,000 â $225,000 USD
Role Overview
Epoch AI is a research institute dedicated to investigating machine learning trends and their economic impacts. As a Software Engineer in the Benchmarking Hub, you will be responsible for building the infrastructure that evaluates frontier AI models. Your work will directly inform researchers, developers, and policymakers by providing rigorous, independent data on AI development.
Key Responsibilities
Infrastructure Maintenance: Run and maintain the AI Benchmarking Hub, ensuring evaluations are reliable and scalable.
Implementation: Integrate existing and brand-new AI benchmarks into the evaluation infrastructure, primarily utilizing the Inspect library.
Strategic Development: Collaborate with the benchmarking team to design new metrics and evaluation methods to track emerging model capabilities.
Provider Integration: Facilitate smooth integration with various AI providers (OpenAI, Anthropic, Google, etc.) to evaluate new model releases as they occur.
Internal Experiments: Support Epoch AI researchers by setting up and running internal experiments to validate new hypotheses about AI trends.
Qualifications & Skills
Engineering Excellence: Several years of professional experience building and maintaining complex, robust software systems.
Creativity: The ability to move beyond execution and pitch original ideas for new benchmarks and experiments.
Mission Alignment: A strong desire to provide public, trustworthy insights into AI trends for the benefit of society.
Preferred (But Not Required): * Hands-on experience with LLM evaluations and frameworks like Inspect.
Familiarity with current AI scaling laws and algorithmic progress.
Proficiency in Python and handling large-scale datasets.
Application Instructions
Formatting: Submit materials in English only.
Privacy: Do not include cover letters, photos, or personal information (age, marital status, etc.) that is not relevant to the technical role.
Process: Epoch AI uses AI tools to assist in resume screening, but final hiring decisions are made by human staff.
đ Apply Now
đ 21 views | đ 0 clicks