Research Scientist, Agentic Learning (Horizons): Anthropic
Aug 23, 2025 |
Location: San Francisco, CA (Hybrid policy requiring at least 25% of time in the office). |
Deadline: Not specified
Experience: Senior
Continent: North America
Salary: $300,000 - $405,000 USD per year.
Anthropic is a public benefit corporation dedicated to creating reliable, interpretable, and steerable AI systems that are safe and beneficial for society. Their research culture treats AI as a large-scale empirical science, functioning as a single cohesive team on a few high-impact research efforts. The "Horizons" team leads their reinforcement learning (RL) research and development, playing a critical role in advancing AI systems like the Claude models.
Responsibilities:
As a Research Scientist/Engineer on the Horizons team, you will blend research and engineering to advance the capabilities and safety of large language models. Key responsibilities include:
Collaborate with researchers and engineers on fundamental RL research, new model architectures, and creating 'agentic' models that use tools for open-ended tasks.
Architect and optimize core RL infrastructure and scale systems for complex research workflows.
Design, implement, and test novel model architectures, training environments, and evaluation methodologies to push the state of the art.
Drive performance improvements across the stack through profiling, optimization, and debugging distributed systems.
Develop automated testing frameworks and build scalable infrastructure to accelerate AI research.
Requirements:
Required Qualifications:
Proficient in Python.
Experience with both JAX and PyTorch.
Experience designing, implementing, and iterating on model architecture improvements.
Industry experience in training and conducting machine learning research on production-scale LLMs.
Ability to balance research exploration with engineering implementation.
Strong systems design and communication skills.
A passion for the potential impact of AI and a commitment to developing safe systems.
Bachelor's degree in a related field or equivalent experience.
Preferred Qualifications (Strong candidates may have):
Experience with reinforcement learning techniques, continuous learning, or long-range LLM agent designs.
Experience with distributed systems, HPC, Kubernetes, or sandboxed code execution environments.
Experience with TensorFlow, Rust, and/or C++.
A history of research publications.
Additional Information:
Visa Sponsorship: The company sponsors visas and will make every reasonable effort to secure one for a successful candidate.
Work Arrangement: This is a hybrid role, requiring employees to be in the San Francisco office at least 25% of the time.
đ Apply Now
đ 8 views | đ 0 clicks