AI Safety Jobs
AI safety as a job category became real in 2024 when the UK and US stood up their AI Security Institutes. Before that, "AI safety" was a research field with a handful of paid roles at Anthropic, DeepMind, and OpenAI. Today there are several hundred dedicated AI safety positions across industry, government, and a small set of independent research labs. The hiring is competitive at every level because the role is interesting, well-paid, and the supply of qualified candidates is genuinely small.
How AI safety jobs differ from AI security jobs
AI safety asks how to make AI systems behave well in expected use, including edge cases and capability scaling. AI security asks how to keep AI systems from being attacked or misused by adversaries. The two overlap heavily at the frontier labs, where the same red team often does both, and the same hiring pipelines feed both teams. The split is sharper at enterprises, where AI security is its own role and AI safety is often part of responsible-AI or AI governance teams. See what is an AI security engineer for the security-specific role.
AI safety jobs at Anthropic
Anthropic has the largest dedicated AI safety hiring of any private company. Open categories in 2026 include alignment research (PhD-track, focused on RLHF, Constitutional AI, scalable oversight), interpretability research (the team behind Anthropic's mechanistic interpretability work), policy and societal impacts, frontier red team, and safety engineering (applied ML with a safety angle, less PhD-heavy). Total comp ranges $280,000 to $620,000 by level. The hiring filter is high: most roles want either a PhD with relevant ML safety publications or 5+ years of relevant engineering experience.
AI safety jobs at OpenAI
OpenAI's safety hiring spans the safety systems team (real-time safety in production), preparedness (capability evaluations for catastrophic risk), superalignment (long-horizon alignment research, restructured in 2024), and policy. Total comp ranges $310,000 to $720,000 by level, with the senior research roles reaching the highest. The hiring process is faster than Anthropic's and more weighted toward demonstrated work than credentials. A strong portfolio (publications, open-source safety tooling, prior frontier-lab experience) can substitute for a PhD.
AI safety jobs at Google DeepMind
DeepMind's AGI Safety, Frontier Safety, and Responsibility teams hire research scientists, research engineers, and policy specialists. Compensation follows Google's standard ladder with the upper-half premium for research-track roles. L5 research engineer total comp is $400,000 to $530,000 in 2026. The hiring process is the slowest of the three labs (multiple onsite loops, multi-month timelines) but the equity is liquid and the role stability is highest.
AI safety jobs at UK AISI
The UK AI Security Institute became one of the largest single employers of AI safety talent globally after its 2024 launch. Roles include research engineer (the largest category), research scientist, AI evaluations engineer, policy adviser, and operations. Pay sits in UK civil service bands plus a "specialist allowance" for technical roles. Total comp in 2026: research engineer £100,000 to £165,000 ($125,000 to $205,000), senior research engineer £140,000 to £200,000 ($175,000 to $250,000), research scientist £145,000 to £215,000 ($180,000 to $270,000). Below frontier-lab pay but with government stability and high mission alignment. See the UK AISI careers page for the application process.
AI safety jobs at US AISI
The US AI Safety Institute (inside NIST, Department of Commerce) launched in 2024 and has been hiring through the federal civil service pay scale plus excepted-service authorities that let it pay above the standard GS scale for technical talent. Roles are posted on USAJobs and through NIST's direct hire authority. Pay ranges roughly $120,000 to $220,000 for technical roles in 2026, which is below industry but competitive against other federal AI roles. US citizenship is required for most positions. Clearance requirements vary by role.
AI safety jobs at independent research labs
METR, Apollo Research, Redwood Research, FAR AI, and Conjecture are the most established independent AI safety research organizations. They are small (10 to 50 staff each as of 2026) and hire selectively. Compensation is below the frontier labs (typical research scientist comp $150,000 to $280,000) but the work is technically deep and the orgs have outsized influence on the field. METR specifically has become a key contractor for frontier model evaluations, and that work has driven hiring growth there since 2024.
AI safety jobs at cloud providers and enterprises
Microsoft (Office of Responsible AI, AI Red Team), Google (Responsible AI, Cloud Trust and Safety), AWS (Bedrock safety team), Meta (Responsible AI), and NVIDIA (NeMo Guardrails team) all hire AI safety roles. Compensation tracks the company's general engineering ladder. Enterprise hiring (banks, healthcare, defense contractors) for AI safety is real but smaller in volume and more compliance-flavored than the cloud-provider roles. These are good fallback options if frontier-lab and AISI processes do not work out.
How to get hired in AI safety
Three tracks work in 2026. The research track: PhD in ML or a strongly related field with publications in alignment, interpretability, or evaluation. Aim for the major venues (NeurIPS, ICML, ICLR safety workshops, the Alignment Forum technical posts). The engineering track: 5+ years of strong ML engineering with a portfolio of safety-relevant work. Open-source contributions to safety tooling (Inspect, ARC Evals tools), reproductions of safety research, or safety-relevant work at a previous employer. The security track: AI red team or model security experience that transfers cleanly to safety evaluation. This route grew significantly in 2024-2025 as the AISIs absorbed security engineers. See the AI red team engineer guide for the security-to-safety bridge.
The hiring filter that actually matters
Across every org above, the deciding signal is the same: have you done relevant safety or alignment work that someone in the field would recognize? A published paper. A widely-read Alignment Forum post. A reproduction of a known result with public code. A pull request to a major safety tool. The credential matters less than the artifact. Plan to spend 6 to 18 months building an artifact before applying if you are coming from an adjacent field, and use that time to also network with people inside the orgs you want to join. Most AI safety hires in 2026 came through warm intros, not cold applications.
Get the AISec Brief
Weekly career intelligence for AI Security Engineers. Salary trends, who's hiring, threat landscape shifts, and certification updates. Free.