Senior AI Quality Engineer

Posted 4 days ago

Apply Now

Resume Score

Check how well your resume matches this job before you apply.

Sign in to check score

About the role

  • Senior AI Quality Engineer at Roofr ensuring AI integrations work effectively while building testing standards and frameworks. Collaborating across teams to enhance AI product quality and performance.

Responsibilities

  • Define the testing standards and patterns for AI at Roofr — establishing how product teams validate AI behaviour when building on top of the application foundation
  • Build and own Roofr's LLM eval framework — selecting and extending the right tooling (e.g. Promptfoo, DeepEval, Braintrust) and designing the methodology that measures whether our AI integrations and agent outputs are performing correctly, consistently, and safely
  • Integrate quality gates into CI/CD pipelines so that regressions in AI behaviour are caught before they reach production
  • Design and implement human-in-the-loop review processes for AI outputs where automated evaluation isn't sufficient
  • Embedded on the AI Platform team — ensuring quality is designed into the integration architecture from day one, not bolted on after the fact
  • Work horizontally across the testing organization — coaching QA engineers and developers on AI eval patterns, embedding best practices into team workflows, and actively raising the quality bar across engineering
  • Stay close to the evolving AI quality landscape — new eval techniques, benchmarking approaches, and tooling like Ragas, Arize Phoenix, or LangSmith — and bring the best of it to Roofr

Requirements

  • 5–8 years of software engineering or quality assurance experience
  • Hands-on experience building eval frameworks for LLM-powered features — you've thought seriously about how to measure output quality, consistency, and regression, and you've worked with tools like Promptfoo, DeepEval, Braintrust, or similar
  • Strong engineering fundamentals — you write real code, build real tooling, and aren't reliant on manual testing processes
  • Experience integrating automated quality checks into CI/CD pipelines
  • Familiarity with LLM APIs and agent frameworks (e.g. Anthropic Claude, OpenAI, or similar) and the specific quality challenges they introduce
  • Experience designing human review workflows to complement automated evaluation
  • Strong collaboration skills — you'll be working across many teams, and the standards you set only work if engineers actually adopt them
  • Comfort operating in an early-stage environment where the right approach isn't always obvious and you'll need to figure it out
  • Genuine ownership mentality — you care about whether AI at Roofr works well, not just whether the tests pass

Benefits

  • 1st week of employment is mandatory PTO! Start your journey with Roofr by decompressing and recharging - we will see you in week 2!
  • 1 Friday off per month (we call those our laundry days!)
  • Company wide paid shutdown for the week between Christmas and New Years
  • Flexible time off
  • 80% employer-paid benefits in the U.S. and 100% employer-paid premiums for Extended Healthcare and Dental in Canada
  • RRSP/401k match
  • Generous Parental Leave policy

Job title

Job type

Full Time

Experience level

Senior

Salary

Not specified

Degree requirement

No Education Requirement

Location requirements

RemoteCanada

Report this job

Found something wrong with the page? Please let us know by submitting a report below.