Research Lead – Principal Scientist, Manager Post-Training, Alignment, Reinforcement Learning

Posted 6 days ago

Apply Now

Resume Score

Check how well your resume matches this job before you apply.

Sign in to check score

About the role

  • Leading research in post-training alignment and reinforcement learning at Autodesk AI Lab. Managing a team of AI scientists to develop reliable foundation models for various industries.

Responsibilities

  • Own post-training strategy for model development — from RLHF and preference optimization to agentic systems and long-horizon reasoning
  • Develop novel algorithms that improve model reliability, controllability, and alignment
  • Make principled architectural decisions about when to address challenges at the pre-training, post-training, or system level
  • Design and run experiments that shape model behavior, robustness, and reasoning quality
  • Partner with infrastructure teams to build scalable, reproducible post-training workflows
  • Contribute to publications, patents, and Autodesk's external research visibility
  • Design evaluation frameworks for long-horizon reasoning, tool use, agentic behavior, safety, and real-world workflow completion
  • Lead rigorous model analysis and interpretability efforts
  • Drive human-in-the-loop evaluation with high annotation quality and sound scientific methodology
  • Establish model readiness criteria and provide go/no-go recommendations for releases
  • Manage, mentor, and grow a team of AI scientists
  • Set technical direction and research priorities across post-training and alignment initiatives
  • Foster a research culture grounded in scientific rigor, reproducibility, and fast iteration
  • Help recruit world-class talent across ML, RL, alignment, and foundation models
  • Partner closely with pre-training teams, infrastructure, product organizations, and other stakeholders
  • Translate research trade-offs into clear, decision-ready guidance for leadership

Requirements

  • Deep hands-on expertise in reinforcement learning for foundation models, and fluency with post-training methods (RLHF, RLAIF, DPO, PPO, or adjacent approaches)
  • Proven experience leading or mentoring technical research teams — whether in an academic lab, AI research organization, or industry setting
  • Strong intuition for model behavior, alignment challenges, and post-training trade-offs
  • Experience designing evaluation systems and thinking rigorously about what it means for a model to be ready
  • Ability to communicate complex technical trade-offs clearly to both technical and non-technical audiences
  • A PhD or equivalent depth of industry research experience in ML, RL, AI, or a related field

Benefits

  • health insurance
  • retirement plans
  • paid time off
  • flexible work arrangements
  • professional development
  • bonuses
  • stock options
  • equipment allowances
  • wellness programs

Job type

Full Time

Experience level

Senior

Salary

Not specified

Degree requirement

Postgraduate Degree

Location requirements

RemoteCanada

Report this job

Found something wrong with the page? Please let us know by submitting a report below.