Machine Learning Engineer designing and operating AI platform components for Mozilla. Collaborating with teams to ensure efficient and reliable deployment of machine learning models.
Responsibilities
Design, build, and operate core AI platform components used to train, deploy, and serve machine learning models in production environments.
Own model serving and inference workflows end-to-end, driving improvements in reliability, scalability, performance, and operational excellence.
Lead efforts to optimize inference systems for throughput, latency, and cost efficiency across CPU and GPU workloads.
Design and manage GPU-based inference and training workloads, including performance tuning, capacity planning, and resource utilization optimization.
Own and improve critical parts of the model lifecycle, including packaging, versioning, testing strategies, validation, and deployment automation.
Implement and evolve observability practices (metrics, logging, tracing, alerting) to improve visibility and operational resilience of ML services and pipelines.
Partner closely with product, infrastructure, security, and data teams to design scalable platform capabilities that enable AI-powered features.
Contribute to technical design discussions, propose architectural improvements, and mentor junior engineers through code reviews and knowledge sharing.
Participate in and help improve operational processes, including incident response, on-call rotations, and post-incident reviews.
Requirements
Bachelor’s degree with 4–6 years of relevant industry experience, or Master’s degree with significant hands-on experience building and operating production ML systems, or work experience equivalent
Strong experience developing in Python for machine learning systems, backend services, or distributed data processing.
Proven experience deploying and operating ML workloads in cloud environments, including production-grade infrastructure.
Solid understanding of model serving architectures, inference pipelines, and performance tradeoffs (latency, throughput, cost, scaling strategies).
Hands-on experience working with GPU-based workloads and accelerated computing in production settings.
Experience designing CI/CD pipelines and development workflows that support reliable ML system deployment.
Ability to independently scope and drive technical initiatives while balancing product and operational priorities.
Strong problem-solving skills and the ability to debug performance and reliability issues in distributed systems.
Clear and effective communication skills, with experience collaborating across engineering, product, and infrastructure teams.
Benefits
Generous performance-based bonus plans to all eligible employees - we share in our success as one team
Rich medical, dental, and vision coverage
Generous retirement contributions with 100% immediate vesting (regardless of whether you contribute)
Quarterly all-company wellness days where everyone takes a pause together
Country specific holidays plus a day off for your birthday
One-time home office stipend
Annual professional development budget
Quarterly well-being stipend
Considerable paid parental leave
Employee referral bonus program
Other benefits (life/AD&D, disability, EAP, etc. - varies by country)
Senior AI Application Engineer needed for hybrid role in Woodbridge, ON. Design and build AI - powered applications with LLMs, multi - agent systems, and full stack development.
UAT Lead needed for banking domain in Toronto, ON (hybrid). Must have strong UAT experience, banking knowledge, and familiarity with JIRA, Azure DevOps, or HP ALM.
Technical Lead for XR and AI Platforms at Aequilibrium, developing immersive training products. Leading technical direction and architecture for next - gen AI - powered solutions.
Applied AI Engineer at Bounteous building an enterprise - grade GenAI workflow platform. Supporting document data extraction and automated business processes across multiple lines of business.
Customer Experience AI Architect developing AI - powered tools for Vena’s Customer Experience. Collaborating across teams to enhance customer experience through AI solutions and workflows.
Go - to - Market AI Engineer at Equisoft creating agentic workflows for marketing and customer support teams. Focused on building AI tools and integrations for improved customer success.
Product Manager overseeing AI platform development at Supabase. Engaging with a range of customers and defining product requirements and success metrics.
Business Development Associate for AI Platform helping software companies grow in B2B insurance. Collaborating with founders and targeting senior decision - makers for pipeline development.