Senior ML Ops Engineer at Sprout Social building and maintaining AI/ML infrastructure using AWS and Kubernetes. Overseeing machine learning model lifecycles and supporting AI/ML Scientists.
Responsibilities
Build and maintain infrastructure using AWS, Terraform, and Kubernetes to support AI/ML at scale, including Generative AI applications.
Manage the end-to-end lifecycle of machine learning models, ensuring observability and tooling support both scale and speed.
Execute at scale while staying nimble enough to keep up with new capabilities being offered by social network APIs.
Improve processes and champion ideas that matter while holding the team accountable to high code quality and engineering standards.
Support our AI/ML Scientists by developing tooling to streamline model development and deployment.
Requirements
5+ years of experience developing and supporting AI/ML software in a production environment.
5+ years of experience programming in object-oriented languages such as Java, Python, or C++.
Impact-oriented mindset with an interest in stability at scale and a willingness to engage in feature development.
3+ years of experience developing and supporting scalable, distributed backend services (preferred).
3+ years of experience building and supporting GPU-heavy services (preferred).
1+ years of experience with LLMs / Generative AI, including managing their unique costs, constraints, and observability challenges (preferred).
1+ years of experience with Infrastructure-as-Code (Terraform) and container orchestration (Kubernetes) within AWS environments (preferred).
Benefits
100% Employer-Paid Health Benefits: We cover 100% of the premiums for you and your eligible dependents, including comprehensive medical, dental (basic & major), vision, life insurance, and disability.
Generous Paid Time Off: 25 days of vacation annually, plus 5 paid sick days, all public holidays, and additional company-wide Rest & Recharge days.
Premium Mental Health Support: Full, free access to Modern Health for you and your dependents, including coaching, therapy sessions, and digital wellness resources.
Annual Lifestyle Stipend: A $950 CAD annual Lifestyle Spending Account to spend on your physical, mental, and financial well-being.
Remote Work Support: A one-time $550 USD (equivalent) stipend to set up your home office, plus a monthly $50 USD (equivalent) stipend for internet.
Personalized Financial Wellness: No-cost, confidential access to financial experts through Your Money Line to support your personal financial goals.
Family & Care Support: Access to subsidized child and eldercare options through Care.com.
Charitable Giving: A company match for your donations to eligible organizations.
Lead AI/ML & MLOps Engineer executing projects from data foundations to model deployment. Collaborating with sales to drive AI/ML engagements for our clients.
Applied ML Engineer working on AI - driven insights at Kaseya. Collaborating with product teams to enhance features with machine learning and data analysis.
Adversarial Machine Learning Engineer conducting adversarial testing and simulations on LLM - driven AI systems for enterprise security. Collaborating with teams to validate and document findings.
MLOps Engineer managing infrastructure for large 2D and 3D media datasets at NBCUniversal. Responsible for automation, reproducibility, and performance of machine learning lifecycles.
Senior ML Engineer leading the strategic direction of machine learning infrastructure for global food delivery platform. Collaborating with Data Science team for seamless model deployment and innovation.
Machine Learning Intern/Co - op at Cohere working on developing and training models for AI applications. Join a team focused on advancing AI technology in an inclusive environment.
Machine Learning Engineer designing and deploying detection ML systems for social engineering defense platform at Doppel. Collaborating to mitigate evolving digital threats using AI.
Senior Software Developer responsible for designing and developing solutions in data engineering and machine learning. Collaborating with teams to deliver scalable software solutions with agile methodologies.
Senior ML Engineer responsible for designing and building ML pipelines for a Trust Scoring platform. Involves productionizing models and implementing MLOps best practices.
Principal Machine Learning Engineer designing the core ML systems for AI agents at Workday. Collaborating in cross - functional teams to integrate ML solutions into the platform.