Infrastructure Engineer/SRE responsible for core infrastructure design and building tools for AI-driven contact center solutions. Join a leading AI company impacting the future of work.
Responsibilities
As a member of the infrastructure team you are responsible for designing, building, and advancing our core infrastructure that allows the engineering team to execute quickly, productively, and securely.
Partner with engineers to build dev tools that empower developer workflows and deployment infrastructure.
Ensure reliability of multi-cloud Kubernetes clusters and pipelines.
Metrics, logging, analytics, and alerting for performance and security across all endpoints and applications.
Infrastructure-as-code deployment tooling and supporting services on multiple cloud providers.
Automate operations and engineering. Focus on automation so we can spend energy where it matters.
Building machine learning infrastructure that enables AI teams to train, test, and deploy on large-scale datasets.
Requirements
5+ years experience in DevOps, Site Reliability Engineering, Production Engineering, or equivalent field.
Deep proficiency with coding languages such as Golang or Python.
Deep familiarity with container-related security best practices.
Production experience working with Kubernetes, and a deep understanding of the Kubernetes ecosystem, including popular open-source tooling such as cert-manager or external-dns. Experience with GPU-enabled clusters is a bonus.
Production experience with Kubernetes templating tools such as Helm or Kustomize.
Production experience with IAC tools such as Terraform or CloudFormation.
Production experience working with AWS and services such as IAM, S3, EC2, and EKS.
Production experience with other cloud providers such as Google Cloud and Azure is a bonus.
Production experience with database software such as PostgreSQL
Experience with GitOps tooling such as Flux or Argo.
Experience with CI/CD such as GitHub Actions.
Benefits
We offer Cresta employees a variety of medical, dental, and vision plans, designed to fit you and your family’s needs
Paid parental leave to support you and your family
Monthly Health & Wellness allowance
Work from home office stipend to help you succeed in a remote environment
DevOps Engineer intern at Sun Life focusing on Java applications and working with Docker and Kubernetes. Engage in collaborative, agile practices with the DevOps team.
Senior Developer, DevOps responsible for Azure infrastructure and automation at Radio - Canada. Collaborating with development teams to ensure optimal performance, availability, and security for digital media services.
Senior Analyst on Data Platform DevOps at AIMCo, responsible for building data operations and collaborating with teams on innovative solutions. Focused on ensuring data quality and integrity across technologies.
Site Reliability Engineer ensuring reliability, availability, and performance of Hiive's platform. Collaborating with cross - functional teams to build scalable and resilient infrastructure while supporting AI systems.
AI Security Control Developer/Site Reliability Engineer for RBC's enterprise AI ecosystem. Design, implement, and validate security controls to protect AI systems with 24/7 reliability.
DevOps Engineering Manager leading a team to improve SDLC at Vancity, Canada's largest Living Wage Employer. Collaborating across teams for reliable delivery of mission - critical systems.
Site Reliability Engineer managing scalable, self - healing systems at Yelp. Collaborating with global teams and ensuring platform reliability across thousands of users.