Infrastructure Engineer focusing on high availability and systems reliability in high throughput data services. Collaborate with infrastructure and product teams while operating Kubernetes systems.
Responsibilities
Collaborate deeply with our infrastructure and product teams to enforce org-wide practices for emitting and collecting telemetry across a wide range of services, both internal and external facing.
Own and operate the Kubernetes infrastructure of the observability team.
Work within the Observability team to ensure industry-standard deployment and reliability practices are used.
Orchestrate and scale systems such as VictoriaMetrics, OpenTelemetry Collector, and Vector.
Requirements
5+ years of experience in a Site Reliability Engineering role
Experience operating and supporting clustered applications in production environments
Hands-on experience deploying and managing applications in Kubernetes (k8s) environments
Working knowledge of PostgreSQL, including administration, performance tuning, and troubleshooting
Proficiency with at least one Infrastructure as Code (IaC) tool (e.g., Terraform, Pulumi, OpenTofu, or equivalent)
Experience with telemetry tooling such as OpenTelemetry, VictoriaMetrics, Grafana, Prometheus.
Experience with AWS services is a plus
Strong documentation and communication skills is a plus
Manager of Delivery Infrastructure Engineering at Mechanical Orchard responsible for end - to - end deployment and team development. Collaborating across Sales, Product, and Delivery to ensure infrastructure delivery.
Azure Infrastructure Architect designing Microsoft Azure solutions for clients at Optimus. Collaborating across teams to implement cloud infrastructure and ensure compliance with best practices.
Senior Operations Infrastructure Architect for University of Toronto Libraries. Responsible for architecting, implementing, and maintaining the infrastructure supporting Scholars Portal applications.
Lead large - scale, enterprise - level I&IT infrastructure initiatives on a 6 - month contract. Manage complex IT infrastructure projects, enhancements, lifecycle management, and IT operations.
Senior Infrastructure Engineer architecting and shipping infrastructure solutions for EvenUp. Focusing on scaling systems that support growth in a mission - driven legal tech platform.
AI Infrastructure Engineer at Xsolla designing AI/ML solutions for multi - cloud infrastructure. Collaborating on automation workflows and observability systems for improved infrastructure management.
Lead a team of infrastructure engineers while staying hands - on with servers, networking, virtualization, cloud, and production operations in a hybrid Toronto role.
Lead Platform & Data Infrastructure Engineer overseeing systems for Minga's Student Behavior Platform. Focus on infrastructure, data pipelines, and analytics for enhanced school life experience.
Infrastructure Solutions Architect role focused on resilience and DR automation. Requires experience with DR strategy, Terraform, Ansible, networking, and automated testing.
Staff Infrastructure Engineer leading complex infrastructure initiatives, mentoring team members and shaping cloud architecture for regulated environments.