Senior Software Engineer developing monitoring and observability tools for transportation technology company Waabi. Leading architecture and collaboration while optimizing performance across cloud and on-prem environments.
Responsibilities
Design and lead the architecture and development of Waabi’s monitoring and observability stack, used to monitor the health and performance of cloud and on-prem environments.
Develop and extend workloads and benchmarks (compute, storage, network, ML/AI) and integrate stress, chaos, and regression tests to validate hardware and platform choices.
Analyze and optimize end-to-end performance across hardware, firmware, Linux kernel, runtimes, and distributed services using advanced profiling tools (perf, eBPF, flamegraphs, tracing frameworks).
Build automation and observability tooling (Go/Python/Java, Kubernetes/Docker) for CI/CD-based performance regression detection, telemetry, alerting, and anomaly detection.
Work with client teams to support their applications’ observability requirements.
Influence system architecture and tooling decisions that improve how Waabi builds, monitors, and scales its infrastructure.
Drive execution and quality, writing design docs, setting milestones, mentoring ICs, and communicating insights and results to stakeholders and leadership.
Requirements
5+ years software engineering or systems/performance engineering experience (BS in CS/EE or related), with demonstrated end-to-end ownership of complex projects.
Proficient in at least one of: Python, Rust, C/C++; strong CS fundamentals and system design skills.
Hands-on with Linux internals (CPU scheduling, memory, I/O, networking) and perf tooling (perf, eBPF, flamegraphs, tracing frameworks).
Experience with Kubernetes, microservices, and distributed systems; comfort building production services and pipelines.
Proven track record of clear communication, writing design docs, and leading cross-functional efforts.
Benefits
Competitive compensation and equity awards.
Health and Wellness benefits encompassing Medical, Dental and Vision coverage (for full-time employees only).
Unlimited Vacation.
Flexible hours and Work from Home support.
Daily drinks, snacks and catered meals (when in office).
Regularly scheduled team building activities and social events both on-site, off-site & virtually.
Senior Deployment Engineer addressing complex technical integrations in AI agent deployments for customer experience. Collaborative role with technical teams and customers to optimize solutions.
We are hiring a CI/CD Engineer with strong Platform Engineering and DevOps expertise to design, build, and optimize scalable and secure CI/CD pipelines and cloud - based platforms in Toronto, ON.
DevOps Lead needed for a 6 - 12 month remote contract in Toronto, ON. Must have 10 - 12 years experience, CI/CD with Azure DevOps, Docker, Kubernetes, and scan integration.
Co - op or Intern, DevOps Engineer joining BDO Digital's AppDev team. Responsibilities include managing Azure cloud environments and building CI/CD pipelines.
Senior DevOps Engineer designing and implementing scalable AWS network architectures at Magnet Forensics. Collaborating with diverse teams for secure, efficient connectivity across services.
Site Reliability Engineer ensuring high availability, scalability, and performance of Emburse’s systems. Collaborating on distributed systems while mentoring junior engineers.
Associate DevOps Engineer supporting the Continuous Integration and Delivery pipeline of Sun Life's Canadian IT API applications. Ideal for Computer Science students graduating December 2026 or later, seeking industry experience.
Reliability Engineering Intern working with experienced engineers on mining operations. Gaining hands - on experience with Caterpillar equipment and engineering challenges.
Senior Reliability Engineer at IKO Industries optimizing asset reliability and equipment performance across manufacturing operations. Applying advanced reliability methodologies and leading multi - site initiatives.
Senior SRE managing resilient cloud infrastructure for Oscilar's AI Risk Decisioning™ Platform. Leading best practices and mentoring engineers in a remote - first culture.