Senior Sustaining Engineer maintaining stability and reliability of fintech platform. Collaborating globally to resolve incidents and enhance system performance.
Responsibilities
Act as a primary responder in a 24x7 on-call rotation for high-priority incidents, ensuring fast acknowledgment (MTTA targets) and resolution to minimize customer impact in our event-driven fintech platform
Conduct root-cause analysis (RCA) for complex issues, collaborating closely with development teams to implement robust solutions and deliver RCAs within 5 business days for Sev1/Sev2 incidents.
Lead the development and deployment of small, customer-facing features and improvements, ensuring alignment with business needs and system requirements while adhering to change success rates ≥99%.
Work with mid- and junior-level engineers, providing guidance in incident response, troubleshooting best practices, and coding standards within a global rota, including handovers and knowledge sharing via tools like Rootly.
Take ownership of software maintainability initiatives, identifying and implementing optimizations, and enhancing system performance to achieve availability ≥99.99% (four nines).
Participate in regular post-incident reviews (blameless retros), documenting lessons learned and suggesting improvements to incident response processes and runbooks for our technology stack.
Collaborate with the infrastructure team to monitor system health and proactively identify areas for improvement in stability and efficiency using tools like Datadog, Rootly, and CloudWatch/AppDynamics.
Requirements
Bachelor's degree in computer science, Engineering, or a related field.
Minimum of 5+ years of experience in sustaining engineering, DevOps, or software engineering with a focus on incident response and system reliability in fintech or regulated environments.
Advanced troubleshooting skills and experience with Golang (preferred), Java, or similar languages, plus familiarity with event-driven architectures (e.g., NATS/JetStream, Redis clustering).
Strong familiarity with monitoring and incident response tools (e.g., Datadog, Rootly) and experience implementing improvements in similar systems to meet SLAs like MTTA/MTTR.
Proven ability to conduct in-depth root-cause analysis and implement long-term fixes in compliance-aware settings (e.g., GDPR/FCA-aligned).
Experience mentoring or guiding mid-level engineers, with a focus on knowledge sharing and process improvements in geo-distributed teams.
Awareness of ITILv4 principles (e.g., incident/change management) and tools like Rootly for unified workflows.
Strong communication skills and the ability to work collaboratively with both technical and non-technical teams across time zones.
Benefits
Out‑of‑hours on‑call rotation with additional compensation
Electrical Engineer managing the design and implementation of HV/EHV substations throughout North America at Stantec. Opportunity to grow professionally in a multi - disciplinary team environment.
Principal Geotechnical and Geohazards Engineer at Stantec leading projects related to geohazards. Collaborating with specialists to mitigate risks and enhance infrastructure.
Senior Municipal Engineer leading municipal drinking water and wastewater projects. Guiding technical teams and managing client relationships at WaterSMART Solutions in Calgary, Alberta.
Project Engineer bridging engineering and project management in hydroelectric technology development for Andritz Hydro. Ensuring effective communication between engineering teams and clients while coordinating technical activities.
Intermediate Geotechnical or Geological Engineer at BGC Engineering Inc. in Ottawa, working within a multidisciplinary team focusing on tailings management and dam site project leadership.
Intermediate Geotechnical Engineer at Stantec leading various geotechnical projects. Collaborating with multidisciplinary teams in Calgary, Alberta and developing managerial and technical skills.
Engineer in asset management for municipalities, planning and optimizing infrastructure investments. Collaborate on intervention plans and provide technical support for municipal assets.
Professional or engineer in climate transition supporting municipalities in Québec regarding climate change adaptation and mitigation. Focus on technical support, training, and project development.
Engineer providing expertise in coastal engineering for municipalities. Engaging in project support, solutions, and collaboration with regional organizations.
Expert Rendering Engineer developing core rendering features for Cyberpunk 2 at CD PROJEKT RED. Optimize GPU performance and mentor team while ensuring visual fidelity and performance.