Staff Software Engineer focused on infrastructure to enhance Docker's platform reliability. Leading technical direction and collaboration across teams for self-service applications and deployment solutions.
Responsibilities
Take ambiguous infrastructure problems and turn them into proposals the org can rally around, then drive them through RFCs and architecture reviews across teams.
Design self-service capabilities and platform APIs (primarily in Go) for onboarding, provisioning, deployment, observability defaults, and day-2 operations, with contracts and docs teams actually use.
Set delivery standards using Terraform, GitOps with Argo CD, progressive rollout, and good testing, including building the continuous-deployment flow we're missing today.
Evolve the multi-tenant EKS foundations toward better reliability, security, scale, and cost: Envoy Gateway ingress, traffic routing, and the multi-region, cross-account connectivity we need.
Improve SLOs, alerting, and incident follow-up on Grafana Cloud so production gets safer and less dependent on heroics.
Assist in shaping AI-assisted and agentic workflows to cut operational toil while ensuring safety, auditability, and human oversight.
Requirements
8+ years of professional, hands-on, full-time software engineering experience in backend, infrastructure, or platform engineering.
Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience
Strong software engineering in Go or a similar language: design, testing, debugging, review, long-term maintainability.
A track record designing, shipping, and operating cloud services or infrastructure platforms in production. We hire for skill and impact, not years.
Deep expertise in at least one of: Kubernetes, networking, cloud platforms, reliability engineering, or developer platforms, plus solid Linux, networking, and production-ops fundamentals.
Experience setting technical direction and leading work that needs cross-team alignment.
Clear written and verbal communication in a remote environment (RFCs, design docs, incident writeups).
Nice to have: EKS and ingress/CNI/service-mesh experience; observability with OpenTelemetry/Prometheus/Grafana; CI/CD and progressive delivery (GitHub Actions, Argo CD, canaries); experience leading migrations or adoption programs across teams.
Benefits
Freedom & flexibility; fit your work around your life
Designated quarterly Whaleness Days plus end of year Whaleness break
Home office setup; we want you comfortable while you work
16 weeks of paid Parental leave (after 6 months of employment)
Technology stipend equivalent to $100 USD net/month
PTO plan that encourages you to take time to do the things you enjoy
Training stipend for conferences, courses and classes
Equity; we are a growing start-up and want all employees to have a share in the success of the company
Docker Swag
Medical benefits, retirement and holidays vary by country
Remote-first culture, with offices in Seattle and Paris
Scientist or Engineer specializing in Automation and Data Management for quantum technologies at Aeponyx in Montreal. Responsible for designing and implementing data workflow infrastructure.
Senior Guidewire Developer supporting implementations and technical problem - solving in Guidewire environments. Collaborating with teams to deliver quality outcomes for clients across Canada.
Senior Full Stack Engineer at Optix building AI - powered features for coworking spaces. Collaborating across teams to deliver end - to - end solutions within a hybrid work model.
Senior AI Engineer leading development of AI - powered tools for DraftKings, enhancing engineering workflows and mentoring teams on AI integration. A role focused on productivity and software lifecycle enhancement.
Lead Software Engineer developing AI - powered tools at DraftKings for enhancing development processes and productivity. Collaborating across teams and mentoring engineers to drive innovation.
Staff iOS Software Engineer leading mobile app development at CNN, collaborating across Product, Design, and Engineering teams to create user - centric mobile experiences.
AI Engineering Manager leading and scaling a team to develop innovative AI - driven solutions for insurance. Guiding technical decisions while promoting team growth and collaboration.
Staff Software Developer at Varicent enhancing AWS infrastructure and REST API. Collaborating with product management and design teams to develop features for our web application.
Lead design and implementation of large - scale, multi - team platform and product initiatives. Drive architecture, scalability, performance, and engineering best practices across the organization.
Staff Software Engineer - Platform developing scalable solutions for Grafana's observability cloud, contributing to backend systems and infrastructure management for distributed applications.