Senior Site Reliability Engineer for Hopper's Cloud FinOps team managing cloud infrastructure. Focused on optimizing cost efficiency and system reliability for fintech solutions.
Responsibilities
Work on projects that will drive a higher cost efficiency, such as:
Reduce our network egress costs by removing unnecessary headers.
Ensure that our warehouse data is in use and select the most efficient storage for it. E.g., cold storage for buckets with infrequent retrieval.
Ensure that autoscaling for both databases and compute is well optimized.
Work on improving the current cost attribution to ensure all teams have clear visibility into their costs.
You will also participate in providing support to incidents and be part of on-call rotation for platform incidents, as each engineering team has their own on-call rotation (Team is scattered across America and Europe, so you can sleep at night!). You will also contribute to solving doubts and problems engineers might face with our infrastructure and approving PRs that require Platform supervision.
You will be part of a small and highly efficient team of SREs.
Requirements
Strong background in SRE, DevOps, Software Engineering or Systems engineering
Troubleshooting skills
System design with good analytical capabilities
Good communication skills
Knowledge of major cloud providers, preferably Google Cloud
SQL knowledge
Containers, Kubernetes, and related tooling like Kustomize and Helm
Service Mesh, preferably with Istio
Networking knowledge. DNS, TLS, certificates, ingresses, etc.
Observability with log collection, metrics, APM, etc. preferably Datadog
Security knowledge, IAM, RBAC, network security, etc.
Knowledge on authentication and authorization technologies
CI/CD
Database technologies
Competent in scripting with Bash and Python or other scripting languages
Benefits
Well-funded and proven startup with large ambitions, competitive salary and upsides of pre-IPO equity packages.
Hopper covers 100% of the premiums for group insurance plan.
Hopper offers life, short term and long term disability coverage.
HSA that covers eligible medical and dental expenses.
All employees and dependents have access to Dialogue’s telemedicine services, anytime, anywhere.
All employees have access to an RRSP plan with automatic pre-tax withdrawals per pay.
Please ask us about our very generous parental leave, much above industry standards!.
Unlimited PTO.
Carrot Cash travel stipend.
Access to co-working space on demand through FlexDesk AND Work-from-home stipend.
Entrepreneurial culture where pushing limits and taking risks is everyday business.
Open communication with management and company leadership.
Ingénieur fiabilité des infrastructures pour soutenir les services SaaS critiques. Collaborer, innover et optimiser la fiabilité et la performance des systèmes cloud sur AWS et Kubernetes.
DevOps Engineer to help scale cloud and on - prem environments, automating deployments and enhancing security posture for energy - intelligent compute applications.
Reliability Engineering Architect at Carbon60 managing a team to deliver AWS cloud solutions. Focus on mentoring engineers and integrating AI tools into automated systems.
DevOps Specialist taking over build, release, and environments for Sparrow’s product team. Leading DevOps practices while collaborating with CTO and senior developers in an agile setting.
Developer Advocate advocating for security in cloud native infrastructure within a global leader in recruitment. Collaborating with thought leaders and driving awareness through various channels.
Senior Site Reliability Engineer at Rootly embedding with teams to enhance service performance and reliability. Own CI/CD pipelines and drive capacity planning efforts in a fast - paced environment.
Site Reliability Engineer maintaining and optimizing cloud infrastructure for Tecsys. Collaborating with engineering teams to drive reliability and performance in mission - critical SaaS environments.
DevOps Engineer responsible for maintaining corporate IT systems and cloud infrastructure. Collaborating with business teams to deliver technology - driven solutions.
Engineering Manager leading Site Reliability Engineers in developing reliable cloud infrastructure at Tempo. Ensure stability, cost efficiency, and effective team management in a SaaS environment.
Senior Site Reliability Engineer with Python infra - as - code for Cloud operations at Canonical. Enabling devsecops for applications on OpenStack and Kubernetes in a remote global environment.