Senior Cloud Engineer at Sleep Country maintaining multi-cloud infrastructure. Designing, building, and optimizing cloud systems for reliability, performance, and security.
Responsibilities
Design, build, and maintain cloud infrastructure across multiple platforms (e.g., AWS, Azure, GCP) to support business applications and services.
Implement configurations for compute, storage, networking, and cloud services following best practices for scalability and resilience, performing regular maintenance, patching, and environment updates to ensure systems remain current and performant.
Monitor cloud systems and services using enterprise monitoring and alerting tools to ensure uptime and performance, Investigating and troubleshooting incidents and problems, performing root-cause analysis and timely remediation of cloud infrastructure issues.
Participate in a rotating on-call schedule for after-hours support, responding to critical incidents to restore services and minimize downtime in line with service level objectives.
Develop and manage automation scripts and CI/CD pipelines to streamline cloud deployments and operations, Utilizing infrastructure-as-code (e.g., Terraform, CloudFormation) to provision and configure cloud resources reliably and repeatedly.
Implement automated build, test, and deployment processes in collaboration with DevOps and development teams, reducing manual effort and improving consistency across environments.
Analyze system performance metrics and identifies opportunities to improve efficiency, reliability, and cost-effectiveness of cloud operations, optimize resource usage and application performance through tuning and capacity planning.
Evaluate and adopt new tools and practices (including emerging AIOps platforms, intelligent monitoring, and auto-remediation technologies) to enhance operational capabilities.
Contribute to improving operational playbooks, runbooks, and knowledge bases for the cloud operations function.
Implement and adhere to security and compliance controls within cloud environments, following corporate IT security policies, standards, and regulatory requirements (e.g., data protection, privacy) in all cloud configuration and deployment activities.
Work closely with Security and Compliance teams to remediate vulnerabilities, manage cloud access and encryption keys, and ensure that cloud infrastructure and processes pass audits and meet governance standards.
Maintain accurate documentation for configurations and changes as part of compliance and change management processes.
Excellent communication skills, both written and verbal, with the ability to document procedures and clearly convey technical information to teammates and stakeholders, effectively collaborating with cross-functional teams (Cloud Ops, DevOps, Developers, Security, Vendors) to implement solutions and resolve issues.
Requirements
Bachelor’s degree in computer science, Information Systems or a related field.
Approximately 5+ years of progressive experience in IT infrastructure or cloud engineering, with a focus on deploying and managing cloud-based environments.
Demonstrated success in supporting complex, distributed systems and services in a production (24/7) environment.
Deep hands-on knowledge of public cloud platforms (such as AWS, Microsoft Azure, or Google Cloud Platform), including experience with compute, storage, networking, and managed services.
Proficiency in designing highly available, scalable cloud architectures and implementing cloud networking concepts (VPC/VNet configuration, security groups, load balancers, etc.). Strong understanding of virtualization and containerization technologies; experience with container orchestrators (e.g., Kubernetes) is an asset.
Strong experience with infrastructure automation and DevOps practices.
Proficiency in writing scripts (e.g., PowerShell, Python, Bash) to automate system tasks and deployments.
Experience building and maintaining CI/CD pipelines (using tools such as Jenkins, GitHub Actions, or Azure DevOps) to automate build, test, and deployment processes.
Familiarity with infrastructure-as-code tools (e.g., Terraform, CloudFormation) for provisioning and managing cloud resources.
Proven ability in monitoring, incident management, and performance tuning for cloud-based systems. Experience using modern observability tools (such as Dynatrace, Datadog, CloudWatch, etc.) to collect metrics, logs, and traces, and to troubleshoot complex issues.
Strong analytical and problem-solving skills, with the capacity to perform root cause analysis and implement effective fixes under pressure. Comfortable working in an on-call rotation, and capable of making sound decisions quickly during critical incidents to restore service.
Excellent communication skills, both written and verbal, with the ability to document procedures and clearly convey technical information to teammates and stakeholders.
Demonstrated curiosity and adaptability in keeping pace with evolving technologies and practices in cloud computing, including willingness and aptitude to learn and utilize new tools (including AI-driven operations tools, automated monitoring/alerting systems) to improve efficiency and reliability.
Experience with BigCommerce, Shopify, or similar cloud-based eCommerce platforms.
Familiarity with Oracle Cloud Infrastructure (OCI) and/or Oracle Fusion Cloud applications in an operational setting considered an asset.
Benefits
This is not a job but a CAREER with opportunities for growth and advancement
Diverse and inclusive work environment
We will invest in you and provide extensive training, mentoring and continuous development
Access to training and development platforms
Associate Discount Program where you will be able to enjoy some of the world’s best sleep products
Recognized as one of Canada’s Most Admired Corporate Cultures in 2023 by Waterstone Human Capital
Join Sokin as a Senior Engineer to own the engineering lifecycle and transform our payments platform. Collaborate with a dynamic team to develop scalable and reliable solutions in financial services.
Software Developer enhancing Clio’s monetization platform using AI and high quality code. Collaborating with various teams to drive impactful outcomes in the legal AI technology sector.
Senior Software Developer developing systems that accurately record financial transactions at Wealthsimple, Canada's largest fintech. Designing ledger infrastructure and collaborating with finance and product teams.
Senior Software Applications Developer on the FortiCare team at Fortinet. Engaging in multi - tier web applications, microservices, and web client technologies.
Telephony Engineer managing Five9 Contact Center solutions. Designing and optimizing telephony infrastructure for healthcare organizations with a focus on exceptional customer experience.
Full Stack Software Engineer responsible for delivering ML - powered applications at AltaML. Collaborate across teams using Claude and agentic coding tools for software development.
Staff Software Engineer at Outschool, creating a personalized, AI - guided educational platform for families. Leading engineering initiatives and mentoring teams in AI - native development.
Intermediate Full - Stack Developer for BGC Engineering Inc. designing features on Structura platform. Collaborating with engineers and data scientists, focusing on React front end and FastAPI backend.
Software Engineer building and maintaining internal development tools for Homebase. Key contributor to platform infrastructure enhancing developer experience and CI/CD pipelines.
Software Engineer focused on Salesforce development at Ritchie Bros. Collaborating with teams to deliver high - quality software solutions efficiently.