Senior Software Engineer II developing observability tools for enhancing small business operations at Wave. Collaborating with teams to improve production visibility and analytics capabilities.
Responsibilities
Build and Scale Observability-as-Code
Design and maintain Python tooling and Terraform modules that standardize Datadog configuration across services.
Eliminate manual setup by codifying monitors, dashboards, SLOs, and alerting patterns.
Improve consistency, repeatability, and reliability of observability across the organization.
Establish Reliable & Standardised Instrumentation
Define and implement observability blueprints that integrate high‑fidelity metrics, logs, and traces into the development lifecycle.
Codify best practices so teams get out-of-the-box visibility without needing deep observability expertise.
Raise the baseline for service health, debuggability, and operational readiness
Optimize Datadog Usage and Cost
Own critical parts of the Datadog platform configuration.
Improve data quality, signal-to-noise ratio, and alert reliability.
Partner with teams to adopt telemetry effectively while managing ingestion and alerting costs.
Maintain and Evolve Platform Components
Upgrade and maintain tracers, agents, and shared observability libraries.
Ensure upgrades are automated, backwards-compatible, and minimally disruptive to product teams.
Reduce operational risk by improving rollout and validation processes.
Integrate Observability Across Infrastructure
Collaborate with Platform and Infrastructure teams to embed monitoring into systems such as Kafka, gRPC services, Kubernetes, and AWS-managed services.
Improve production visibility and reduce mean time to detect (MTTD) and resolve (MTTR) incidents
Deliver High-Quality, Production-Ready Code
Write clean, well-tested, and maintainable Python code and Terraform modules.
Participate in architecture and design reviews; provide thoughtful feedback in code reviews.
Take ownership of projects end-to-end, from design and implementation through production rollout and support.
Mentorship & Collaboration
Assist team members to solve problems and develop their own skills.
Foster a collaborative mindset within the team.
Requirements
Degree in Computer Science, or related.
7+ years of experience in application development, platform engineering, or developer tooling.
High proficiency in Python; solid experience with Terraform.
Hands-on experience using Datadog for metrics, logging, tracing, dashboards, monitors, and alerts.
Experience with containerized and cloud-native environments (e.g., Kubernetes, Kafka, AWS, gRPC, Lambda).
Proven ability to independently drive medium-to-large initiatives from design to delivery.
Comfortable making pragmatic tradeoffs to deliver reliable, scalable solutions.
A strong product mindset for internal tools.
Passion for reducing cognitive load, eliminating toil, and making observability easy to adopt by default.
Solid understanding of modern web applications and distributed systems.
Knowledge of how observability applies to high-throughput, highly available systems
Clear written and verbal communication skills.
Ability to influence technical direction through design discussions, documentation, and hands-on implementation.
Comfortable partnering with product, platform, and infrastructure teams.
Software Engineer II focused on building scalable detection systems using AI tools at Abnormal AI. Collaborating with teams to enhance model serving infrastructure for data processing.
Senior Engineer in Building Electricity at EXP managing critical electrical projects for diverse clients. Contributing to quality and performance in design and implementation with hybrid work flexibility.
Senior Software Application Developer building full - stack features for Breezeway's property operations platform. Collaborating across teams and contributing to AI - driven initiatives for operational efficiency.
Software Engineer Intern building real - time AI - driven customer interaction systems for the modern contact center. Contributing to production infrastructure that focuses on latency, reliability, and measurable business outcomes.
Senior Infrastructure Software Engineer at Dropbox re - architecting Identity systems for multi - product strategy. Collaborating with teams and mentoring junior engineers in a dynamic environment.
Full - Stack JS engineer developing features and scaling systems for US Mobile's wireless communication. Collaborating with teams to enhance a future - ready, unified network.
Full - Stack Software Engineer to develop and deploy innovative features at US Mobile. Focused on scaling connectivity for millions of devices through agile team collaboration.
Staff Software Engineer, Tech Lead developing scalable software solutions at Toast for the restaurant industry. Leading projects that improve employee performance management and customer engagement.
Staff Software Engineer responsible for the Developer Platform at Chainguard, building secure software infrastructure. Focus on CI/CD, AI tooling, and developer experience innovations.