Develop and maintain hardware abstraction layers and runtime interfaces for NVIDIA’s computing platforms. Collaborate with cross-functional teams to enhance reliability and performance.
Responsibilities
Extend and maintain hardware abstraction layers and core system libraries used across the platform.
Design and implement drivers, runtimes, and data movement/aggregation pipelines supporting workload execution.
Build and maintain runtime interfaces for launching, monitoring, and managing workloads.
Improve platform reliability through automation, error reporting, diagnostics, and operational tooling.
Debug and resolve complex sequencing, initialization, and runtime issues across multi-component systems.
Partner cross-functionally with hardware engineering, compiler teams, and data center operations to bring features from prototype to production.
Support new platform bring-up and NPI (New Product Introduction) efforts for new boards and silicon.
Contribute to engineering excellence through documentation, tooling improvements, code reviews, and knowledge sharing.
Requirements
A Masters Degree in Computer Science, Computer Engineering, Electrical Engineering, related STEM field or equivalent experience.
5+ years of relevant work experience
Strong proficiency in modern C++ (design, implementation, debugging, and performance considerations).
Experience designing, maintaining, and refactoring software libraries and APIs with long-term support in mind.
Comfort working in large, multi-repository or multi-component codebases with layered dependencies.
Demonstrated ability to lead or drive triage of difficult reliability issues and produce clear root-cause analysis.
Ability to clearly communicate software architecture and design tradeoffs, including using diagrams and written design docs.
Low-level platform software experience (e.g., firmware/boot flows, RTOS, BMCs/MCUs, RISC-V, or closely related system software).
Linux systems experience that includes driver or kernel-adjacent interfaces (e.g., VFIO or similar subsystems).
Hardware bring-up and/or system triage experience (fault analysis, system diagnostics, or validation support in lab environments).
Software Engineer (L3) developing applications for Twilio, shaping the future of communications. Collaborating on software and cloud infrastructure to enhance developer productivity and best practices.
Senior Software Application Developer at Boeing responsible for full stack software development for data delivery applications. Collaborating with product management and maintaining high standards of software quality.
Software Developer in Testing ensuring quality and reliability for Tecsys’ cloud - based data platform. Collaborating with data engineers and product owners within an agile team environment.
Senior Software Developer creating and maintaining AI - driven automation systems at Plusgrade. Leading technical design and ensuring quality and security for automation projects.
Principal Software Engineer at Dropbox driving technical direction for identity and engagement services. Focusing on core platforms, architecture evolution, and mentoring engineers.
Principal Software Engineer at Dropbox driving impactful technical outcomes across teams and organizations. Defining long - term strategy and remaining hands - on with software development.
Software Engineer developing solutions utilizing generative AI and data engineering at Mechanical Orchard. Collaborating in small cross - functional teams on modernization of business applications.
Software Engineering Intern contributing to Tonal’s product roadmap while developing AI - assisted automation solutions. Collaborating with engineering teams to leverage new technologies and boost productivity.
Technical Lead specializing in mentorship and code quality at CanadaHelps, a leading charity platform. Driving team collaboration and delivering scalable software solutions for charitable donations.