Senior Engineer optimizing deep learning inference on edge hardware for autonomous vehicles and robotics at NVIDIA. Collaborating with automotive OEMs and addressing complex optimization challenges.
Responsibilities
Address customer and partner optimization challenges by engaging directly with automotive OEMs and robotics associates to analyze, debug, and improve deep learning models on NVIDIA platforms
Own performance benchmarking by driving efforts to achieve leading results on MLPerf Edge and industry benchmarks, defining methodology and ensuring reproducibility
Evaluate emerging model architectures by analyzing DL architectures, including vision encoders, multi-modal VLMs, for compilation feasibility, memory footprint, and latency on target SOCs
Collaborate across teams by partnering with compiler, runtime, and hardware teams to connect model-level insight with platform capabilities
Deliver TensorRT and compiler-stack solutions for edge by creating and deploying inference solutions on Jetson, DRIVE, and GPU + ARM platforms for AV and robotics workloads.
Develop Proofs of Readiness (PORs) and work closely with compiler team on Torch-TRT, MLIR-TRT, and related frameworks to bridge performance gaps.
Requirements
Master’s degree or equivalent experience in Computer Science, Electrical Engineering, or a related field
12 + years of industry experience with over 8 years in deep learning model optimization, inference engineering, or neural network compilation
Adept at interpreting and reasoning about model architectures at the operator/kernel level
Over 5 years of validated expertise in embedded/edge software, experience delivering production inference solutions within power-limited, latency-sensitive deployment environments
Deep knowledge of current DL architectures: transformers, attention variants, vision encoders (ViT), multi-modal/vision-language model frameworks, and experience with diffusion models and/or state space models
Expert knowledge of GPU architecture fundamentals, CUDA, and low-level performance optimization using heterogeneous computing
Experience with TensorRT, compiler IRs, or equivalent inference optimization toolchains
Solid understanding of embedded operating system internals (QNX/Linux), memory management, C/C++, and embedded/system software concepts
Background in parallel programming (e.g., CUDA, OpenMP) and experience reasoning about memory hierarchies, data movement, and compute utilization
Demonstrated capability to collaborate directly with external partners and customers in a deep technical role, solving their workload issues, identifying performance problems, and providing solutions within production limitations.
Software Engineering Intern contributing to Tonal’s product roadmap while developing AI - assisted automation solutions. Collaborating with engineering teams to leverage new technologies and boost productivity.
Technical Lead specializing in mentorship and code quality at CanadaHelps, a leading charity platform. Driving team collaboration and delivering scalable software solutions for charitable donations.
Full Stack Developer for Signal49 Research, creating interactive dashboards and reporting tools. Work collaboratively with internal clients and data teams in a remote setting.
Renewables Lead Electrical Engineer driving growth and success in Ulteig’s electrical engineering offerings. Conducting system studies, mentoring, and leading projects in renewable energy sector.
Staff Software Engineer specializing in data infrastructure for Instacart's data governance and compute systems. Collaborating with engineering teams to enhance the platform's reliability and performance.
Principal Engineer designing mixed - signal IPs for Microchip Technology. Collaborating with SoC architects and managing IP intake processes for advanced analog solutions.
Principal Software Architecture Director overseeing software architecture and technology strategy at SGI. Providing guidance and mentorship while aligning with business goals in the insurance sector.
Senior Engineer leading design and implementation of protective relaying systems for BWRX - 300 Nuclear Reactor. Engaging in grid interface projects and customer technical assessments.