About the role

Lead Inference Platform Engineer focused on optimizing ML models for high-performance inference at Thomson Reuters. Collaborating with engineering teams and deploying AI workloads efficiently.

Responsibilities

Optimize LLMs and ML models for high-performance inference using techniques such as quantization, pruning, distillation, and hardware specific tuning
Deploy and scale inference workloads on GPUs across AWS, Azure, GCP and internal Kubernetes clusters, ensuring predictable performance during peak traffic hours, especially during business hours
Implement routing and failover strategies for OpenAI/Anthropic/Vertex AI traffic
Integrate models into production grade APIs supporting TR products and enterprise workflows
Develop highly optimized environment and eliminate performance bottlenecks to reduce latency
Collaborate with Platform Engineering teams (Landing Zones, Network, Storage, Compute, AI) to ensure inference workloads align with TR’s cloud native patterns (AWS, Azure, GCP, OCI)
Build and optimize containerized inference pipelines using Kubernetes for large‑scale distributed workloads
Ensure compliance with TR’s AI standards for deployment, monitoring, governance, and drift detection
Profile inference performance, identify GPU/CPU bottlenecks, and optimize compute utilization across heterogeneous hardware
Implement observability and health monitoring for inference pipelines, ensuring reliability of enterprise AI services
Collaborate with platform teams to enhance capacity forecasting for AI workloads
Work with Product, Data Science, Architecture, and Enterprise AI teams to onboard new research models into production
Collaborates closely with AI engineers to invent new quantization techniques, improve numerical precision, and explore non‑standard architectures
Partner with Cloud Engineers (Azure, AWS, GCP) to develop guardrails and automation that support inference workload
Support the scale out of AI infrastructure during critical releases and global product rollouts.

Requirements

Strong understanding of ML/LLM fundamentals and inference optimization techniques
Hands-on experience with GPU programming (CUDA preferred), inference runtimes (TensorRT, ONNX Runtime), and deep learning frameworks (PyTorch/TensorFlow)
Proficiency in Python and at least one systems language (C++ strongly preferred for performance critical inference paths)
Experience deploying AI workloads to AWS/GCP/Azure and Kubernetes
Familiarity with vector search systems (OpenSearch vectors) and retrieval augmented generation pipelines
Knowledge of distributed systems, microservices, CI/CD, and cloud native architecture
Experience with AI networks, such as CNNs, transformers, and diffusion model architectures, and their performance characteristics
Understanding of GPU, Multithreading and/or other accelerators with vectorized instructions
Specialized experience in one or more of the following machine learning/deep learning domains: Model compression, hardware aware model optimizations, hardware accelerators architecture, GPU/ASIC architecture, machine learning compilers, high performance computing, performance optimizations, numerics and SW/HW co-design.

Benefits

Flexible vacation
Two company-wide Mental Health Days off
Access to the Headspace app
Retirement savings
Tuition reimbursement
Employee incentive programs
Resources for mental, physical, and financial wellbeing

Lead Inference Platform Support Engineer – AI

at Thomson Reuters

Resume Score

About the role

Responsibilities

Requirements

Benefits

Job title

Job type

Experience level

Salary

Degree requirement

Tech skills

Location requirements

Report this job

Similar roles

Senior Data Platform Engineer

Wrapbook

Platform Engineer, Database Reliability

Bold Commerce

Platform Engineer

Shift Markets

Junior Power Platform Developer

Intact

Senior Staff Platform Operations Engineer

Cloudera

Web Administrator – Platform Engineer

CryptoTicker.io

Platform Engineer

Kroll

Platform Engineer, Databases

Clio - Cloud-Based Legal Technology

Engineering Manager – Platform Infrastructure

Spotify

Azure Platform Engineer

LinkedIn Recruiter Post