Senior Software Engineer developing AI evaluation frameworks and systems at Lattice. Engaging in building robust AI infrastructure and ensuring performance and reliability of AI products.
Responsibilities
Design and ship a robust, end-to-end AI evaluation framework, covering offline evals, production tracing, and human-in-the-loop feedback loops, connected across all of Lattice’s AI use cases.
Define and instrument the metrics that actually matter: agent task completion, hallucination rates, response quality, user engagement, and downstream business outcomes.
Build and maintain evaluation datasets, test harnesses, and automated scoring pipelines to catch regressions before they ship.
Identify and surface the drivers of agent quality improvement, giving the team clear signals on where to invest.
Architect and implement reusable agent infrastructure: multi-turn conversation workflows, recommendation services, LLM DAGs, and standardized agent topology patterns using LangGraph.
Build and scale RAG pipelines and retrieval infrastructure, including vector store management and retrieval quality optimization.
Make principled build vs. buy decisions across LLM providers, agent frameworks, and evaluation tooling, balancing capability, cost, latency, and vendor risk.
Contribute to production AI systems with a strong focus on reliability, observability, and performance, not just prototypes.
Own projects end-to-end: scope them, drive them to completion, and bring in the right people at the right time.
Partner with engineering leads and managers to inform technical direction on agent quality and evaluation strategy you’ll be expected to hold intelligent, substantive conversations about methodology, not just implementation.
Raise the AI engineering bar across the broader team through code review, documentation, and thoughtful technical debate.
Requirements
5+ years of professional software engineering experience with significant time spent on production AI/ML systems.
Deep hands-on experience with LLM-based systems: prompt engineering, RAG pipelines, agent orchestration, evaluation metrics, and model fine-tuning.
Proven ability to work with data and understand statistics, especially in experiments.
Proven ability to build and operate agentic AI systems in production: multi-step workflows, multi-agent topologies, and the failure modes that come with them.
Strong command of AI evaluation: you’ve built eval frameworks before, you know the difference between a good eval and a vanity metric, and you have opinions about it.
Software Engineering Intern contributing to Tonal’s product roadmap while developing AI - assisted automation solutions. Collaborating with engineering teams to leverage new technologies and boost productivity.
Technical Lead specializing in mentorship and code quality at CanadaHelps, a leading charity platform. Driving team collaboration and delivering scalable software solutions for charitable donations.
Full Stack Developer for Signal49 Research, creating interactive dashboards and reporting tools. Work collaboratively with internal clients and data teams in a remote setting.
Renewables Lead Electrical Engineer driving growth and success in Ulteig’s electrical engineering offerings. Conducting system studies, mentoring, and leading projects in renewable energy sector.
Staff Software Engineer specializing in data infrastructure for Instacart's data governance and compute systems. Collaborating with engineering teams to enhance the platform's reliability and performance.
Principal Engineer designing mixed - signal IPs for Microchip Technology. Collaborating with SoC architects and managing IP intake processes for advanced analog solutions.
Principal Software Architecture Director overseeing software architecture and technology strategy at SGI. Providing guidance and mentorship while aligning with business goals in the insurance sector.
Senior Engineer leading design and implementation of protective relaying systems for BWRX - 300 Nuclear Reactor. Engaging in grid interface projects and customer technical assessments.