About the role

Evaluating AI models through audio assessments as a contractor for AI benchmark evaluation project. Designing training data and auditing conversational AI outputs remotely.

Responsibilities

Operate autonomously to design complex evaluation frameworks and provide structured training data.
Role-Play Scenario Execution: Creating and executing complex, role-play-based evaluation scenarios that simulate realistic customer service interactions across travel, finance, and technical support domains.
Model Performance Auditing: Evaluating AI model performance across standardized qualitative and quantitative metrics, focusing strictly on task completion accuracy, conversational naturalness, and audio comprehension.
Technical Metric Evaluation: Assessing the model's basic computer programming literacy, including its understanding of JSON structures, functions, methods, and ability to reason about structured data within a support context.
Representative Dataset Generation: Contributing to the development of diverse, high-quality audio datasets that accurately reflect real customer expectations for clarity, efficiency, and natural conversational flow.

Demonstrable professional expertise in complex customer support, technical troubleshooting, or conversational AI evaluation.
Native or bilingual proficiency in the target language, including fluency across all language skills (reading, listening, writing, and speaking), alongside strong analytical and verbal communication skills to confidently conduct simulated customer support role-plays.
Basic computer programming literacy, specifically a comfortable understanding of JSON structures, functions, methods, and simple logic.
A meticulous, detail-oriented approach to working with structured prompts, complex evaluation rubrics, and technical guidelines.
Required Equipment: Access to a high-quality microphone to ensure clean, reliable audio input during voice evaluations.

As a contractor you’ll supply a secure computer and high‑speed internet; company‑sponsored benefits such as health insurance and PTO do not apply.