Data Scientist
We're looking for a Data Scientist to help establish the quantitative foundation of a cutting-edge trust and validation framework for autonomous systems. In this role, you'll design rigorous statistical methodologies to evaluate system performance, develop confidence and reliability metrics, and support high-scale deployment with robust measurement systems. Your work will be critical in validating performance in high-stakes domains and enabling data-driven decisions as the platform scales from early users to millions of interactions per month.
Responsibilities
Design statistical frameworks to validate autonomous system performance with academic rigor
Develop mathematical models to quantify trust, reliability, and performance in complex domains
Build autoscaling algorithms for compute resource optimization at scale
Create projection models for quota growth and capacity planning across multi-region deployments
Establish methodologies to measure system composition, including dynamic and contextual behavior
Design systems for context traceability and statistical validation of reasoning pathways
Develop confidence calculation methods across simulation runs and deployment conditions
Create judge coverage frameworks for comprehensive performance evaluation
Define metrics tied to interpretability, safety, and business outcomes
Design attribution systems that identify key components contributing to system performance
Model capability expansion to measure growth while maintaining reliability
Collaborate with verification and simulation teams to define evaluation standards
Contribute to academic publications and technical content showcasing scientific rigor
Work with engineering teams to implement statistical measurement systems in production
Qualifications
Advanced degree in statistics, data science, applied mathematics, or related field
Strong foundation in statistical methods, experimental design, and measurement frameworks
Experience applying quantitative approaches to complex system evaluation
Background in building performance metrics for AI or software systems
Proficient in confidence intervals, variance analysis, and statistical validation
Experience designing experiments to quantify behavior across variable conditions
Skilled in Python, statistical tools, and data analysis libraries
Ability to connect metrics to business impact and technical performance
Experience with data visualization for communicating complex concepts
Academic or industry publication experience is a plus
Passion for scientific rigor and trustworthy evaluation in AI systems
