Astraea attacks one of the slowest, most expensive bottlenecks in drug development: the biometrics pipeline that turns raw clinical trial data into a submission-ready package for the FDA. Today, a Phase II or III sponsor will typically have 5 to 10 statistical programmers and data managers spend roughly nine months converting raw study data into SDTM datasets, ADaM datasets, TFLs (tables, figures, and listings), and the QC artifacts regulators expect. Astraea automates that pipeline end-to-end, taking only the protocol and the raw data as inputs and producing CDISC-compliant outputs in days.

The platform stitches together what are normally seven separate manual workstreams — data cleaning, SDTM mapping, ADaM creation, TFL generation, statistical programming, QC, and regulatory formatting — and runs them through agents in one auditable system. It also handles spec creation, Pinnacle 21 checks, and revision workflows, which is unusual; most existing tools automate one slice of the pipeline rather than the whole chain. The founders are careful to position this as automating manual programming and data wrangling, not replacing the scientific judgment of statisticians.

Astraea is part of Y Combinator's Spring 2026 (P26) batch, co-founded by Joshua Wang (CEO) and Sanmay Sarada (CTO). The team is based in San Francisco and has publicly demonstrated compressing ongoing-study workloads from months into days for early pharma design partners. Because this is a regulated workflow that ultimately lands in front of the FDA, expect adoption to move through validation, audit trails, and 21 CFR Part 11 conversations rather than self-serve signup.