Overview
Datacurve provides premium, human-curated coding data used to train and evaluate large language models. Through its bounty-based contributor platform, Shipd, it engages top engineers to submit research-grade coding challenges, debugging tasks, and private repository benchmarks such as its long-horizon DeepSWE benchmark. Leading AI labs use Datacurve's datasets to improve model reasoning and code generation.
Company
Datacurve went through Y Combinator's W24 batch and is based in San Francisco. It was founded by Serena Ge and Charley Lee, both University of Waterloo computer science alumni. The company reached roughly $9M ARR within 13 months and in October 2025 raised a $15M Series A led by Chemistry, with participation from engineers at DeepMind, Vercel, Anthropic, and OpenAI, bringing total funding to about $17.7M. Datacurve positions itself as a higher-quality, code-focused alternative to large data-labeling incumbents.