Datafold is a data-engineering automation platform focused on data quality, reliability, and migrations. Founded in 2020 by Gleb Mezhanskiy and Alex Morozov and based in San Francisco, the company originally became known for proactive data-quality testing and has since repositioned around AI-powered automation for data teams, spanning migrations, code optimization, and AI-assisted code review.
Its foundational product, Data Diff, compares datasets at the value level within or across databases, showing engineers exactly how a code change will affect the resulting data and downstream products such as BI dashboards and ML models before it ships. This regression-testing capability integrates into CI/CD and dbt workflows so teams can catch breaking changes during development rather than in production.
More recently, Datafold has built a Data Migration Agent that automates code translation and validation when moving between data platforms, aiming to compress migrations from years to weeks, along with a Data Knowledge Graph for lineage and context, ML-based anomaly detection, and specialized AI agents for migration, optimization, and code review. The platform integrates with major warehouses and tools and offers single-tenant VPC deployment with SOC 2, HIPAA, and GDPR support.
Datafold raised a $20 million Series A in 2021 led by NEA with Amplify Partners, for roughly $26 million total across its rounds. Customers cited publicly include Thumbtack, Patreon, Faire, and Dutchie, with a dbt Labs partnership reinforcing its place in the modern data stack.