dltHub was founded in 2022 by Matthaus Krzykowski and team to make building data pipelines as simple as writing a few lines of Python. Its flagship open-source library, dlt (data load tool), lets developers extract data from APIs, databases, and files and load it into warehouses and lakes with automatic schema inference, normalization, and incremental loading, all without standing up heavy ETL infrastructure. Released under a permissive Apache 2.0 license, dlt has become one of the most widely adopted open-source ingestion tools.
The library handles the unglamorous but critical parts of data ingestion: it infers and evolves schemas automatically, manages state for incremental loads, handles retries and pagination, and writes to destinations like BigQuery, Snowflake, Redshift, DuckDB, and Postgres. Because it is just a Python library, it slots naturally into developers' existing code, notebooks, and orchestration tools rather than requiring a separate platform.
On top of the open-source core, dltHub offers dltHub Pro, a commercial managed platform that adds deployment, scheduling, alerting, observability, and agentic workflows for teams that want a fully supported experience. A standout direction is AI-driven pipeline creation: dltHub Context provides a knowledge base spanning thousands of data sources so AI agents can generate working connectors automatically. The company reports that agent-created pipelines have grown explosively, becoming responsible for the large majority of new pipeline creation on its platform.
dltHub raised an $8 million seed round in August 2025 led by Bessemer Venture Partners, with participation from Dig Ventures and Firestreak Ventures, bringing total funding to roughly $14 million across rounds. With millions of monthly PyPI downloads and thousands of production users, dltHub competes with managed connectors like Fivetran and Airbyte by being code-first, open-source, and increasingly agent-native.