Tobias Müller the read_csv of DuckDB does not fix the column order, right? Hence if the column of agency.txt was agency_name, agency_id this would fail, similarly when an extra column is part of the source GTFS file not specified in the table. For us GTFS is one of the file formats that we handle (import and export), but since within Europe NeTEx is the standard for data exchange between the national access points it makes sense to be able to process NeTEx. I have done so in different forms within DuckDB. Creating a relational database based on XML Schema . Using DuckDB as 'advanced' key value store with extra attributes per key in various incarnations. For now my conclusion is that DuckDB has a set of significant and known issues when going beyond main memory. https://github.com/duckdb/duckdb/issues/?q=is%3Aissue%20author%3Askinkie