Reading files
There are a few functions mainly designed for DuckDB to read from files:
PRQL
prql target:sql.duckdb
from a = (read_parquet "artists.parquet")
join b = (read_csv "albums.csv") (a.artist_id == b.artist_id)
join c = (read_json "metadata.json") (a.artist_id == c.artist_id)
SQL
WITH table_0 AS (
SELECT
*
FROM
read_parquet(
'artists.parquet',
binary_as_string = false,
file_row_number = false,
hive_partitioning = NULL,
union_by_name = false
)
),
table_1 AS (
SELECT
*
FROM
read_csv_auto('albums.csv')
),
table_2 AS (
SELECT
*
FROM
read_json_auto('metadata.json')
)
SELECT
table_0.*,
table_1.*,
table_2.*
FROM
table_0
INNER JOIN table_1 ON table_0.artist_id = table_1.artist_id
INNER JOIN table_2 ON table_0.artist_id = table_2.artist_id
[!NOTE] These don’t currently have all the DuckDB options. If those would be helpful, please log an issue and it’s a fairly easy addition.
[!NOTE] We may be able to reduce the boilerplate
WITH table_x AS SELECT * FROM...in future versions.
When specifying file names directly in the FROM clause without using
functions, which is allowed in DuckDB, enclose the file names in backticks
`` as follows:
PRQL
from `artists.parquet`
SQL
SELECT
*
FROM
"artists.parquet"