Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Reading files

There are a few functions mainly designed for DuckDB to read from files:

PRQL

prql target:sql.duckdb

from a = (read_parquet "artists.parquet")
join b = (read_csv "albums.csv") (a.artist_id == b.artist_id)
join c = (read_json "metadata.json") (a.artist_id == c.artist_id)

SQL

WITH table_0 AS (
  SELECT
    *
  FROM
    read_parquet(
      'artists.parquet',
      binary_as_string = false,
      file_row_number = false,
      hive_partitioning = NULL,
      union_by_name = false
    )
),
table_1 AS (
  SELECT
    *
  FROM
    read_csv_auto('albums.csv')
),
table_2 AS (
  SELECT
    *
  FROM
    read_json_auto('metadata.json')
)
SELECT
  table_0.*,
  table_1.*,
  table_2.*
FROM
  table_0
  INNER JOIN table_1 ON table_0.artist_id = table_1.artist_id
  INNER JOIN table_2 ON table_0.artist_id = table_2.artist_id

[!NOTE] These don’t currently have all the DuckDB options. If those would be helpful, please log an issue and it’s a fairly easy addition.

[!NOTE] We may be able to reduce the boilerplate WITH table_x AS SELECT * FROM... in future versions.

When specifying file names directly in the FROM clause without using functions, which is allowed in DuckDB, enclose the file names in backticks `` as follows:

PRQL

from `artists.parquet`

SQL

SELECT
  *
FROM
  "artists.parquet"

See also