Why PRQL?
For data engineers
- PRQL is concise, with abstractions such as variables & functions
- PRQL is database agnostic, compiling to many dialects of SQL
- PRQL isn’t limiting — it can contain embedded SQL where necessary
- PRQL has bindings to most major languages (and more are in progress)
- PRQL allows for column lineage and type inspection (in progress)
For analysts
- PRQL is ergonomic for data exploration — for example, commenting out a filter, or a column in a list, maintains a valid query
- PRQL is simple, and easy to understand, with a small number of powerful concepts
- PRQL allows for powerful autocomplete, type-checking, and helpful error messages (in progress)
For tools
- PRQL’s vision is a foundation to build on; we’re open-source and will never have a commercial product
- PRQL is growing into a single secular standard which tools can target
- PRQL is easy for machines to read & write
For HackerNews enthusiasts
- The PRQL compiler is written in Rust
- We talk about “orthogonal language features” a lot
Showcase
PRQL consists of a curated set of orthogonal transformations, which are combined together to form a pipeline. That makes it easy to compose and extend queries. The language also benefits from modern features, such syntax for dates, ranges and f-strings as well as functions, type checking and better null handling.
Playground Bookfrom employees
select {id, first_name, age}
sort age
take 10
SELECT
id,
first_name,
age
FROM
employees
ORDER BY
age
LIMIT
10
from track_plays
filter plays > 10_000 # Readable numbers
filter (length | in 60..240) # Ranges with `..`
filter recorded > @2008-01-01 # Simple date literals
filter released - recorded < 180days # Nice interval literals
sort {-length} # Concise order direction
SELECT
*
FROM
track_plays
WHERE
plays > 10000
AND length BETWEEN 60 AND 240
AND recorded > DATE '2008-01-01'
AND released - recorded < INTERVAL 180 DAY
ORDER BY
length DESC
from employees
# `filter` before aggregations...
filter start_date > @2021-01-01
group country (
aggregate {max_salary = max salary}
)
# ...and `filter` after aggregations!
filter max_salary > 100_000
SELECT
country,
MAX(salary) AS max_salary
FROM
employees
WHERE
start_date > DATE '2021-01-01'
GROUP BY
country
HAVING
MAX(salary) > 100000
from track_plays
derive {
finished = started + unfinished,
fin_share = finished / started, # Use previous definitions
fin_ratio = fin_share / (1-fin_share), # BTW, hanging commas are optional!
}
SELECT
*,
started + unfinished AS finished,
(started + unfinished) / started AS fin_share,
(started + unfinished) / started / (1 - (started + unfinished) / started)
AS fin_ratio
FROM
track_plays
from web
# Just like Python
select url = f"https://www.{domain}.{tld}/{page}"
SELECT
CONCAT('https://www.', domain, '.', tld, '/', page) AS url
FROM
web
from employees
group employee_id (
sort month
window rolling:12 (
derive {trail_12_m_comp = sum paycheck}
)
)
SELECT
*,
SUM(paycheck) OVER (
PARTITION BY employee_id
ORDER BY
month ROWS BETWEEN 11 PRECEDING AND CURRENT ROW
) AS trail_12_m_comp
FROM
employees
let fahrenheit_from_celsius = temp -> temp * 9/5 + 32
from weather
select temp_f = (fahrenheit_from_celsius temp_c)
SELECT
temp_c * 9 / 5 + 32 AS temp_f
FROM
weather
# Most recent employee in each role
# Quite difficult in SQL...
from employees
group role (
sort join_date
take 1
)
WITH table_0 AS (
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY role
ORDER BY
join_date
) AS _expr_0
FROM
employees
)
SELECT
*
FROM
table_0
WHERE
_expr_0 <= 1
# There's no `version` in PRQL, but s-strings
# let us embed SQL as an escape hatch:
from x
derive db_version = s"version()"
SELECT
*,
version() AS db_version
FROM x
from employees
join b=benefits (==employee_id)
join side:left p=positions (p.id==employees.employee_id)
select {employees.employee_id, p.role, b.vision_coverage}
SELECT
employees.employee_id,
p.role,
b.vision_coverage
FROM
employees
JOIN benefits AS b ON employees.employee_id = b.employee_id
LEFT JOIN positions AS p ON p.id = employees.employee_id
from users
filter last_login != null
filter deleted_at == null
derive channel = channel ?? "unknown"
SELECT
*,
COALESCE(channel, 'unknown') AS channel
FROM
users
WHERE
last_login IS NOT NULL
AND deleted_at IS NULL
prql target:sql.mssql # Will generate TOP rather than LIMIT
from employees
sort age
take 10
SELECT
TOP (10) *
FROM
employees
ORDER BY
age
Principles
Pipelined
A PRQL query is a linear pipeline of transformations
Each line of the query is a transformation of the previous line’s result. This makes it easy to read, and simple to write.
Simple
PRQL serves both sophisticated engineers and analysts without coding experience.
By providing a small number of powerful & orthogonal primitives, queries are simple and composable — there’s only one way of expressing each operation. We can eschew the debt that SQL has built up.
Open
PRQL is open-source, with an open community
PRQL will always be fully open-source and will never have a commercial product. By compiling to SQL, PRQL is compatible with most databases, existing tools, and programming languages that manage SQL. We’re a welcoming community for users, contributors, and other projects.
Extensible
PRQL is designed to be extended, from functions to language bindings
PRQL has abstractions which make it a great platform to build on. Its explicit versioning allows changes without breaking backward-compatibility. And in the cases where PRQL doesn’t yet have an implementation, it allows embedding SQL with s-strings.
Analytical
PRQL’s focus is analytical queries
PRQL was originally designed to serve the growing need of writing analytical queries, emphasizing data transformations, development speed, and readability. We de-emphasize other SQL features such as inserting data or transactions.
Pipelines in action
Integrations
Jupyter/IPython
pyprql contains a Jupyter extension, which executes a PRQL cell against a database. It can also set up an in-memory DuckDB instance, populated with a pandas DataFrame.Visual Studio Code
Extension with syntax highlighting and live SQL compilation.Prefect
Add PRQL models to your Prefect workflows with a single function.DuckDB
A DuckDB extension to execute PRQLTools
Playground
Online in-browser playground that compiles PRQL to SQL as you type.prqlc
A CLI for PRQL compiler, written in Rust.
cargo install prqlc
brew install prqlc
winget install prqlc
Bindings
prql-python
Python bindings for prql-compiler.prql-js
JavaScript bindings for prql-compiler.prqlr
R bindings for prql-compiler.prql-compiler
Compiler implementation, written in Rust. Compile, format & annotate PRQL queries.Others
Java, C, C++, Elixir, .NET, and PHP have unsupported or nascent bindings.What people are saying
It starts with FROM, it fixes trailing commas, and it's called PRQL?? If this is a dream, don't wake me up. — Jeremiah Lowin, Founder & CEO, Prefect.