Why PRQL?
For data engineers
- PRQL is concise, with abstractions such as variables & functions
- PRQL is database agnostic, compiling to many dialects of SQL
- PRQL isn’t limiting — it can contain embedded SQL where necessary
- PRQL has bindings to most major languages (and more are in progress)
- PRQL allows for column lineage and type inspection (in progress)
For analysts
- PRQL is ergonomic for data exploration — for example, commenting out a filter, or a column in a list, maintains a valid query
- PRQL is simple, and easy to understand, with a small number of powerful concepts
- PRQL allows for powerful autocomplete, type-checking, and helpful error messages (in progress)
For tools
- PRQL is a stable foundation to build on; we’re open-source and will never have a commercial product
- PRQL is a single secular standard which tools can target
- PRQL is easy for machines to read & write
For HackerNews enthusiasts
- The PRQL compiler is written in rust
- We talk about “orthogonal language features” a lot
Showcase
PRQL consists of a curated set of orthogonal transformations, which are combined together to form a pipeline. That makes it easy to compose and extend queries. The language also benefits from modern features, such syntax for dates, ranges and f-strings as well as functions, type checking and better null handling.
Playground Bookfrom employees
select [id, first_name, age]
sort age
take 10
SELECT id, first_name, age
FROM employees
ORDER BY age
LIMIT 10
from order # This is a comment
filter status == "done"
sort [-amount] # sort order
SELECT
order.*
FROM
order
WHERE
status = 'done'
ORDER BY
amount DESC
from employees
derive [
age_at_year_end = (@2022-12-31 - dob),
first_check_in = start + 10days,
]
SELECT
*,
DATE '2022-12-31' - dob AS age_at_year_end,
start + INTERVAL 10 DAY AS first_check_in
FROM
employees
from employees
# Filter before aggregations
filter start_date > @2021-01-01
group country (
aggregate [max_salary = max salary]
)
# And filter after aggregations!
filter max_salary > 100_000
SELECT
country,
MAX(salary) AS max_salary
FROM
employees
WHERE
start_date > DATE '2021-01-01'
GROUP BY
country
HAVING
MAX(salary) > 100_000
from web
# Just like Python
select url = f"http://www.{domain}.{tld}/{page}"
SELECT CONCAT('http://www.', domain, '.', tld,
'/', page) AS url
FROM web
from employees
group employee_id (
sort month
window rolling:12 (
derive [trail_12_m_comp = sum paycheck]
)
)
SELECT
employees.*,
SUM(paycheck) OVER (
PARTITION BY employee_id
ORDER BY
month ROWS BETWEEN 11 PRECEDING
AND CURRENT ROW
) AS trail_12_m_comp
FROM
employees
func fahrenheit_from_celsius temp -> temp * 9/5 + 32
from weather
select temp_f = (fahrenheit_from_celsius temp_c)
SELECT
temp_c * 9/5 + 32 AS temp_f
FROM
weather
# Most recent employee in each role
# Quite difficult in SQL...
from employees
group role (
sort join_date
take 1
)
WITH table_0 AS (
SELECT
employees.*,
ROW_NUMBER() OVER (
PARTITION BY role
ORDER BY
join_date
) AS _rn
FROM
employees
)
SELECT
table_0.*
FROM
table_0
WHERE
_rn <= 1
# There's no `version` in PRQL, but
# we have an escape hatch:
derive db_version = s"version()"
SELECT
version() AS db_version
from employees
join b=benefits [==employee_id]
join side:left p=positions [p.id==employees.employee_id]
select [employees.employee_id, p.role, b.vision_coverage]
SELECT
employees.employee_id,
p.role,
b.vision_coverage
FROM
employees
JOIN benefits AS b ON employees.employee_id = b.employee_id
LEFT JOIN positions AS p ON p.id = employees.employee_id
from users
filter last_login != null
filter deleted_at == null
derive channel = channel ?? "unknown"
SELECT
users.*,
COALESCE(channel, 'unknown') AS channel
FROM
users
WHERE
last_login IS NOT NULL
AND deleted_at IS NULL
prql target:sql.mssql # Will generate TOP rather than LIMIT
from employees
sort age
take 10
SELECT
TOP (10) employees.*
FROM
employees
ORDER BY
age
Principles
Pipelined
A PRQL query is a linear pipeline of transformations
Each line of the query is a transformation of the previous line’s result. This makes it easy to read, and simple to write.
Simple
PRQL serves both sophisticated engineers and analysts without coding experience.
By providing a small number of powerful & orthogonal primitives, queries are simple and composable — there’s only one way of expressing each operation. We can eschew the debt that SQL has built up.
Open
PRQL is open-source, with an open community
PRQL will always be fully open-source and will never have a commercial product. By compiling to SQL, PRQL is compatible with most databases, existing tools, and programming languages that manage SQL. We’re a welcoming community for users, contributors, and other projects.
Extensible
PRQL is designed to be extended, from functions to language bindings
PRQL has abstractions which make it a great platform to build on. Its explicit versioning allows changes without breaking backward-compatibility. And in the cases where PRQL doesn’t yet have an implementation, it allows embedding SQL with S-Strings.
Analytical
PRQL’s focus is analytical queries
PRQL was originally designed to serve the growing need of writing analytical queries, emphasizing data transformations, development speed, and readability. We de-emphasize other SQL features such as inserting data or transactions.
Pipelines in action
Integrations
dbt
Allows writing PRQL in dbt models. This combines the benefits of PRQL’s power & simplicity within queries; with dbt’s version control, lineage & testing across queries.Jupyter/IPython
PyPrql contains a Jupyter extension, which executes a PRQL cell against a database. It can also set up an in-memory DuckDB instance, populated with a pandas DataFrame.Visual Studio Code
Extension with syntax highlighting and an upcoming language server.Prefect
Add PRQL models to your Prefect workflows with a single function.Tools
Playground
Online in-browser playground that compiles PRQL to SQL as you type.PyPrql
Python TUI for connecting to databases. Provides a native interactive console with auto-complete for column names and Jupyter/IPython cell magic.
pip install pyprql
prql-compiler
Reference compiler implementation. Has a CLI utility that can transpile, format and annotate PRQL queries.
brew install prql/PRQL/prql-compiler
What people are saying
It starts with FROM, it fixes trailing commas, and it's called PRQL?? If this is a dream, don't wake me up. — Jeremiah Lowin, Founder & CEO, Prefect.
Hamilton Ulmer
@hamiltonulmerSwanand.
@_swanandRishabh Software
@RishabhSoftBurak Emir
@burakemirMichael Sumner
@mdsumner