Standard Library

The standard library currently contains commonly used functions that are used in SQL. It’s not yet as broad as we’d like, and we’re very open to expanding it.

Currently s-strings are an escape-hatch for any function that isn’t in our standard library. If we find ourselves using them for something frequently, raise an issue and we’ll add it to the stdlib.

Note

Currently the stdlib implementation doesn’t support different DB implementations itself; those need to be built deeper into the compiler. We’ll resolve this at some point. Until then, we’ll only add functions here that are broadly supported by most DBs.

Here’s the source of the current PRQL std:

# Aggregate Functions
func min <scalar|column> column -> null
func max <scalar|column> column -> null
func sum <scalar|column> column -> null
func avg <scalar|column> column -> null
func stddev <scalar|column> column -> null
func average <scalar|column> column -> null
func count <scalar|column> non_null:s"*" -> null
# TODO: Possibly make this into `count distinct:true` (or like `distinct:` as an
# abbreviation of that?)
func count_distinct <scalar|column> column -> null

# Window functions
func lag<column> offset column -> null
func lead<column> offset column -> null
func first<column> offset column -> null
func last<column> offset column -> null
func rank<column> -> null
func rank_dense<column> -> null
func row_number<column> -> null

# Other functions
func round<scalar> n_digits column -> null
func as<scalar> `noresolve.type` column -> null
func in<bool> pattern value -> null

# Transform type definitions
func from<table> `default_db.source`<table> -> null
func select<table> columns<column> tbl<table> -> null
func filter<table> condition<bool> tbl<table> -> null
func derive<table> columns<column> tbl<table> -> null
func aggregate<table> a<column> tbl<table> -> null
func sort<table> by tbl<table> -> null
func take<table> expr tbl<table> -> null
func join<table> `default_db.with`<table> filter `noresolve.side`:inner tbl<table> -> null
func group<table> by pipeline tbl<table> -> null
func window<table> rows:0..0 range:0..0 expanding:false rolling:0 pipeline tbl<table> -> null

func append<table> `default_db.bottom`<table> top<table> -> null
func intersect<table> `default_db.bottom`<table> top<table> -> (
    from t = _param.top
    join b = _param.bottom (all (map _eq (zip t.* b.*)))
    select t.*
)
func remove<table> `default_db.bottom`<table> top<table> -> (
    from t = _param.top
    join side:left b = _param.bottom (all (map _eq (zip t.* b.*)))
    filter (all (map _is_null b.*))
    select t.*
)

# List functions
func all<bool> list<list> -> null
func map<list> fn list<list> -> null
func zip<list> a<list> b<list> -> null
func _eq<list> a<list> -> null
func _is_null a -> _param.a == null

# Misc
func from_text<table> text<string> `noresolve.format`:csv -> null

And a couple of examples:

PRQL

from employees
derive [
  gross_salary = (salary + payroll_tax | as int),
  gross_salary_rounded = (gross_salary | round 0),
  time = s"NOW()",  # an s-string, given no `now` function exists in PRQL
]

SQL

SELECT
  *,
  CAST(salary + payroll_tax AS int) AS gross_salary,
  ROUND(CAST(salary + payroll_tax AS int), 0) AS gross_salary_rounded,
  NOW() AS time
FROM
  employees