S-strings

An s-string inserts SQL directly, as an escape hatch when there’s something that PRQL doesn’t yet implement. For example, there’s no version() function in SQL that returns the Postgres version, so if we want to use that, we use an s-string:

PRQL

from my_table
select db_version = s"version()"

SQL

SELECT
  version() AS db_version
FROM
  my_table

Embed a column name in an s-string using braces. For example, PRQL’s standard library defines the average function as:

func average column -> s"AVG({column})"

So this compiles using the function:

PRQL

from employees
aggregate [average salary]

SQL

SELECT
  AVG(salary)
FROM
  employees

Here’s an example of a more involved use of an s-string:

PRQL

from de=dept_emp
join s=salaries side:left [
  (s.emp_no == de.emp_no),
  s"""({s.from_date}, {s.to_date})
  OVERLAPS
  ({de.from_date}, {de.to_date})"""
]

SQL

SELECT
  de.*,
  s.*
FROM
  dept_emp AS de
  LEFT JOIN salaries AS s ON s.emp_no = de.emp_no
  AND (s.from_date, s.to_date) OVERLAPS (de.from_date, de.to_date)

For those who have used Python, s-strings are similar to Python’s f-strings, but the result is SQL code, rather than a string literal. For example, a Python f-string of f"average({col})" would produce "average(salary)", with quotes; while in PRQL, s"average({col})" produces average(salary), without quotes.

Note that interpolations can only contain plain variable names and not whole expression like Python.

We can also use s-strings to produce a full table:

PRQL

from s"SELECT DISTINCT ON first_name, id, age FROM employees ORDER BY age ASC"
join s = s"SELECT * FROM salaries" [==id]

SQL

WITH table_0 AS (
  SELECT
    DISTINCT ON first_name,
    id,
    age
  FROM
    employees
  ORDER BY
    age ASC
),
table_1 AS (
  SELECT
    *
  FROM
    salaries
)
SELECT
  table_2.*,
  table_3.*
FROM
  table_0 AS table_2
  JOIN table_1 AS table_3 ON table_2.id = table_3.id

Note

S-strings in user code are intended as an escape-hatch for an unimplemented feature. If we often need s-strings to express something, that’s a sign we should implement it in PRQL or PRQL’s stdlib.

Braces

To output braces from an s-string, use double braces:

PRQL

from employees
derive [
  has_valid_title = s"regexp_contains(title, '([a-z0-9]*-){{2,}}')"
]

SQL

SELECT
  *,
  regexp_contains(title, '([a-z0-9]*-){2,}') AS has_valid_title
FROM
  employees

Precedence

The PRQL compiler simply places a literal copy of each variable into the resulting string, which means we may get surprising behavior when the variable is has multiple terms and the s-string isn’t parenthesized.

In this toy example, the salary + benefits / 365 gets precedence wrong:

PRQL

from employees
derive [
  gross_salary = salary + benefits,
  daily_rate = s"{gross_salary} / 365"
]

SQL

SELECT
  *,
  salary + benefits AS gross_salary,
  salary + benefits / 365 AS daily_rate
FROM
  employees

Instead, put the denominator {gross_salary} in parentheses:

PRQL

from employees
derive [
  gross_salary = salary + benefits,
  daily_rate = s"({gross_salary}) / 365"
]

SQL

SELECT
  *,
  salary + benefits AS gross_salary,
  (salary + benefits) / 365 AS daily_rate
FROM
  employees