Optimized-plan inspection and cross-system SQL transpilation
Installing and Loading
INSTALL lpts FROM community;
LOAD lpts;
Example
-- Set both input and output dialects to DuckDB for a first round trip.
D SET lpts_input_dialect = 'duckdb';
D SET lpts_dialect = 'duckdb';
D CREATE TABLE users (id INTEGER, name VARCHAR, age INTEGER);
D INSERT INTO users VALUES (1, 'Alice', 30), (2, 'Bob', 22), (3, 'Carol', 28);
-- Return the optimized plan as readable CTE SQL.
D PRAGMA lpts('SELECT name FROM users WHERE age > 25');
┌──────────────────────────────────────────────────────────────────────────────┐
│ sql │
│ varchar │
├──────────────────────────────────────────────────────────────────────────────┤
│ WITH scan_0 (t0_name) AS (SELECT name FROM memory.main.users WHERE age>25), │
│ projection_1 (t1_name) AS (SELECT t0_name FROM scan_0) │
│ SELECT t1_name AS "name" FROM projection_1; │
└──────────────────────────────────────────────────────────────────────────────┘
-- Check that the generated SQL returns the same bag of rows as the input query.
D PRAGMA lpts_check('SELECT name FROM users WHERE age > 25');
┌─────────┐
│ match │
│ boolean │
├─────────┤
│ true │
└─────────┘
About lpts
LPTS is a DuckDB extension for optimized-plan inspection and cross-system SQL transpilation. LPTS takes DuckDB's post-optimizer logical plan and reconstructs equivalent SQL as a sequence of named CTEs.
PRAGMA Syntax
PRAGMA lpts('<query>');
Example:
-- Set both input and output dialects to DuckDB for a first round trip.
D SET lpts_input_dialect = 'duckdb';
D SET lpts_dialect = 'duckdb';
D CREATE TABLE users (id INTEGER, name VARCHAR, age INTEGER);
D INSERT INTO users VALUES (1, 'Alice', 30), (2, 'Bob', 22), (3, 'Carol', 28);
-- Return the optimized plan as readable CTE SQL.
D PRAGMA lpts('SELECT name FROM users WHERE age > 25');
┌──────────────────────────────────────────────────────────────────────────────┐
│ sql │
│ varchar │
├──────────────────────────────────────────────────────────────────────────────┤
│ WITH scan_0 (t0_name) AS (SELECT name FROM memory.main.users WHERE age>25), │
│ projection_1 (t1_name) AS (SELECT t0_name FROM scan_0) │
│ SELECT t1_name AS "name" FROM projection_1; │
└──────────────────────────────────────────────────────────────────────────────┘
LPTS plans the query through DuckDB, optimizes it, then serializes the optimized logical plan.
Supported Dialects
The dialect settings accept these values:
| Dialect | Accepted values |
|---|---|
| DuckDB | duckdb |
| PostgreSQL | postgres, postgresql |
| Spark SQL | spark |
| Hive | hive |
| Trino / Presto | trino, presto |
| Snowflake | snowflake |
| BigQuery | bigquery, bq |
| Redshift | redshift |
| MySQL / MariaDB | mysql, mariadb |
Use Cases
- Inspect optimized DuckDB plans as SQL.
- Debug optimizer rewrites such as filter pushdown, join reordering, top-N, materialized CTEs, and subquery decorrelation.
- Generate a CTE program that communicates the optimized execution shape.
- Emit SQL for another engine with
lpts_dialect. - Convert other SQL dialect syntax to DuckDB SQL with
lpts_input_dialect, then execute or inspect it.
Supported Operators
LPTS is intended to cover all logical operators produced by optimized DuckDB SELECT plans. The current regression suite round-trips all 22 TPC-H queries and exercises joins, aggregates, windows, set operations, CTEs, recursive CTEs, table functions, DuckLake scans, and inserts.
Unsupported optimizer edge cases fail explicitly with NotImplementedException.
Examples
D CREATE TABLE events (id INTEGER, ts TIMESTAMP, name VARCHAR, "order" INTEGER);
D INSERT INTO events VALUES
(1, TIMESTAMP '2024-01-15 08:09:10', 'alpha', 10),
(11, TIMESTAMP '2024-01-16 11:12:13', 'beta', 20);
-- Render generated SQL for PostgreSQL.
D SET lpts_dialect = 'postgres';
-- Return generated CTE SQL directly in the shell.
D PRAGMA lpts(
'SELECT strftime(ts, ''%Y-%m-%d'') AS day
FROM events
WHERE id > 10
ORDER BY day'
);
-- Return generated CTE SQL as a table row, useful in scripts and tests.
D SELECT sql
FROM lpts_query(
'SELECT strftime(ts, ''%Y-%m-%d'') AS day
FROM events
WHERE id > 10
ORDER BY day'
);
┌────────────────────────────────────────────────────────────────────────────────────────┐
│ sql │
│ varchar │
├────────────────────────────────────────────────────────────────────────────────────────┤
│ WITH scan_0 (t0_ts) AS (SELECT ts FROM events WHERE id>10), │
│ projection_1 (t1_day) AS (SELECT to_char(t0_ts, 'YYYY-MM-DD') FROM scan_0), │
│ order_2 (t1_day) AS (SELECT t1_day FROM projection_1 ORDER BY t1_day ASC NULLS LAST) │
│ SELECT t1_day AS "day" FROM order_2; │
└────────────────────────────────────────────────────────────────────────────────────────┘
-- Switch back to DuckDB rendering.
D SET lpts_dialect = 'duckdb';
-- Execute the generated SQL and return the query result.
D PRAGMA lpts_exec('SELECT name FROM events WHERE id > 10 ORDER BY name');
-- Compare original and generated SQL using bag equality.
D PRAGMA lpts_check('SELECT name FROM events WHERE id > 10 ORDER BY name');
┌─────────┐
│ name │
│ varchar │
├─────────┤
│ beta │
└─────────┘
┌─────────┐
│ match │
│ boolean │
├─────────┤
│ true │
└─────────┘
-- Print the AST tree to stdout for interactive debugging.
D PRAGMA print_ast('SELECT name FROM events WHERE id > 10 ORDER BY name');
-- Return the AST tree as a table row, useful for tools and regression tests.
D SELECT ast
FROM print_ast_query('SELECT name FROM events WHERE id > 10 ORDER BY name');
-- Normalize MySQL syntax before DuckDB parses and plans the query.
D SET lpts_input_dialect = 'mysql';
-- Return source-dialect SQL normalized to DuckDB SQL.
D SELECT sql
FROM lpts_normalize_query(
'SELECT `order`, DATE_FORMAT(ts, ''%Y-%m-%d %H:%i:%s'') AS formatted FROM events LIMIT 5, 10'
);
SELECT "order", strftime(ts, '%Y-%m-%d %H:%M:%S') AS formatted FROM events LIMIT 10 OFFSET 5
Added Functions
| function_name | function_type | description | comment | examples |
|---|---|---|---|---|
| lpts | pragma | Return the optimized logical plan for a query as equivalent CTE SQL. | NULL | [PRAGMA lpts('SELECT name FROM users WHERE age > 25')] |
| lpts_check | pragma | Compare the original query and the LPTS-generated query using EXCEPT ALL in both directions. | NULL | [PRAGMA lpts_check('SELECT name FROM users WHERE age > 25')] |
| lpts_exec | pragma | Execute the SQL generated by LPTS and return its result rows. | NULL | [PRAGMA lpts_exec('SELECT name FROM users WHERE age > 25')] |
| lpts_normalize_query | table | Normalize SQL from lpts_input_dialect into DuckDB SQL without planning it. | NULL | [SET lpts_input_dialect = 'mysql'; SELECT sql FROM lpts_normalize_query('SELECT order FROM users LIMIT 1, 2')] |
| lpts_query | table | Table-function form of PRAGMA lpts. Returns the generated CTE SQL as a single sql column. | NULL | [SELECT sql FROM lpts_query('SELECT name FROM users WHERE age > 25')] |
| print_ast | pragma | Print the LPTS AST tree for a query to stdout. | NULL | [PRAGMA print_ast('SELECT name FROM users WHERE age > 25')] |
| print_ast_query | table | Table-function form of PRAGMA print_ast. Returns the rendered AST tree as a single ast column. | NULL | [SELECT ast FROM print_ast_query('SELECT name FROM users WHERE age > 25')] |
Overloaded Functions
This extension does not add any function overloads.
Added Types
This extension does not add any types.
Added Settings
| name | description | input_type | scope | aliases |
|---|---|---|---|---|
| lpts_dialect | SQL dialect for lpts output. Valid values: 'duckdb' (default), 'postgres', 'spark', 'hive', 'trino', 'presto', 'snowflake', 'bigquery', 'redshift', 'mysql', 'mariadb' | VARCHAR | GLOBAL | [] |
| lpts_enable_data_dependent_optimizers | Enable LPTS planning optimizers that depend on current data, statistics, cardinality estimates, row groups, or runtime dynamic filters. | BOOLEAN | GLOBAL | [] |
| lpts_input_dialect | SQL dialect for lpts input normalization. Valid values: 'duckdb' (default), 'postgres', 'spark', 'hive', 'trino', 'presto', 'snowflake', 'bigquery', 'redshift', 'mysql', 'mariadb' | VARCHAR | GLOBAL | [] |