This extension allows you to connect DuckDB to Microsoft Fabric OneLake workspaces and lakehouses, enabling you to query data stored in OneLake directly from DuckDB.
Maintainer(s):
achrafcei
Installing and Loading
INSTALL onelake FROM community;
LOAD onelake;
Example
-- First, create a secret to authenticate with OneLake.
-- You can use either a service principal or the credential chain (e.g., Azure CLI).
-- Uncomment the desired method and provide the necessary credentials.
set azure_transport_option_type = 'curl';
CREATE SECRET (
TYPE azure,
PROVIDER service_principal,
TENANT_ID '<your_tenant_id>',
CLIENT_ID '<your_client_id>',
CLIENT_SECRET '<your_client_secret>'
);
-- CREATE SECRET (
-- TYPE azure,
-- PROVIDER credential_chain,
-- CHAIN 'cli'
-- );
CREATE SECRET onelake (
TYPE ONELAKE,
TENANT_ID '<your_tenant_id>',
CLIENT_ID '<your_client_id>',
CLIENT_SECRET '<your_client_secret>'
);
-- CREATE SECRET onelake(
-- TYPE ONELAKE,
-- PROVIDER credential_chain,
-- CHAIN 'cli'
-- );
-- Optional: use preissued tokens stored in env variables (defaults shown)
--SET onelake_env_fabric_token_variable = 'FABRIC_API_TOKEN';
--SET onelake_env_storage_token_variable = 'AZURE_STORAGE_TOKEN';
--CREATE SECRET onelake_env (
-- TYPE ONELAKE,
-- PROVIDER credential_chain,
-- CHAIN 'env'
--);
-- Attach to your OneLake workspace and lakehouse
ATTACH '<your_workspace_name>/<your_lakehouse_name>.Lakehouse'
AS <your_connection_name>
(TYPE ONELAKE);
USE <your_connection_name>.<your_schema_name>; -- e.g., dbo
SHOW TABLES;
-- Query tables (Delta or Iceberg)
SELECT * FROM <your_table_name> LIMIT 10;
-- Create a new Delta table
CREATE TABLE new_table (
id INT,
name VARCHAR
);
-- Create a new Delta table with partitioning
CREATE TABLE new_table_partitioned (
id INT,
name VARCHAR,
region VARCHAR
) PARTITION BY (region);
-- Write to Delta tables (INSERT)
INSERT INTO new_table VALUES (1, 'New Data');
-- SET write mode to overwrite (default is append)
SET onelake_delta_write_mode = 'overwrite';
INSERT INTO new_table VALUES (2, 'Overwritten Data');
-- Perform destructive operations (UPDATE, DELETE, DROP)
-- These require explicit opt-in per session
SET onelake_allow_destructive_operations = true;
UPDATE new_table
SET name = 'Updated Value'
WHERE id = 1;
DELETE FROM new_table WHERE id = 1;
DROP TABLE new_table;
About onelake
This extension enables DuckDB to connect to Microsoft Fabric OneLake workspaces and lakehouses, allowing users to query and modify data stored in OneLake directly from DuckDB. It supports authentication via service principals, credential chains (e.g., Azure CLI), or environment tokens and provides seamless integration with OneLake's data storage capabilities.
Key Features:
- Read support for both Delta and Iceberg tables.
- Full write support for Delta tables including CREATE TABLE, INSERT, UPDATE, DELETE, and DROP.
- Configurable write modes (append, overwrite) and schema evolution.
- ALTER TABLE support is planned for future releases.
Added Functions
| function_name | function_type | description | comment | examples |
|---|---|---|---|---|
| copy_dir | table_macro | NULL | NULL | |
| delta_filter_pushdown_log | table_macro | NULL | NULL | |
| delta_filter_pushdown_log_tpcds | table_macro | NULL | NULL | |
| delta_list_files | table | NULL | NULL | |
| delta_scan | table | NULL | NULL | |
| get_delta_test_expression | scalar | NULL | NULL | |
| parse_delta_filter_logline | macro | NULL | NULL | |
| write_blob | scalar | NULL | NULL |
Added Settings
| name | description | input_type | scope | aliases | |||
|---|---|---|---|---|---|---|---|
| auto_fallback_to_full_download | Allows automatically falling back to full file downloads when possible. | BOOLEAN | GLOBAL | [] | |||
| ca_cert_file | Path to a custom certificate file for self-signed certificates. | VARCHAR | GLOBAL | [] | |||
| delta_kernel_logging | Forwards the internal logging of the Delta Kernel to the duckdb logger. Warning: this may impact performance even with DuckDB logging disabled. | BOOLEAN | GLOBAL | [] | |||
| delta_scan_explain_files_filtered | Adds the filtered files to the explain output. Warning: this may impact performance of delta scan during explain analyze queries. | BOOLEAN | GLOBAL | [] | |||
| enable_curl_server_cert_verification | Enable server side certificate verification for CURL backend. | BOOLEAN | GLOBAL | [] | |||
| enable_server_cert_verification | Enable server side certificate verification. | BOOLEAN | GLOBAL | [] | |||
| force_download | Forces upfront download of file | BOOLEAN | GLOBAL | [] | |||
| hf_max_per_page | Debug option to limit number of items returned in list requests | UBIGINT | GLOBAL | [] | |||
| http_keep_alive | Keep alive connections. Setting this to false can help when running into connection failures | BOOLEAN | GLOBAL | [] | |||
| http_retries | HTTP retries on I/O error | UBIGINT | GLOBAL | [] | |||
| http_retry_backoff | Backoff factor for exponentially increasing retry wait time | FLOAT | GLOBAL | [] | |||
| http_retry_wait_ms | Time between retries | UBIGINT | GLOBAL | [] | |||
| http_timeout | HTTP timeout read/write/connection/retry (in seconds) | UBIGINT | GLOBAL | [] | |||
| httpfs_client_implementation | Select which is the HTTPUtil implementation to be used | VARCHAR | GLOBAL | [] | |||
| onelake_allow_destructive_operations | Allow destructive operations like DROP TABLE (default: false) | BOOLEAN | GLOBAL | [] | |||
| onelake_env_fabric_token_variable | Environment variable name that stores the Fabric API access token | VARCHAR | GLOBAL | [] | |||
| onelake_env_storage_token_variable | Environment variable name that stores the OneLake storage access token | VARCHAR | GLOBAL | [] | |||
| onelake_insert_mode | Default INSERT mode (append | overwrite | error_if_exists | ignore) | VARCHAR | GLOBAL | [] |
| onelake_partition_columns | Comma-separated list of partition columns for the next CREATE TABLE | VARCHAR | GLOBAL | [] | |||
| onelake_safe_cast | Enable safe type casting during writes | BOOLEAN | GLOBAL | [] | |||
| onelake_schema_mode | Schema evolution mode (empty | merge | overwrite) | VARCHAR | GLOBAL | [] | |
| onelake_target_file_size | Target Parquet file size in bytes (0 = default) | UBIGINT | GLOBAL | [] | |||
| onelake_write_batch_size | Number of rows per write batch (0 = default) | UBIGINT | GLOBAL | [] | |||
| s3_access_key_id | S3 Access Key ID | VARCHAR | GLOBAL | [] | |||
| s3_endpoint | S3 Endpoint | VARCHAR | GLOBAL | [] | |||
| s3_kms_key_id | S3 KMS Key ID | VARCHAR | GLOBAL | [] | |||
| s3_region | S3 Region | VARCHAR | GLOBAL | [] | |||
| s3_requester_pays | S3 use requester pays mode | BOOLEAN | GLOBAL | [] | |||
| s3_secret_access_key | S3 Access Key | VARCHAR | GLOBAL | [] | |||
| s3_session_token | S3 Session Token | VARCHAR | GLOBAL | [] | |||
| s3_uploader_max_filesize | S3 Uploader max filesize (between 50GB and 5TB) | VARCHAR | GLOBAL | [] | |||
| s3_uploader_max_parts_per_file | S3 Uploader max parts per file (between 1 and 10000) | UBIGINT | GLOBAL | [] | |||
| s3_uploader_thread_limit | S3 Uploader global thread limit | UBIGINT | GLOBAL | [] | |||
| s3_url_compatibility_mode | Disable Globs and Query Parameters on S3 URLs | BOOLEAN | GLOBAL | [] | |||
| s3_url_style | S3 URL style | VARCHAR | GLOBAL | [] | |||
| s3_use_ssl | S3 use SSL | BOOLEAN | GLOBAL | [] | |||
| unsafe_disable_etag_checks | Disable checks on ETag consistency | BOOLEAN | GLOBAL | [] |