Search Shortcut cmd + k | ctrl + k
salesforce

Read-only access to Salesforce orgs as DuckDB SQL tables over the official REST and Bulk APIs — OAuth refresh-token or JWT-bearer auth (credentials from inline options, environment variables, or an SFDX auth URL), native ATTACH, projection + predicate pushdown, COUNT pushdown, explicit server-side aggregates with GROUP BY, lazy/auto/Bulk transports with PK chunking (Bulk blob/base64 compatibility guard), a per-job API quota governor, parent + grandparent relationship STRUCT columns with diagnostics, queryAll (archived + deleted), Tooling-API fast schema, and metadata helpers (manual cache refresh, picklist values, record types).

Maintainer(s): flozer

Installing and Loading

INSTALL salesforce FROM community;
LOAD salesforce;

Example

-- Published build is signed: INSTALL salesforce FROM community; LOAD salesforce;
-- Connect with an OAuth refresh-token Connected App.
-- Set SF_CLIENT_ID, SF_CLIENT_SECRET, SF_REFRESH_TOKEN, and optionally SF_LOGIN_URL.
ATTACH 'salesforce://myorg' AS sf (TYPE salesforce, auth_source 'env');
SELECT Id, Name FROM sf.Account WHERE Name = 'Acme' LIMIT 10;

About salesforce

duckdb-salesforce attaches a Salesforce org as a read-only DuckDB catalog. Tables map to sObjects; SELECT runs over REST /query (or /queryAll for archived + soft-deleted rows), Bulk API 2.0 (lazy-streamed, optional parallel PK chunking), or an auto-selected transport, with SOQL projection + predicate pushdown and COUNT pushdown. salesforce_aggregate() runs explicit server-side SOQL aggregates (MIN/MAX/SUM/AVG/COUNT, optional filter and GROUP BY) without dragging rows down, and salesforce_relationships() reports parent/grandparent expansion. Read-only metadata helpers are available too: salesforce_refresh_metadata() (manual per-ATTACH cache refresh), salesforce_picklist_values(), and salesforce_record_types(). A Bulk blob/base64 compatibility guard keeps incompatible scans off Bulk, and blob bodies / epoch datetimes have clear documented limitations. Opt-in parent- and grandparent-relationship STRUCT columns, a per-job API quota governor, and Tooling-API fast schema discovery round it out. Authentication is OAuth 2.0 refresh-token or JWT bearer, with credentials from inline options, environment variables, or an SFDX auth URL; credentials stay in memory and are never logged; TLS server-certificate verification is always on. Read-only: all mutating catalog operations throw.

Added Functions

function_name function_type description comment examples
salesforce_aggregate table NULL NULL  
salesforce_decode table NULL NULL  
salesforce_describe table NULL NULL  
salesforce_describe_calls table NULL NULL  
salesforce_global_describe_calls table NULL NULL  
salesforce_last_bulk_create_body table NULL NULL  
salesforce_last_quota table NULL NULL  
salesforce_last_scan_pages table NULL NULL  
salesforce_last_soql table NULL NULL  
salesforce_last_transport table NULL NULL  
salesforce_picklist_values table NULL NULL  
salesforce_query table NULL NULL  
salesforce_query_cost table NULL NULL  
salesforce_record_types table NULL NULL  
salesforce_refresh_metadata table NULL NULL  
salesforce_relationships table NULL NULL  
salesforce_tooling_calls table NULL NULL  
sf_url_encode scalar NULL NULL  

Overloaded Functions

This extension does not add any function overloads.

Added Types

This extension does not add any types.

Added Settings

name description input_type scope aliases
sf_auto_bulk_threshold For sf_force_transport='auto': estimated row count above which Bulk is chosen over REST (default 50000). BIGINT GLOBAL []
sf_auto_probe For sf_force_transport='auto': run the COUNT() row-count probe (default true). When false, 'auto' always resolves to REST. BOOLEAN GLOBAL []
sf_bulk_chunks Bulk PK chunking: split a Bulk scan into N disjoint Id ranges (1 = off, default; capped at 8). Sequential in this cut. Bulk transport only. BIGINT GLOBAL []
sf_force_transport Scan transport: 'rest' (default), 'bulk' (Bulk API 2.0), or 'auto' (choose by row-count probe). 'bulk' is for large extractions / CREATE TABLE AS / COPY; same SOQL (projection + predicate pushdown) either way. VARCHAR GLOBAL []
sf_mock_bulk_create_body TEST ONLY. Bulk job-create response body. VARCHAR GLOBAL []
sf_mock_bulk_create_status TEST ONLY. Bulk job-create HTTP status. BIGINT GLOBAL []
sf_mock_bulk_results_body TEST ONLY. Bulk results CSV page(s) ('|~|' per page). VARCHAR GLOBAL []
sf_mock_bulk_results_locator TEST ONLY. Sforce-Locator per results page (comma-separated; empty = last). VARCHAR GLOBAL []
sf_mock_bulk_results_status TEST ONLY. Bulk results HTTP status(es), CSV. VARCHAR GLOBAL []
sf_mock_bulk_status_body TEST ONLY. Bulk status body/bodies ('|~|' per poll). VARCHAR GLOBAL []
sf_mock_bulk_status_code TEST ONLY. Bulk status HTTP status(es), CSV. VARCHAR GLOBAL []
sf_mock_count_body TEST ONLY. Body for the mocked COUNT() probe (reads totalSize). VARCHAR GLOBAL []
sf_mock_count_status TEST ONLY. Statuses for the mocked COUNT() probe GET. VARCHAR GLOBAL []
sf_mock_describe_body TEST ONLY. Bodies for mocked describe GETs ('|~|'-separated). VARCHAR GLOBAL []
sf_mock_describe_status TEST ONLY. Statuses for mocked describe GETs (e.g. '200', '401,200'). VARCHAR GLOBAL []
sf_mock_env TEST ONLY. Override environment-variable lookup for auth_source env/sfdx_url ('NAME=value;…'). Empty uses the real OS environment. VARCHAR GLOBAL []
sf_mock_limits_body TEST ONLY. Body for the mocked GET /limits. VARCHAR GLOBAL []
sf_mock_limits_status TEST ONLY. Statuses for the mocked GET /limits. VARCHAR GLOBAL []
sf_mock_query_body TEST ONLY. Bodies for mocked query GET pages ('|~|'-separated). VARCHAR GLOBAL []
sf_mock_query_status TEST ONLY. Statuses for mocked query GETs (e.g. '200,200'). VARCHAR GLOBAL []
sf_mock_queryall_body TEST ONLY. Body/pages for the mocked GET /queryAll ('|~|'). VARCHAR GLOBAL []
sf_mock_queryall_status TEST ONLY. Statuses for the mocked GET /queryAll. VARCHAR GLOBAL []
sf_mock_sobjects_body TEST ONLY. Body for mocked global describe (GET /sobjects). VARCHAR GLOBAL []
sf_mock_sobjects_status TEST ONLY. Statuses for mocked global describe (GET /sobjects). VARCHAR GLOBAL []
sf_mock_token_body TEST ONLY. Response body paired with sf_mock_token_status. VARCHAR GLOBAL []
sf_mock_token_status TEST ONLY. HTTP status for a mocked Salesforce token-endpoint response. 0 disables the mock and uses the live transport (default). BIGINT GLOBAL []
sf_mock_tooling_body TEST ONLY. Body/pages for the mocked GET /tooling/query ('|~|'). VARCHAR GLOBAL []
sf_mock_tooling_status TEST ONLY. Statuses for the mocked GET /tooling/query. VARCHAR GLOBAL []
sf_query_mode Read mode: 'query' (default) or 'queryAll' (also returns archived + soft-deleted records). Affects the scan (REST + Bulk) and its probes. VARCHAR GLOBAL []
sf_quota_cache_seconds Quota governor: in-memory TTL for a cached /limits snapshot, per instance_url (default 60; 0 disables caching). BIGINT GLOBAL []
sf_quota_enabled Quota governor: gate Bulk job starts on the org's API quota (default true). false skips /limits and never blocks. BOOLEAN GLOBAL []
sf_quota_enforce Quota governor: block when below reserve (default true). false = consult /limits and report, but proceed (warn-only). BOOLEAN GLOBAL []
sf_quota_fail_open Quota governor: when /limits is unavailable, allow the Bulk job (default true). false blocks with a clear error. BOOLEAN GLOBAL []
sf_quota_min_remaining Quota governor: absolute floor of remaining DailyApiRequests below which Bulk is refused (default 1000). BIGINT GLOBAL []
sf_quota_reserve_pct Quota governor: keep this %% of DailyApiRequests.Max in reserve (default 10). BIGINT GLOBAL []
sf_relationship_depth Parent traversal depth when sf_relationships='parent': 1 (default, parent only) or 2 (also grandparent, nested STRUCT). Capped at 2. BIGINT GLOBAL []
sf_relationships Parent relationship traversal: 'off' (default) or 'parent' (expose each single-target parent as a STRUCT column, e.g. SELECT Account.Name FROM sf.Contact). Polymorphic/child relationships not expanded. VARCHAR GLOBAL []
sf_schema_source Schema discovery: 'describe' (default, REST, authoritative) or 'tooling' (fast batched Tooling API FieldDefinition; falls back to REST describe per object on error/absent/ambiguous type; coarser types; fields default non-filterable unless Tooling marks them filterable). VARCHAR GLOBAL []