Search Shortcut cmd + k | ctrl + k
raster

DuckDB Extension for reading and writing raster files using SQL.

Maintainer(s): ahuarte47

Installing and Loading

INSTALL raster FROM community;
LOAD raster;

Example

SELECT short_name, long_name, help_url FROM RT_Drivers();

┌────────────────┬──────────────────────────────────────────────────────────┬─────────────────────────────────────────────────────┐
   short_name                           long_name                                               help_url                       
    varchar                              varchar                                                 varchar                       
├────────────────┼──────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────┤
 VRT             Virtual Raster                                            https://gdal.org/drivers/raster/vrt.html            
 DERIVED         Derived datasets using VRT pixel functions                https://gdal.org/drivers/raster/derived.html        
 GTiff           GeoTIFF                                                   https://gdal.org/drivers/raster/gtiff.html          
 COG             Cloud optimized GeoTIFF generator                         https://gdal.org/drivers/raster/cog.html            
  ·                       ·                                                                   ·                                
  ·                       ·                                                                   ·                                
  ·                       ·                                                                   ·                                
 EHdr            ESRI .hdr Labelled                                        https://gdal.org/drivers/raster/ehdr.html           
 Zarr            Zarr                                                      NULL                                                
└────────────────┴──────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────┘

SELECT * FROM RT_Read('path/to/raster/file.tif');

┌───────┬───────────┬────────────┬────────────────────────────────┬─────────────────────────┬───────┬────────┬────────┬───────┬───────┬────────────┬────────────┐
  id        x          y                  bbox                       geometry           level  tile_x  tile_y  cols   rows    metadata   databand_1 
 int64   double      double    struct(xmin, ymin, xmax, ymax)   geometry('epsg:25830')  int32  int32   int32   int32  int32     JSON        BLOB    
├───────┼───────────┼────────────┼────────────────────────────────┼─────────────────────────┼───────┼────────┼────────┼───────┼───────┤────────────┤────────────┤
     0  545619.75  4724508.25  {                                    POLYGON ((...))         0       0       0    320      8  {...}       ...        
                                 'xmin': 545539.75,                                                                                                 
                                 'ymin': 4724506.25,                                                                                                
                                 'xmax': 545699.75,                                                                                                 
                                 'ymax': 4724510.25                                                                                                 
                               }                                                                                                                    
├───────┼───────────┼────────────┼────────────────────────────────┼─────────────────────────┼───────┼────────┼────────┼───────┼───────┤────────────┤────────────┤
     1  545619.75  4724504.25  {                                    POLYGON ((...))         0       0       1    320      8  {...}       ...        
                                 'xmin': 545539.75,                                                                                                 
                                 'ymin': 4724502.25,                                                                                                
                                 'xmax': 545699.75,                                                                                                 
                                 'ymax': 4724506.25                                                                                                 
                               }                                                                                                                    
└───────┴───────────┴────────────┴────────────────────────────────┴─────────────────────────┴───────┴────────┴────────┴───────┴───────┴────────────┴────────────┘

-- Example transforming the BLOB data of the bands into arrays of integers, to allow analyzing the raster data:

SELECT
    RT_Cube2ArrayInt32(databand_1, true) AS r,
    RT_Cube2ArrayInt32(databand_2, true) AS g,
    RT_Cube2ArrayInt32(databand_3, true) AS b
FROM
    RT_Read('path/to/raster/file.tif')
;

-- Example to back data from the arrays to BLOBs, to be able to write the data back to a raster file:

SELECT
    RT_Array2Cube(r.values, 'RAW', r.bands, r.cols, r.rows, r.no_data) AS r_cube,
    RT_Array2Cube(g.values, 'RAW', g.bands, g.cols, g.rows, g.no_data) AS g_cube,
    RT_Array2Cube(b.values, 'RAW', b.bands, b.cols, b.rows, b.no_data) AS b_cube,
FROM
    ...
;

-- Example of using COPY to export the data table to a new raster file:

COPY (
    SELECT geometry, databand_1, databand_2, databand_3 FROM RT_Read('./path/to/raster/file.tif')
)
TO './path/to/raster/newfile.tif'
WITH (
    FORMAT RASTER,
    DRIVER 'COG',
    CREATION_OPTIONS ('COMPRESS=LZW','BIGTIFF=YES'),
    RESAMPLING 'nearest',
    ENVELOPE [545539.750, 4724420.250, 545699.750, 4724510.250],
    SRS 'EPSG:25830',
    GEOMETRY_COLUMN 'geometry',
    DATABAND_COLUMNS ['databand_3', 'databand_2', 'databand_1']
);

-- You can combine data band using algebraic operations, set CAST to ARRAY...:

WITH __input AS (
    SELECT
        databand_1 AS red,
        databand_3 AS nir
    FROM
        RT_Read('path/to/raster/file.tif', blocksize_x := 512, blocksize_y := 512)
)
SELECT
    (nir - red) / (nir + red) AS ndvi
FROM
    __input
;

About raster

Extension for DuckDB for reading and writing raster files using SQL.

The main purpose with this extension is to be able to read raster files as tables in DuckDB, and then perform band algebra operations on the raster data using SQL. A raster is just a datacube, and the idea is to be able to manipulate it as such.

The extension uses the GDAL library to read raster files, and it supports a wide range of raster formats.

The RT_Read function is the main function to read raster files, extract metadata, and perform basic operations on the raster data before returning a geoparquet-like DuckDB table. The function takes the path to the raster file as an argument, and other optional parameters, and returns a table with the following columns:

  • id: A unique identifier for each tile.
  • x: The center x coordinate of the tile.
  • y: The center y coordinate of the tile.
  • bbox: The bounding box of the tile, represented as a struct with the following fields: xmin, ymin, xmax, ymax.
  • geometry: The geometry of the tile, represented as a polygon.
  • level: The zoom level of the tile.
  • tile_x: The x coordinate of the tile in the zoom level.
  • tile_y: The y coordinate of the tile in the zoom level.
  • cols: The number of columns in the tile.
  • rows: The number of rows in the tile.
  • metadata: A JSON object with the metadata of the tile.
  • databand_x: A BLOB with the data of the x-th band of the tile. The number of bands and their names can be obtained from the metadata column.

The RT_Read function provides a datacube option that allows to return a single datacube BLOB column with the N-dimensional array with the data of all the bands of the tile, instead of one BLOB per band.

RT_Read supports filter pushdown on the non-BLOB columns, which allows you to prefilter the tiles that are loaded based on their metadata or spatial location. For example, you can filter the tiles that intersect with a certain geometry or that have a certain value in the metadata.

COPY TO with FORMAT RASTER allows to export a table with the same structure as the one returned by RT_Read to a new raster file. You can specify the driver and several creation options:

  • DRIVER: The GDAL driver to use for writing the raster file. For example, GTiff or COG.
  • CREATION_OPTIONS: A list of creation options to pass to the GDAL driver, specified as strings in the format KEY=VALUE.
  • RESAMPLING: The resampling method to use when writing the raster file. For example, nearest, bilinear, cubic, etc.
  • ENVELOPE: The spatial extent of the raster file, specified as a list of four values: [xmin, ymin, xmax, ymax].
  • SRS: The spatial reference system of the raster file, specified as an EPSG code (e.g. EPSG:4326) or a WKT string.
  • GEOMETRY_COLUMN: The name of the column that contains the geometry of the tiles. This column will be used to determine the spatial location and the resolution of the tiles in the output raster file.
  • DATABAND_COLUMNS: A list with the names of the columns that contain the data of the bands. The order of the columns in the list will determine the order of the bands in the output raster file.

Added Functions

function_name function_type description comment examples
RT_Drivers table Returns the list of supported GDAL raster drivers. NULL [SELECT RT_Drivers();]
RT_Read table Reads a raster file and returns a table with the raster data. NULL [SELECT * FROM RT_Read('path/to/raster/file.tif');]
RT_Array2Cube scalar Transforms an array of numeric values into a databand column (BLOB). NULL [SELECT RT_Array2Cube(r.values, 'RAW', r.bands, r.cols, r.rows, r.no_data) FROM …;]
RT_CubeNeg scalar Negate the values in the datacube element-wise. NULL [SELECT RT_CubeNeg(databand_1) FROM …;]
RT_CubeAbs scalar Takes the absolute value of the values in the datacube element-wise. NULL [SELECT RT_CubeAbs(databand_1) FROM …;]
RT_CubeSqrt scalar Takes the square root of the values in the datacube element-wise. NULL [SELECT RT_CubeSqrt(databand_1) FROM …;]
RT_CubeLog scalar Takes the logarithm of the values in the datacube element-wise. NULL [SELECT RT_CubeLog(databand_1) FROM …;]
RT_CubeExp scalar Takes the exponential of the values in the datacube element-wise. NULL [SELECT RT_CubeExp(databand_1) FROM …;]
RT_CubeEqual scalar Return 1 where values in datacube_a are equal to datacube_b or a scalar value, 0 otherwise. NULL [SELECT RT_CubeEqual(databand_1, 10) FROM …;]
RT_CubeNotEqual scalar Return 1 where values in datacube_a are not equal to datacube_b or a scalar value, 0 otherwise. NULL [SELECT RT_CubeNotEqual(databand_1, 10) FROM …;]
RT_CubeGreater scalar Return 1 where values in datacube_a are greater than datacube_b or a scalar value, 0 otherwise. NULL [SELECT RT_CubeGreater(databand_1, 10) FROM …;]
RT_CubeGreaterEqual scalar Return 1 where values in datacube_a are greater than or equal to datacube_b or a scalar value, 0 otherwise. NULL [SELECT RT_CubeGreaterEqual(databand_1, 10) FROM …;]
RT_CubeLess scalar Return 1 where values in datacube_a are less than datacube_b or a scalar value, 0 otherwise. NULL [SELECT RT_CubeLess(databand_1, 10) FROM …;]
RT_CubeLessEqual scalar Return 1 where values in datacube_a are less than or equal to datacube_b or a scalar value, 0 otherwise. NULL [SELECT RT_CubeLessEqual(databand_1, 10) FROM …;]
RT_CubeAdd scalar Add the values in datacube_a to datacube_b or a scalar value element-wise. NULL [SELECT RT_CubeAdd(databand_1, 10) FROM …;]
RT_CubeSubtract scalar Subtract datacube_b or a scalar value from the values in datacube_a element-wise. NULL [SELECT RT_CubeSubtract(databand_1, 10) FROM …;]
RT_CubeMultiply scalar Multiply the values in datacube_a by datacube_b or a scalar value element-wise. NULL [SELECT RT_CubeMultiply(databand_1, 10) FROM …;]
RT_CubeDivide scalar Divide the values in datacube_a by datacube_b or a scalar value element-wise. NULL [SELECT RT_CubeDivide(databand_1, 10) FROM …;]
RT_CubePow scalar Take the power of the values in datacube_a to datacube_b or a scalar value element-wise. NULL [SELECT RT_CubePow(databand_1, 2) FROM …;]
RT_Cube2ArrayInt32 scalar NULL NULL NULL
RT_Cube2TypeUInt64 scalar NULL NULL NULL
RT_Cube2TypeInt8 scalar NULL NULL NULL
RT_Cube2ArrayFloat scalar NULL NULL NULL
RT_Cube2ArrayUInt32 scalar NULL NULL NULL
RT_Cube2ArrayDouble scalar NULL NULL NULL
RT_Cube2TypeUInt8 scalar NULL NULL NULL
RT_Cube2ArrayInt64 scalar NULL NULL NULL
RT_Cube2TypeUInt16 scalar NULL NULL NULL
RT_Cube2TypeDouble scalar NULL NULL NULL
RT_Cube2TypeInt32 scalar NULL NULL NULL
RT_Cube2TypeInt64 scalar NULL NULL NULL
RT_Cube2ArrayInt16 scalar NULL NULL NULL
RT_Cube2ArrayInt8 scalar NULL NULL NULL
RT_Cube2ArrayUInt8 scalar NULL NULL NULL
RT_CubeMod scalar NULL NULL NULL
RT_Cube2TypeFloat scalar NULL NULL NULL
RT_Cube2TypeInt16 scalar NULL NULL NULL
RT_Cube2ArrayUInt16 scalar NULL NULL NULL
RT_Cube2TypeUInt32 scalar NULL NULL NULL
RT_Cube2ArrayUInt64 scalar NULL NULL NULL

Overloaded Functions

function_name function_type description comment examples
- scalar NULL NULL NULL
/ scalar NULL NULL NULL
+ scalar NULL NULL NULL
* scalar NULL NULL NULL
% scalar NULL NULL NULL
^ scalar NULL NULL NULL

Added Types

type_name type_size logical_type type_category internal
RT_BBOX 0 STRUCT COMPOSITE true
RT_DATACUBE 16 BLOB NULL true

Added Settings

This extension does not add any settings.