DuckDB Extension for reading and writing raster files using SQL.
Installing and Loading
INSTALL raster FROM community;
LOAD raster;
Example
SELECT short_name, long_name, help_url FROM RT_Drivers();
┌────────────────┬──────────────────────────────────────────────────────────┬─────────────────────────────────────────────────────┐
│ short_name │ long_name │ help_url │
│ varchar │ varchar │ varchar │
├────────────────┼──────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────┤
│ VRT │ Virtual Raster │ https://gdal.org/drivers/raster/vrt.html │
│ DERIVED │ Derived datasets using VRT pixel functions │ https://gdal.org/drivers/raster/derived.html │
│ GTiff │ GeoTIFF │ https://gdal.org/drivers/raster/gtiff.html │
│ COG │ Cloud optimized GeoTIFF generator │ https://gdal.org/drivers/raster/cog.html │
│ · │ · │ · │
│ · │ · │ · │
│ · │ · │ · │
│ EHdr │ ESRI .hdr Labelled │ https://gdal.org/drivers/raster/ehdr.html │
│ Zarr │ Zarr │ NULL │
└────────────────┴──────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────┘
SELECT * FROM RT_Read('path/to/raster/file.tif');
┌───────┬───────────┬────────────┬────────────────────────────────┬─────────────────────────┬───────┬────────┬────────┬───────┬───────┬────────────┬────────────┐
│ id │ x │ y │ bbox │ geometry │ level │ tile_x │ tile_y │ cols │ rows │ metadata │ databand_1 │
│ int64 │ double │ double │ struct(xmin, ymin, xmax, ymax) │ geometry('epsg:25830') │ int32 │ int32 │ int32 │ int32 │ int32 │ JSON │ BLOB │
├───────┼───────────┼────────────┼────────────────────────────────┼─────────────────────────┼───────┼────────┼────────┼───────┼───────┤────────────┤────────────┤
│ 0 │ 545619.75 │ 4724508.25 │ { │ POLYGON ((...)) │ 0 │ 0 │ 0 │ 320 │ 8 │ {...} │ ... │
│ │ │ │ 'xmin': 545539.75, │ │ │ │ │ │ │ │ │
│ │ │ │ 'ymin': 4724506.25, │ │ │ │ │ │ │ │ │
│ │ │ │ 'xmax': 545699.75, │ │ │ │ │ │ │ │ │
│ │ │ │ 'ymax': 4724510.25 │ │ │ │ │ │ │ │ │
│ │ │ │ } │ │ │ │ │ │ │ │ │
├───────┼───────────┼────────────┼────────────────────────────────┼─────────────────────────┼───────┼────────┼────────┼───────┼───────┤────────────┤────────────┤
│ 1 │ 545619.75 │ 4724504.25 │ { │ POLYGON ((...)) │ 0 │ 0 │ 1 │ 320 │ 8 │ {...} │ ... │
│ │ │ │ 'xmin': 545539.75, │ │ │ │ │ │ │ │ │
│ │ │ │ 'ymin': 4724502.25, │ │ │ │ │ │ │ │ │
│ │ │ │ 'xmax': 545699.75, │ │ │ │ │ │ │ │ │
│ │ │ │ 'ymax': 4724506.25 │ │ │ │ │ │ │ │ │
│ │ │ │ } │ │ │ │ │ │ │ │ │
└───────┴───────────┴────────────┴────────────────────────────────┴─────────────────────────┴───────┴────────┴────────┴───────┴───────┴────────────┴────────────┘
-- Example transforming the BLOB data of the bands into arrays of integers, to allow analyzing the raster data:
SELECT
RT_Cube2ArrayInt32(databand_1, true) AS r,
RT_Cube2ArrayInt32(databand_2, true) AS g,
RT_Cube2ArrayInt32(databand_3, true) AS b
FROM
RT_Read('path/to/raster/file.tif')
;
-- Example to back data from the arrays to BLOBs, to be able to write the data back to a raster file:
SELECT
RT_Array2Cube(r.values, 'RAW', r.bands, r.cols, r.rows, r.no_data) AS r_cube,
RT_Array2Cube(g.values, 'RAW', g.bands, g.cols, g.rows, g.no_data) AS g_cube,
RT_Array2Cube(b.values, 'RAW', b.bands, b.cols, b.rows, b.no_data) AS b_cube,
FROM
...
;
-- Example of using COPY to export the data table to a new raster file:
COPY (
SELECT geometry, databand_1, databand_2, databand_3 FROM RT_Read('./path/to/raster/file.tif')
)
TO './path/to/raster/newfile.tif'
WITH (
FORMAT RASTER,
DRIVER 'COG',
CREATION_OPTIONS ('COMPRESS=LZW','BIGTIFF=YES'),
RESAMPLING 'nearest',
ENVELOPE [545539.750, 4724420.250, 545699.750, 4724510.250],
SRS 'EPSG:25830',
GEOMETRY_COLUMN 'geometry',
DATABAND_COLUMNS ['databand_3', 'databand_2', 'databand_1']
);
-- You can combine data band using algebraic operations, set CAST to ARRAY...:
WITH __input AS (
SELECT
databand_1 AS red,
databand_3 AS nir
FROM
RT_Read('path/to/raster/file.tif', blocksize_x := 512, blocksize_y := 512)
)
SELECT
(nir - red) / (nir + red) AS ndvi
FROM
__input
;
About raster
Extension for DuckDB for reading and writing raster files using SQL.
The main purpose with this extension is to be able to read raster files as tables in DuckDB, and then perform band algebra operations on the raster data using SQL. A raster is just a datacube, and the idea is to be able to manipulate it as such.
The extension uses the GDAL library to read raster files, and it supports a wide range of raster formats.
The RT_Read function is the main function to read raster files, extract metadata, and perform basic operations on the raster data before returning a geoparquet-like
DuckDB table. The function takes the path to the raster file as an argument, and other optional parameters, and returns a table with the following columns:
id: A unique identifier for each tile.x: The center x coordinate of the tile.y: The center y coordinate of the tile.bbox: The bounding box of the tile, represented as a struct with the following fields:xmin,ymin,xmax,ymax.geometry: The geometry of the tile, represented as a polygon.level: The zoom level of the tile.tile_x: The x coordinate of the tile in the zoom level.tile_y: The y coordinate of the tile in the zoom level.cols: The number of columns in the tile.rows: The number of rows in the tile.metadata: A JSON object with the metadata of the tile.databand_x: A BLOB with the data of the x-th band of the tile. The number of bands and their names can be obtained from themetadatacolumn.
The RT_Read function provides a datacube option that allows to return a single datacube BLOB column with the N-dimensional array with the
data of all the bands of the tile, instead of one BLOB per band.
RT_Read supports filter pushdown on the non-BLOB columns, which allows you to prefilter the tiles that are loaded based on their metadata or
spatial location. For example, you can filter the tiles that intersect with a certain geometry or that have a certain value in the metadata.
COPY TO with FORMAT RASTER allows to export a table with the same structure as the one returned by RT_Read to a new raster file.
You can specify the driver and several creation options:
DRIVER: The GDAL driver to use for writing the raster file. For example,GTifforCOG.CREATION_OPTIONS: A list of creation options to pass to the GDAL driver, specified as strings in the formatKEY=VALUE.RESAMPLING: The resampling method to use when writing the raster file. For example,nearest,bilinear,cubic, etc.ENVELOPE: The spatial extent of the raster file, specified as a list of four values:[xmin, ymin, xmax, ymax].SRS: The spatial reference system of the raster file, specified as an EPSG code (e.g.EPSG:4326) or a WKT string.GEOMETRY_COLUMN: The name of the column that contains the geometry of the tiles. This column will be used to determine the spatial location and the resolution of the tiles in the output raster file.DATABAND_COLUMNS: A list with the names of the columns that contain the data of the bands. The order of the columns in the list will determine the order of the bands in the output raster file.
Added Functions
| function_name | function_type | description | comment | examples |
|---|---|---|---|---|
| RT_Drivers | table | Returns the list of supported GDAL raster drivers. | NULL | [SELECT RT_Drivers();] |
| RT_Read | table | Reads a raster file and returns a table with the raster data. | NULL | [SELECT * FROM RT_Read('path/to/raster/file.tif');] |
| RT_Array2Cube | scalar | Transforms an array of numeric values into a databand column (BLOB). | NULL | [SELECT RT_Array2Cube(r.values, 'RAW', r.bands, r.cols, r.rows, r.no_data) FROM …;] |
| RT_CubeNeg | scalar | Negate the values in the datacube element-wise. | NULL | [SELECT RT_CubeNeg(databand_1) FROM …;] |
| RT_CubeAbs | scalar | Takes the absolute value of the values in the datacube element-wise. | NULL | [SELECT RT_CubeAbs(databand_1) FROM …;] |
| RT_CubeSqrt | scalar | Takes the square root of the values in the datacube element-wise. | NULL | [SELECT RT_CubeSqrt(databand_1) FROM …;] |
| RT_CubeLog | scalar | Takes the logarithm of the values in the datacube element-wise. | NULL | [SELECT RT_CubeLog(databand_1) FROM …;] |
| RT_CubeExp | scalar | Takes the exponential of the values in the datacube element-wise. | NULL | [SELECT RT_CubeExp(databand_1) FROM …;] |
| RT_CubeEqual | scalar | Return 1 where values in datacube_a are equal to datacube_b or a scalar value, 0 otherwise. | NULL | [SELECT RT_CubeEqual(databand_1, 10) FROM …;] |
| RT_CubeNotEqual | scalar | Return 1 where values in datacube_a are not equal to datacube_b or a scalar value, 0 otherwise. | NULL | [SELECT RT_CubeNotEqual(databand_1, 10) FROM …;] |
| RT_CubeGreater | scalar | Return 1 where values in datacube_a are greater than datacube_b or a scalar value, 0 otherwise. | NULL | [SELECT RT_CubeGreater(databand_1, 10) FROM …;] |
| RT_CubeGreaterEqual | scalar | Return 1 where values in datacube_a are greater than or equal to datacube_b or a scalar value, 0 otherwise. | NULL | [SELECT RT_CubeGreaterEqual(databand_1, 10) FROM …;] |
| RT_CubeLess | scalar | Return 1 where values in datacube_a are less than datacube_b or a scalar value, 0 otherwise. | NULL | [SELECT RT_CubeLess(databand_1, 10) FROM …;] |
| RT_CubeLessEqual | scalar | Return 1 where values in datacube_a are less than or equal to datacube_b or a scalar value, 0 otherwise. | NULL | [SELECT RT_CubeLessEqual(databand_1, 10) FROM …;] |
| RT_CubeAdd | scalar | Add the values in datacube_a to datacube_b or a scalar value element-wise. | NULL | [SELECT RT_CubeAdd(databand_1, 10) FROM …;] |
| RT_CubeSubtract | scalar | Subtract datacube_b or a scalar value from the values in datacube_a element-wise. | NULL | [SELECT RT_CubeSubtract(databand_1, 10) FROM …;] |
| RT_CubeMultiply | scalar | Multiply the values in datacube_a by datacube_b or a scalar value element-wise. | NULL | [SELECT RT_CubeMultiply(databand_1, 10) FROM …;] |
| RT_CubeDivide | scalar | Divide the values in datacube_a by datacube_b or a scalar value element-wise. | NULL | [SELECT RT_CubeDivide(databand_1, 10) FROM …;] |
| RT_CubePow | scalar | Take the power of the values in datacube_a to datacube_b or a scalar value element-wise. | NULL | [SELECT RT_CubePow(databand_1, 2) FROM …;] |
| RT_Cube2ArrayInt32 | scalar | NULL | NULL | NULL |
| RT_Cube2TypeUInt64 | scalar | NULL | NULL | NULL |
| RT_Cube2TypeInt8 | scalar | NULL | NULL | NULL |
| RT_Cube2ArrayFloat | scalar | NULL | NULL | NULL |
| RT_Cube2ArrayUInt32 | scalar | NULL | NULL | NULL |
| RT_Cube2ArrayDouble | scalar | NULL | NULL | NULL |
| RT_Cube2TypeUInt8 | scalar | NULL | NULL | NULL |
| RT_Cube2ArrayInt64 | scalar | NULL | NULL | NULL |
| RT_Cube2TypeUInt16 | scalar | NULL | NULL | NULL |
| RT_Cube2TypeDouble | scalar | NULL | NULL | NULL |
| RT_Cube2TypeInt32 | scalar | NULL | NULL | NULL |
| RT_Cube2TypeInt64 | scalar | NULL | NULL | NULL |
| RT_Cube2ArrayInt16 | scalar | NULL | NULL | NULL |
| RT_Cube2ArrayInt8 | scalar | NULL | NULL | NULL |
| RT_Cube2ArrayUInt8 | scalar | NULL | NULL | NULL |
| RT_CubeMod | scalar | NULL | NULL | NULL |
| RT_Cube2TypeFloat | scalar | NULL | NULL | NULL |
| RT_Cube2TypeInt16 | scalar | NULL | NULL | NULL |
| RT_Cube2ArrayUInt16 | scalar | NULL | NULL | NULL |
| RT_Cube2TypeUInt32 | scalar | NULL | NULL | NULL |
| RT_Cube2ArrayUInt64 | scalar | NULL | NULL | NULL |
Overloaded Functions
| function_name | function_type | description | comment | examples |
|---|---|---|---|---|
| - | scalar | NULL | NULL | NULL |
| / | scalar | NULL | NULL | NULL |
| + | scalar | NULL | NULL | NULL |
| * | scalar | NULL | NULL | NULL |
| % | scalar | NULL | NULL | NULL |
| ^ | scalar | NULL | NULL | NULL |
Added Types
| type_name | type_size | logical_type | type_category | internal |
|---|---|---|---|---|
| RT_BBOX | 0 | STRUCT | COMPOSITE | true |
| RT_DATACUBE | 16 | BLOB | NULL | true |
Added Settings
This extension does not add any settings.