Vector Read Benchmarks
Source:vignettes/articles/vector-read-benchmarks.Rmd
vector-read-benchmarks.RmdVector read benchmarks for R package
gdalraster
(last updated 2025-12-14)
The following benchmark tests follow the format of benchmarks described in GDAL RFC 86: Column-oriented read API for vector layers using the same dataset. The timings here cannot be compared directly with the GDAL timings due to hardware differences (hardware used for the GDAL benchmarks is not specified). The benchmarks are intended as a sanity check on read performance in the context of ranges seen for two different I/O methods (traditional GetNextFeature() iteration vs. column-oriented ArrowArrayStream), and multiple implementations in both plain C++ and using Python libraries as described for the GDAL benchmarks, and using R packages described here.
The C++ benchmark programs used in RFC 86 are also included here to measure baseline raw performance under the same hardware and software environment. Directly comparing the timings between the .cpp and .R programs in their current forms should consider that the cpp programs both read the 3.3 million features in the test dataset but do not populate a data structure, while the R programs populate a data frame object and return it.
Hardware
Dell XPS 13 laptop 9350 (2024 edition), Intel® Core™ Ultra 7 258V x 8 cores up to 4.8 GHz Lunar Lake CPU, 32GB LPDDR5X RAM 8533MT/s integrated, 1 TB SSD
Software environment
Ubuntu Linux 24.04.3 LTS, R 4.5.2 (2025-10-31), GDAL 3.12.0 (released 2025-11-03) local build, gdalraster 2.3.0.9007, sf 1.0.23
Vector data
NZ Building Outlines, https://data.linz.govt.nz/layer/101290-nz-building-outlines/, from Land Information New Zealand: “This dataset provides current outlines of buildings within mainland New Zealand captured from the latest aerial imagery.”
Tests used the GeoPackage file nz-building-outlines.gpkg (1.5 GB). The layer contains 3.3 million features, each with 13 attribute fields (2 fields of type Integer, 8 of type String, 3 of type DateTime) and polygon geometries.
Benchmark programs
Each program reads all features from the layer and populates an R
data frame. Code for the programs along with output generated by
reprex::reprex() is given in a separate section further
below.
bench_ogr.cpp
Benchmark C++ program used in RFC 86, re-run here for comparisons
within the same hardware/software environment. Use of traditional
OGRLayer::GetNextFeature() and related API from C. Reads
each feature but does not populate a data structure, representing
baseline raw performance.
bench_gdalraster_fetch.R
Uses the class method GDALVector$fetch() in
gdalraster for traditional row-level reading done in
C++ iterating over features with OGRLayer::GetNextFeature()
in the GDAL API. The method is an analog of function
DBI::dbFetch() in the DBI R package.
bench_gdalraster_fetch_conv_to_sf.R
The same as bench_gdalraster_fetch.R (traditional row-level access)
but with conversion to a classed sf data frame via
sf::st_sf() included in the timing.
bench_sf_read_sf.R
Traditional row-level read using package sf for its
function sf::read_sf(). Populates a classed data frame,
with geometries contained in a classed list column.
bench_ogr_batch.cpp
Benchmark C++ program used in RFC 86, re-run here for comparisons
within the same hardware/software environment. Use of the
GetNextRecordBatch() API from C++ with GDAL >= 3.6.
Reads each feature but does not populate a data structure, representing
baseline raw performance with the Array Stream interface.
bench_gdalraster_arrow_stream.R (requires GDAL >= 3.6)
Uses the class method GDALVector$getArrowStream() in
gdalraster to expose an Arrow C stream on the layer as
a nanoarrow_array_stream object (external pointer to an
ArrowArrayStream). Provides direct access to the stream object and
retrieves features in a column-oriented memory layout. The required
package nanoarrow provides S3 methods for
as.data.frame() to import a nanoarrow_array
(one batch at a time), or the nanoarrow_array_stream itself
(pulling all batches in the stream).
bench_sf_read_sf_use_stream.R (requires GDAL >= 3.6)
Uses sf::read_sf() with argument
use_stream = TRUE: “use the experimental columnar interface
introduced in GDAL 3.6”.
Timings
| Bench program | Time (s) | Data frame class | Geom list column |
|---|---|---|---|
| bench_ogr.cpp | 2.63 | none | n/a |
| bench_gdalraster_fetch.R | 11.20 | OGRFeatureSet | WKB raw vectors |
| bench_gdalraster_fetch_conv_to_sf.R | 25.01 | sf | classed sfc |
| bench_sf_read_sf.R | 77.24 | sf | classed sfc |
| bench_ogr_batch.cpp | 0.45 | none | n/a |
| bench_gdalraster_arrow_stream.R | 2.87 | base data.frame | WKB raw vectors |
| bench_sf_read_sf_use_stream.R | 11.17 | sf | classed sfc |
Code
bench_ogr.cpp (RFC86 benchmark)
// https://gdal.org/en/stable/development/rfc/rfc86_column_oriented_api.html#benchmarks
#include "gdal_priv.h"
#include "ogr_api.h"
#include "ogrsf_frmts.h"
int main(int argc, char* argv[])
{
GDALAllRegister();
GDALDataset* poDS = GDALDataset::Open(argv[1]);
OGRLayer* poLayer = poDS->GetLayer(0);
OGRLayerH hLayer = OGRLayer::ToHandle(poLayer);
OGRFeatureDefnH hFDefn = OGR_L_GetLayerDefn(hLayer);
int nFields = OGR_FD_GetFieldCount(hFDefn);
std::vector<OGRFieldType> aeTypes;
for( int i = 0; i < nFields; i++ )
aeTypes.push_back(OGR_Fld_GetType(OGR_FD_GetFieldDefn(hFDefn, i)));
int nYear, nMonth, nDay, nHour, nMin, nSecond, nTZ;
while( true )
{
OGRFeatureH hFeat = OGR_L_GetNextFeature(hLayer);
if( hFeat == nullptr )
break;
OGR_F_GetFID(hFeat);
for( int i = 0; i < nFields; i++ )
{
if( aeTypes[i] == OFTInteger )
OGR_F_GetFieldAsInteger(hFeat, i);
else if( aeTypes[i] == OFTInteger64 )
OGR_F_GetFieldAsInteger64(hFeat, i);
else if( aeTypes[i] == OFTReal )
OGR_F_GetFieldAsDouble(hFeat, i);
else if( aeTypes[i] == OFTString )
OGR_F_GetFieldAsString(hFeat, i);
else if( aeTypes[i] == OFTDateTime )
OGR_F_GetFieldAsDateTime(hFeat, i, &nYear, &nMonth, &nDay, &nHour, &nMin, &nSecond, &nTZ);
}
OGRGeometryH hGeom = OGR_F_GetGeometryRef(hFeat);
if( hGeom )
{
int size = OGR_G_WkbSize(hGeom);
GByte* pabyWKB = static_cast<GByte*>(malloc(size));
OGR_G_ExportToIsoWkb( hGeom, wkbNDR, pabyWKB);
CPLFree(pabyWKB);
}
OGR_F_Destroy(hFeat);
}
delete poDS;
return 0;
}bench_gdalraster_fetch.R
library(gdalraster)
#> GDAL 3.10.3 (released 2025-04-01), GEOS 3.12.2, PROJ 9.4.1
f <- '/home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg'
(lyr <- new(GDALVector, f))
#> C++ object of class GDALVector
#> Driver : GeoPackage (GPKG)
#> DSN : /home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg
#> Layer : nz_building_outlines
#> CRS : NZGD2000 / New Zealand Transverse Mercator 2000 (EPSG:2193)
#> Geom : MULTIPOLYGON
lyr$getFeatureCount()
#> [1] 3289574
system.time(d <- lyr$fetch(-1))
#> user system elapsed
#> 10.383 0.702 11.200
(nrow(d) == lyr$getFeatureCount())
#> [1] TRUE
head(d)
#> OGR feature set
#> FID building_id name use suburb_locality town_city territorial_authority
#> 1 1 2292051 Unknown Marton Marton Rangitikei District
#> 2 2 2292353 Unknown Durie Hill Whanganui Whanganui District
#> 3 3 2292407 Unknown Durie Hill Whanganui Whanganui District
#> 4 4 2292675 Unknown Feilding Feilding Manawatu District
#> 5 5 2292771 Unknown Feilding Feilding Manawatu District
#> 6 6 2292825 Unknown Feilding Feilding Manawatu District
#> capture_method capture_source_group capture_source_id
#> 1 Feature Extraction NZ Aerial Imagery 1042
#> 2 Feature Extraction NZ Aerial Imagery 1042
#> 3 Feature Extraction NZ Aerial Imagery 1042
#> 4 Feature Extraction NZ Aerial Imagery 1042
#> 5 Feature Extraction NZ Aerial Imagery 1042
#> 6 Feature Extraction NZ Aerial Imagery 1042
#> capture_source_name capture_source_from
#> 1 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 2 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 3 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 4 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 5 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 6 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> capture_source_to last_modified geom
#> 1 2016-04-21 2019-01-04 WKB MULTIPOLYGON: raw 01 06 00 00 ...
#> 2 2016-04-21 2019-01-04 WKB MULTIPOLYGON: raw 01 06 00 00 ...
#> 3 2016-04-21 2019-01-04 WKB MULTIPOLYGON: raw 01 06 00 00 ...
#> 4 2016-04-21 2019-01-04 WKB MULTIPOLYGON: raw 01 06 00 00 ...
#> 5 2016-04-21 2019-01-04 WKB MULTIPOLYGON: raw 01 06 00 00 ...
#> 6 2016-04-21 2019-01-04 WKB MULTIPOLYGON: raw 01 06 00 00 ...
lyr$close()Created on 2025-12-13 with reprex v2.1.1
bench_gdalraster_fetch_conv_to_sf.R
library(gdalraster)
#> GDAL 3.10.3 (released 2025-04-01), GEOS 3.12.2, PROJ 9.4.1
f <- '/home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg'
(lyr <- new(GDALVector, f))
#> C++ object of class GDALVector
#> Driver : GeoPackage (GPKG)
#> DSN : /home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg
#> Layer : nz_building_outlines
#> CRS : NZGD2000 / New Zealand Transverse Mercator 2000 (EPSG:2193)
#> Geom : MULTIPOLYGON
lyr$getFeatureCount()
#> [1] 3289574
system.time({
d <- lyr$fetch(-1)
d <- sf::st_sf(d, crs = lyr$getSpatialRef())
})
#> user system elapsed
#> 23.533 1.395 25.010
(nrow(d) == lyr$getFeatureCount())
#> [1] TRUE
head(d)
#> Simple feature collection with 6 features and 14 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: 1776318 ymin: 5544066 xmax: 1818438 ymax: 5576891
#> Projected CRS: NZGD2000 / New Zealand Transverse Mercator 2000
#> FID building_id name use suburb_locality town_city territorial_authority
#> 1 1 2292051 Unknown Marton Marton Rangitikei District
#> 2 2 2292353 Unknown Durie Hill Whanganui Whanganui District
#> 3 3 2292407 Unknown Durie Hill Whanganui Whanganui District
#> 4 4 2292675 Unknown Feilding Feilding Manawatu District
#> 5 5 2292771 Unknown Feilding Feilding Manawatu District
#> 6 6 2292825 Unknown Feilding Feilding Manawatu District
#> capture_method capture_source_group capture_source_id
#> 1 Feature Extraction NZ Aerial Imagery 1042
#> 2 Feature Extraction NZ Aerial Imagery 1042
#> 3 Feature Extraction NZ Aerial Imagery 1042
#> 4 Feature Extraction NZ Aerial Imagery 1042
#> 5 Feature Extraction NZ Aerial Imagery 1042
#> 6 Feature Extraction NZ Aerial Imagery 1042
#> capture_source_name capture_source_from
#> 1 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 2 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 3 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 4 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 5 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 6 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> capture_source_to last_modified geom
#> 1 2016-04-21 2019-01-04 MULTIPOLYGON (((1796394 556...
#> 2 2016-04-21 2019-01-04 MULTIPOLYGON (((1776394 557...
#> 3 2016-04-21 2019-01-04 MULTIPOLYGON (((1776322 557...
#> 4 2016-04-21 2019-01-04 MULTIPOLYGON (((1818268 554...
#> 5 2016-04-21 2019-01-04 MULTIPOLYGON (((1818172 554...
#> 6 2016-04-21 2019-01-04 MULTIPOLYGON (((1818436 554...
lyr$close()Created on 2025-12-13 with reprex v2.1.1
bench_sf_read_sf.R
library(sf)
#> Linking to GEOS 3.12.2, GDAL 3.10.3, PROJ 9.4.1; sf_use_s2() is TRUE
f <- '/home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg'
system.time(d <- read_sf(f, "nz_building_outlines"))
#> user system elapsed
#> 74.969 2.079 77.238
nrow(d)
#> [1] 3289574
head(d)
#> Simple feature collection with 6 features and 13 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: 1776318 ymin: 5544066 xmax: 1818438 ymax: 5576891
#> Projected CRS: NZGD2000 / New Zealand Transverse Mercator 2000
#> # A tibble: 6 × 14
#> building_id name use suburb_locality town_city territorial_authority
#> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 2292051 "" Unknown Marton Marton Rangitikei District
#> 2 2292353 "" Unknown Durie Hill Whanganui Whanganui District
#> 3 2292407 "" Unknown Durie Hill Whanganui Whanganui District
#> 4 2292675 "" Unknown Feilding Feilding Manawatu District
#> 5 2292771 "" Unknown Feilding Feilding Manawatu District
#> 6 2292825 "" Unknown Feilding Feilding Manawatu District
#> # ℹ 8 more variables: capture_method <chr>, capture_source_group <chr>,
#> # capture_source_id <int>, capture_source_name <chr>,
#> # capture_source_from <date>, capture_source_to <date>, last_modified <date>,
#> # geom <MULTIPOLYGON [m]>Created on 2025-12-13 with reprex v2.1.1
bench_ogr_batch.cpp (RFC86 benchmark)
// https://gdal.org/en/stable/development/rfc/rfc86_column_oriented_api.html#benchmarks
#include "gdal_priv.h"
#include "ogr_api.h"
#include "ogrsf_frmts.h"
#include "ogr_recordbatch.h"
int main(int argc, char* argv[])
{
GDALAllRegister();
GDALDataset* poDS = GDALDataset::Open(argv[1]);
OGRLayer* poLayer = poDS->GetLayer(0);
OGRLayerH hLayer = OGRLayer::ToHandle(poLayer);
struct ArrowArrayStream stream;
if( !OGR_L_GetArrowStream(hLayer, &stream, nullptr))
{
CPLError(CE_Failure, CPLE_AppDefined, "OGR_L_GetArrowStream() failed\n");
exit(1);
}
while( true )
{
struct ArrowArray array;
if( stream.get_next(&stream, &array) != 0 ||
array.release == nullptr )
{
break;
}
array.release(&array);
}
stream.release(&stream);
delete poDS;
return 0;
}bench_gdalraster_arrow_stream.R
library(gdalraster)
#> GDAL 3.10.3 (released 2025-04-01), GEOS 3.12.2, PROJ 9.4.1
f <- '/home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg'
(lyr <- new(GDALVector, f))
#> C++ object of class GDALVector
#> Driver : GeoPackage (GPKG)
#> DSN : /home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg
#> Layer : nz_building_outlines
#> CRS : NZGD2000 / New Zealand Transverse Mercator 2000 (EPSG:2193)
#> Geom : MULTIPOLYGON
lyr$getFeatureCount()
#> [1] 3289574
lyr$testCapability()$FastGetArrowStream
#> [1] TRUE
options(nanoarrow.warn_unregistered_extension = FALSE)
(stream <- lyr$getArrowStream())
#> <nanoarrow_array_stream struct<fid: int64, building_id: int32, name: string, use: string, suburb_locality: string, town_city: string, territorial_authority: string, capture_method: string, capture_source_group: string, capture_source_id: int32, capture_source_name: string, capture_source_from: date32, capture_source_to: date32, last_modified: date32, geom: ogc.wkb{binary}>>
#> $ get_schema:function ()
#> $ get_next :function (schema = x$get_schema(), validate = TRUE)
#> $ release :function ()
system.time(d <- as.data.frame(stream))
#> user system elapsed
#> 3.223 0.856 2.870
stream$release()
(nrow(d) == lyr$getFeatureCount())
#> [1] TRUE
head(d)
#> fid building_id name use suburb_locality town_city territorial_authority
#> 1 1 2292051 Unknown Marton Marton Rangitikei District
#> 2 2 2292353 Unknown Durie Hill Whanganui Whanganui District
#> 3 3 2292407 Unknown Durie Hill Whanganui Whanganui District
#> 4 4 2292675 Unknown Feilding Feilding Manawatu District
#> 5 5 2292771 Unknown Feilding Feilding Manawatu District
#> 6 6 2292825 Unknown Feilding Feilding Manawatu District
#> capture_method capture_source_group capture_source_id
#> 1 Feature Extraction NZ Aerial Imagery 1042
#> 2 Feature Extraction NZ Aerial Imagery 1042
#> 3 Feature Extraction NZ Aerial Imagery 1042
#> 4 Feature Extraction NZ Aerial Imagery 1042
#> 5 Feature Extraction NZ Aerial Imagery 1042
#> 6 Feature Extraction NZ Aerial Imagery 1042
#> capture_source_name capture_source_from
#> 1 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 2 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 3 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 4 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 5 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> 6 Manawatu Whanganui 0.3m Rural Aerial Photos (2015-2016) 2015-12-27
#> capture_source_to last_modified geom
#> 1 2016-04-21 2019-01-04 blob[102 B]
#> 2 2016-04-21 2019-01-04 blob[102 B]
#> 3 2016-04-21 2019-01-04 blob[230 B]
#> 4 2016-04-21 2019-01-04 blob[102 B]
#> 5 2016-04-21 2019-01-04 blob[118 B]
#> 6 2016-04-21 2019-01-04 blob[102 B]
lyr$close()Created on 2025-12-13 with reprex v2.1.1
bench_sf_read_sf_use_stream.R
library(sf)
#> Linking to GEOS 3.12.2, GDAL 3.10.3, PROJ 9.4.1; sf_use_s2() is TRUE
f <- '/home/ctoney/data/gis/nz-building-outlines/nz-building-outlines.gpkg'
system.time(d <- read_sf(f, "nz_building_outlines", use_stream = TRUE))
#> user system elapsed
#> 10.876 1.453 11.168
nrow(d)
#> [1] 3289574
head(d)
#> Simple feature collection with 6 features and 13 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: 1776318 ymin: 5544066 xmax: 1818438 ymax: 5576891
#> Projected CRS: NZGD2000 / New Zealand Transverse Mercator 2000
#> # A tibble: 6 × 14
#> building_id name use suburb_locality town_city territorial_authority
#> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 2292051 "" Unknown Marton Marton Rangitikei District
#> 2 2292353 "" Unknown Durie Hill Whanganui Whanganui District
#> 3 2292407 "" Unknown Durie Hill Whanganui Whanganui District
#> 4 2292675 "" Unknown Feilding Feilding Manawatu District
#> 5 2292771 "" Unknown Feilding Feilding Manawatu District
#> 6 2292825 "" Unknown Feilding Feilding Manawatu District
#> # ℹ 8 more variables: capture_method <chr>, capture_source_group <chr>,
#> # capture_source_id <int>, capture_source_name <chr>,
#> # capture_source_from <date>, capture_source_to <date>, last_modified <date>,
#> # geom <MULTIPOLYGON [m]>Created on 2025-12-13 with reprex v2.1.1