Hugh's Avatar

Hugh

@hughagraham.bsky.social

42 Followers  |  153 Following  |  15 Posts  |  Joined: 17.08.2025
Posts Following

Posts by Hugh (@hughagraham.bsky.social)

panel figure showing plots of a5 traversal functions coloured by distance from origin cell.

panel figure showing plots of a5 traversal functions coloured by distance from origin cell.

Getting very into a5 right now, its such a clever geospatial index when you get into it! A bunch of new functions have landed in {a5R} including new traversal functions and paired cell distance calculation. belian-earth.github.io/a5R/articles... #rspatial #rstats

05.03.2026 15:11 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
HTTP streaming and Server-Sent Events in R with nanonext HTTP streaming and Server-Sent Events in R with nanonext - sse.R

HTTP streaming and Server-Sent Events in R with nanonext. Here's an example: gist.github.com/jrosell/178e...

02.03.2026 20:43 β€” πŸ‘ 8    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

πŸŒπŸ€– New update on the PRISM project: spatial ML validation, model errors, improving R packages (incl. supercells), and contributing to open geocomputation resources.

jakubnowosad.com/posts/2026-0...

#MachineLearning #RStats #RSpatial #OpenScience

01.03.2026 15:01 β€” πŸ‘ 10    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

This is certainly true!

23.02.2026 20:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

That's awesome, how did my googling not find this! Thanks for sharing πŸ™

23.02.2026 19:52 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
GitHub - euctrl-pru/a5r Contribute to euctrl-pru/a5r development by creating an account on GitHub.

Done the same last month
github.com/euctrl-pru/a5r

23.02.2026 18:34 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
SMT (@bioinfhotep@genomic.social) Attached: 4 images #Genomics #Bioinformatics Release of duckhts: #htslib based #Duckdb Extension for High Throughput Sequencing File Formats https://duckdb.org/community_extensions/extensions/duckh...

#Duckdb #htslib #Genomics #Bioinformatics #RStats

duckths: Read HTS (VCF/BCF/BAM/CRAM/FASTA/FASTQ/GTF/GFF) files in DuckDB via htslib

Rduckhts: 'DuckDB' High Throughput Sequencing File Formats Reader Extension
genomic.social/@bioinfhotep...

21.02.2026 23:04 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

{controlledburn} fulfils a wish for fasterize polygon rasterization in "abstract mode". A unicorn combination of exactextract+fasterize, and a wildly compact rep. of run lengths for fill, weights for individual partial cover, and dense storage ...

20.02.2026 09:48 β€” πŸ‘ 3    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
abstract procedural art in a dark blue and white palette, comprised of thousands of flowing ribbon shapes in the middle of the field against a dark blue background. it feels vaguely organic

abstract procedural art in a dark blue and white palette, comprised of thousands of flowing ribbon shapes in the middle of the field against a dark blue background. it feels vaguely organic

rising from the depths #rstats

20.02.2026 21:26 β€” πŸ‘ 25    πŸ” 5    πŸ’¬ 0    πŸ“Œ 0

My first actual stats #rstats package... Another #rust wrapper -this time for the insanely performant petal-cluster crate. So far it seems super fast but would love feedback if you have any! ✌️ github.com/belian-earth...
includes DBSCAN, HDBSCAN, and OPTICS algorithms.

20.02.2026 17:02 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - belian-earth/petalcluster: R bindings to the petal-clustering rust crate R bindings to the petal-clustering rust crate. Contribute to belian-earth/petalcluster development by creating an account on GitHub.

github.com/belian-earth...

20.02.2026 17:01 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

#RStats When i was looking at some of the perf in quickr readme, i was thinking oh these tails are R's garbage collection pauses, the R-to-C (LLM enabled) rabbit hole i went in makes the pauses worse because it is really R to R's C API and sometimes calling into R via Rf_eval :D

19.02.2026 21:22 β€” πŸ‘ 0    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

Present at @cascadiarconf.bsky.social !!!! I can't wait to meet y'all there!

#rstats

19.02.2026 21:05 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Mate, this is awesome!

19.02.2026 19:59 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

{lazysf} gets a massive update: github.com/hypertidy/la... #RStats a *simple features* dataframe with a #GDAL vector backend: {dbplyr) SQL generation, with execution via {gdalraster}, {wk} for geometry interchange+crs. R's modern dataframe and query push-down support #RStats

18.02.2026 13:42 β€” πŸ‘ 7    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
Logo of the R package valh.

Logo of the R package valh.

A new version of {valh} has hit the CRAN.
{valh} is an interface between R and the Valhalla API.
Valhalla is a routing service based on OpenStreetMap data.

https://github.com/riatelab/valh

This is mostly a maintenance release, no new features, but the package […]

[Original post on fosstodon.org]

19.02.2026 15:04 β€” πŸ‘ 3    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0
GitHub - belian-earth/a5R: An R wrapper of the a5 rust crate An R wrapper of the a5 rust crate. Contribute to belian-earth/a5R development by creating an account on GitHub.

#rstats bindings for the awesome #a5 DGGS. My first rust based package with help from the πŸ€–. #rspatial github.com/belian-earth...

18.02.2026 17:05 β€” πŸ‘ 8    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1
Preview
GitHub - r-xla/tengen Contribute to r-xla/tengen development by creating an account on GitHub.

#Rstats #HPC gurus
Is there a package for mixed precision arithmetic that covers most of the exotic new floating points or one needs specialized ML package from the r-xla organization like tengen github.com/r-xla/tengen

16.02.2026 19:39 β€” πŸ‘ 1    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0
Post image Post image Post image

R Coding for Ecology chapter on the cartogram package explores mapping ecological patterns with cartograms -- visualizing sampling bias by resizing regions based on data values.

Chapter: doi.org/10.1007/978-...
Code: github.com/RCodingForEc...

#RStats #GIS #DataViz #LandscapeEcology

15.02.2026 16:00 β€” πŸ‘ 17    πŸ” 7    πŸ’¬ 0    πŸ“Œ 0

#RStats #Statsky
Any references and packages out there on statistical inference around binary outcomes for whom some outcome have no measurement error but the other has ? Should i just use linear models and get over it ?

14.02.2026 22:54 β€” πŸ‘ 0    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0

sometimes I look at all the zarr variants apis and just think "this is a whole new round of horseshit just like opendap and netcdf was", declared utopia by the matlab community 20 years ago and we've come close to an actual rich ecosystem of tools but it's all being wrapped up in marketing again.

15.02.2026 11:52 β€” πŸ‘ 0    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

still insanely out of date [ OK] mur-sst-aws
MUR SST L4 Global (JPL, NASA)
NASA JPL / AWS Open Data | V2 | s3
s3://mur-sst/zarr-v1
dims={'time': 6443, 'lat': 17999, 'lon': 36000} (27.57s)

15.02.2026 14:53 β€” πŸ‘ 0    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
tissot: Tissot Indicatrix for Map Projection Distortion

A very long lead time... but {tissot} hit CRAN today, no bounce 😁

Tissot Indicatrix for Map Projection Distortion (math and explainer and code originally outlined by whuber direct from Snyder manual πŸ‘Œ)
#RStats

hypertidy.r-universe.dev/tissot

12.02.2026 09:38 β€” πŸ‘ 4    πŸ” 4    πŸ’¬ 0    πŸ“Œ 0
Rduckhts: 'DuckDB' 'HTS' File Reader Extension for 'R'

Bundles the 'duckhts' 'DuckDB' extension for reading 'HTS' file formats (VCF/BCF, SAM/BAM/CRAM, FASTA, FASTQ, GFF, GTF, tabix) from 'R' via 'DuckDB'. The extension and its 'htslib' dependency are compiled from vendored sources during package installation.

Authors:Sounkou Mahamane Toure [aut, cre], htslib authors [ctb], DuckDB C Extension API authors [ctb]

Rduckhts_0.1.1-0.0.1.tar.gz
Rduckhts_0.1.1-0.0.1.zip(r-4.6)Rduckhts_0.1.1-0.0.1.zip(r-4.5)Rduckhts_0.1.1-0.0.1.zip(r-4.4)
Rduckhts_0.1.1-0.0.1.tgz(r-4.6-any)Rduckhts_0.1.1-0.0.1.tgz(r-4.5-any)
Rduckhts_0.1.1-0.0.1.tar.gz(r-4.6-any)Rduckhts_0.1.1-0.0.1.tar.gz(r-4.5-any)
Rduckhts.pdf |Rduckhts.html✨
Rduckhts/json (API)
NEWS
# Install 'Rduckhts' in R:
install.packages('Rduckhts', repos = c('https://rgenomicsetl.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/rgenomicsetl/duckhts/issues0 issues

On CRAN: no

3.75 score 12 exports 2 dependencies

Last updated0 hours ago from:e99a28a305. Checks:7 OK, 1 NOTE, 1 FAIL. Indexed: yes.
Citation

To cite package β€˜Rduckhts’ in publications use:

    Toure S (2026). Rduckhts: 'DuckDB' 'HTS' File Reader Extension for 'R'. R package version 0.1.1-0.0.1, https://github.com/rgenomicsetl/duckhts.

Corresponding BibTeX entry:

  @Manual{,
    title = {Rduckhts: 'DuckDB' 'HTS' File Reader Extension for 'R'},
    author = {Sounkou Mahamane Toure},
    year = {2026},
    note = {R package version 0.1.1-0.0.1},
    url = {https://github.com/rgenomicsetl/duckhts},
  }

Rduckhts: 'DuckDB' 'HTS' File Reader Extension for 'R' Bundles the 'duckhts' 'DuckDB' extension for reading 'HTS' file formats (VCF/BCF, SAM/BAM/CRAM, FASTA, FASTQ, GFF, GTF, tabix) from 'R' via 'DuckDB'. The extension and its 'htslib' dependency are compiled from vendored sources during package installation. Authors:Sounkou Mahamane Toure [aut, cre], htslib authors [ctb], DuckDB C Extension API authors [ctb] Rduckhts_0.1.1-0.0.1.tar.gz Rduckhts_0.1.1-0.0.1.zip(r-4.6)Rduckhts_0.1.1-0.0.1.zip(r-4.5)Rduckhts_0.1.1-0.0.1.zip(r-4.4) Rduckhts_0.1.1-0.0.1.tgz(r-4.6-any)Rduckhts_0.1.1-0.0.1.tgz(r-4.5-any) Rduckhts_0.1.1-0.0.1.tar.gz(r-4.6-any)Rduckhts_0.1.1-0.0.1.tar.gz(r-4.5-any) Rduckhts.pdf |Rduckhts.html✨ Rduckhts/json (API) NEWS # Install 'Rduckhts' in R: install.packages('Rduckhts', repos = c('https://rgenomicsetl.r-universe.dev', 'https://cloud.r-project.org')) Bug tracker:https://github.com/rgenomicsetl/duckhts/issues0 issues On CRAN: no 3.75 score 12 exports 2 dependencies Last updated0 hours ago from:e99a28a305. Checks:7 OK, 1 NOTE, 1 FAIL. Indexed: yes. Citation To cite package β€˜Rduckhts’ in publications use: Toure S (2026). Rduckhts: 'DuckDB' 'HTS' File Reader Extension for 'R'. R package version 0.1.1-0.0.1, https://github.com/rgenomicsetl/duckhts. Corresponding BibTeX entry: @Manual{, title = {Rduckhts: 'DuckDB' 'HTS' File Reader Extension for 'R'}, author = {Sounkou Mahamane Toure}, year = {2026}, note = {R package version 0.1.1-0.0.1}, url = {https://github.com/rgenomicsetl/duckhts}, }

Rduckhts: DuckDB HTS File Reader Extension for R

Rduckhts provides an R interface to a DuckDB HTS (High Throughput Sequencing) file reader extension. This enables reading common bioinformatics file formats such as VCF/BCF, SAM/BAM/CRAM, FASTA, FASTQ, GFF, GTF, and tabix-indexed files directly from R using SQL queries via duckhts.
How it works

Following RBCFTools, tables are created and returned instead of data frames. VCF/BCF, SAM/BAM/CRAM, FASTA, FASTQ, GFF, GTF, and tabix formats can be queried. We support region queries for indexed files, and we target Linux, macOS, and RTools. htslib 1.23 is bundled so build dependencies stay minimal. The extensnion is built by adapting the generic extension infracstructure by using only makefiles unlike unlike the submitted communtity extension duckhts.
Installation

The package can be installed from github

remotes::install_github(
    "RGenomicsETL/duckhts", subdir = "r/Rduckhts")`.

System Requirements

Installation requires htslib dependencies such ad zlib and libbz2, and optionally for full functionally liblzma, libcurl, and openssl. The package requires GNU make. On Windows’s Rtools, htslib plugins are not enable.

Rduckhts: DuckDB HTS File Reader Extension for R Rduckhts provides an R interface to a DuckDB HTS (High Throughput Sequencing) file reader extension. This enables reading common bioinformatics file formats such as VCF/BCF, SAM/BAM/CRAM, FASTA, FASTQ, GFF, GTF, and tabix-indexed files directly from R using SQL queries via duckhts. How it works Following RBCFTools, tables are created and returned instead of data frames. VCF/BCF, SAM/BAM/CRAM, FASTA, FASTQ, GFF, GTF, and tabix formats can be queried. We support region queries for indexed files, and we target Linux, macOS, and RTools. htslib 1.23 is bundled so build dependencies stay minimal. The extensnion is built by adapting the generic extension infracstructure by using only makefiles unlike unlike the submitted communtity extension duckhts. Installation The package can be installed from github remotes::install_github( "RGenomicsETL/duckhts", subdir = "r/Rduckhts")`. System Requirements Installation requires htslib dependencies such ad zlib and libbz2, and optionally for full functionally liblzma, libcurl, and openssl. The package requires GNU make. On Windows’s Rtools, htslib plugins are not enable.

Quick Start

The extension is loaded with rduckhts_load(con, extension_path = NULL). We can create tables with rduckhts_bcf, rduckhts_bam, rduckhts_fasta, rduckhts_fastq, rduckhts_gff, rduckhts_gtf, and rduckhts_tabix using the parameters documented in their help pages

library(DBI)
library(duckdb)
library(Rduckhts)


ext_path <- system.file("extdata", "duckhts.duckdb_extension", package = "Rduckhts")
fasta_path <- system.file("extdata", "ce.fa", package = "Rduckhts")
fastq_r1 <- system.file("extdata", "r1.fq", package = "Rduckhts")
fastq_r2 <- system.file("extdata", "r2.fq", package = "Rduckhts")
con <- dbConnect(duckdb::duckdb(config = list(allow_unsigned_extensions = "true")))
rduckhts_load(con, extension_path = ext_path)
#> [1] TRUE

rduckhts_fasta(con, "sequences", fasta_path, overwrite = TRUE)
rduckhts_fastq(con, "reads", fastq_r1, mate_path = fastq_r2, overwrite = TRUE)

dbGetQuery(con, "SELECT COUNT(*) AS n FROM sequences")
#>   n
#> 1 7
dbGetQuery(con, "SELECT COUNT(*) AS n FROM reads")
#>    n
#> 1 10

Quick Start The extension is loaded with rduckhts_load(con, extension_path = NULL). We can create tables with rduckhts_bcf, rduckhts_bam, rduckhts_fasta, rduckhts_fastq, rduckhts_gff, rduckhts_gtf, and rduckhts_tabix using the parameters documented in their help pages library(DBI) library(duckdb) library(Rduckhts) ext_path <- system.file("extdata", "duckhts.duckdb_extension", package = "Rduckhts") fasta_path <- system.file("extdata", "ce.fa", package = "Rduckhts") fastq_r1 <- system.file("extdata", "r1.fq", package = "Rduckhts") fastq_r2 <- system.file("extdata", "r2.fq", package = "Rduckhts") con <- dbConnect(duckdb::duckdb(config = list(allow_unsigned_extensions = "true"))) rduckhts_load(con, extension_path = ext_path) #> [1] TRUE rduckhts_fasta(con, "sequences", fasta_path, overwrite = TRUE) rduckhts_fastq(con, "reads", fastq_r1, mate_path = fastq_r2, overwrite = TRUE) dbGetQuery(con, "SELECT COUNT(*) AS n FROM sequences") #> n #> 1 7 dbGetQuery(con, "SELECT COUNT(*) AS n FROM reads") #> n #> 1 10

FASTA, BAM, FASTQ, READER

FASTA, BAM, FASTQ, READER

#RStats
Rduckhts: 'DuckDB' 'HTS' File Reader Extension for 'R'

Sitting on the shoulders of the great #htslib API and the duckdb C API

Package : rgenomicsetl.r-universe.dev/Rduckhts

09.02.2026 21:00 β€” πŸ‘ 1    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0
Conference logo

Conference logo

Honored and excited to be part of useR! 2026

I will give a keynote on geocomputation and spatial data science in R.

πŸ“… 6–9 Jul 2026, Warsaw, Poland
πŸ”— user2026.r-project.org

Let’s make this event spatial!

#useR2026 #RStats #RSpatial #Geocompr

08.02.2026 16:02 β€” πŸ‘ 22    πŸ” 5    πŸ’¬ 0    πŸ“Œ 0
Preview
We mourn our craft I didn’t ask for this and neither did you. I didn’t ask for a robot to consume every blog post and piece of code I ever wrote and parrot it back so that some hack could make money off o…

nolanlawson.com/2026/02/07/w...

07.02.2026 23:29 β€” πŸ‘ 8    πŸ” 2    πŸ’¬ 0    πŸ“Œ 1
Video thumbnail

Bit of love for #rstats πŸ“¦ {paint} today as I had to swap out a deprecated dependency.

The interactive scrolling ipaint() got a speed up, and some additional controls to advance pages, or to start / end. I find myself using this one a bit lately.

08.02.2026 06:38 β€” πŸ‘ 22    πŸ” 5    πŸ’¬ 2    πŸ“Œ 0
GitHub - brendensm/sqlfluffr: R Wrapper to the SQL Linter and Formatter 'sqlfluff' R Wrapper to the SQL Linter and Formatter 'sqlfluff' - brendensm/sqlfluffr

Any #rstats folks interested in testing out a package for SQL linting in R workflows?

I made a package that wraps SQLFluff into R with reticulate. I've tried it on a few machines I own but would appreciate some experimentation before I try to send it to CRAN.

github.com/brendensm/sq...

06.02.2026 21:12 β€” πŸ‘ 5    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0

Want to make your research more transparent, reusable, and impactful? I’ve written a practical Open Research primer with clear guidance, especially for geospatial ecology & geography β€” but useful across disciplines.
tess-lab.org/resources/op...

I'm passionate about open research throughout my work

05.02.2026 22:24 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
crschmidt: 

When running gdalinfo -stats on a large GeoTIFF, it uses only one core to compute stats. It would be nice if (for thread-safe datasets) it could run with more than one core.

(Seems like this might be more practical now that https://gdal.org/en/stable/development/rfc/rfc101_raster_dataset_threadsafety.html is in place?)

In particular, running gdalinfo -stats on CAFA_2020 from https://drive.google.com/corp/drive/folders/1exF1KpGr0EJqZ10yqPIkzqLfh6gGj3Sa took 8 minutes on my Macbook with an M3 chip, which seems like it could have gone faster with multi-threading.

<snip>

rouault

@crschmidt Wouldn't https://gdal.org/en/latest/programs/gdal_dataset_check.html added in master a couple days ago do the job as far as your goal is concerned? (i.e. it reads all blocks ensuring they don't result in errrors). It is not multi-threaded per se. That said if your GeoTIFF is compressed and you specify "--config GDAL_NUM_THREADS=ALL_CPUS" then you'll get multi-threaded decompression, which should reach close to optimal performance.

crschmidt: When running gdalinfo -stats on a large GeoTIFF, it uses only one core to compute stats. It would be nice if (for thread-safe datasets) it could run with more than one core. (Seems like this might be more practical now that https://gdal.org/en/stable/development/rfc/rfc101_raster_dataset_threadsafety.html is in place?) In particular, running gdalinfo -stats on CAFA_2020 from https://drive.google.com/corp/drive/folders/1exF1KpGr0EJqZ10yqPIkzqLfh6gGj3Sa took 8 minutes on my Macbook with an M3 chip, which seems like it could have gone faster with multi-threading. <snip> rouault @crschmidt Wouldn't https://gdal.org/en/latest/programs/gdal_dataset_check.html added in master a couple days ago do the job as far as your goal is concerned? (i.e. it reads all blocks ensuring they don't result in errrors). It is not multi-threaded per se. That said if your GeoTIFF is compressed and you specify "--config GDAL_NUM_THREADS=ALL_CPUS" then you'll get multi-threaded decompression, which should reach close to optimal performance.

TIL more than just gdal_translate can use all the CPU cores for decompression. So, pass `--config GDAL_NUM_THREADS=ALL_CPUS` to gdalinfo for massive compressed files! πŸš€

github.com/OSGeo/gdal/i...

09.01.2026 17:39 β€” πŸ‘ 2    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0