Huge thanks to @gsimpson.bsky.social and everyone who joined our GAMs in R course! π
Happy modeling, everyone! π
@gsimpson.bsky.social
(Palaeo)[ecologist | limnologist] & #fakeStatistican, #rstats user, wielder of #GAMs. He/him/his. Opinions mineβ¦
Huge thanks to @gsimpson.bsky.social and everyone who joined our GAMs in R course! π
Happy modeling, everyone! π
A palaeoecological plant find from a Breckland ghost ice age pond core (East Harling, E. England). Puzzled over by @hayleymcmechan.bsky.social & myself. Help with ID please team!! Seems obvious but stuck. @jo-the-botanist.bsky.social @bramblebotanist.bsky.social @timholtwilson.bsky.social
07.12.2025 18:58 β π 16 π 4 π¬ 6 π 1Figure with two panels. Left panel: visualisation of a 3D movement track. Right panel: visualisation of the 3D direction of movement as two angles (one horizontal angle and one vertical angle).
We have a preprint about modelling three-dimensional movement tracks, led by @njklappstein.bsky.social.
The model takes the form of a step selection function and, just like in 2D, it can include directional persistence, attraction to targets, and habitat selection.
doi.org/10.1101/2025...
If you want to see me get flustered, but recover to give a banger talk about RSV prophylaxis, the magic of GAMs, and the vital importance of survellence data, then this is the talk for you. With shout-outs for @gsimpson.bsky.social, @vincentab.bsky.social, and @hpscireland.bsky.social
03.12.2025 08:31 β π 31 π 4 π¬ 1 π 0EFS is the Extended Fellner Schall smoothness selection method that Simon & Matteo Fasiolo developed, & which was initially used for the twlss() family
EFS is neat bc it doesn't require all the fancy higher order derivatives of model quantities to fit the model, even for location scale families
Why is scasm() such a big deal? Natalya Pya & Simon did some work leading to Natalya's *scam* π¦ with a load of different shape constraints. But the algorithm was GCV-based and only worked for the standard families.
scasm() works for *any* family in mgcv thanks to EFS & just slots into models
The full change log with all the changes is here: cran.r-project.org/web/packages...
I've already started the process of getting these new features supported in the gratia π¦. Shape constraint smooths mostly just work; small fix was needed bc structure of the smooths lacked something gam() produces
Example plot showing the partial effects of the estimated smooths in a GAM with terms y ~ x0 + s(x1) + s(x2) + x(x3). Three panels are shown, one per smooth. The first panel is for s(x1) and shows an increasing, slightly curved function that increases slightly more as x1 increases. The second panel shows the partial effects for s(x2), which shows much more variation, being initially negative at x2=0, rising to a peak around x2=0.35, then falling to ~-1 at x2=0.5, then remaining roughly flat followed by a gradual decline thereafter. The final panel shows the partial effect of s(x3), which is ~0 for all x3, with uncertainty band covering 0 too. This predictor is known to have no effect in the simulated data. The uncertainty in the partial effects is shown by two credible interval bands; a dark blue central band is a 68% Bayesian credible interval, while the lighter blue outer interval is a 95% Bayesian credible interval. The background of each panel is light grey with white grid lines, in a similar style to ggplot2's default theme.
π plot.gam() also has a new theme: scheme=2
This draws plots with 68 & 95% intervals (by default) and has a ggplot2-like grey plot background
Example plot showing the derivative of the estimated smooths in a GAM with terms y ~ x0 + s(x1) + s(x2) + x(x3). Three panels are shown, one per smooth. The first panel is for s(x1) and shows a positive derivative that increases slightly as x1 increases. The second panel shows the derivative for s(x2), which shows much more variation, being initially strongly positive from x2=0, moving gradually to strongly negative by x2=0.35, then back to 0 at x2=0.5, then remaining roughly flat thereafter. This reflects the strongly peaked nature of the fitted smooth s(x2). The final panel shows the derivative of s(x3), which is ~0 for all x3, with uncertainty band covering 0 too. This predictor is known to have no effect in the simulated data. The uncertainty in the derivatives is shown by two credible interval bands; a dark blue central band is a 68% Bayesian credible interval, while the lighter blue outer interval is a 95% Bayesian credible interval. The background of each panel is light grey with white grid lines, in a similar style to ggplot2's default theme.
π plot.gam() gains a deriv argument, which if TRUE plots derivatives of univariate smooths instead of the usual partial effect plots
Partial effect plots can be confusing. With deriv=TRUE you see the change in Y (Ξ·) for a small change in X, which is comparable with usual interpretations of model π½
π New family, bcg(), for (censored) Box-Cox Gaussian responses (basically anything that is conditionally Gaussian *after* a Box-Cox transform of Y_i)
12.11.2025 11:28 β π 4 π 0 π¬ 1 π 0Model is: b3 <- scasm( y ~ s(x0, bs = "bs", k= k) + s(x1, bs = "sc", xt = "m+", k = k) + s(x2, bs = "bs", k = k) + s(x3, bs = "bs", k = k), family=poisson, bs=200 ) The second smooth `s(x1) is a shape constrained smooth with a positive monotonicity constraint (xt = "m+"). The `bs = 200` arguments uses 200 boostrap samples, which generates bootstrap distributions for each coefficient in the model. These bootstrap samples respect the shape constraints, while the usual +/- 2 SE credible intervals may not. The uncertainty in the partial effects is shown by two credible interval bands; a dark blue central band is a 68% Bayesian credible interval, while the lighter blue outer interval is a 95% Bayesian credible interval. The background of each panel is light grey with white grid lines, in a similar style to ggplot2's default theme.
A new release of the mgcv #RStats π¦ is out on CRAN and Simon Wood (U Edinburgh) has added some significant new features despite the small bump in version number:
π scasm() for estimating GAMs with shape constrained smooths. Can be used with any family & smoothness selection is via the EFS method
Opinion: Chaos is coming for scholarly publishing.
Buckling of commercial models alongside maturing of community-led efforts promises major shifts, says Caroline Edwards (@theblochian.bsky.social).
www.researchprofessionalnews.com/rr-news-uk-v...
Feel free to tag me on any questions if you post them here. Lots of answers on CrossValidated in the generalized additive model tag cover HGAMs. If you have a stats question you could also ask it there
12.11.2025 06:40 β π 2 π 0 π¬ 1 π 0filter_out() = yeet()
07.11.2025 16:34 β π 54 π 3 π¬ 3 π 0Results of model fitting to the average daily fat content data from @Henderson1990-bd. a) observed average daily fat content (points) and estimated lactation curves from Wood's [-@Wood1967-re] model, a Tweedie GLM, and a Tweedie GAM (lines) with associated 95% confidence (Wood's model) or 95% credible intervals (GLM and GAM). Response residuals for Wood's model (b), Tweedie GLM (c), and Tweedie GAM (d), plus scatter plot smoothers (lines) and 95% credible intervals (shaded ribbons). The fitted lactation curves are like an inverted U, with an extended longer tail to the right (later in lactation). The GAM curve fits the data well, but the fitted curves from Wood's model and the GLM equivalent do not provide good fits to the data, and over predict the amount of fat produced at the peak of lactation, and only grossly capture the decline in fat production later in lactation. The remaining panels show the raw response residuals for the three models, drawing attention to the poor fit; for Wood's model and the GLM there is significant pattern in the residuals, while for the GAM no residual pattern is observed.
Quantities of interest derived from Wood's model, a Tweedie GLM, and a Tweedie GAM fitted to the lactation data example: a) the estimated week of peak average daily fat content, b) the estimated average daily fat content at the peak, and c) the rate of change (first derivative) of the lactation curve estimated at a point that is midway between the peak fat content and the end of lacation. The points are the estimated values and the lines are a 95% uncertainty interval. The uncertainty interval is based on the 0.025 and 0.975 percentiles of the bootstrap distribution of model coefficient estimates (Wood's model) or of the posterior distribution (GLM and GAM). Each panel shows three point estimates and an uncertainty range. The three points are the estimates from a GAM, a GLM, and Wood's lactation model. The first panel shows the estimated timing of the peak of lactation, with the GAM capturing the fact that the peak in the data occurs much later in lactation (~ week 11) while the other two models confidently estimate that the peak is in week ~8-9. The GAM estimate has a much wider credible interval, which does include the estimates of Wood's model & the GLM at the extreme end. This reflects the uncertainty in the estimation of the peak timing arising from the data having a wide flat peak. The other panels show the estimates of fat content at the peak, which are broadly similar at ~ 0.7 kg fat per day. The final panel showing the persistency estimate shows the GAM estimate diverging from those of the GLM and Wood's model. Again, the latter two models are overly confident in their estimation of this biologically relevant parameter, despite the fited lactation curve not really following the lactation data.
a) Estimated daily growth rate on November 15^th^, 2021 and 95% Bayesian credible interval for the 18 pigs in the pig growth example. b) Posterior distribution of daily growth rate on November 15^th^, 2021, for three pigs (numbers 2, 13, and 17), for whom weight observations ceased before November 1^st^, 2021. In b), the shaded region is the posterior distribution, the point, and thick and thin bars are the posterior median, and 66% and 95% posterior intervals respectively. With the fitted growth curves, we can estimate for any day what the growth rate of each pig was. In this figure I'm showing the estimated growth rate of each pig in the example on November 21st. This growth rate is the first derivative of the fitted growth curve (smooth function). I used posterior sampling to produce the posterior distribution of the growth rate for each pig. These are summarised as a point estimate (median) and ccredible interval in the first panel with most pigs growin at ~1-1.5 kg per day by November 21st, with uncertainties on the order of +/- 0.5 kg per day. The second panel shows the entire posterior distribution of the estimated growth rate for three pigs (2, 13, and 17) for whom there were no weight estimates after November 1st. Here, the model is drawing power from the other pigs to help extrapolate the growth curves for these three pigs, but pig-specific details remain, with the posterior distribution for pig 17 being much more diffuse (wider) than for either pigs 2 or 13, reflecting greater uncertainty for the former animal.
Just updated my manuscript on using #GAMs in #AnimalScience, now on arXiv: doi.org/10.48550/arX...
πππͺΆ
Extended examples now show how GAMs go beyond prediction, helping estimate biologically meaningful traits from data.
Code: github.com/gavinsimpson...
π§ͺ #RStats #mgcv #Statistics #OpenScience
"I don't care for the UK tonight
So stay
Stay"
Fuck you, Katie Lam, absolutely fuck off.
www.youtube.com/watch?v=W38v...
Iβm seeing some misinformation about pseudo-random number generator best practices going around the internets. Letβs talk about why the pseudo-random number generator seed you use shouldnβt actually have any impact on your results and, consequently, you can choose whatever seed you damn well please.
22.10.2025 19:06 β π 37 π 11 π¬ 4 π 4So Framework is supporting projects in the Linux ecosphere that are lead by vile, racist, homophonic, transphobic people and sees nothing wrong with that. And Shopify supports and platforms the racist idiot that built Ruby on Rails, all because their money stream is entirely dependent on Rails
23.10.2025 06:28 β π 0 π 0 π¬ 1 π 0Fuck Kramnik
22.10.2025 16:43 β π 1 π 0 π¬ 0 π 0I have very fond memories of feeding the giraffe there; have fun enjoying Nairobi (taking breakfast at the national park at dawn, in one of the picnic / viewpoints on the high ground overlooking the rest of the park was a real treat, as was the elephant orphanage)
06.10.2025 17:39 β π 1 π 0 π¬ 0 π 0If you want to check out the examples from the Physalia course, the materials from the last running are available on GitHub
github.com/gavinsimpson...
Iβd be happy to add/use a psych example or two if you have suggestions for papers or analyses where the data are open?
29.09.2025 08:23 β π 2 π 0 π¬ 1 π 0As Iβm a geographer by PhD working in ecology, environmental science, and now animal science, the examples tend towards the natural and life sciences, but they are quite varied so attendees from a wide array of backgrounds usually find them relatable.
29.09.2025 08:23 β π 3 π 0 π¬ 1 π 0Iβll be running a 3-day one at AU Viborg (about an hour from Aarhus, Denmark) June 9th through 11th that is in person. Dates are TBC but registration should be open in the next couple of weeks
Iβm also running an online one with @physaliacourses.bsky.social in December this year.
The Pink Book of #MarginalEffects (aka Model to Meaning) ships next week and I've got a backlog of Zoolander memes.
Hope you're hungry for some spam in your timeline.
#RStats #PyData
The new {marginaleffects} release for #RStats (0.30.0) comes with two new vignettes:
1. Speed up computation with automatic differentiation (often 10x gains) marginaleffects.com/bonus/perfor...
2. Power analyses with {marginaleffects} and {DeclareDesign}. marginaleffects.com/bonus/power....
We're glad to finally bring you this update!
11.09.2025 12:02 β π 28 π 3 π¬ 1 π 0Congratulations Thomas; I know this release hasnβt been an easy one
11.09.2025 14:36 β π 5 π 0 π¬ 0 π 0I am beyond excited to announce that ggplot2 4.0.0 has just landed on CRAN.
It's not every day we have a new major #ggplot2 release but it is a fitting 18 year birthday present for the package.
Get an overview of the release in this blog post and be on the lookout for more in-depth posts #rstats
π±
10.09.2025 18:17 β π 1 π 0 π¬ 0 π 0