Jem Arnold's Avatar

Jem Arnold

@jemarnold.bsky.social

PhD candidate & physiotherapist | 🩸Iliac artery endofibrosis / FLIA | Endurance testing & NIRS. Treat declarative statements as questions?

647 Followers  |  369 Following  |  381 Posts  |  Joined: 23.09.2023
Posts Following

Posts by Jem Arnold (@jemarnold.bsky.social)

For the same reason Methods sections in papers need to be written clearly. Code is data analysis is not only the "doing" but is also a detailed account of the "how/why-ing".

08.03.2026 05:02 β€” πŸ‘ 9    πŸ” 2    πŸ’¬ 2    πŸ“Œ 0

I seem to need to relearn it every time I want to use it properly πŸ˜…

07.03.2026 23:57 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Interesting. Thanks. It's occurring with Sonnet 4.6 via copilot via Position assistant. But not so far with VS Claude Code. Can't recall whether it's happened with Opus 4.6. Possibly something about the model access layers?

07.03.2026 23:27 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Now I have the full picture. The problem was I was trying to use positron_editFile_internal with // ...existing code... markers, which the tool was applying at the top of the file. The correct approach is to use copilot_replaceString with exact literal context. Let me also check create_mock_interval to know what attributes it sets, then write the tests precisely.

test-extract_intervals.R

Now I have a precise picture. The problem with previous attempts was that the tool was trying to match oldString on content that appears at the top of the file, not at the actual target lines. Now I understand exactly:

Now I have the full picture. The problem was I was trying to use positron_editFile_internal with // ...existing code... markers, which the tool was applying at the top of the file. The correct approach is to use copilot_replaceString with exact literal context. Let me also check create_mock_interval to know what attributes it sets, then write the tests precisely. test-extract_intervals.R Now I have a precise picture. The problem with previous attempts was that the tool was trying to match oldString on content that appears at the top of the file, not at the actual target lines. Now I understand exactly:

hmm, this is what it reveals about its own inner workings

But as insightful as that sounds, it proceeded to try the same thing again, so... πŸ™„

One of us is having a skill issue

07.03.2026 05:39 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Screenshot of .R script in Positron IDE with Positron assistant copilot edits suggested at the very top of the script, corrupting the true structure of the script, rather than at the right line location.

Screenshot of .R script in Positron IDE with Positron assistant copilot edits suggested at the very top of the script, corrupting the true structure of the script, rather than at the right line location.

Positron Assistant keeps trying to throw code edits inappropriately at the top of my scripts, rather than at the intended lines. Very annoying. Any ideas why or fixes to try? πŸ€” #rstats

I have to reject and ask it to repeat the operation. Usually gets it right on the second attempt

07.03.2026 05:18 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 2    πŸ“Œ 0
Preview
The Cyclists’ Alliance needs your help - Canadian Cycling Magazine The organization that has quietly supported hundreds of women in pro cycling is running out of money

β€œIt is so important to have an independent advocacy group for women in professional cycling,” Alison Jackson says

cyclingmagazine.ca/sections/new...

05.03.2026 11:26 β€” πŸ‘ 12    πŸ” 5    πŸ’¬ 0    πŸ“Œ 1

imo -- making sure folks understand tibbles or just basically lists and lists are basically vectors goes and then showing how to subset vectors / lists goes a along way to unifying base R vs. tidyverse syntax

05.03.2026 03:29 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

(Though I made exactly that same comment about cycling analytics back before any of the current big apps existed: they were diving into a single ride's data when the big potential was looking at trends, patterns, and deviations from pattern across rides).

04.03.2026 11:25 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
GitHub - posit-dev/skills: A collection of Claude Skills from Posit A collection of Claude Skills from Posit. Contribute to posit-dev/skills development by creating an account on GitHub.

You add the posit marketplace, install r-lib, and invoke the skill at an opportune moment (or claude code runs it for you with the right trigger words).

The README has some instructions.

github.com/posit-dev/sk...

02.03.2026 21:29 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Thanks for this!

I admit I'm only using Positron assistant w/ copilot which I don't think can access skills.md. What is your general workflow for Claude code with package development? Or any resources you could point me to?

02.03.2026 21:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Please do and report back!

I admit I'm only using Positron assistant w/ copilot which I don't think can access skills.md. What is your general workflow for Claude code with package development?

02.03.2026 21:12 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Be extra careful with the Description: -- quote software names, beware of spelling, use proper DOI refs, ...

"Newbies" -- packages, not maintainers are put through a special room in CRAN-Hell πŸ‘Ώ

02.03.2026 03:13 β€” πŸ‘ 0    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
skills/r-lib/cran-extrachecks at main Β· posit-dev/skills A collection of Claude Skills from Posit. Contribute to posit-dev/skills development by creating an account on GitHub.

use this skill for some extra help on the arcane: github.com/posit-dev/sk...

01.03.2026 20:44 β€” πŸ‘ 5    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
Thinking inside the box Dirk Eddelbuettel, R, C++, Rcpp

"newbies" check (pkgs w/o prior releases on CRAN) have a particularly fussy human-administered set of checks (e.g. do all functions have explicitly documented return values?) Also, things like spell-check false positives (can be fixed via dirk.eddelbuettel.com/blog/2017/08... )

01.03.2026 17:54 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Thanks. Valuable to set those expectations. Sounds like a manuscript review process 🫠

01.03.2026 15:30 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Run R CMD check --as-cran, and read the CRAN Repository Policy document and Writing R Extensions carefully (they are long and dense and document updates poorly...).

Don't expect perfection. Iterate. Feedback from the review is normal as we cannot run all their checks (their bug, not ours). #rstats

01.03.2026 14:56 β€” πŸ‘ 9    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Good point. Possible to de-trend with a polynomial and look for deviations from a circle vs a straight line? πŸ€”

There is information in the rate of change, it just doesn't always happen at discrete breakpoints. Not always an obvious 'corner' on the circle

01.03.2026 14:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Thanks, yeah that's exactly why I think implementing something like this is above my current skill ceiling, but could definitely be done

Use manual selection to inform priors on meaningful changepoints, then a Bayesian model to statistically validate? πŸ€·β€β™‚οΈ

01.03.2026 14:51 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Dear #rstats mentors, what advice do you wish someone would have given you before submitting to CRAN for the first time?

The obvious, and the less so

01.03.2026 14:46 β€” πŸ‘ 8    πŸ” 2    πŸ’¬ 7    πŸ“Œ 1

Metabolic threshold detection should combine expert manual selection with a locally weighted piecewise fit/changepoint model

Beyond my current time availability and skill limits to develop, but I'd love to see how it works

01.03.2026 03:10 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
+ )
# A tibble: 3 Γ— 13
  expression       min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc
  <bch:expr>  <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>
1 which_apply  50.03ms  116.8ms      8.18    2.49MB     4.03    67    33
2 Find          8.39ms   12.2ms     70.9     9.69KB     2.19    97     3
3 loop          11.7ms     20ms     50.3    26.08KB     2.64    95     5
# β„Ή 5 more variables: total_time <bch:tm>, result <list>, memory <list>,
#   time <list>, gc <list>

+ ) # A tibble: 3 Γ— 13 expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> <int> <dbl> 1 which_apply 50.03ms 116.8ms 8.18 2.49MB 4.03 67 33 2 Find 8.39ms 12.2ms 70.9 9.69KB 2.19 97 3 3 loop 11.7ms 20ms 50.3 26.08KB 2.64 95 5 # β„Ή 5 more variables: total_time <bch:tm>, result <list>, memory <list>, # time <list>, gc <list>

Find() faster than a loop too

28.02.2026 00:19 β€” πŸ‘ 2    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

Oh yeah. I'm coming back to base R from tidyverse (purrr, etc.), and I just about feel comfortable with the *apply() functions. Now starting to really understand the power of these higher order functions

tidyverse vs base R official #rstats debate I think is planned for next Thursday?

28.02.2026 00:14 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Post image

The EPPL is recruiting for a remote (app-based) pilot study examining whether light, fully self-paced, swimming may be a tolerable form of movement for people with ME/CFS and related conditions, due to the distinct physiological effects of water immersion. #ME/CFS #ME #POTS #LongCovid

20.02.2026 19:24 β€” πŸ‘ 1    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0

We recently went through a similar process for equivalence testing and for standard difference testing. It is a good experience. Although I can't help but feel that it still falls back to 'vibes' when there is incomplete quantitative information to use? πŸ€·β€β™‚οΈ

27.02.2026 19:17 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

πŸ˜…
That's kinda how I think about potential future users running my code repeatedly. This improvement saved ~30ms... negligible running a single analysis, but adds up on thousands of future runs

27.02.2026 19:08 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Thanks Philip! Ya, there are strong precautions about premature optimisation, but I stumbled on this while solving a bug, so I think that's justified πŸ˜„

Attempting to develop a package has completely changed how I code. Have to be much more flexible. It's been such a fun process!

27.02.2026 18:34 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I couldn't imagine rawdogging contrasts like that. The big lesson I learned from Russ is that the SEs around the marginal means aren't necessarily the appropriate SEs to compare marginal contrasts

27.02.2026 18:23 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
## pre-allocate row search counter
apply_count <- 0L
find_count <- 0L

which_apply_count <- which(apply(data, 1L, \(.row_vec) {
    apply_count <<- apply_count + 1L
    all(nirs_channels %in% .row_vec)
}))

Find_result <- Find(\(.i) {
    find_count <<- find_count + 1L
    all(nirs_channels %in% data[.i, ])
}, seq_len(nrow(data)))

## compare rows checked
data.frame(
    method = c("which(apply())", "Find()"),
    rows_checked = c(apply_count, find_count),
    result = c(which_apply_count, Find_result)
)

#>           method rows_checked result
#> 1 which(apply())        12042     41
#> 2         Find()           41     41

## pre-allocate row search counter apply_count <- 0L find_count <- 0L which_apply_count <- which(apply(data, 1L, \(.row_vec) { apply_count <<- apply_count + 1L all(nirs_channels %in% .row_vec) })) Find_result <- Find(\(.i) { find_count <<- find_count + 1L all(nirs_channels %in% data[.i, ]) }, seq_len(nrow(data))) ## compare rows checked data.frame( method = c("which(apply())", "Find()"), rows_checked = c(apply_count, find_count), result = c(which_apply_count, Find_result) ) #> method rows_checked result #> 1 which(apply()) 12042 41 #> 2 Find() 41 41

If I know the row I need is somewhere near the top, it's faster to stop searching after 41 rows than continue over 12042 rows...

who knew! ☺️

27.02.2026 18:08 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
library(mnirs) ## development package
library(bench) ## benchmarking

## read file to a data frame (internal fn)
data <- mnirs:::read_file(example_mnirs("train.red"))

## column name strings to match all
nirs_channels <- c("SmO2 unfiltered", "HBDiff unfiltered")

## return bench::mark results
bench::mark(
    ## previous code searched through all rows
    which_apply = {
        which(apply(data, 1L, \(.row_vec) {
            all(nirs_channels %in% .row_vec)
        }))
    },

    ## TIL about `Find()` which returns the first match and stops
    Find = {
        Find(\(.i) {
                all(nirs_channels %in% data[.i, ])
            }, seq_len(nrow(data)))
    },
    check = TRUE,
    iterations = 100
)
#> # A tibble: 2 Γ— 6
#>   expression       min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>  <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 which_apply   31.5ms  34.89ms      27.1    2.56MB     86.0
#> 2 Find          2.73ms   3.03ms     304.     63.2KB     26.4

library(mnirs) ## development package library(bench) ## benchmarking ## read file to a data frame (internal fn) data <- mnirs:::read_file(example_mnirs("train.red")) ## column name strings to match all nirs_channels <- c("SmO2 unfiltered", "HBDiff unfiltered") ## return bench::mark results bench::mark( ## previous code searched through all rows which_apply = { which(apply(data, 1L, \(.row_vec) { all(nirs_channels %in% .row_vec) })) }, ## TIL about `Find()` which returns the first match and stops Find = { Find(\(.i) { all(nirs_channels %in% data[.i, ]) }, seq_len(nrow(data))) }, check = TRUE, iterations = 100 ) #> # A tibble: 2 Γ— 6 #> expression min median `itr/sec` mem_alloc `gc/sec` #> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> #> 1 which_apply 31.5ms 34.89ms 27.1 2.56MB 86.0 #> 2 Find 2.73ms 3.03ms 304. 63.2KB 26.4

TIL `Find()` returns the first match then stops searching.

10x speed and 40x memory improvement in one of my core package functions!

I wonder where else I can implement? πŸ€”
#rstats #mnirs

27.02.2026 18:08 β€” πŸ‘ 16    πŸ” 2    πŸ’¬ 3    πŸ“Œ 0

OH phew! I interpreted "getting the numbers they want" as something else 🫣

26.02.2026 23:32 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0