Etienne Bacher

Etienne Bacher

@etiennebacher.bsky.social

PhD in economics from LISER, Luxembourg, now looking for research software engineer or data science positions. Mostly here to talk about #rstats https://github.com/etiennebacher

199 Followers 68 Following 128 Posts Joined Dec 2024
1 week ago

Yes, a bit, but also some coaching could have been done earlier (like dupont serin).
But sometimes the opponent just has a great day, at least we got to see beautiful rugby. And with this BP the tournament victory almost only depends on us

2 0 1 0
1 week ago

Also fair to assume that Scotland would have better defended if they didn't have a 25+ points margin

0 0 1 0
1 week ago
Preview
Workshops/BPLIM2025 at master · BPLIM/Workshops Collection of presentations at BPLIM's workshops. Contribute to BPLIM/Workshops development by creating an account on GitHub.

Back in December, I presented at a "Fast Computing" workshop hosted by the Bank of Portugal. The videos + materials are now all available online. github.com/BPLIM/Worksh...

(Also including cool talks by @s3alfisc.bsky.social, @sebkrantz.bsky.social, and others.)

13 5 0 0
2 weeks ago
A screenshot with three panels. The two top panels show two R files where an R function "expr_uses_col_from_dots" is defined (therefore two functions are given the same name). The bottom panel shows  the output of "jarl check . --select duplicated_function_definition", highlighting one of the "expr_uses_col_from_dots" with the message:

`expr_uses_col_from_dots` is defined more than once in this package.
help: Other definition at R/utils-expr.R:885:1

This will be available in the next version of Jarl (0.5.0):

1 0 1 0
2 weeks ago

Alt+enter

1 0 1 0
3 weeks ago

Yes it's a complement, not a replacement. They don't do the same thing

3 0 1 0
3 weeks ago
Jarl – jarl

And a bit of self promotion 😄: if you use R, you could try out my new linter Jarl

jarl.etiennebacher.com

4 1 1 0
3 weeks ago

A linter often plays well with a formatter, whose job is to automatically format the code to match some rules in terms of spacing, indentation, etc

1 0 1 0
3 weeks ago

A linter could also detect correctness issues in the code, for example to find code that can never run because it comes after a return() in a function.

Another example is detecting code that we know will error if it runs.

2 0 1 0
3 weeks ago

A linter checks for several patterns of code that could / should be fixed to improve it.

For instance, some code might produce the correct output but could be slow or hard to read because it doesn't use the most appropriate function for the job. A linter could detect that and recommend a fix.

1 0 1 0
1 month ago

Thank you, it's always nice to read that :)

2 0 0 0
1 month ago

Well deserved, congrats!

1 0 0 0
1 month ago
Preview
Changelog

#rstats tidypolars 0.17.0 is available!

tidypolars provides the tidyverse syntax while using polars for better perf.

In this release:

- support new functions from dplyr 1.2.0 (filter_out, when_any...)
- pivot_wider with lazyframe
- bug fixes

and more

News: tidypolars.etiennebacher.com/news/

28 9 0 0
1 month ago
Enable "Fix on save" in editors · Issue #160 · etiennebacher/jarl Ruff has a code action "Fix all" that can run on save: https://marketplace.visualstudio.com/items?itemName=charliermarsh.ruff

Not for now, this is sth that maybe will be implemented but I have some concerns, see github.com/etiennebache...

1 0 0 0
1 month ago

You had a great timing, I had already planned to release today ^^

2 0 0 0
1 month ago
This screenshot shows the terminal after running `jarl check . --statistics`:

> jarl check . --statistics
   92 [ ] true_false_symbol
   30 [ ] implicit_assignment
    9 [*] numeric_leading_zero
    8 [ ] duplicated_arguments
    3 [*] seq
    3 [*] lengths
    2 [ ] unreachable_code
    2 [*] outer_negation
    1 [*] any_is_na
    1 [*] class_equals
    1 [*] length_levels

Rules with `[*]` have an automatic fix.

To avoid filling the terminal with tons of diagnostics, there is now a command-line option `--statistics` to quickly show the summary of diagnostics reported by Jarl.

(not trying to throw shade on dplyr of course, just an example ;-) )

0 0 0 0
1 month ago
This screenshot is in two parts. 

1. the left part shows a code example where some lines should reported by Jarl but are ignored because they have a `# jarl-ignore` comment

# The comment below only applies to `any(is.na(x1))`.
# jarl-ignore any_is_na: <reason>
any(is.na(x1))
any(is.na(x2))

# The comment below applies to the entire function definition, including the
# two `any(is.na(...))` calls.
# jarl-ignore any_is_na: <reason>
f <- function(x1, x2) {
  any(is.na(x1))
  any(is.na(x2))
}

2. the right part shows the terminal, where Jarl only reports the line that doesn't have this special comment.

0.4.0 brings a new system for suppression comments. Suppression comments allow you to ignore diagnostics on specific pieces of code. Jarl used to have some (brittle) compatibility with `lintr` comments "# nolint".

This is not the case anymore and Jarl only supports "# jarl-ignore" comments.

0 0 1 0
1 month ago
This screenshot is in two parts:

1. the left part shows  a function where a print() statement is unreachable because it comes after an if/else where all branches return early or error.

f <- function(x) {
  if (x > 5) {
    return("greater than five")
  } else if (x < 5) {
    return("lower than five")
  } else {
    stop("x must be greater or lower than five")
  }
  print("end of function")
}

2. the right part shows the output of `jarl check test.R` in the terminal, highlighting that this code is unreachable:

warning: unreachable_code
 --> _posts/2026-02-03-jarl-0.4.0/test.R:9:3
  |
9 |   print("end of function")
  |   ------------------------ This code is unreachable because the preceding if/else
  terminates in all branches.
  |

Found 1 error. This is very similar to the first image but the code example is different as a line of code is unreachable because it comes after a `next` in a for loop:

f <- function(x) {
  for (i in names(x)) {
    if (i == "foo") {
      next
      print("Found name 'foo', skipping")
    }
    print(toupper(i))
  }
}

Jarl is now able to find unreachable code, meaning code that will never run because it's after a stop(), a return(), or a `next` in a `for` loop for example.

This can also happen if the code comes after an `if` statement where all branches return early or error, and Jarl can reliably detect that.

1 0 1 0
1 month ago
Etienne Bacher: Jarl 0.4.0 Find unreachable code, ignore diagnostics, show summary statistics of diagnostics, and more.

#rstats I'm very happy to announce Jarl 0.4.0!

Jarl is a very fast R linter, written in Rust. This release brings lots of improvements and fixes.

See the blog post: www.etiennebacher.com/posts/2026-0...

And the full changelog: jarl.etiennebacher.com/changelog

🧵 to highlight some features below

26 10 3 2
1 month ago
A screenshot with two parts: 

1. the left side shows code with a nested loop:

for (x in names(mtcars)) {
  x <- substr(x, 1, 3)
  for (x in 1:3) {
    print(x)
  }
}

2. the right part shows the output of `jarl check test.R` in the terminal:

warning: for_loop_dup_index
 --> test.R:3:8
  |
3 |   for (x in 1:3) {
  |        -------- This index variable is already used in a parent `for` loop.
  |
  = help: Rename this index variable to avoid unexpected results.

Found 1 error.

It will make it in the update:

5 1 1 0
1 month ago

That's a good idea, I'll try to include it in Jarl before I release 0.4.0

2 0 1 0
1 month ago

No idea, I've never used it. Maybe @gmcd.bsky.social can answer that

1 0 1 0
1 month ago
Preview
GitHub - grantmcdermott/dbreg: Fast regressions on database backends Fast regressions on database backends. Contribute to grantmcdermott/dbreg development by creating an account on GitHub.

You might be interested in dbreg, it looks like there's an overlap in functionalities: github.com/grantmcdermo...

4 1 1 0
1 month ago

You can see some benchmarks here, with the usual caveats that benchmarks never capture all use cases:
duckdblabs.github.io/db-benchmark/

I would just suggest to give both polars and duckdb a try.

0 0 0 0
1 month ago

Regarding polars vs duckdb, both are super performant. The one that suits your needs best will likely depend on the usecase. I like the data frame interface of polars in both R and python since I'm not super familiar with writing SQL.

0 0 1 0
1 month ago

Parquet is a file format while polars or duckdb are data processing libraries so they're not comparison to be made between parquet and polars. Polars is very very fast at processing parquet files.

0 0 2 0
1 month ago
Preview
Changelog

#rstats tidypolars 0.16.0 is available!

tidypolars provides the tidyverse syntax while using polars for better perf.

This release:
- support for unnest and separate functions (tidyr)
- new interface to export partitioned output
- and more

News: www.tidypolars.etiennebacher.com/news/#tidypo...

10 1 1 0
1 month ago
Disable cutesy, encouraging messages? · Issue #804 · r-lib/testthat Is there a way to disable the cute messages when tests fail? (a la "No one is perfect" et al.?) It can get to be cumbersome during repeated unit tests. > test_file('myscript.R') ✔ | OK F W S | Cont...

This is the only thing I found in the issue tracker: github.com/r-lib/testth...

2 0 1 0
1 month ago
Handling large data with R and Python

#rstats I've given a workshop a few days ago on handling large data (think tens to hundreds of millions of rows) with Polars in Python and in R.

Here are my introductory slides: brussels-large-data-r-python.etiennebacher.com

And the associated repo: github.com/etiennebache...

19 7 1 0
2 months ago
changelog – jarl

#rstats Jarl 0.3.0 is available!

Jarl is a very fast R linter, able to check and fix thousands of lines in milliseconds.

New since 0.2.0:
- 6 new rules
- ignore automatically generated files by default
- bug fixes and perf improvements

All changes: jarl.etiennebacher.com/changelog

6 1 0 0