Les messageries, ou les logiciels en général, n'ont du succès que si ils sont utilisés par beaucoup de monde.
Une mesure d'isolationisme numérique sera un échec long terme.
Pourquoi pas @signal.org avec un hébergement en France?
En attendant, mes visios avec des agences d'état sont sous teams...
01.08.2025 12:19 — 👍 15 🔁 1 💬 2 📌 0
EurIPS includes a call for both Workshops and Affinity Workshops!
We look forward to making #EurIPS a diverse and inclusive event with you.
The submission deadlines are August 22nd, AoE.
More information at:
eurips.cc/call-for-wor...
eurips.cc/call-for-aff...
28.07.2025 08:51 — 👍 34 🔁 18 💬 0 📌 2
📢 Talk Announcement
"PyPI in the face: running jokes that PyPI download stats can play on you", by Loïc Estève.
📜 Talk info: pretalx.com/pydata-paris-2025/talk/DSHHZK
📅 Schedule: pydata.org/paris2025/schedule
🎟 Tickets: pydata.org/paris2025/tickets
28.07.2025 10:27 — 👍 2 🔁 1 💬 0 📌 0
Excited to have co-contributed the SquashingScaler, which implements the robust numerical preprocessing from RealMLP!
24.07.2025 16:00 — 👍 7 🔁 4 💬 0 📌 0
Huge release, and the first one where I felt like I actually contributed a lot to the final result.
I really think DataOps are a game changer, and I can't wait to see what people come up with with them.
I also ended up rewriting most of the user guide, hopefully improving it along on the way 😂
24.07.2025 16:05 — 👍 3 🔁 3 💬 0 📌 0
✨️💥skrub: machine learning with dataframes
New release 💫 0.6
A huge one, with the super powerful new "DataOps", and many improvements all over the library.
Exciting!!
24.07.2025 16:16 — 👍 15 🔁 3 💬 0 📌 0
config
Configuration:
⚙️ A global Skrub config has been introduced, which allows to set a number of parameters to customize the behavior of Skrub.
skrub-data.org/stable/refer...
24.07.2025 15:55 — 👍 0 🔁 1 💬 1 📌 0
DropUninformative
🗑️ DropUninformative is a transformer that uses various heuristics to remove columns that are unlikely to bring information for training a model.
skrub-data.org/dev/referenc...
24.07.2025 15:55 — 👍 0 🔁 1 💬 1 📌 0
📊 The TableReport has been improved with many new features: series are now supported directly, it is possible to skip generating plots when the number of columns in the dataframe exceeds a user-defined threshold. Columns with high cardinality and sorted columns are now highlighted.
24.07.2025 15:55 — 👍 0 🔁 1 💬 1 📌 0
Form complex DataOps plans to train and tune machine learning models, then export the plans as learners, standalone objects that can be used on new data.
Tune hyperparameters where they're defined, and explore the resulting space with a parallel coordinate plot
24.07.2025 15:55 — 👍 0 🔁 1 💬 1 📌 0
🌟 Major feature! Skrub DataOps are a powerful new way of combining dataframe transformations over multiple tables with machine learning pipelines.
24.07.2025 15:55 — 👍 1 🔁 1 💬 1 📌 0
⚡ Release 0.6.0 is now out! ⚡
🚀 Major update! Skrub DataOps, various improvements for the TableReport, new tools for applying transformers to the columns, and a new robust transformer for numerical features are only some of the features included in this release.
24.07.2025 15:55 — 👍 5 🔁 3 💬 1 📌 3
Nope
The trick is to delete email.
Reading is optional, and quite unproductive.
21.07.2025 20:42 — 👍 3 🔁 0 💬 0 📌 0
📢 Present your NeurIPS paper in Europe!
Join EurIPS 2025 + ELLIS UnConference in Copenhagen for in-person talks, posters, workshops and more. Registration opens soon; save the date:
📅 Dec 2–7, 2025
📍 Copenhagen 🇩🇰
🔗eurips.cc
#EurIPS
@euripsconf.bsky.social
16.07.2025 23:00 — 👍 60 🔁 17 💬 2 📌 3
Oui, la vie n'est pas juste, les humains ne sont pas rationnels (bisous aux économistes néo-classiques), et la sociologie est une force puissante.
Mais il faut l'accepter et se battre sur ce terrain. On peut moduler les équilibres (bravo @valmasdel.bsky.social )
11.07.2025 10:30 — 👍 1 🔁 0 💬 0 📌 0
An open mindset
The commitments required for fully open source machine learning
Fully open machine learning requires not only GPU access but a community commitment to openness. (Some nostalgic lessons from the ImageNet decade.)
10.07.2025 14:28 — 👍 27 🔁 4 💬 1 📌 1
Bluesky is cool
10.07.2025 05:19 — 👍 5 🔁 0 💬 0 📌 0
TabICL: A Tabular Foundation Model for In-Context Learning on Large Data
This work is presented at ICML next week.
• The paper arxiv.org/html/2502.05...
• The python package: pypistats.org/packages/tab... (try it out 🐍)
• The source code github.com/soda-inria/t... (100% open source, including pre-training 💞)
Longer read (5mn): gael-varoquaux.info/science/tabi...
8/9
09.07.2025 18:41 — 👍 12 🔁 2 💬 0 📌 0
With this architecture, both scalable and flexible, we can do intense pretraining, on rich simulated datasets, including large one, baking in subtle implicit biases.
For instance, in a comparison of classifiers, we see that TabICL has some of the axis-aligned aspect of trees
6/9
09.07.2025 18:41 — 👍 0 🔁 0 💬 1 📌 0
We also add positional encoding at the input using a "fingerprint" of the column distribution computed with a set-transformer.
Implicitly, this representation ends up capturing aspects of the distribution of input columns.
5/9
09.07.2025 18:41 — 👍 0 🔁 0 💬 1 📌 0
In TabICL, we do row-wise encoding, with a transformer across columns, before using a transformer across rows for in-context learning.
As a result, the cost is is O(n p² + n²), which is more scalable. Also, the architecture is more amenable to memory offloading and caching.
4/9
09.07.2025 18:41 — 👍 0 🔁 0 💬 1 📌 0
PhD, climate scientist @lsce-ipsl.bsky.social @CEAParisSaclay @IPSL_outreach @hc_climat. Was co-chair @ipcc.bsky.social (AR6 WGI). Member of French Academy of Technology and Academy of Sciences.
Professor for Machine Learning, University of Tübingen, Germany
Researcher in machine learning
Compte officiel de l'École normale supérieure | PSL - Membre de
l'Université PSL
DuckDB is an analytical in-process SQL database management system. "DuckDB" and the DuckDB logo are registered trademarks of the DuckDB Foundation.
Professor at BIFOLD & TU Berlin, research on data engineering for ML. Previously at UvA, NYU, Amazon, Twitter. Opinions are my own.
https://deem.berlin
Workshop on Data Management for End-to-End Machine Learning (DEEM).
Upcoming: 9th edition 27 June 2025, co-located w/ @sigmod2025.bsky.social, Berlin, Germany.
https://deem-workshop.github.io/
Auteur de “Pourquoi pas le vélo? Envie d’une France cyclable” et “50 bonnes raisons de faire du vélo”. 👉 Il ne faut pas empêcher les gens de se déplacer à vélo.
Co-founder and CEO, Mistral AI
[RE]Donner du sens au numérique
Revue de vulgarisation scientifique dédiée aux #sciences du #numérique publiée par Inria avec des scientifiques issus d'organismes de recherche et d'universités. #HelloESR
👉 RDV sur https://interstices.info
Compte officiel de l'Université Paris-Saclay : 48 000 étudiants, 9 000 chercheurs et enseignants-chercheurs, 4 600 doctorants #Formation #Recherche
Fédération française des Usagères et des usagers de la Bicyclette - #FUB #vélo #mobilité #PlanVélo
Strengthening Europe's Leadership in AI through Research Excellence | ellis.eu
ALMAnaCH, the Inria Paris NLP research team.
Développer la pratique quotidienne du vélo et les infrastructures locales. Promotion du vélo ✨, vélo-école 🚲, balades.
Association; antenne d’Antony (92) de "Mieux se Déplacer à Bicyclette"
https://mdb-idf.org/nos-relais-locaux/hautsdeseine-92/antony
L'asso qui met du vélo dans ta ville ! Adhérer : https://mdb-idf.org/adhesion
Updates on community and events, such as the TRL workshop at NeurIPS 2024 and ACL 2025.
Info: https://table-representation-learning.github.io/
Head of Div. Intelligent Medical Systems (IMSY) at DKFZ, director of the National Center of Tumor Diseases (NCT) Heidelberg and @ellis.eu Health board member. Excited about Medical Imaging AI, Surgical Data Science, and Validation of A(G)Is.