Ron Richman's Avatar

Ron Richman

@ronrich.bsky.social

Avid actuary

229 Followers  |  1,452 Following  |  3 Posts  |  Joined: 03.10.2023  |  1.8564

Latest posts by ronrich.bsky.social on Bluesky

A Bibliography Database for Machine Learning Getting the correct bibtex entry for a conference paper (e.g. published at NeurIPS, ICML, ICLR) is annoyingly hard: if you search for the title, you will often find a link to arxiv or to the pdf file,...

Want all NeurIPS/ICML/ICLR papers in one single .bib file? Here you go!

πŸ—žοΈ short blog post: fabian-sp.github.io/posts/2024/1...

πŸ“‡ bib files: github.com/fabian-sp/ml-bib

17.12.2024 10:42 β€” πŸ‘ 6    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Preview
Survey of different Large Language Model Architectures: Trends, Benchmarks, and Challenges Large Language Models (LLMs) represent a class of deep learning models adept at understanding natural language and generating coherent responses to various prompts or queries. These models far exceed ...

Survey of different Large Language Model Architectures: Trends, Benchmarks, and Challenges

Presents a survey on LLM architectures that systematically categorizes auto-encoding, auto-regressive and encoder-decoder models.

πŸ“ arxiv.org/abs/2412.03220

05.12.2024 04:15 β€” πŸ‘ 20    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0
Post image

🚨 New preprint out!

We build **scalar** time series embeddings of temporal networks !

The key enabling insight : the relevant feature of each network snapshot... is just its distance to every other snapshot!

Work w/ FJ MarΓ­n, N. Masuda, L. Arola-FernΓ‘ndez

arxiv.org/abs/2412.02715

05.12.2024 08:30 β€” πŸ‘ 19    πŸ” 4    πŸ’¬ 0    πŸ“Œ 1
Post image

Proud to announce our NeurIPS spotlight, which was in the works for over a year now :) We dig into why decomposing aleatoric and epistemic uncertainty is hard, and what this means for the future of uncertainty quantification.

πŸ“– arxiv.org/abs/2402.19460 🧡1/10

03.12.2024 09:45 β€” πŸ‘ 74    πŸ” 12    πŸ’¬ 3    πŸ“Œ 2
Post image

If you use SHAP, LIME or Data Shapley, you might be interested in our new #neurips2024 paper. We introduce stochastic amortization to speed up feature + data attribution by 10x-100x πŸš€ #XML

Surprisingly we can "learn to attribute" cheaply from noisy explanations! arxiv.org/abs/2401.15866

02.12.2024 17:35 β€” πŸ‘ 77    πŸ” 12    πŸ’¬ 1    πŸ“Œ 0
Samples y | x from Treeffuser vs. true densities, for multiple values of x under three different scenarios. Treeffuser captures arbitrarily complex conditional distributions that vary with x.

Samples y | x from Treeffuser vs. true densities, for multiple values of x under three different scenarios. Treeffuser captures arbitrarily complex conditional distributions that vary with x.

I am very excited to share our new Neurips 2024 paper + package, Treeffuser! 🌳 We combine gradient-boosted trees with diffusion models for fast, flexible probabilistic predictions and well-calibrated uncertainty.

paper: arxiv.org/abs/2406.07658
repo: github.com/blei-lab/tre...

🧡(1/8)

02.12.2024 21:48 β€” πŸ‘ 156    πŸ” 23    πŸ’¬ 4    πŸ“Œ 4
Post image

TIL from the Hard Fork podcast that the transformer, the core of modern AI including LLMs, was inspired by the aliens in Arrival. That’s wildβ€”and yet another reason to watch Arrival, easily one of the best films of the last decade. Great podcast, great movie!

01.12.2024 06:47 β€” πŸ‘ 31    πŸ” 3    πŸ’¬ 1    πŸ“Œ 2
Transcript of Hard Fork ep 111: Yeah. And I could talk for an hour about transformers and why they are so important.
But I think it's important to say that they were inspired by the alien language in the film Arrival, which had just recently come out.
And a group of researchers at Google, one researcher in particular, who was part of that original team, was inspired by watching Arrival and seeing that the aliens in the movie had this language which represented entire sentences with a single symbol. And they thought, hey, what if we did that inside of a neural network? So rather than processing all of the inputs that you would give to one of these systems one word at a time, you could have this thing called an attention mechanism, which paid attention to all of it simultaneously.
That would allow you to process much more information much faster. And that insight sparked the creation of the transformer, which led to all the stuff we see in Al today.

Transcript of Hard Fork ep 111: Yeah. And I could talk for an hour about transformers and why they are so important. But I think it's important to say that they were inspired by the alien language in the film Arrival, which had just recently come out. And a group of researchers at Google, one researcher in particular, who was part of that original team, was inspired by watching Arrival and seeing that the aliens in the movie had this language which represented entire sentences with a single symbol. And they thought, hey, what if we did that inside of a neural network? So rather than processing all of the inputs that you would give to one of these systems one word at a time, you could have this thing called an attention mechanism, which paid attention to all of it simultaneously. That would allow you to process much more information much faster. And that insight sparked the creation of the transformer, which led to all the stuff we see in Al today.

Did you know that attention across the whole input span was inspired by the time-negating alien language in Arrival? Crazy anecdote from the latest Hard Fork podcast (by @kevinroose.com and @caseynewton.bsky.social). HT nwbrownboi on Threads for the lead.

01.12.2024 14:50 β€” πŸ‘ 247    πŸ” 53    πŸ’¬ 19    πŸ“Œ 17

Did you do bagging in addition to TabM ie fit five of these?

27.11.2024 11:31 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I did a small test with TabM-mini and 5-fold bagging, only default parameters with numerical embeddings. It seems that it's roughly comparable with RealMLP. But then maybe RealMLP can benefit more from additional ensembling or the two could be combined. A fair comparison with ensembling is hard.

26.11.2024 14:13 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

PyTabKit 1.1 is out!

- Includes TabM and provides a scikit-learn interface
- some baseline NN parameter names are renamed (removed double-underscores)
- other small changes, see the readme.

github.com/dholzmueller...

25.11.2024 10:49 β€” πŸ‘ 13    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0

Bluesky really is the new #rstats twitter because we have the first base R vs tidyverse flame war 🀣

14.11.2024 17:18 β€” πŸ‘ 581    πŸ” 70    πŸ’¬ 29    πŸ“Œ 12

With a message like that I just activate

14.11.2024 09:54 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Welcome!

07.08.2024 08:26 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Going to try to start posting more on here given the increasingly suffocating toxicity of the other place πŸ‘‹

07.08.2024 07:13 β€” πŸ‘ 1834    πŸ” 160    πŸ’¬ 112    πŸ“Œ 13
Post image

When home heating prices are lower, fewer people die each winter, particularly in high-poverty communities. That's the punchline of my paper with Janjala Chirakijja and Pinchuan Ong on heating prices and mortality in the US, just published in the Economic Journal. πŸ“‰πŸ“ˆ academic.oup.com/ej/advance-a...

07.12.2023 18:35 β€” πŸ‘ 197    πŸ” 86    πŸ’¬ 2    πŸ“Œ 7

β€œComputer Age Statistical Inference” by Efron and Hastie is great for learning the connections among all these (though not much on deep learning specifically). And it’s free! Can’t recommend this book highly enough:
hastie.su.domains/CASI_files/P...

03.10.2023 21:34 β€” πŸ‘ 4    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

@ronrich is following 20 prominent accounts