Bradley Love's Avatar

Bradley Love

@profdata.bsky.social

Senior research scientist at Los Alamos National Laboratory. Former UCL, UTexas, Alan Turing Institute, Ellis EU. CogSci, AI, Comp Neuro, AI for scientific discovery https://bradlove.org

4,125 Followers  |  739 Following  |  51 Posts  |  Joined: 05.10.2023  |  2.0784

Latest posts by profdata.bsky.social on Bluesky

Preview
Why I left academia and neuroscience Don't worry, this isn't yet another story of rage-quitting.

Michael X Cohen on why he left academia/neuroscience.
mikexcohen.substack.com/p/why-i-left...

06.10.2025 17:05 β€” πŸ‘ 90    πŸ” 34    πŸ’¬ 4    πŸ“Œ 14
Preview
Home Your local police force - online. Report a crime, contact us and other services, plus crime prevention advice, crime news, appeals and statistics.

moderation@blueskyweb.xyz, send to me, or send directly to the Met (London police) who are investigating www.met.police.uk. I could see this being super distressing for a vulnerable person, so hope this does not become more common. For me, it's been an exercise in rapidly learning to not care! 2/2

18.07.2025 22:14 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Some UK dude is trying to extort me, demanding money to not spread made-up stories. I reported to the poilice after getting flooded with phone messages I never listen to, etc. @bsky.app has been good about deleting his posts and accounts. If contacted, don't interact, but instead report to...1/2

18.07.2025 22:14 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Preview
Giving LLMs too much RoPE: A limit on Sutton’s Bitter Lesson β€” Bradley C. Love Introduction Sutton’s Bitter Lesson (Sutton, 2019) argues that machine learning breakthroughs, like AlphaGo, BERT, and large-scale vision models, rely on general, computation-driven methods that prior...

New blog w @ken-lxl.bsky.social, β€œGiving LLMs too much RoPE: A limit on Sutton’s Bitter Lesson”. The field has shifted from flexible data-driven position representations to fixed approaches following human intuitions. Here’s why and what it means for model performance bradlove.org/blog/positio...

13.06.2025 14:09 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 3    πŸ“Œ 1
https://bradlove.org/blog/prob-llm-consistency

https://bradlove.org/blog/prob-llm-consistency

New blog, "Backwards Compatible: The Strange Math Behind Word Order in AI" w @ken-lxl.bsky.social It turns out the language learning problem is the same for any word order, but is that true in practice for large language models? paper: arxiv.org/abs/2505.08739 BLOG: bradlove.org/blog/prob-ll...

28.05.2025 14:15 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 3    πŸ“Œ 0
Preview
Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies Can autoregressive large language models (LLMs) learn consistent probability distributions when trained on sequences in different token orders? We prove formally that for any well-defined probability ...

Bonus: I found it counterintuitive that (in theory) the learning problem is the same for any word ordering. Aligning proof and simulation was key. Now, new avenues open to address positional biases, better training and knowing when to trust LLMs. w @ken-lxl.bsky.social arxiv.org/abs/2505.08739

14.05.2025 15:02 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Preview
Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies Can autoregressive large language models (LLMs) learn consistent probability distributions when trained on sequences in different token orders? We prove formally that for any well-defined probability ...

When LLMs diverge from one another because of word order (data factorization), it indicates their probability distributions are inconsistent, which is a red flag (not trustworthy). We trace deviations to self-attention positional and locality biases. 2/2 arxiv.org/abs/2505.08739

14.05.2025 15:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 3    πŸ“Œ 0
Preview
Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies Can autoregressive large language models (LLMs) learn consistent probability distributions when trained on sequences in different token orders? We prove formally that for any well-defined probability ...

"Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies"
Oddly, we prove LLMs should be equivalent for any word ordering: forward, backward, scrambled. In practice, LLMs diverge from one another. Why? 1/2 arxiv.org/abs/2505.08739

14.05.2025 15:02 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 3    πŸ“Œ 0

with @ken-lxl.bsky.social , @robmok.bsky.social , Brett Roads

17.02.2025 15:23 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Preview
Coordinating multiple mental faculties during learning - Scientific Reports Scientific Reports - Coordinating multiple mental faculties during learning

"Coordinating multiple mental faculties during learning" There's lots of good work in object recognition and learning, but how do we integrate the two? Here's a proposal and model that is more interactive than perception provides the inputs to cognition. www.nature.com/articles/s41...

17.02.2025 15:23 β€” πŸ‘ 31    πŸ” 9    πŸ’¬ 3    πŸ“Œ 0

Last year, we funded 250 authors and other contributors to attend #ICLR2024 in Vienna as part of this program. If you or your organization want to directly support contributors this year, please get in touch! Hope to see you in Singapore at #ICLR2025!

21.01.2025 15:52 β€” πŸ‘ 37    πŸ” 14    πŸ’¬ 1    πŸ“Œ 0

Thanks @hossenfelder.bsky.social for covering our recent paper, doi.org/10.1038/s415... Also, I want to spotlight this excellent podcast (19 minutes long) with Nicky Cartridge covering how AI will impact science and healthcare in the coming years, touchneurology.com/podcast/brai...

13.12.2024 15:44 β€” πŸ‘ 14    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1

A 7B is small enough to train efficiently on 4 A100s (thanks Microsoft) and at the time Mistral performed relatively well for its size.

27.11.2024 17:11 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Yes, the model weights and all materials are openly available. We really want to offer easy to use tools people can use through the web without hassle. To do that, we need to do more work (will be announcing an open source effort soon) and need some funding for hosting a model endpoint.

27.11.2024 17:09 β€” πŸ‘ 9    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
BrainGPT This is the homepage for BrainGPT, a Large Language Model tool to assist neuroscientific research.

While BrainBench focused on neuroscience, our approach is science general, so others can adopt our template. Everything is open weight and open source. Thanks to the entire team and the expert participants. Sign up for news at braingpt.org 8/8

27.11.2024 14:13 β€” πŸ‘ 11    πŸ” 0    πŸ’¬ 3    πŸ“Œ 0
Post image

Finally, LLMs can be augmented with neuroscience knowledge for better performance. We tuned Mistral on 20 years of the neuroscience literature using LoRA. The tuned model, which we refer to as BrainGPT, performed better on BrainBench. 7/8

27.11.2024 14:13 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Confidence-weighted integration of human and machine judgments for superior decision-making Large language models (LLMs) have emerged as powerful tools in various domains. Recent studies have shown that LLMs can surpass humans in certain tasks, such as predicting the outcomes of neuroscience...

Indeed, follow-up work on teaming finds that joint LLM and human teams outperform either alone, because LLMs and humans make different types of errors. We offer a simple method to combine confidence-weighted judgements.
arxiv.org/abs/2408.08083 6/8

27.11.2024 14:13 β€” πŸ‘ 10    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Post image

In the Nature HB paper, both human experts and LLMs were well calibrated - when they were more certain of their decisions, they were more likely to be correct. Calibration is beneficial for human-machine teaming. 5/8

27.11.2024 14:13 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Matching domain experts by training from scratch on domain knowledge Recently, large language models (LLMs) have outperformed human experts in predicting the results of neuroscience experiments (Luo et al., 2024). What is the basis for this performance? One possibility...

There were no signs of leakage from the training to test set. We performed standard checks. In follow-up work, we trained an LLM from scratch to rule out leakage; even this smaller model was superhuman on BrainBench arxiv.org/abs/2405.09395 4/8

27.11.2024 14:13 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

All 15 LLMs considered crushed human experts at BrainBench's predictive task. LLMs correctly predicted neuroscience results (across all sub areas) dramatically better than human experts, including those with decades of experience. 3/8

27.11.2024 14:13 β€” πŸ‘ 10    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

To test, we created BrainBench, a forward-looking benchmark that stresses prediction over retrieval of facts, avoiding LLM's "hallucination" issue. The task was to predict which version of a Journal of Neuroscience abstract gave the actual result. 2/6

27.11.2024 14:13 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

"Large language models surpass human experts in predicting neuroscience results" w @ken-lxl.bsky.social
and braingpt.org. LLMs integrate a noisy yet interrelated scientific literature to forecast outcomes. nature.com/articles/s41... 1/8

27.11.2024 14:13 β€” πŸ‘ 267    πŸ” 107    πŸ’¬ 19    πŸ“Œ 19

Thanks Gary! I have no idea because I don't see how we get anyone to learn over more than a billion tokens. Maybe one could bootstrap some estimate from the perplexity difference between forward and backward, assuming we can get a sense of how that affects learning? Just off the top of my head...

20.11.2024 22:27 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

i am not seeing the issue. every method is the same, but the text is reversed. we even tokenize separately for forward and backward to make comparable. Perplexity is calculated over the entire option for the benchmark items. The difficulty doesn't have to be the same - it just turned out that way.

19.11.2024 17:44 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

For backward: Everything is reversed at the character level, including the benchmark items. So, the last character of the last word for each passage is the first and the first character of the first word is last. On the benchmark, as in the forward case, the option with lower perplexity is chosen.

19.11.2024 16:59 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Beyond Human-Like Processing: Large Language Models Perform Equivalently on Forward and Backward Scientific Text The impressive performance of large language models (LLMs) has led to their consideration as models of human language processing. Instead, we suggest that the success of LLMs arises from the flexibili...

Instead of viewing LLMs as models of humans or stochastic parrots, we view them as general and powerful pattern learners that can master a superset of what people can. arxiv.org/abs/2411.11061 2/2

19.11.2024 13:21 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 2    πŸ“Œ 1
Post image

"Beyond Human-Like Processing: Large Language Models Perform Equivalently on Forward and Backward Scientific Text" Our take is that large language models (LLMs) are neither stochastic parrots nor faithful models of human language processing. arxiv.org/abs/2411.11061 1/2

19.11.2024 13:21 β€” πŸ‘ 37    πŸ” 11    πŸ’¬ 3    πŸ“Œ 0
bsky-follow-back-all/ at main Β· jiftechnify/bsky-follow-back-all Contribute to jiftechnify/bsky-follow-back-all development by creating an account on GitHub.

Has anyone tried this tool to follow back all of one's followers? github.com/jiftechnify/... It seems legit but I'm weary of giving a password to a third party website. So many people here so suddenly!

19.11.2024 13:06 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
The inevitability and superfluousness of cell types in spatial cognition

I fully support the last sentence of this abstract from @profdata.bsky.social :

elifesciences.org/reviewed-pre...

"...the complexity of the brain should be respected and intuitive notions of cell type, which can be misleading and arise in any complex network, should be relegated to history."

πŸ§ πŸ“ˆ πŸ§ͺ

28.08.2024 16:47 β€” πŸ‘ 29    πŸ” 9    πŸ’¬ 11    πŸ“Œ 4

🚨Submissions for #CCN2024 are now open at ccneuro.org 🚨

We welcome submissions for 2-page papers (deadline: 12 April) and Generative Adversarial Collaborations (GACs), Keynote+Tutorials, and (new this year!) Community Events (deadline: 5 April).

Stay tuned: registration will open in early April!

27.03.2024 20:32 β€” πŸ‘ 17    πŸ” 15    πŸ’¬ 2    πŸ“Œ 1

@profdata is following 20 prominent accounts