Manuel's Avatar

Manuel

@chasinggradients.bsky.social

NLP - mostly representation learning #NLP #NLProc

65 Followers  |  315 Following  |  11 Posts  |  Joined: 29.10.2023  |  1.7466

Latest posts by chasinggradients.bsky.social on Bluesky

Post image

I just released Sentence Transformers v4.1; featuring ONNX and OpenVINO backends for rerankers offering 2-3x speedups and improved hard negatives mining which helps prepare stronger training datasets.

Details in 🧡

15.04.2025 13:54 β€” πŸ‘ 11    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0
Preview
a cat holding a sign that says help ALT: a cat holding a sign that says help

πŸ—£οΈCall for emergency reviewers

I am serving as an AC for #ICML2025, seeking emergency reviewers for two submissions

Are you an expert of Knowledge Distillation or AI4Science?

If so, send me DM with your Google Scholar profile and OpenReview profile

Thank you!

20.03.2025 05:25 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 1
Post image

We've just released MMTEB, our multilingual upgrade to the MTEB Embedding Benchmark!

It's a huge collaboration between 56 universities, labs, and organizations, resulting in a massive benchmark of 1000+ languages, 500+ tasks, and a dozen+ domains.

Details in 🧡

21.02.2025 15:06 β€” πŸ‘ 23    πŸ” 4    πŸ’¬ 2    πŸ“Œ 0
Preview
Can Cross Encoders Produce Useful Sentence Embeddings? Cross encoders (CEs) are trained with sentence pairs to detect relatedness. As CEs require sentence pairs at inference, the prevailing view is that they can only be used as re-rankers in information r...

Can Cross Encoders Produce Useful Sentence Embeddings?

IBM discovered that early cross encoders layers can produce effective sentence embeddings, enabling 5.15x faster inference while maintaining comparable accuracy to full dual encoders.

πŸ“ arxiv.org/abs/2502.03552

07.02.2025 03:34 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Same issue for me

30.01.2025 19:53 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Distiling DeepSeek reasoning to ModernBERT classifiers How can we use the reasoning ability of DeepSeek to generate synthetic labels for fine tuning a ModernBERT model?

Why choose between strong #LLM reasoning and efficient models?

Use DeepSeek to generate high-quality training data, then distil that knowledge into ModernBERT for fast, efficient classification.

New blog post: danielvanstrien.xyz/posts/2025/d...

29.01.2025 10:07 β€” πŸ‘ 58    πŸ” 11    πŸ’¬ 2    πŸ“Œ 4
Post image

People often claim they know when ChatGPT wrote something, but are they as accurate as they think?

Turns out that while general population is unreliable, those who frequently use ChatGPT for writing tasks can spot even "humanized" AI-generated text with near-perfect accuracy 🎯

28.01.2025 14:55 β€” πŸ‘ 187    πŸ” 66    πŸ’¬ 10    πŸ“Œ 19
Preview
A Test So Hard No AI System Can Pass It β€” Yet (Gift Article) The creators of a new test called β€œHumanity’s Last Exam” argue we may soon lose the ability to create tests hard enough for A.I. models.

I wrote about a new AI evaluation called "Humanity's Last Exam," a collection of 3,000 questions submitted by leading academics to try to stump leading AI models, which mostly find today's college-level tests too easy.

www.nytimes.com/2025/01/23/t...

23.01.2025 16:41 β€” πŸ‘ 209    πŸ” 46    πŸ’¬ 17    πŸ“Œ 15
Post image

I just released Sentence Transformers v3.4.0, featuring a memory leak fix (memory not being cleared upon model & trainer deletion), compatibility between the powerful Cached... losses and the Matryoshka loss modifier, and a bunch of fixes & small features.

Details in 🧡

23.01.2025 16:44 β€” πŸ‘ 12    πŸ” 4    πŸ’¬ 2    πŸ“Œ 0
10th Workshop on Representation Learning for NLP - Call for Papers The 10th Workshop on Representation Learning for NLP (RepL4NLP 2025), co-located with NAACL 2025 in Albuquerque, New Mexico, invites papers of a theoretical or experimental nature describing recent ad...

Disappointed with #ICLR or #NAACL reviews? Consider submitting your work at #Repl4NLP, whether it's full papers, extended abstracts, or cross-submissions. πŸ”₯

Details on submissions πŸ‘‰ sites.google.com/view/repl4nl...

⏰ Deadline January 30

23.01.2025 16:30 β€” πŸ‘ 2    πŸ” 2    πŸ’¬ 0    πŸ“Œ 1
Preview
Introducing Phi-4: Microsoft’s Newest Small Language Model Specializing in Complex Reasoning | Microsoft Community Hub Today we are introducing Phi-4, our 14B parameter state-of-the-art small language model (SLM) that excels at complex reasoning in areas such as math, in...

Microsoft’s latest small language model - phi-4 - is open source and now available on Hugging Face techcommunity.microsoft.com/blog/aiplatf...

09.01.2025 15:40 β€” πŸ‘ 10    πŸ” 4    πŸ’¬ 0    πŸ“Œ 0
Preview
LLMs are Also Effective Embedding Models: An In-depth Overview Large language models (LLMs) have revolutionized natural language processing by achieving state-of-the-art performance across various tasks. Recently, their effectiveness as embedding models has gaine...

LLMs are Also Effective Embedding Models: An In-depth Overview

Provides a comprehensive analysis on adopting LLMs as embedding models, examining both zero-shot prompting and tuning strategies to derive competitive text embeddings vs traditional models.

πŸ“ arxiv.org/abs/2412.12591

18.12.2024 06:40 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

You and users of the "but humans"-argument assume different goals for AI.

The argument assumes that the goal is to develop human-level AI. (Or it's used to counter statements claiming AI systems are less intelligent than humans.) It's not a direct argument for their usefulness.

01.01.2025 10:23 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Not sure about this idea (and also the objective of maximizing impact), but I really like the "plain language summary" I've seen in some medical papers.

25.12.2024 10:05 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
How Hallucinatory A.I. Helps Science Dream Up Big Breakthroughs (Gift Article) Hallucinations, a bane of popular A.I. programs, turn out to be a boon for venturesome scientists eager to push back the frontiers of human knowledge.

The upside of A.I. hallucination
gift link www.nytimes.com/2024/12/23/s...

24.12.2024 15:05 β€” πŸ‘ 104    πŸ” 30    πŸ’¬ 10    πŸ“Œ 4
Preview
What happened to BERT & T5? On Transformer Encoders, PrefixLM and Denoising Objectives β€” Yi Tay A Blogpost series about Model Architectures Part 1: What happened to BERT and T5? Thoughts on Transformer Encoders, PrefixLM and Denoising objectives

Good blog post on good old encoder-style models.

Glad to see ModernBERT recently brought something new to the field. So don't count BERT as GOFAI yet.

www.yitay.net/blog/model-a...

24.12.2024 12:16 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image

Announcement #1: our call for papers is up! πŸŽ‰
colmweb.org/cfp.html
And excited to announce the COLM 2025 program chairs @yoavartzi.com @eunsol.bsky.social @ranjaykrishna.bsky.social and @adtraghunathan.bsky.social

17.12.2024 15:48 β€” πŸ‘ 67    πŸ” 24    πŸ’¬ 0    πŸ“Œ 1
Preview
Computer Science Conference Deadlines Map Interactive world map of Computer Science, AI, and ML conference deadlines

Unsure where to submit your next research paper to now that aideadlin.es is not updated anymore? And let’s be honest, is the location not as important as the conference itself?

πŸ—ΊοΈ Check out my latest side-project: deadlines.pieter.ai

23.12.2024 14:39 β€” πŸ‘ 13    πŸ” 4    πŸ’¬ 0    πŸ“Œ 0

πŸ§ͺ New pre-print explores generative AI’s in medicine, highlighting applications for clinicians, patients, researchers, and educators. It also addresses challenges like privacy, transparency, and equity.
Additional details from the author linked below.
🩺πŸ–₯️
Direct link: arxiv.org/abs/2412.10337

22.12.2024 15:03 β€” πŸ‘ 20    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0
Post image

Instead of listing my publications, as the year draws to an end, I want to shine the spotlight on the commonplace assumption that productivity must always increase. Good research is disruptive and thinking time is central to high quality scholarship and necessary for disruptive research.

20.12.2024 11:18 β€” πŸ‘ 1156    πŸ” 376    πŸ’¬ 21    πŸ“Œ 57

IMO, there's a great discussion over there (in my timeline, not the for yoy tab) with interesting insights from the OpenAI team.

Bluesky isn't there (yet).

22.12.2024 13:24 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Re your question at the end:

22.12.2024 13:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

I'll get straight to the point.

We trained 2 new models. Like BERT, but modern. ModernBERT.

Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff.

It's much faster, more accurate, longer context, and more useful. 🧡

19.12.2024 16:45 β€” πŸ‘ 628    πŸ” 148    πŸ’¬ 19    πŸ“Œ 34

Now also out on Arxiv:
www.arxiv.org/abs/2412.12119

18.12.2024 08:23 β€” πŸ‘ 30    πŸ” 5    πŸ’¬ 2    πŸ“Œ 0

1/ Okay, one thing that has been revealed to me from the replies to this is that many people don't know (or refuse to recognize) the following fact:

The unts in ANN are actually not a terrible approximation of how real neurons work!

A tiny 🧡.

πŸ§ πŸ“ˆ #NeuroAI #MLSky

16.12.2024 20:03 β€” πŸ‘ 153    πŸ” 39    πŸ’¬ 21    πŸ“Œ 17

3-7d is totally fine with me

16.12.2024 09:36 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Large Concept Models: Language Modeling in a Sentence Representation Space LLMs have revolutionized the field of artificial intelligence and have emerged as the de-facto tool for many tasks. The current established technology of LLMs is to process input and generate output a...

Really interesting paper: The Large Concept Model is trained to perform autoregressive sentence prediction in an embedding space.
arxiv.org/abs/2412.08821

16.12.2024 09:35 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
uv: An In-Depth Guide to Python's Fast and Ambitious New Package Manager A comprehensive guide on why and how to start using uvβ€”the package manager (and much more) that's taken the Python world by storm.

I'm not a Python developer, and often battle with environments and dependencies when I have to use it. This comprehensive introduction to the uv package manager makes me less hesitant to use Python! www.saaspegasus.com/guides/uv-de...

11.12.2024 08:41 β€” πŸ‘ 44    πŸ” 15    πŸ’¬ 3    πŸ“Œ 0
Post image

The strain on scientific publishing: we set out to characterise the remarkable growth of the scientific literature in the last few years, in spite of declining growth in total scientists. What is going on?

direct.mit.edu/qss/article/...

A 🧡 1/n
#AcademicSky #PhDchat #ScientificPublishing #SciPub

19.11.2024 12:27 β€” πŸ‘ 992    πŸ” 559    πŸ’¬ 46    πŸ“Œ 134
Post image

The journal Nature asked 6 biomedical scientists to briefly comment on what should be a U.S. priority going forward.
Here's mine.
www.nature.com/articles/d41...

25.11.2024 21:52 β€” πŸ‘ 245    πŸ” 54    πŸ’¬ 12    πŸ“Œ 9

@chasinggradients is following 20 prominent accounts