I just released Sentence Transformers v4.1; featuring ONNX and OpenVINO backends for rerankers offering 2-3x speedups and improved hard negatives mining which helps prepare stronger training datasets.
Details in π§΅
@chasinggradients.bsky.social
NLP - mostly representation learning #NLP #NLProc
I just released Sentence Transformers v4.1; featuring ONNX and OpenVINO backends for rerankers offering 2-3x speedups and improved hard negatives mining which helps prepare stronger training datasets.
Details in π§΅
π£οΈCall for emergency reviewers
I am serving as an AC for #ICML2025, seeking emergency reviewers for two submissions
Are you an expert of Knowledge Distillation or AI4Science?
If so, send me DM with your Google Scholar profile and OpenReview profile
Thank you!
We've just released MMTEB, our multilingual upgrade to the MTEB Embedding Benchmark!
It's a huge collaboration between 56 universities, labs, and organizations, resulting in a massive benchmark of 1000+ languages, 500+ tasks, and a dozen+ domains.
Details in π§΅
Can Cross Encoders Produce Useful Sentence Embeddings?
IBM discovered that early cross encoders layers can produce effective sentence embeddings, enabling 5.15x faster inference while maintaining comparable accuracy to full dual encoders.
π arxiv.org/abs/2502.03552
Same issue for me
30.01.2025 19:53 β π 0 π 0 π¬ 0 π 0Why choose between strong #LLM reasoning and efficient models?
Use DeepSeek to generate high-quality training data, then distil that knowledge into ModernBERT for fast, efficient classification.
New blog post: danielvanstrien.xyz/posts/2025/d...
People often claim they know when ChatGPT wrote something, but are they as accurate as they think?
Turns out that while general population is unreliable, those who frequently use ChatGPT for writing tasks can spot even "humanized" AI-generated text with near-perfect accuracy π―
I wrote about a new AI evaluation called "Humanity's Last Exam," a collection of 3,000 questions submitted by leading academics to try to stump leading AI models, which mostly find today's college-level tests too easy.
www.nytimes.com/2025/01/23/t...
I just released Sentence Transformers v3.4.0, featuring a memory leak fix (memory not being cleared upon model & trainer deletion), compatibility between the powerful Cached... losses and the Matryoshka loss modifier, and a bunch of fixes & small features.
Details in π§΅
Disappointed with #ICLR or #NAACL reviews? Consider submitting your work at #Repl4NLP, whether it's full papers, extended abstracts, or cross-submissions. π₯
Details on submissions π sites.google.com/view/repl4nl...
β° Deadline January 30
Microsoftβs latest small language model - phi-4 - is open source and now available on Hugging Face techcommunity.microsoft.com/blog/aiplatf...
09.01.2025 15:40 β π 10 π 4 π¬ 0 π 0LLMs are Also Effective Embedding Models: An In-depth Overview
Provides a comprehensive analysis on adopting LLMs as embedding models, examining both zero-shot prompting and tuning strategies to derive competitive text embeddings vs traditional models.
π arxiv.org/abs/2412.12591
You and users of the "but humans"-argument assume different goals for AI.
The argument assumes that the goal is to develop human-level AI. (Or it's used to counter statements claiming AI systems are less intelligent than humans.) It's not a direct argument for their usefulness.
Not sure about this idea (and also the objective of maximizing impact), but I really like the "plain language summary" I've seen in some medical papers.
25.12.2024 10:05 β π 1 π 0 π¬ 0 π 0The upside of A.I. hallucination
gift link www.nytimes.com/2024/12/23/s...
Good blog post on good old encoder-style models.
Glad to see ModernBERT recently brought something new to the field. So don't count BERT as GOFAI yet.
www.yitay.net/blog/model-a...
Announcement #1: our call for papers is up! π
colmweb.org/cfp.html
And excited to announce the COLM 2025 program chairs @yoavartzi.com @eunsol.bsky.social @ranjaykrishna.bsky.social and @adtraghunathan.bsky.social
Unsure where to submit your next research paper to now that aideadlin.es is not updated anymore? And letβs be honest, is the location not as important as the conference itself?
πΊοΈ Check out my latest side-project: deadlines.pieter.ai
π§ͺ New pre-print explores generative AIβs in medicine, highlighting applications for clinicians, patients, researchers, and educators. It also addresses challenges like privacy, transparency, and equity.
Additional details from the author linked below.
π©Ίπ₯οΈ
Direct link: arxiv.org/abs/2412.10337
Instead of listing my publications, as the year draws to an end, I want to shine the spotlight on the commonplace assumption that productivity must always increase. Good research is disruptive and thinking time is central to high quality scholarship and necessary for disruptive research.
20.12.2024 11:18 β π 1156 π 376 π¬ 21 π 57IMO, there's a great discussion over there (in my timeline, not the for yoy tab) with interesting insights from the OpenAI team.
Bluesky isn't there (yet).
Re your question at the end:
22.12.2024 13:18 β π 1 π 0 π¬ 1 π 0I'll get straight to the point.
We trained 2 new models. Like BERT, but modern. ModernBERT.
Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff.
It's much faster, more accurate, longer context, and more useful. π§΅
Now also out on Arxiv:
www.arxiv.org/abs/2412.12119
1/ Okay, one thing that has been revealed to me from the replies to this is that many people don't know (or refuse to recognize) the following fact:
The unts in ANN are actually not a terrible approximation of how real neurons work!
A tiny π§΅.
π§ π #NeuroAI #MLSky
3-7d is totally fine with me
16.12.2024 09:36 β π 4 π 0 π¬ 0 π 0Really interesting paper: The Large Concept Model is trained to perform autoregressive sentence prediction in an embedding space.
arxiv.org/abs/2412.08821
I'm not a Python developer, and often battle with environments and dependencies when I have to use it. This comprehensive introduction to the uv package manager makes me less hesitant to use Python! www.saaspegasus.com/guides/uv-de...
11.12.2024 08:41 β π 44 π 15 π¬ 3 π 0The strain on scientific publishing: we set out to characterise the remarkable growth of the scientific literature in the last few years, in spite of declining growth in total scientists. What is going on?
direct.mit.edu/qss/article/...
A π§΅ 1/n
#AcademicSky #PhDchat #ScientificPublishing #SciPub
The journal Nature asked 6 biomedical scientists to briefly comment on what should be a U.S. priority going forward.
Here's mine.
www.nature.com/articles/d41...