LightOn (@lightonai) — Bluesky Profile

4 days ago

LightOn Enhances Its Paradigm Platform with Real-Time Web Access Through a Strategic Partnership with Linkup - LightOn LightOn, a leading French provider of secure AI for sensitive data, announces a strategic partnership with French startup Linkup, which specializes in web search designed for AI applications.

🗞️ Read the press release: www.lighton.ai/lighton-blog...

1 0 0 0

4 days ago

Private data and the open web, securely unified in a single pipeline.

Together, the two technologies enable organizations to build AI systems that are better informed, more reliable, and designed for demanding environments, combining search and reasoning to deliver accurate and actionable outputs.

1 0 1 0

4 days ago

AI agents are only as good as the information they can access.

🇪🇺 LightOn, an AI Search & Reason company, is partnering with Linkup to provide structured, real-time web search to Paradigm within a secure, fully European technology stack.

1 0 1 0

3 weeks ago

Day Zero of Multi-Vector Retrieval - LightOn Introducing ColBERT-Zero: late interaction model trained from scratch with PyLate

Models, checkpoints, training code under Apache 2.0.

🧑‍🍳 Kudos to the whole team @nohtow.bsky.social Luca Arnaboldi @amelietabatta.bsky.social @krzakalaf.bsky.social

🔗 Dive into the release: www.lighton.ai/lighton-blog...

0 0 0 0

3 weeks ago

🥇 SOTA on BEIR, <150M params
⚡ Supervised-first → distill = most of the gains for a fraction of the cost
🧠 Prompt alignment is non-negotiable to preserve peak performance through fine-tuning

1 0 1 0

3 weeks ago

In collaboration with @epfl-ai-center.bsky.social and the Swiss AI initiative, LightOn pre-trained it end-to-end for late-interaction retrieval

0 0 1 0

3 weeks ago

Day Zero for Multi-Vector Retrieval.
Today we’re flipping the retrieval playbook: no dense model adaptation, no retrofit.

🏗️Multi-vector from scratch, powered by PyLate.

Meet ColBERT-Zero

1 0 1 0

1 month ago

LateOn-Code & ColGrep: LightOn unveils state-of-the-art code retrieval models and code search tooling - LightOn The "Stronger Grep" for Modern Development While AI coding assistants like Claude Code have transformed how code is written, their ability to navigate large codebases efficiently is often limited by k...

Give your coding agent the search it deserves.

Huge kudos to @nohtow.bsky.social and @raphaelsty.bsky.social

Read more: www.lighton.ai/lighton-blog...

1 0 0 0

1 month ago

What we measured with Claude Code:
🚀 70% win rate vs. vanilla grep
📉 ~60k tokens saved per question
🤏 56% fewer search operations

Built in Rust with Next-Plaid - 100% local - No code leaves your machine.

2 0 1 0

1 month ago

ColGrep is powered by LateOn-Code-edge (17M) and LateOn-Code (130M), the first late-interaction models purpose-built for code.

🏆 They top MTEB Code,
outperforming models up to 17x their size while running instantly on a laptop.

0 0 1 0

1 month ago

ColGrep mirrors the grep interface your agents already use, but replaces pattern matching with semantic scoring, and supports hybrid queries that combine both. It plugs straight into Claude Code, OpenCode, or Codex.

0 0 1 0

1 month ago

LateOn-Code & ColGrep: LightOn unveils state-of-the-art code retrieval models and code search tooling - LightOn The "Stronger Grep" for Modern Development While AI coding assistants like Claude Code have transformed how code is written, their ability to navigate large codebases efficiently is often limited by k...

🔥 Stop burning tokens on blind grep searches. Give your coding agent semantic eyes.

Meet LateOn-Code & ColGrep:
a Rust-powered search tool and two SOTA late-interaction models that bring intent-level code retrieval directly to your terminal.

4 0 1 0

1 month ago

Introducing LightOn NextPlaid - LightOn Multi-Vector Database Built for Sharper Retrieval and Frugal Inference

Huge kudos to @raphaelsty.bsky.social for shipping this breakthrough! 🙌

Read the full article here 👉 www.lighton.ai/lighton-blog...

1 0 0 0

1 month ago

NextPlaid represents the "Blanc" milestone in our Bleu/Blanc/Rouge roadmap for enterprise document intelligence. It follows the "Bleu" release, LightOnOCR-2, a SOTA 1B OCR model which converts complex documents into clean, usable text.

1 0 1 0

1 month ago

⚙️ Production Ready:
Built in Rust and optimized for CPUs, it supports incremental index updates and concurrent reads/writes—capabilities missing from standard implementations.

0 0 1 0

1 month ago

🚀 Seamless Integration:
NextPlaid runs alongside your existing vector database. You can add multi-vector retrieval to your established RAG pipeline without ripping anything out.

0 0 1 0

1 month ago

📉 Frugal Inference: High-signal context reduces the amount of noise sent to your LLM, allowing it to answer with fewer, more accurate tokens.

0 0 1 0

1 month ago

Why NextPlaid is the missing layer for your RAG stack:

🎯 Precision Matching: Retrieval matches at the token level, surfacing the exact passage that answers your question rather than just a document that vaguely relates.

0 0 1 0

1 month ago

By representing documents as sets of vectors, one per token, we preserve the distinct concepts and precise details that other search engines average away.

0 0 1 0

1 month ago

🔍🪡To find the needle, you better index every straw of the haystack.

Today, LightOn is launching LightOn NextPlaid: a CPU-optimized multi-vector database that indexes at the token level.

3 0 1 0

1 month ago

Comment donner une mémoire fiable aux intelligences artificielles ? avec Amélie Chatelain, Head of Knowledge & Search chez LightOn

En entreprise :
📄 vos documents sont vivants,
🔍 l’observabilité est indispensable,
🌳 le bruit coûte cher et les GPUs ne poussent pas sur les arbres !

Un épisode dense et sans langue de bois sur l'IA en entreprise.

🎧 Écouter l'épisode
👉 Spotify : open.spotify.com/episode/4Dtt...

1 0 0 0

1 month ago

@amelietabatta.bsky.social Head of Knowledge & Search chez @lightonai.bsky.social est l’invitée de Laurent Nicolas-Guennoc pour le podcast Converteo “Changement d’époque”

Face au narratif "bigger context = better", Amélie remet les pendules à l'heure.

1 0 1 0

1 month ago

🎙️“Mettre tous vos documents dans le contexte d'un modèle, c'est comme inviter 30 personnes à une réunion où une seule suffit : ça coûte cher, ça fait du bruit, et au final le résultat est moins précis !”

2 0 1 0

1 month ago

Introducing Ettin Suite: the SoTA open recipe to outperform existing Generative & Retrieval Models - LightOn Introducing Ettin, the first ever SOTA suite of paired encoder & decoder models, developed by Johns Hopkins University in collaboration with LightOn.

Congrats to @orionweller.bsky.social @jhuclsp.bsky.social @nohtow.bsky.social for pushing the boundaries of useful AI.

🧑‍🍳 Read the open recipe here: lighton.ai/lighton-blog...

2 0 0 0

1 month ago

Size matters less than the right architecture choice.
That’s why the smallest Ettin model is already being massively adopted to build high-performance Edge AI.

It’s time to stop forcing "Decoder-only" models on every problem. For high-value tasks, specialized engineering beats generic scale.

3 0 1 0

1 month ago

Ettin was built as the first-ever SOTA suite of paired encoder-only & decoder-only models to prove a point:

🔍 Encoders for classification & retrieval
✏️ Decoders for text generation

1 0 1 0

1 month ago

The Ettin suite paper has been accepted to
@iclr-conf.bsky.social

It highlights the Elephant in the room:
🏗️ Architecture matters.

2 0 1 0

2 months ago

RAG isn’t Dead, Yours is! - LightOn Static ingestion, stale answers, lost trust.

The "G" in RAG only amplifies what the "R" provides.
If your retrieval layer is static, your AI is hallucinating on facts.

Here is how LightOn approaches RAG as critical infrastructure, not just a chatbot feature.
👉 www.lighton.ai/lighton-blog...

2 0 0 0

2 months ago

When you treat it as a simple add-on:

📉 Relevance drops as document versions change.
🔐 Security blocks you because access control wasn't enforced at query time.
⚠️ Trust erodes because the system generates confident answers based on last week's data.

2 0 1 0

2 months ago

Stop building RAG like a feature. It is infrastructure.

RAG inherits every constraint of your organization: scale, heterogeneous data, and strict governance

2 0 1 0