Jean Barré's Avatar

Jean Barré

@jbarre.bsky.social

PhD student in Computational Literary Studies at LaTTiCe lab & École normale supérieure in Paris. I study French genre fiction (1860–1945) using quantitative and network-based approaches. https://jeanbarre.eu/

459 Followers  |  502 Following  |  89 Posts  |  Joined: 18.10.2023
Posts Following

Posts by Jean Barré (@jbarre.bsky.social)

LinkedIn This link will take you to a page that’s not on LinkedIn

Out in Evolutionary Human Sciences! With @mikekestemont.bsky.social, @jbcamps.bsky.social, @remcosleiderink.bsky.social & Anne Chao

New work on unseen species models for cult heritage to the question: how many stories were _shared_ between medieval French and Dutch literature?

lnkd.in/exyAWtir

28.02.2026 11:02 — 👍 14    🔁 8    💬 0    📌 0
Preview
Frontiers | Computational hermeneutics: evaluating generative AI as a cultural technology Generative AI (GenAI) systems are increasingly recognized as cultural technologies, yet current evaluation frameworks often treat culture as a variable to be...

I'm on a 38(!)-author paper just published in Frontiers in Artificial Intelligence, "Computational hermeneutics: evaluating generative AI as a cultural technology". We splice Schleiermacher and hermeneutic theory into AI debates, arguing AI are "context machines".
www.frontiersin.org/journals/art...

26.02.2026 07:53 — 👍 53    🔁 20    💬 5    📌 2
New multi-institutional project to use AI to represent past historical periods A new project led by a team of researchers from four universities aims to create and evaluate language models that represent past historical periods. The project, "Artificial Intelligence for Cultural...

From time to time I mutter about a secret project that involves benchmarks and historical language models. Here's a formal announcement of the Schmidt Sciences grant. Other PIs include @dmimno.bsky.social , @lauraknelson.bsky.social, @andrewpiper.bsky.social, and @mattwilkens.bsky.social. And +

25.02.2026 19:33 — 👍 109    🔁 21    💬 15    📌 2
Preview
Memorization vs. generalization in deep learning: implicit biases, benign overfitting, and more Or: how I learned to stop worrying and love the memorization

What is the relationship between memorization and generalization in AI? Is there a fundamental tradeoff? In infinitefaculty.substack.com/p/memorizati... I’ve reviewed some of the evolving perspectives on memorization & generalization in machine learning, from classic perspectives through LLMs.

18.02.2026 15:54 — 👍 132    🔁 27    💬 4    📌 5
Preview
GitHub - lattice-8094/fictional-character-ontology: Repository for the data & code of the paper "Toward an Ontological Representation of Fictional Characters" Repository for the data & code of the paper "Toward an Ontological Representation of Fictional Characters" - lattice-8094/fictional-character-ontology

A bit ironically, we end up learning as much about the researchers (and their implicit representations) as about the characters themselves.

Kudos to @antoine-bourgois.bsky.social , first author & 1st-year PhD student 👏

📂 Data + code → github.com/lattice-8094/fictional-character-ontology

20.02.2026 13:35 — 👍 4    🔁 0    💬 0    📌 0

This flips the similarity question on its head.

It's no longer: "Are these two characters similar?"

But rather: "Along which dimensions did the scholars who designed this benchmark implicitly define similarity between these two characters?"

20.02.2026 13:35 — 👍 1    🔁 0    💬 1    📌 0
Box plot showing accuracy scores for all combinations of 1 to 17 ontological character attribute classes. Three trend lines track maximum (green), mean (red), and minimum (orange) accuracy. The key finding: maximum accuracy peaks at ~0.96 with just 4–5 carefully selected classes, while using all 17 classes simultaneously yields no improvement over the mean (~0.64). A small, well-chosen subset of ontological dimensions outperforms the full representation — more is not better.

Box plot showing accuracy scores for all combinations of 1 to 17 ontological character attribute classes. Three trend lines track maximum (green), mean (red), and minimum (orange) accuracy. The key finding: maximum accuracy peaks at ~0.96 with just 4–5 carefully selected classes, while using all 17 classes simultaneously yields no improvement over the mean (~0.64). A small, well-chosen subset of ontological dimensions outperforms the full representation — more is not better.

We then built on @dbamman.bsky.social et al.'s (2014) triplets protocol and introduce CharaSim-fr, a brand new benchmark for characters similarity

Key method: we exhaustively compute all possible mixes of concatenated ontological classes (131,054 combinations!) to find the optimal similarity signal

20.02.2026 13:35 — 👍 2    🔁 0    💬 1    📌 0

So we built an ontology of characterization w/ 17 classes grounded in narratological theory: actions, emotions, personality traits, relations, cognition, objects, body parts, ++

The goal: stop treating characters as bags of features and start asking along which dimensions they resemble each other.

20.02.2026 13:35 — 👍 2    🔁 0    💬 1    📌 0

≠ Rosanette: less refined, more eroticized, defined largely through dialogue
≠ Mme Arnoux: the central figure — the other two are satellites who exist mainly by contrast with her
≠ Mme Dambreuse: far more effaced, valued by Frédéric mainly for the social access she provides — little interiority

20.02.2026 13:35 — 👍 1    🔁 0    💬 1    📌 0

The story starts with an argument among co-authors about three women in Flaubert's L'Éducation sentimentale: Mme Arnoux, Rosanette, Mme Dambreuse.

The question: who is the most dissimilar in this triplet?

Three defensible answers
That is exactly where the tears began

20.02.2026 13:35 — 👍 2    🔁 0    💬 1    📌 0

Sweat, because we tried to grasp what it actually means for two characters to be "similar" measured across 100,000+ characters.
The standard approach? Character embeddings, cosine similarity, done.
We were so naive. So wrong.

20.02.2026 13:35 — 👍 1    🔁 0    💬 1    📌 0
Toward an Ontological Representation of Fictional Characters | Computational Humanities Research | Cambridge Core Toward an Ontological Representation of Fictional Characters

New article! "Toward an Ontological Representation of Fictional Characters" by @antoine-bourgois.bsky.social, me, @oseminck.bsky.social & @tpoibeau.bsky.social

doi.org/10.1017/chr....

Nothing fancy here — only sweat & tears. 🧵

20.02.2026 13:35 — 👍 21    🔁 7    💬 1    📌 0
Post image

📢 Postulez aux masters en #humanitésnumériques de l'École des chartes - @psl-univ.bsky.social .
L’École propose deux masters qui forment au traitement des sources (objets, textes, images) par les technologies numériques.
Candidatures : 17 fév.-16 mars.
▶️ En savoir plus : https://urls.fr/tr9XN4

11.02.2026 10:15 — 👍 4    🔁 2    💬 0    📌 0

ah oui c'est du rapide !

10.02.2026 12:57 — 👍 1    🔁 0    💬 0    📌 0

This call is currently open for a Humanistica-satellite event, that might interest people in computational humanities (and not only). It is supported by CultureLab and welcomes long papers as well as lightning talks and posters.

20.01.2026 16:09 — 👍 4    🔁 7    💬 0    📌 0
ICDAR 2026 Competition on Multilingual Medieval Handwriting Recognition ICDAR 2026 Competition on Multilingual Medieval Handwriting Recognition The ICDAR 2026 Competition on Multilingual Medieval Handwriting Recognition (CMMHWR26) seeks to evaluate the state of the art in...

You like Automatic Text Recognition (ATR/OCR/HTR) ?
You like challenges ?
Well, we open a competition for ATR/OCR of medieval manuscripts, in cross-lingual & diachronic settings for the Latin script.

📆 20/01 Registration
📆 21/03 Test set released
📆 3/04 Deadline for results

Link cmmhwr26.inria.fr

20.01.2026 09:40 — 👍 25    🔁 11    💬 0    📌 3

un petit tuning d'un BERT sur tes catégories ? mais il te faudra + d'exemples
sinon l'option frugale c'est diffusion de tes 1000 exemples avec un SVM sur tes embeddings (ya 6 mois j'étais encore sur les modèles bge-m3 en multi-lingue c'était top)

02.01.2026 16:25 — 👍 0    🔁 0    💬 0    📌 0
Understanding Conversational AI | Ubiquity Press <p><i><b>What do large language models really know—and what does it mean to live alongside them?</b></i></p><p>This book offers a critical and interdisciplinary exploration of large language models (L...

New Publication: "Understanding Conversational AI: Philosophy, Ethics, and Social Impact of Large Language Models" (270 pages, Ubiquity Press, open access). Feel free to read it and share it widely! www.ubiquitypress.com/books/m/10.5...

17.12.2025 07:06 — 👍 16    🔁 7    💬 1    📌 1
Preview
Technologies of the Novel Cambridge Core - English Literature 1700-1830 - Technologies of the Novel

A distant reading -not computational- cool read !
www.cambridge.org/core/books/t...

15.12.2025 20:44 — 👍 2    🔁 0    💬 0    📌 0

Amazing papers at #CHR2025; particularly enjoying the computational literary studies. An observation: questions about genre as a confounding factor seem to keep coming up. I do wonder if (and I'm also guilty of this) CLS can fixate on the x-axis of history and we ought to give genre more attention.

12.12.2025 11:24 — 👍 19    🔁 2    💬 2    📌 0
Post image

@jbarre.bsky.social, @oseminck.bsky.social, Antoine Bourgois, and @tpoibeau.bsky.social built a detective detector, tracing the different archetypes in French detective fiction #CHR2025

12.12.2025 12:20 — 👍 21    🔁 5    💬 0    📌 1

Thx @oseminck.bsky.social for the pic!

12.12.2025 10:27 — 👍 0    🔁 0    💬 0    📌 0
Yuri and I standing by a literal cannon while we talk about canonicity in the Luxembourg city casemates

Yuri and I standing by a literal cannon while we talk about canonicity in the Luxembourg city casemates

Post image

Luxembourg is a good place to talk about Canonicity with Yuri #chr2025

12.12.2025 09:44 — 👍 14    🔁 0    💬 1    📌 0

Christmas comes earlier every year !

10.12.2025 10:34 — 👍 1    🔁 0    💬 0    📌 0
025 - A Perfect Job is the New Very Good Job A little disclaimer for once, because I usually prefer to praise if I name people. I do not know Dan Cohen nor his work, my criticism of his article is not directed against him personally, but rather

I added a new post on my research blog last week. I wanted to react to a post from Dan Cohen that I've seen circulating on BlueSky last week about Gemini 3, and figured I would add my critical 2 cents to the mix!

alix-tz.github.io/phd/posts/025/

03.12.2025 00:04 — 👍 19    🔁 5    💬 1    📌 3
Preview
Event Detection between Literary Studies and NLP. A Survey, a Narratological Reflection, and a Case Study Narrative structure in fiction relies on the strategic presentation of events, where the ordering and disclosure of information (syuzhet) shape reader engagement and tension. This study outlines a com...

New article in #JCLS 4(1)! 🎉
Visser Solissa, van Cranenburgh & @fpianz.bsky.social present a model for detecting syuzhet—the ordering and disclosure of events that shape a narrative—and formalize event annotation in fiction across multiple languages.
#CCLS25 #ComputationalNarratology

02.12.2025 19:13 — 👍 8    🔁 4    💬 1    📌 0

Antoine Mazière will be at CHR btw ! he has a poster presentation about the corpus

25.11.2025 16:07 — 👍 2    🔁 0    💬 0    📌 0

The task ahead remains enormous, of course, but bravo to the authors: harmonizing the corpus metadata at this scale must have been an absolute nightmare.

21.11.2025 15:11 — 👍 3    🔁 0    💬 0    📌 0

🚨 huge dataset for CLS -> 650K "contemporary" & multilingual books

tagging @tpoibeau.bsky.social + we need to bring Antoine Mazière around here

21.11.2025 15:10 — 👍 6    🔁 0    💬 2    📌 0
Preview
Edited by Taylor Arnold, Margherita Fantoli, and Ruben Ros

📢 The #CHR2025 proceedings are out!

97 papers, ~1600 pages of computational humanities 🔥 Now published via the new Anthology of Computers and the Humanities, with DOIs for every paper.

🔗 anthology.ach.org/volumes/vol0...

And don’t forget: registration closes tomorrow (20 Nov)!

19.11.2025 12:53 — 👍 56    🔁 41    💬 0    📌 2