LinkedIn
This link will take you to a page that’s not on LinkedIn
Out in Evolutionary Human Sciences! With @mikekestemont.bsky.social, @jbcamps.bsky.social, @remcosleiderink.bsky.social & Anne Chao
New work on unseen species models for cult heritage to the question: how many stories were _shared_ between medieval French and Dutch literature?
lnkd.in/exyAWtir
28.02.2026 11:02 —
👍 14
🔁 8
💬 0
📌 0
Frontiers | Computational hermeneutics: evaluating generative AI as a cultural technology
Generative AI (GenAI) systems are increasingly recognized as cultural technologies, yet current evaluation frameworks often treat culture as a variable to be...
I'm on a 38(!)-author paper just published in Frontiers in Artificial Intelligence, "Computational hermeneutics: evaluating generative AI as a cultural technology". We splice Schleiermacher and hermeneutic theory into AI debates, arguing AI are "context machines".
www.frontiersin.org/journals/art...
26.02.2026 07:53 —
👍 53
🔁 20
💬 5
📌 2
New multi-institutional project to use AI to represent past historical periods
A new project led by a team of researchers from four universities aims to create and evaluate language models that represent past historical periods. The project, "Artificial Intelligence for Cultural...
From time to time I mutter about a secret project that involves benchmarks and historical language models. Here's a formal announcement of the Schmidt Sciences grant. Other PIs include @dmimno.bsky.social , @lauraknelson.bsky.social, @andrewpiper.bsky.social, and @mattwilkens.bsky.social. And +
25.02.2026 19:33 —
👍 109
🔁 21
💬 15
📌 2
Memorization vs. generalization in deep learning: implicit biases, benign overfitting, and more
Or: how I learned to stop worrying and love the memorization
What is the relationship between memorization and generalization in AI? Is there a fundamental tradeoff? In infinitefaculty.substack.com/p/memorizati... I’ve reviewed some of the evolving perspectives on memorization & generalization in machine learning, from classic perspectives through LLMs.
18.02.2026 15:54 —
👍 132
🔁 27
💬 4
📌 5
This flips the similarity question on its head.
It's no longer: "Are these two characters similar?"
But rather: "Along which dimensions did the scholars who designed this benchmark implicitly define similarity between these two characters?"
20.02.2026 13:35 —
👍 1
🔁 0
💬 1
📌 0
Box plot showing accuracy scores for all combinations of 1 to 17 ontological character attribute classes. Three trend lines track maximum (green), mean (red), and minimum (orange) accuracy. The key finding: maximum accuracy peaks at ~0.96 with just 4–5 carefully selected classes, while using all 17 classes simultaneously yields no improvement over the mean (~0.64). A small, well-chosen subset of ontological dimensions outperforms the full representation — more is not better.
We then built on @dbamman.bsky.social et al.'s (2014) triplets protocol and introduce CharaSim-fr, a brand new benchmark for characters similarity
Key method: we exhaustively compute all possible mixes of concatenated ontological classes (131,054 combinations!) to find the optimal similarity signal
20.02.2026 13:35 —
👍 2
🔁 0
💬 1
📌 0
So we built an ontology of characterization w/ 17 classes grounded in narratological theory: actions, emotions, personality traits, relations, cognition, objects, body parts, ++
The goal: stop treating characters as bags of features and start asking along which dimensions they resemble each other.
20.02.2026 13:35 —
👍 2
🔁 0
💬 1
📌 0
≠ Rosanette: less refined, more eroticized, defined largely through dialogue
≠ Mme Arnoux: the central figure — the other two are satellites who exist mainly by contrast with her
≠ Mme Dambreuse: far more effaced, valued by Frédéric mainly for the social access she provides — little interiority
20.02.2026 13:35 —
👍 1
🔁 0
💬 1
📌 0
The story starts with an argument among co-authors about three women in Flaubert's L'Éducation sentimentale: Mme Arnoux, Rosanette, Mme Dambreuse.
The question: who is the most dissimilar in this triplet?
Three defensible answers
That is exactly where the tears began
20.02.2026 13:35 —
👍 2
🔁 0
💬 1
📌 0
Sweat, because we tried to grasp what it actually means for two characters to be "similar" measured across 100,000+ characters.
The standard approach? Character embeddings, cosine similarity, done.
We were so naive. So wrong.
20.02.2026 13:35 —
👍 1
🔁 0
💬 1
📌 0
Toward an Ontological Representation of Fictional Characters | Computational Humanities Research | Cambridge Core
Toward an Ontological Representation of Fictional Characters
New article! "Toward an Ontological Representation of Fictional Characters" by @antoine-bourgois.bsky.social, me, @oseminck.bsky.social & @tpoibeau.bsky.social
doi.org/10.1017/chr....
Nothing fancy here — only sweat & tears. 🧵
20.02.2026 13:35 —
👍 21
🔁 7
💬 1
📌 0
📢 Postulez aux masters en #humanitésnumériques de l'École des chartes - @psl-univ.bsky.social .
L’École propose deux masters qui forment au traitement des sources (objets, textes, images) par les technologies numériques.
Candidatures : 17 fév.-16 mars.
▶️ En savoir plus : https://urls.fr/tr9XN4
11.02.2026 10:15 —
👍 4
🔁 2
💬 0
📌 0
ah oui c'est du rapide !
10.02.2026 12:57 —
👍 1
🔁 0
💬 0
📌 0
This call is currently open for a Humanistica-satellite event, that might interest people in computational humanities (and not only). It is supported by CultureLab and welcomes long papers as well as lightning talks and posters.
20.01.2026 16:09 —
👍 4
🔁 7
💬 0
📌 0
ICDAR 2026 Competition on Multilingual Medieval Handwriting Recognition
ICDAR 2026 Competition on Multilingual Medieval Handwriting Recognition The ICDAR 2026 Competition on Multilingual Medieval Handwriting Recognition (CMMHWR26) seeks to evaluate the state of the art in...
You like Automatic Text Recognition (ATR/OCR/HTR) ?
You like challenges ?
Well, we open a competition for ATR/OCR of medieval manuscripts, in cross-lingual & diachronic settings for the Latin script.
📆 20/01 Registration
📆 21/03 Test set released
📆 3/04 Deadline for results
Link cmmhwr26.inria.fr
20.01.2026 09:40 —
👍 25
🔁 11
💬 0
📌 3
un petit tuning d'un BERT sur tes catégories ? mais il te faudra + d'exemples
sinon l'option frugale c'est diffusion de tes 1000 exemples avec un SVM sur tes embeddings (ya 6 mois j'étais encore sur les modèles bge-m3 en multi-lingue c'était top)
02.01.2026 16:25 —
👍 0
🔁 0
💬 0
📌 0
Amazing papers at #CHR2025; particularly enjoying the computational literary studies. An observation: questions about genre as a confounding factor seem to keep coming up. I do wonder if (and I'm also guilty of this) CLS can fixate on the x-axis of history and we ought to give genre more attention.
12.12.2025 11:24 —
👍 19
🔁 2
💬 2
📌 0
@jbarre.bsky.social, @oseminck.bsky.social, Antoine Bourgois, and @tpoibeau.bsky.social built a detective detector, tracing the different archetypes in French detective fiction #CHR2025
12.12.2025 12:20 —
👍 21
🔁 5
💬 0
📌 1
Thx @oseminck.bsky.social for the pic!
12.12.2025 10:27 —
👍 0
🔁 0
💬 0
📌 0
Yuri and I standing by a literal cannon while we talk about canonicity in the Luxembourg city casemates
Luxembourg is a good place to talk about Canonicity with Yuri #chr2025
12.12.2025 09:44 —
👍 14
🔁 0
💬 1
📌 0
Christmas comes earlier every year !
10.12.2025 10:34 —
👍 1
🔁 0
💬 0
📌 0
Event Detection between Literary Studies and NLP. A Survey, a Narratological Reflection, and a Case Study
Narrative structure in fiction relies on the strategic presentation of events, where the ordering and disclosure of information (syuzhet) shape reader engagement and tension. This study outlines a com...
New article in #JCLS 4(1)! 🎉
Visser Solissa, van Cranenburgh & @fpianz.bsky.social present a model for detecting syuzhet—the ordering and disclosure of events that shape a narrative—and formalize event annotation in fiction across multiple languages.
#CCLS25 #ComputationalNarratology
02.12.2025 19:13 —
👍 8
🔁 4
💬 1
📌 0
Antoine Mazière will be at CHR btw ! he has a poster presentation about the corpus
25.11.2025 16:07 —
👍 2
🔁 0
💬 0
📌 0
The task ahead remains enormous, of course, but bravo to the authors: harmonizing the corpus metadata at this scale must have been an absolute nightmare.
21.11.2025 15:11 —
👍 3
🔁 0
💬 0
📌 0
🚨 huge dataset for CLS -> 650K "contemporary" & multilingual books
tagging @tpoibeau.bsky.social + we need to bring Antoine Mazière around here
21.11.2025 15:10 —
👍 6
🔁 0
💬 2
📌 0
Edited by Taylor Arnold, Margherita Fantoli, and Ruben Ros
📢 The #CHR2025 proceedings are out!
97 papers, ~1600 pages of computational humanities 🔥 Now published via the new Anthology of Computers and the Humanities, with DOIs for every paper.
🔗 anthology.ach.org/volumes/vol0...
And don’t forget: registration closes tomorrow (20 Nov)!
19.11.2025 12:53 —
👍 56
🔁 41
💬 0
📌 2