CLAUSE - Computational Linguistics @ Bielefeld University's Avatar

CLAUSE - Computational Linguistics @ Bielefeld University

@clausebielefeld.bsky.social

CompLing group (CLAUSE) at Bielefeld U (PI: Sina Zarrieß). We work on: NLG, Language & Vision, Pragmatics & Dialogue, HateSpeech, BabyLMs, DH, and more! clause-bielefeld.github.io

397 Followers  |  366 Following  |  36 Posts  |  Joined: 13.11.2024  |  2.448

Latest posts by clausebielefeld.bsky.social on Bluesky

Dialogue Is Not Enough to Make a Communicative BabyLM
(But Neither Is Developmentally Inspired Reinforcement Learning)
Francesca Padovani1∗ Bastian Bunzeck2∗ Manar Ali2 Omar Momen2
Arianna Bisazza1 Hendrik Buschmeier2 Sina Zarrieß2
1Center for Language and Cognition (CLCG), University of Groningen
2CRC 1646 – Linguistic Creativity in Communication, Bielefeld University
f.padovani@rug.nl bastian.bunzeck@uni-bielefeld.de

Dialogue Is Not Enough to Make a Communicative BabyLM (But Neither Is Developmentally Inspired Reinforcement Learning) Francesca Padovani1∗ Bastian Bunzeck2∗ Manar Ali2 Omar Momen2 Arianna Bisazza1 Hendrik Buschmeier2 Sina Zarrieß2 1Center for Language and Cognition (CLCG), University of Groningen 2CRC 1646 – Linguistic Creativity in Communication, Bielefeld University f.padovani@rug.nl bastian.bunzeck@uni-bielefeld.de

As part of this year's BabyLM challenge, we (researchers from @gronlp.bsky.social and @clausebielefeld.bsky.social diverged from established pretraining paradigm by training only on dialogue data from CHILDES.

28.10.2025 12:53 — 👍 16    🔁 3    💬 1    📌 0

Preprint alert! We release BabyBabelLM, a multilingual benchmark of developmentally plausible training data. I was responsible for German and Polish data as well as various child-directed wikis. Immensely rewarding project with exceptionally cool co-authors. 🥳🚀

14.10.2025 17:19 — 👍 11    🔁 3    💬 0    📌 1
Post image

𝐃𝐨 𝐲𝐨𝐮 𝐫𝐞𝐚𝐥𝐥𝐲 𝐰𝐚𝐧𝐭 𝐭𝐨 𝐬𝐞𝐞 𝐰𝐡𝐚𝐭 𝐦𝐮𝐥𝐭𝐢𝐥𝐢𝐧𝐠𝐮𝐚𝐥 𝐞𝐟𝐟𝐨𝐫𝐭 𝐥𝐨𝐨𝐤𝐬 𝐥𝐢𝐤𝐞? 🇨🇳🇮🇩🇸🇪

Here’s the proof! 𝐁𝐚𝐛𝐲𝐁𝐚𝐛𝐞𝐥𝐋𝐌 is the first Multilingual Benchmark of Developmentally Plausible Training Data available for 45 languages to the NLP community 🎉

arxiv.org/abs/2510.10159

14.10.2025 17:01 — 👍 40    🔁 16    💬 2    📌 1

Happening in an hour! 🥳

23.09.2025 13:36 — 👍 1    🔁 0    💬 0    📌 0

If you are at #IWCS, then you should not miss Sanne‘s talk ”Not Just Who or What: Modeling the Interaction of Linguistic and Annotator Variation in Hateful Word Interpretation“ (Sanne Hoeken, Özge Alacam, Dong Nguyen, Massimo Poesio, Sina Zarrieß), tomorrow at 16:30! 🕟
@sannehoeken.bsky.social

22.09.2025 10:15 — 👍 4    🔁 1    💬 0    📌 1
Sina in front of a slide with different size circles

Sina in front of a slide with different size circles

Sina Zarieß is giving the KONVENS keynote on training BabyLMs #nlproc
The slide shows the number of words a 12yo human has seen in their lifetime compared to the numbers of words typical language models have seen in training #llm

11.09.2025 11:43 — 👍 6    🔁 3    💬 0    📌 0
Post image

Happening now: Sina‘s keynote on our BabyLM work. 🥳

11.09.2025 11:34 — 👍 5    🔁 0    💬 0    📌 1
Post image

Great first day at #KONVENS2015 today. Looking forward to another engaging day with a keynote by Sina Zarrieß tomorrow 🤓
@clausebielefeld.bsky.social

10.09.2025 20:36 — 👍 2    🔁 1    💬 1    📌 0

Don’t miss Sina‘s keynote on BabyLMs at #konvens tomorrow!

10.09.2025 11:09 — 👍 3    🔁 0    💬 0    📌 0
Post image

Final Keynote of #semdial by David Schlangen on ”Meaningful Interaction with Unreal Speakers?“ 😇💬

05.09.2025 09:32 — 👍 2    🔁 0    💬 1    📌 0

Final day at #semdial2025 #bialogue — four more presentations, one key note and hopefully many engaging discussions. Let's go!

05.09.2025 06:11 — 👍 0    🔁 1    💬 0    📌 0
Post image

Second #semdial keynote by Robert Hawkins on ”Foraging for common ground“

04.09.2025 14:03 — 👍 3    🔁 0    💬 0    📌 0
Post image

Day 2 of #semdial starts with a session on LMs and dialogue systems 🤩

04.09.2025 06:40 — 👍 3    🔁 0    💬 0    📌 0
Post image

Actually yes! Dialogue differs distinctly from monologues in terms of phonetic features and in the production of novel phonetic forms!

03.09.2025 09:41 — 👍 2    🔁 0    💬 0    📌 0
Post image

Leonie Schade asks whether it takes two to do an articulatory tango 😁

03.09.2025 09:24 — 👍 6    🔁 1    💬 1    📌 0

And the second talk features contributions by our PI Sina Zarrieß. 🤩

03.09.2025 08:35 — 👍 6    🔁 0    💬 1    📌 0

#semdial has begun 💬

03.09.2025 07:33 — 👍 1    🔁 0    💬 0    📌 0
Post image

#semdial is about to begin 🥳

03.09.2025 07:01 — 👍 2    🔁 2    💬 1    📌 0

Program: semdial2025.github.io/program/
Proceedings: purl.org/semdial/2025...

02.09.2025 20:11 — 👍 0    🔁 0    💬 0    📌 0
Post image

#semdial2025, the long-awaited #bialogue conference starts tomorrow! We are looking forward to three wonderful conference days, featuring three exciting keynotes, and many oral and poster presentations on the semantics and pragmatics of dialogue. 👄💬
Check out the program and proceedings below. 👇

02.09.2025 20:10 — 👍 3    🔁 0    💬 1    📌 1
Post image

Let’s go!

01.08.2025 10:00 — 👍 3    🔁 0    💬 0    📌 0

Is simpler child-directed language easier to learn?

Check out our CoNLL paper "Do Construction Distributions Shape Formal Language Learning in German BabyLMs?"

@conll-conf.bsky.social

01.08.2025 09:24 — 👍 2    🔁 2    💬 1    📌 0
Preview
Components of Creativity: Language Model-based Predictors for Clustering and Switching in Verbal Fluency Sina Zarrieß, Simeon Junker, Judith Sieker, Özge Alacam. Proceedings of the 29th Conference on Computational Natural Language Learning. 2025.

Find the paper here: aclanthology.org/2025.conll-1...

01.08.2025 09:14 — 👍 3    🔁 0    💬 1    📌 0

Our PI Sina will give an oral presentation on "Components of Creativity: Language Model-based Predictors for Clustering and Switching in Verbal Fluency" at @conll-conf.bsky.social in 45 minutes. Come check it out if you are at @aclmeeting.bsky.social #ACL2025NLP

01.08.2025 09:13 — 👍 4    🔁 1    💬 1    📌 0
Post image

Impromptu dinner after @conll-conf.bsky.social #ACL2025NLP, connecting Bielefeld and the Netherlands over Greek food 😇👌

31.07.2025 17:17 — 👍 6    🔁 0    💬 0    📌 0
Post image

Happening now: catch Simeon, Manar and Larissa presenting their paper -Are Multimodal Large Language Models Pragmatically Competent Listeners in Simple Reference Resolution Tasks?- in hall X5. #ACL2025NLP

28.07.2025 16:00 — 👍 3    🔁 1    💬 0    📌 0

It’s actually 41. 🙃

28.07.2025 08:46 — 👍 3    🔁 0    💬 0    📌 0
Post image

Happening now — Clara, Judith and Sina present their poster:
Can LLMs Ground when they (Don’t) Know: A Study on Direct and Loaded Political Questions
(Poster board 45) #ACL2025NLP

28.07.2025 08:45 — 👍 3    🔁 0    💬 1    📌 0
Overview of CLAUSE papers at ACL

Overview of CLAUSE papers at ACL

The CLAUSE group from Bielefeld University is looking forward to next month‘s ACL in Vienna, where we will be presenting quite a few papers. 🥳
Feel free to get in touch if you want to know more. 😇

26.06.2025 07:55 — 👍 9    🔁 6    💬 0    📌 0
Overview of CLAUSE papers at ACL

Overview of CLAUSE papers at ACL

The CLAUSE group from Bielefeld University is looking forward to next month‘s ACL in Vienna, where we will be presenting quite a few papers. 🥳
Feel free to get in touch if you want to know more. 😇

26.06.2025 07:55 — 👍 9    🔁 6    💬 0    📌 0

@clausebielefeld is following 20 prominent accounts