Onur Keleş @onurkeles - Bluesky Profile

We hope this benchmark sparks deeper collaboration between sign language linguistics and multimodal AI, highlighting signed languages as a rich testbed for visual grounding and embodiment.

15.10.2025 13:45 — 👍 0 🔁 0 💬 0 📌 0

Even the best model (Gemini 2.5 Pro) identified only 17/96 signs (~18%), far below human baselines (40/96 hearing non-signers). Also, models favor static iconic objects over dynamic iconic actions, showing a key gap between visual AI and embodied cognition, unlike humans. ❌

15.10.2025 13:45 — 👍 1 🔁 0 💬 1 📌 0

We evaluated 13 VLMs (3 closed-source). Larger models (GPT-5, Gemini 2.5 Pro, Qwen2.5-VL 72B) showed moderate correlation with human iconicity judgments and mirrored some human phonological difficulty patterns, e.g., handshape harder than location.

15.10.2025 13:45 — 👍 0 🔁 0 💬 1 📌 0

The benchmark has three complementary tasks:

1️⃣ Phonological form prediction – predicting handshape, location, etc.
2️⃣ Transparency – inferring meaning from visual form.
3️⃣ Graded iconicity – rating how much a sign looks like what it means.

15.10.2025 13:45 — 👍 0 🔁 0 💬 1 📌 0

We introduce the Visual Iconicity Challenge, a benchmark testing whether Vision–Language Models (VLMs) can recognize iconicity, i.e., the visual resemblance between form and meaning, using signs from the Sign Language of the Netherlands (NGT).

15.10.2025 13:45 — 👍 0 🔁 0 💬 1 📌 0

I’m very happy to share our new paper!

“The Visual Iconicity Challenge: Evaluating Vision–Language Models on Sign Language Form–Meaning Mapping”, co-authored with @asliozyurek.bsky.social, Gerardo Ortega, Kadir Gökgöz, and @esamghaleb.bsky.social

arXiv: arxiv.org/abs/2510.08482

15.10.2025 13:45 — 👍 5 🔁 1 💬 1 📌 0

Can BERT help save endangered languages?

Excited to present this paper tomorrow at LM4UC @naaclmeeting.bsky.social! We explored how multilingual BERT with augmented data perform POS tagging & NER for Hamshentsnag #NAACL

🔗 Paper: aclanthology.org/2025.lm4uc-1.9/

03.05.2025 20:35 — 👍 6 🔁 1 💬 0 📌 0

Our attention analysis revealed GPT-2 models often shifted attention toward semantically plausible but syntactically incorrect noun phrases in reversed orders. LLaMA-3 maintained more stable attention patterns, suggesting syntactic but less human-like processing.

02.05.2025 12:44 — 👍 0 🔁 0 💬 0 📌 0

We then tested 3 Turkish LLMs (GPT-2-Base, GPT-2-Large, LLaMA-3) on the same stimuli, measuring surprisal and attention patterns. GPT-2-Large surprisal significantly predicted human reading times at critical regions, while LLaMA-3 surprisal did not.

02.05.2025 12:44 — 👍 0 🔁 0 💬 1 📌 0

Despite Turkish having explicit morphosyntactic features like accusative case marking and the agentive postposition "tarafından" (by), participants still made interpretation errors 25% of the time for implausible but grammatical sentences, confirming good-enough parsing effects.

02.05.2025 12:44 — 👍 0 🔁 0 💬 1 📌 0

When Men Bite Dogs: Testing Good-Enough Parsing in Turkish with Humans and Large Language Models Onur Keleş, Nazik Dinctopal Deniz. Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics. 2025.

We conducted a self-paced reading experiment with native Turkish speakers processing sentences with reversed thematic roles (e.g., "the man bit the dog" instead of "the dog bit the man"), testing if Turkish morphosyntactic marking prevents good-enough parsing.
🔗 aclanthology.org/2025.cmcl-1....

02.05.2025 12:44 — 👍 1 🔁 0 💬 1 📌 0

Are your LLMs good-enough? 🤔

Our new paper w/ @nazik.bsky.social at #CMCL2025
@naaclmeeting.bsky.social shows both humans & smaller LLMs do good-enough parsing in Turkish role-reversal contexts. GPT-2 better predicts human RTs. LLaMA-3 does less heuristic parses but lacks predictive power.

02.05.2025 12:44 — 👍 2 🔁 0 💬 1 📌 0

Thanks!

27.03.2025 12:52 — 👍 0 🔁 0 💬 0 📌 0

I’ll be at #HSP2025 at the University of Maryland @UofMaryland, College Park, to present my MA thesis work on using pose estimation to detect phonetic reduction in Turkish Sign Language from March 27 to 29. Would love to meet and chat if you are in the area!

26.03.2025 13:11 — 👍 2 🔁 0 💬 0 📌 1

GitHub - kelesonur/MA_Thesis Contribute to kelesonur/MA_Thesis development by creating an account on GitHub.

As referents become more accessible, signs undergo phonetic and kinematic reduction (shorter duration, smaller hand movement, & narrower signing space). Native deaf signers also retell events faster than late deaf signers.

I’ll present it in HSP 2025 on March 27.

Repo: github.com/kelesonur/MA...

11.03.2025 05:02 — 👍 1 🔁 0 💬 0 📌 0

Thesis cover page

Thesis abstract. ABSTRACT ˙ Discourse Cohesion and Phonetics in Turkish Sign Language (T ID): An Experimental and Computational Approach Theories of linguistic efficiency, such as Zipf’s (1949) law, claim that languages reduce effort by favoring simpler or more economical forms whenever possible. Although this claim has widely been tested and confirmed in spoken languages (e.g., Givon, 1983; Gundel, Hedberg, & Zacharski, 1993), it has not been addressed as widely in sign languages. This thesis investigates how such theories would be ˙ realized in Turkish Sign Language (T ID) through discourse cohesion and phonetic ˙ reduction in the narratives of T ID. With a story-retelling production experiment, discourse cohesion has been analyzed using a quantized measure of accessibility adapted from Ariel’s (1990) framework. In particular, I examine referring expression ˙forms (e.g., nominal versus verbal) that native and late-signing deaf adult T ID signers employ in narratives. The articulatory phonetic aspect of narratives has been analyzed using MediaPipe, an open-source computer vision tool. The results of the discourse cohesion experiment display similarities with previous research on spoken language and other sign languages. A strong relationship was found between the cognitive accessibility scores of referents and the discourse context (e.g., first mention, maintenance, re-introduction), type of the referring expression, and age of acquisition ˙in T ID. The results of the computational phonetic analysis showed the forms that T ID signers used underwent phonetic reduction as the cognitive accessibility of a referent increased. They had shorter duration, smaller hand movement, and narrower signing space. Age of acquisition or delayed first acquisition did not significantly affect these measures except for duration, in which native signers retold the events faster than late signers.

What’s a better first Bluesky post than introduce my recently defended MA thesis?

I examined signed narratives with a production experiment and computer vision to answer a simple question:

“Do signers display language economy as evidenced by phonetic reduction and referring expression choice?”

11.03.2025 05:02 — 👍 3 🔁 0 💬 1 📌 0

Onur Keleş

Latest posts by onurkeles.bsky.social on Bluesky

@onurkeles is following 20 prominent accounts