Ivan Kartáč @ivankartac - Bluesky Profile

Ivan Kartáč

@ivankartac.bsky.social

PhD student @ Charles University. Researching evaluation and explainability of reasoning in language models.

65 Followers | 223 Following | 3 Posts | Joined: 30.03.2025 | 1.5396

Latest posts by ivankartac.bsky.social on Bluesky

OpeNLGauge comes in two variants: a prompt-based ensemble and a smaller fine-tuned model, both built exclusively on open-weight LLMs (including training data!).

Thanks @tuetschek.bsky.social and @mlango.bsky.social!

23.08.2025 16:39 — 👍 1 🔁 0 💬 0 📌 0

We introduce an explainable metric for evaluating a wide range of natural language generation tasks, without any need for reference texts. Given an evaluation criterion, the metric provides fine-grained assessments of the output by highlighting and explaining problematic spans in the text.

23.08.2025 16:37 — 👍 0 🔁 0 💬 1 📌 0

Our paper "OpeNLGauge: An Explainable Metric for NLG Evaluation with Open-Weights LLMs" has been accepted to #INLG2025 conference!

You can read the preprint here: arxiv.org/abs/2503.11858

23.08.2025 16:36 — 👍 4 🔁 2 💬 1 📌 0

#ACL2025NLP in Vienna 🇦🇹 starts today with 23 🤯 @ufal-cuni.bsky.social folks presenting their work both at the main conference and workshops. Check out our main conference papers today and on Wednesday 👇

28.07.2025 07:27 — 👍 22 🔁 8 💬 1 📌 1

Ondrej Dusek MLPrague 2025 Evaluating LLM outputs with humans and LLMs Ondřej Dušek MLPrague 30 April 2025 These slides: https://bit.ly/mlprague25-od

Slides and links to papers at bit.ly/mlprague25-od 🤓

02.05.2025 19:25 — 👍 2 🔁 2 💬 0 📌 0

Today, @tuetschek.bsky.social shared the work of his team on evaluating LLM text generation with both human annotation frameworks and LLM-based metrics. Their approach tackles the benchmark data leakage problem and how to get unseen data for unbiased LLM testing.

30.04.2025 12:02 — 👍 8 🔁 3 💬 1 📌 0

Large Language Models as Span Annotators Website for the paper Large Language Models as Span Annotators

How do LLMs compare to human crowdworkers in annotating text spans? 🧑🤖

And how can span annotation help us with evaluating texts?

Find out in our new paper: llm-span-annotators.github.io

Arxiv: arxiv.org/abs/2504.08697

15.04.2025 11:10 — 👍 20 🔁 7 💬 1 📌 2

@ivankartac is following 20 prominent accounts

EvalEval Coalition
@eval-eval

We are a researcher community developing scientifically grounded research outputs and robust deployment infrastructure for broader impact evaluations. https://evalevalai.com/

Neal Agarwal
@neal.fun

making neal.fun

Julian Togelius
@togelius

AI and Games Researcher at NYU. Head of AI at Nof1.

Stella Li
@stellali

PhD student @uwnlp.bsky.social @uwcse.bsky.social | visiting researcher @MetaAI | previously @jhuclsp.bsky.social https://stellalisy.com

Benno Krojer
@bennokrojer

AI PhDing at Mila/McGill (prev FAIR intern). Happily residing in Montreal 🥯❄️ Academic: language grounding, vision+language, interp, rigorous & creative evals, cogsci Other: many sports, urban explorations, puzzles/quizzes bennokrojer.com

Patrick Kahardipraja
@pkhdipraja

PhD student @ Fraunhofer HHI. Interpretability, incremental NLP, and NLU. https://pkhdipraja.github.io/

404 Media
@404media.co

it's a website (and a podcast, and a newsletter) about humans and technology, made by four journalists you might already know. like and subscribe: 404media.co

LoResLM 2026
@loreslm

MaiNLP lab, LMU Munich
@mainlp

MaiNLP research lab at CIS, LMU Munich directed by Barbara Plank @barbaraplank.bsky.social Natural Language Processing | Artificial Intelligence | Computational Linguistics | Human-centric NLP

MilaNLP Lab
@milanlp

The Milan Natural Language Processing Group #NLProc #AI milanlproc.github.io

Can
@canrager

EurIPS Conference
@euripsconf

EurIPS is a community-organized, NeurIPS-endorsed conference in Copenhagen where you can present papers accepted at @neuripsconf.bsky.social eurips.cc

Gabriele Sarti
@gsarti.com

PhD Student at @gronlp.bsky.social 🐮, core dev @inseq.org. Interpretability ∩ HCI ∩ #NLProc. gsarti.com

Jennifer Hu
@jennhu

Asst Prof at Johns Hopkins Cognitive Science • Director of the Group for Language and Intelligence (GLINT) ✨• Interested in all things language, cognition, and AI jennhu.github.io

BIFOLD Berlin Institute for the Foundations of Learning and Data
@bifold.berlin

Groundbreaking foundational research in Big Data Management, Machine Learning, and their intersection. #AI #Research www.bifold.berlin 📰News: www.bifold.berlin/news-events/news 🔑Data Privacy: www.bifold.berlin/data-privacy

Explainable AI Berlin
@xai-berlin

Explainable AI research from the machine learning group of Prof. Klaus-Robert Müller at @tuberlin.bsky.social & @bifold.berlin

Qianli Wang @ ACL 2025🇦🇹
@qiaw99

Second-year PhD student at XplaiNLP group @TU Berlin: interpretability & explainability Website: https://qiaw99.github.io

Steven Bird 🍉
@steven-bird

http://linktr.ee/stevenbird Working with First Nations people who are keeping their ancestral languages strong. Living and working on Larrakia, Bininj, and Miriwoong country. He/they.

Paul Röttger @ EMNLP
@paul-rottger

Postdoc @milanlp.bsky.social working on LLM safety and societal impacts. Previously PhD @oii.ox.ac.uk and CTO / co-founder of Rewire (acquired '23) https://paulrottger.com/

Angelina Wang
@angelinawang

Asst Prof at Cornell Info Sci and Cornell Tech. Responsible AI https://angelina-wang.github.io/