Joel Mire's Avatar

Joel Mire

@joelmire.bsky.social

PhD student @ltiatcmu.bsky.social. he/him

104 Followers  |  225 Following  |  23 Posts  |  Joined: 17.11.2024
Posts Following

Posts by Joel Mire (@joelmire.bsky.social)

Cyberpunk!

03.02.2026 01:03 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

๐ŸŽญ How do LLMs (mis)represent culture?
๐Ÿงฎ How often?
๐Ÿง  Misrepresentations = missing knowledge? spoiler: NO!

At #CHI2026 we are bringing โœจTALESโœจ a participatory evaluation of cultural (mis)reps & knowledge in multilingual LLM-stories for India

๐Ÿ“œ arxiv.org/abs/2511.21322

1/10

02.02.2026 21:38 โ€” ๐Ÿ‘ 45    ๐Ÿ” 21    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2
Preview
CMU LTI Summer 2026 Internship Program Application We are looking for applicants for the Carnegie Mellon University Language Technology Institute's Summer 2026 "Language Technology for All" internship program. The main goal of this internship is to pr...

๐Ÿš€ Apply to CMU LTIโ€™s Summer 2026 โ€œLanguage Technology for Allโ€ internship! ๐ŸŽ“ Open to preโ€‘doctoral students new to language tech (nonโ€‘CS backgrounds welcome). ๐Ÿ”ฌ 12โ€“14 weeks inโ€‘person in Pittsburgh โ€” travel + stipend paid. ๐Ÿ’ธ Deadline: Feb 20, 11:59pm ET. Apply โ†’ forms.gle/cUu8g6wb27Hs...

02.02.2026 15:41 โ€” ๐Ÿ‘ 14    ๐Ÿ” 12    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0
Post image Post image Post image

New paper to appear at EACL 2026 main conference, and it's now up on arxiv: arxiv.org/pdf/2505.17536. The character limit here is insane, so I'll let the screenshots speak for themselves. We put together a new dataset for conversational role attribution & thread disentanglement.

29.01.2026 21:14 โ€” ๐Ÿ‘ 7    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

This was joint work with my wonderful collaborators @mariaa.bsky.social, @stevewilson.bsky.social, @zxma.bsky.social, Achyut Ganti, @andrewpiper.bsky.social, and advisor @maartensap.bsky.social, at CU Boulder, UM-Flint, @uconn.bsky.social, and @ltiatcmu.bsky.social!

19.12.2025 23:05 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Social Story Frames: Contextual Reasoning about Narrative Intent and Reception Reading stories evokes rich interpretive, affective, and evaluative responses, such as inferences about narrative intent or judgments about characters. Yet, computational models of reader response are...

Paper: arxiv.org/abs/2512.15925
Code: github.com/joel-mire/so...
SSF-Corpus: huggingface.co/datasets/joe...
SSF-Generator: huggingface.co/joelmire/lla...
SSF-Classifier: huggingface.co/joelmire/lla...

19.12.2025 23:05 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

SocialStoryFrames helps illuminate the social topography of narrative practices and reader response in online communities, opening avenues for future research in computational social science and cultural analytics!

19.12.2025 23:05 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Scatterplot with x-axis for 'reader-centric normalized entorpy' and y-axis for 'author-centric normalized entropy'. Each point represents a subreddit and is color-coded according to a legend of topics (e.g., fashion, hobby).

Scatterplot with x-axis for 'reader-centric normalized entorpy' and y-axis for 'author-centric normalized entropy'. Each point represents a subreddit and is color-coded according to a legend of topics (e.g., fashion, hobby).

Lastly, we quantify the diversity (entropy) of author goals and reader reactions using SSF-Taxonomy.

Quadrants suggest distinct patterns: r/Frugal & r/techsupport have predictable scripts for authors and readers; r/Fitness has varied authorial approaches, predictable reader reactions.

19.12.2025 23:05 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Scatterplot comparing subreddit-pair similarity according to two similarity measures: semantic similarity and ssf-sim. The plot shows clusters of labels at regions of high and low agreement between the two measures. For example, both measures score the CFB-nfl highly, while only the ssf-sim measure scores apple-books highly.

Scatterplot comparing subreddit-pair similarity according to two similarity measures: semantic similarity and ssf-sim. The plot shows clusters of labels at regions of high and low agreement between the two measures. For example, both measures score the CFB-nfl highly, while only the ssf-sim measure scores apple-books highly.

SocialStoryFrames provides a new wayโ€”beyond semantic similarityโ€”to compare social functions of storytelling across communities.

Even topically different communities (e.g., r/MakeupAddiction vs. r/buildapc) can show surprisingly similar social storytelling dynamics.

19.12.2025 23:05 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Bar plots for the overall_goal and narrative_intent dimensions of SSF-Taxonomy. Each bar plot shows relative proportion of dimensions sub-labels. For example, 'provide_info_support' has the highest proportion of overall_goal labels, followed by provide_experiential_accounts, and persuade_debate.

Bar plots for the overall_goal and narrative_intent dimensions of SSF-Taxonomy. Each bar plot shows relative proportion of dimensions sub-labels. For example, 'provide_info_support' has the highest proportion of overall_goal labels, followed by provide_experiential_accounts, and persuade_debate.

Using the framework, we analyze narrative intents across Reddit. Most common: justify/challenge belief (40%), clarify (14%), vent (14%), show identity (10%).

Association tests cast particular forms of storytelling (e.g., conveying similar experience) as mechanisms for empathy.

19.12.2025 23:05 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Table reporting inference classification performance for GPT-4.1 (k-shot) and SSF-Classifier across all SSF-Taxonomy dimensions.

Table reporting inference classification performance for GPT-4.1 (k-shot) and SSF-Classifier across all SSF-Taxonomy dimensions.

For inference classification, we build a zero-shot SSF-Classifier using the same distillation approach, guided by a k-shot GPT-4.1 teacher.

Expert human evaluation shows strong performance, with an average Micro F1 of 0.85 and Macro F1 of 0.79 across SSF-Taxonomy dimensions.

19.12.2025 23:05 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Stacked bar plot comparing human plausibility ratings for SSF-Classifier and GPT-4o across all SSF-Taxonomy dimensions.

Stacked bar plot comparing human plausibility ratings for SSF-Classifier and GPT-4o across all SSF-Taxonomy dimensions.

For inference generation, we create SSF-Generator via SFT distillation of GPT-4o on a Llama3.1 base.

We validate inference plausibility through a human survey (N=382).

94% of inferences were deemed plausible, with 78% deemed very/somewhat likely.

19.12.2025 23:05 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

We define 2 tasks:
1. Generate plausible reader-response inferences given a story and its community/conversational context
2. Classify inferences onto taxonomy subdimensions.

We also curate SSF-Corpus: 6,140 storytelling contexts from 50+ Reddit communities.

19.12.2025 23:05 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Diagram of SSF-Taxonomy. Shows information flow from the author and community + conversational context into readers, who then have many different forms of reader response. These range from author-oriented inferences (overall goal, narrative intent, author emotional response), causal inferences (causal explanation, prediction), value judgments (character appraisal, moral, stance), and feelings (narrative feeling, aesthetic feeling).

Diagram of SSF-Taxonomy. Shows information flow from the author and community + conversational context into readers, who then have many different forms of reader response. These range from author-oriented inferences (overall goal, narrative intent, author emotional response), causal inferences (causal explanation, prediction), value judgments (character appraisal, moral, stance), and feelings (narrative feeling, aesthetic feeling).

We introduce SocialStoryFrames, a framework for contextual reasoning about narrative intent and reader response for social media stories.

Its foundation is SSF-Taxonomy, a taxonomy of dimensions of reader response, grounded in narrative theory, pragmatics, and psychology.

19.12.2025 23:05 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Screenshot of paper title and authors. 

Title: Social Story Frames: Contextual Reasoning about Narrative Intent and Reception
Authors: Joel Mire, Maria Antoniak, Steven R. Wilson, Zexin Ma, Achyutarama R. Ganti, Andrew Piper, Maarten Sap

Screenshot of paper title and authors. Title: Social Story Frames: Contextual Reasoning about Narrative Intent and Reception Authors: Joel Mire, Maria Antoniak, Steven R. Wilson, Zexin Ma, Achyutarama R. Ganti, Andrew Piper, Maarten Sap

Reading social media stories evokes a wide range of contextual reader reactionsโ€”inferential, affective, evaluativeโ€”yet we lack methods to study these at scale.

Excited to share our new paper that builds a framework for analyzing storytelling practices across online communities!

19.12.2025 23:05 โ€” ๐Ÿ‘ 22    ๐Ÿ” 7    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
A staircase in the new School of Computer, Data & Information Sciences building at Wisconsin Madison. Tan wood structures surround tapestry art and a small indoor garden.

A staircase in the new School of Computer, Data & Information Sciences building at Wisconsin Madison. Tan wood structures surround tapestry art and a small indoor garden.

A view from above of the staircases in the Wisconsin CDIS building

A view from above of the staircases in the Wisconsin CDIS building

An shot from below of winding wooden staircases and a glass atrium rooftop. The new School of Computer, Data & Information Sciences building at Wisconsin Madison.

An shot from below of winding wooden staircases and a glass atrium rooftop. The new School of Computer, Data & Information Sciences building at Wisconsin Madison.

A bicolor white cat with seal-colored markings, looking upwards with big wide dark eyes.

A bicolor white cat with seal-colored markings, looking upwards with big wide dark eyes.

It's the season for PhD apps!! ๐Ÿฅง ๐Ÿฆƒ โ˜ƒ๏ธ โ„๏ธ

Apply to Wisconsin CS to research
- Societal impact of AI
- NLP โ†โ†’ CSS and cultural analytics
- Computational sociolinguistics
- Human-AI interaction
- Culturally competent and inclusive NLP
with me!

lucy3.github.io/prospective-...

11.11.2025 22:32 โ€” ๐Ÿ‘ 51    ๐Ÿ” 16    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Performance of a sweep of models on Oolong-synth and Oolong-real. Performance decreases with increasing context length, sometimes steeply.

Performance of a sweep of models on Oolong-synth and Oolong-real. Performance decreases with increasing context length, sometimes steeply.

Can LLMs accurately aggregate information over long, information-dense texts? Not yetโ€ฆ

We introduce Oolong, a dataset of simple-to-verify information aggregation questions over long inputs. No model achieves >50% accuracy at 128K on Oolong!

07.11.2025 17:07 โ€” ๐Ÿ‘ 50    ๐Ÿ” 20    ๐Ÿ’ฌ 3    ๐Ÿ“Œ 3
Post image

I'm recruiting multiple PhD students for Fall 2026 in Computer Science at @hopkinsengineer.bsky.social ๐Ÿ‚

Apply to work on AI for social sciences/human behavior, social NLP, and LLMs for real-world applied domains you're passionate about!

Learn more at kristinagligoric.com & help spread the word!

05.11.2025 14:56 โ€” ๐Ÿ‘ 28    ๐Ÿ” 17    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1
Post image

How and when should LLM guardrails be deployed to balance safety and user experience?

Our #EMNLP2025 paper reveals that crafting thoughtful refusals rather than detecting intent is the key to human-centered AI safety.

๐Ÿ“„ arxiv.org/abs/2506.00195
๐Ÿงต[1/9]

20.10.2025 20:04 โ€” ๐Ÿ‘ 9    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image

10 years after the initial idea, Artificial Humanities is here! Thanks so much to all who have preordered it. I hope you enjoy reading it and find this research approach as generative as I do. More to come!

27.09.2025 02:28 โ€” ๐Ÿ‘ 38    ๐Ÿ” 13    ๐Ÿ’ฌ 5    ๐Ÿ“Œ 3
Academic paper titled un-straightening generative ai: how queer artists surface and challenge the normativity of generative ai models

The piece is written by Jordan Taylor, Joel Mire, Franchesca Spektor, Alicia DeVrio, Maarten Sap, Haiyi Zhu, and Sarah Fox.

As an image titled 24 attempts at intimacy showing 24 ai generated images with the word intimacy, none of which seems to include same gender couples

Academic paper titled un-straightening generative ai: how queer artists surface and challenge the normativity of generative ai models The piece is written by Jordan Taylor, Joel Mire, Franchesca Spektor, Alicia DeVrio, Maarten Sap, Haiyi Zhu, and Sarah Fox. As an image titled 24 attempts at intimacy showing 24 ai generated images with the word intimacy, none of which seems to include same gender couples

๐Ÿณ๏ธโ€๐ŸŒˆ๐ŸŽจ๐Ÿ’ป๐Ÿ“ข Happy to share our workshop study on queer artistsโ€™ experiences critically engaging with GenAI

Looking forward to presenting this work at #FAccT2025 and you can read a pre-print here:
arxiv.org/abs/2503.09805

14.05.2025 18:38 โ€” ๐Ÿ‘ 27    ๐Ÿ” 4    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0
Post image

When it comes to text prediction, where does one LM outperform another? If you've ever worked on LM evals, you know this question is a lot more complex than it seems. In our new #acl2025 paper, we developed a method to find fine-grained differences between LMs:

๐Ÿงต1/9

09.06.2025 13:47 โ€” ๐Ÿ‘ 72    ๐Ÿ” 21    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 2
An overview of the work โ€œResearch Borderlands: Analysing Writing Across Research Culturesโ€ by Shaily Bhatt, Tal August, and Maria Antoniak. The overview describes that We  survey and interview interdisciplinary researchers (ยง3) to develop a framework of writing norms that vary across research cultures (ยง4) and operationalise them using computational metrics (ยง5). We then use this evaluation suite for two large-scale quantitative analyses: (a) surfacing variations in writing across 11 communities (ยง6); (b) evaluating the cultural competence of LLMs when adapting writing from one community to another (ยง7).

An overview of the work โ€œResearch Borderlands: Analysing Writing Across Research Culturesโ€ by Shaily Bhatt, Tal August, and Maria Antoniak. The overview describes that We survey and interview interdisciplinary researchers (ยง3) to develop a framework of writing norms that vary across research cultures (ยง4) and operationalise them using computational metrics (ยง5). We then use this evaluation suite for two large-scale quantitative analyses: (a) surfacing variations in writing across 11 communities (ยง6); (b) evaluating the cultural competence of LLMs when adapting writing from one community to another (ยง7).

๐Ÿ–‹๏ธ Curious how writing differs across (research) cultures?
๐Ÿšฉ Tired of โ€œculturalโ€ evals that don't consult people?

We engaged with interdisciplinary researchers to identify & measure โœจcultural normsโœจin scientific writing, and show thatโ—LLMs flatten themโ—

๐Ÿ“œ arxiv.org/abs/2506.00784

[1/11]

09.06.2025 23:29 โ€” ๐Ÿ‘ 72    ๐Ÿ” 30    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 5

This looks incredible! Thanks for sharing the syllabus!

04.06.2025 17:23 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Iโ€™m thrilled to share RewardBench 2 ๐Ÿ“Šโ€” We created a new multi-domain reward model evaluation that is substantially harder than RewardBench, we trained and released 70 reward models, and we gained insights about reward modeling benchmarks and downstream performance!

02.06.2025 23:41 โ€” ๐Ÿ‘ 22    ๐Ÿ” 6    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 1
Wisconsin-Madison's tree-filled campus, next to a big shiny lake

Wisconsin-Madison's tree-filled campus, next to a big shiny lake

A computer render of the interior of the new computer science, information science, and statistics building. A staircase crosses an open atrium with visibility across multiple floors

A computer render of the interior of the new computer science, information science, and statistics building. A staircase crosses an open atrium with visibility across multiple floors

I'm joining Wisconsin CS as an assistant professor in fall 2026!! There, I'll continue working on language models, computational social science, & responsible AI. ๐ŸŒฒ๐Ÿง€๐Ÿšฃ๐Ÿปโ€โ™€๏ธ Apply to be my PhD student!

Before then, I'll postdoc for a year in the NLP group at another UW ๐Ÿ”๏ธ in the Pacific Northwest

05.05.2025 19:54 โ€” ๐Ÿ‘ 145    ๐Ÿ” 14    ๐Ÿ’ฌ 16    ๐Ÿ“Œ 3
Post image

When interacting with ChatGPT, have you wondered if they would ever "lie" to you? We found that under pressure, LLMs often choose deception. Our new #NAACL2025 paper, "AI-LIEDAR ," reveals models were truthful less than 50% of the time when faced with utility-truthfulness conflicts! ๐Ÿคฏ 1/

28.04.2025 20:36 โ€” ๐Ÿ‘ 25    ๐Ÿ” 9    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 3
A bar plot comparing the storytelling rates for different topics in the example dataset of congressional speeches. There are often large differences between storytelling and non-storytelling for individual topics. For example, the topic whose top words read "NUM, years, service, great, state" has much more storytelling that non-storytelling.

A bar plot comparing the storytelling rates for different topics in the example dataset of congressional speeches. There are often large differences between storytelling and non-storytelling for individual topics. For example, the topic whose top words read "NUM, years, service, great, state" has much more storytelling that non-storytelling.

The top five congressional speeches for the topic "NUM, years, service, great state." All of the documents honor the lives of important people.

The top five congressional speeches for the topic "NUM, years, service, great state." All of the documents honor the lives of important people.

I updated our ๐Ÿ”ญStorySeeker demo. Aimed at beginners, it briefly walks through loading our model from Hugging Face, loading your own text dataset, predicting whether each text contains a story, and topic modeling and exploring the results. Runs in your browser, no installation needed!
โ†ณ

15.04.2025 12:05 โ€” ๐Ÿ‘ 20    ๐Ÿ” 6    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

New work on multimodal framing! ๐Ÿ’ซ

Some fun results: comparisons of the same frame when expressed in images vs texts. When the "crime" frame is expressed in the article text, there are more political words in the text, but when the frame is expressed in the article image, more police words.

07.04.2025 09:48 โ€” ๐Ÿ‘ 44    ๐Ÿ” 10    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1

This was joint work with my co-author Zubin Aysola; collaborators @dchechel.bsky.social, Nick Deas, and @chryssazrv.bsky.social; and advisor @maartensap.bsky.social at @ltiatcmu.bsky.social @scsatcmu.bsky.social @columbiauniversity.bsky.social, and the @istecnico.bsky.social! (10/10)

06.03.2025 19:49 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0