Oskar van der Wal (@ovdw) — Bluesky Profile

10 months ago

✈️ Headed to @iclr-conf.bsky.social — whether you’ll be there in person or tuning in remotely, I’d love to connect!

We’ll be presenting our paper on pre-training stability in language models and the PolyPythias 🧵

🔗 ArXiv: arxiv.org/abs/2503.09543
🤗 PolyPythias: huggingface.co/collections/...

6 3 1 0

1 year ago

Work in progress -- suggestions for NLP-ers based in the EU/Europe & already on Bluesky very welcome!

go.bsky.app/NZDc31B

70 20 51 0

1 year ago

I would like to be added! 😄

1 0 0 0

1 year ago

Hi, I'd like to be part of this!

1 0 0 0

1 year ago

👋

2 0 0 0

1 year ago

💬Panel discussion with Sally Haslanger and Marjolein Lanzing: A philosophical perspective on algorithmic discrimination

Is discrimination the right way to frame the issues of lang tech? Or should we answer deeper rooted questions? And how does tech fit in systems of oppression?

0 0 0 0

1 year ago

Screenshot of a slide discussing how to improve how we communicate bias scores on Model Cards.

📄Undesirable Biases in NLP: Addressing Challenges of Measurement

We also presented our own work on strategies for testing the validity and reliability of LM bias measures:

www.jair.org/index.php/ja...

0 0 1 0

1 year ago

Photo of the presentation. The slide shows an image of Frankenstein's monster.

🔑Keynote @zeerak.bsky.social: On the promise of equitable machine learning technologies

Can we create equitable ML technologies? Can statistical models faithfully express human language? Or are tokenizers "tokenizing" people—creating a Frankenstein monster of lived experiences?

6 2 1 0

1 year ago

📄A Capabilities Approach to Studying Bias and Harm in Language Technologies

@hellinanigatu.bsky.social introduced us to the Capabilities Approach and how it can help us better understand the social impact of language technologies—with case studies of failing tech in the Majority World.

1 0 1 1

1 year ago

📄Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution

Flor Plaza discussed the importance of studying gendered emotional stereotypes in LLMs, and how collaborating with philosophers benefits work on bias evaluation greatly.

1 0 1 0

1 year ago

🔑Keynote by John Lalor: Should Fairness be a Metric or a Model?

While fairness is often viewed as a metric, using integrated models instead can help with explaining upstream bias, predicting downstream fairness, and capturing intersectional bias.

0 0 1 0

1 year ago

📄A Decade of Gender Bias in Machine Translation

Eva Vanmassenhove: how has research on gender bias in MT developed over the years? Important issues, like non-binary gender bias, now fortunately get more attention. Yet, fundamental problems (that initially seemed trivial) remain unsolved.

2 0 1 0

1 year ago

📄MBBQ: A Dataset for Cross-Lingual Comparison of Stereotypes in Generative LLMs

Vera Neplenbroek presented a multilingual extension of the BBQ bias benchmark to study bias across English, Dutch, Spanish, and Turkish.

"Multilingual LLMs are not necessarily multicultural!"

0 0 1 0

1 year ago

🔑Keynote by Dong Nguyen: When LLMs meet language variation: Taking stock and looking forward

Non-standard language is often seen as noisy/incorrect data, but this ignores the reality of language. Variation should play a larger role in LLM developments and sociolinguistics can help!

0 0 1 0

1 year ago

Last week, we organized the workshop "New Perspectives on Bias and Discrimination in Language Technology" 🤖 @uvahumanities.bsky.social @amsterdamnlp.bsky.social

We're looking back at two inspiring days of talks, posters, and discussions—thanks to everyone who participated!

wai-amsterdam.github.io

12 1 1 0

1 year ago

This is a friendly reminder that there are 7 days left for submitting your extended abstract to this workshop!

(Since the workshop is non-archival, previously published work is welcome too. So consider submitting previous/future work to join the discussion in Amsterdam!)

0 1 0 0

1 year ago

Workshop: New Perspectives on Bias and Discrimination in Language Technology. Workshop: New Perspectives on Bias and Discrimination in Language Technology.

This workshop is organized by University of Amsterdam researchers Katrin Schulz, Leendert van Maanen, @wzuidema.bsky.social, Dominik Bachmann, and myself.
More information on the workshop can be found on the website, which will be updated regularly.
wai-amsterdam.github.io

1 0 0 0

1 year ago

🌟The goal of this workshop is to bring together researchers from different fields to discuss the state of the art on bias measurement and mitigation in language technology and to explore new avenues of approach.

1 0 1 0

1 year ago

One of the central issues discussed in the context of the societal impact of language technology is that ML systems can contribute to discrimination. Despite efforts to address these issues, we are far from solving them.

1 0 1 0

1 year ago

We're super excited to host Dong Nguyen, John Lalor, @zeerak.bsky.social and @azjacobs.bsky.social as invited speakers at this workshop! Submit an extended abstract to join the discussions; either in a 20min talk or a poster session.
📝Deadline Call for Abstracts: 15 Sep, 2024

2 1 1 0

1 year ago

Workshop: New Perspectives on Bias and Discrimination in Language Technology. Workshop: New Perspectives on Bias and Discrimination in Language Technology.

Working on #bias & #discrimination in #NLP? Passionate about integrating insights from different disciplines? And do you want to discuss current limitations of #LLM bias mitigation work? 🤖
👋Join the workshop New Perspectives on Bias and Discrimination in Language Technology 4&5 Nov in #Amsterdam!

5 4 1 2

2 years ago

OLMo: Open Language Model A State-Of-The-Art, Truly Open LLM and Framework

release day release day 🥳 OLMo 1b +7b out today and 65b soon...

OLMo accelerates the study of LMs. We release *everything*, from toolkit for creating data (Dolma) to train/inf code

blog blog.allenai.org/olmo-open-la...
olmo paper allenai.org/olmo/olmo-pa...
dolma paper allenai.org/olmo/dolma-p...

28 14 1 2

2 years ago

But exciting to see more work dedicated to sharing models, checkpoints, and training data to the (research) community!

1 0 0 0

2 years ago

Pythia | Proceedings of the 40th International Conference on Machine Learning

Don't forget EleutherAI's Pythia, which came out last year! dl.acm.org/doi/10.5555/...

1 0 1 0

2 years ago

Uncovering the Phases of Neural Network Training: Insights from CDS’ Michael Hu The idea that neural networks undergo distinct developmental phases during training has long been a subject of debate and fascination. CDS…

@michahu.bsky.social did an interview laying out our recent paper containing the figures I insist on calling "the mona lisa of training visualizations"

5 2 0 0

2 years ago

I look forward to debates in the philosophy of (techno)science by people more knowledgeable. I'd say we have some philosophical basis that people are capable of such tasks. But there is also sufficient reason to believe LLM≠human, so any trust in one does not automatically transfer to the other.

0 0 1 0

2 years ago

That is not to say I am categorically against using LLMs as epistemic tools, but from my own experience as a bias and interpretability researcher I think we should be careful of potential biases/failure modes. If we are transparent about their use and potential issues, I could see LLMs being useful.

1 0 1 0

2 years ago

I think it all boils down to the reliability+validity of the approach. We don't have good methodologies (yet) to assess these qualities for LLMs compared to simpler more interpretable techniques. And intuitively I think we have more reasons to trust (expert) human annotators—see also psychometrics.

0 0 2 0

2 years ago

A 🧵thread about strategies for improving social bias evaluations of LMs. #blueskAI 🤖

bsky.app/profile/ovdw...

4 1 0 0

2 years ago

Special thanks go to Dominik Bachmann (shared first-author) whose insights from the perspective of psychometrics not only helped shape this paper, but also my views of current AI fairness practices more broadly.

0 0 0 0