Hellina Hailu Nigatu

Hellina Hailu Nigatu

@hellinanigatu.bsky.social

CS PhD candidate @UCBerkeley. Interested in multilingual and low-resourced language NLP + HCI. @SIGHPC CDS Fellow. Interned @MBZUAI. Current intern at DAIR Website: https://hhnigatu.github.io

2,251 Followers 258 Following 136 Posts Joined Nov 2024
4 months ago

iโ€™m looking to recruit a postdoc to work on this (documentation + evaluation on accuracy, reliability and societal impacts). hope to advertise detailed descriptions of the role in the coming weeks

28 29 1 0
4 months ago

Book #12 How Dare the Sun Rise? By Sandra Uwiringiyimana from the DRC

This was a heavy one...I had to sit for a while with the first few chapters as Sandra recounted her experience of loss and greif...

8 2 0 0
5 months ago

๐Ÿ˜‚๐Ÿ˜‚ works for me

1 0 0 0
5 months ago

แ‰ฅแ‰ปแ‹‹แŠ• แ‹จแ‰ แˆ‹แ‰ฝ....๐Ÿ˜Œ

1 0 1 0
5 months ago

Congrats!!

0 0 0 0
5 months ago

There is so much about navigating the Internet in a low resourced language that makes one unnecessarily vulnerable to malicious actors. It's not just a quality of experience difference, but literally the soft belly through which misinformation spreaders attack.

10 6 0 0
5 months ago

Thank you friend โค

1 0 0 0
5 months ago

This work was done with my wonderful collaborators Nuredin Ali, Fiker Tewelde, @schancellor.bsky.social and @iamdaricia.bsky.social

5/n

3 0 1 0
5 months ago

Based on our findings, we introduce the concept of Data Horizons: a critical boundary where algorithmic structures begin to degrade the relevance and reliability of search results.

4/n

2 0 1 0
5 months ago

We investigate online health information on #YouTube and #TikTok in two low-web data languages, Amharic and Tigrinya. We find that linguistic, technological, and socio-cultural constraints on information access and production lead to degraded information quality for low-web data languages.

3/n

4 0 1 0
5 months ago

While social media platforms are increasingly being used as sources of information for critical sectors like healthcare, the quality and quantity of information available is not always guaranteed, especially for languages with limited data available online.
2/n

2 0 1 0
5 months ago

Very excited for our upcoming #AIES paper Into the Void: Understanding Online Health Information in Low-Web Data Languages.

Link: arxiv.org/pdf/2509.20245

1/n

9 1 1 1
5 months ago

I will DM you!

0 0 0 0
5 months ago

แŠฅแŠ•แŠณแŠ• แŠ แ‰ฅแˆฎ แŠ แ‹ฐแˆจแˆฐแŠ•!
So far so good navigating the documentation! Will reach out if i need help or have questions ๐Ÿ˜Š thank you!

1 0 1 0
5 months ago

@meg48.bsky.social's Ethiopian new years gift to me is a new version of HornMorpho exactly as i am working on a project that requires morphological analyzer for Amharic, Tigrinya, and Afan Oromo ๐Ÿ’ƒ๐Ÿ’ƒ

0 0 1 0
6 months ago

That explains a lot ๐Ÿ˜‚๐Ÿ˜‚

1 0 0 0
6 months ago

What are you up to Nina ๐Ÿ‘€

0 0 1 0
6 months ago
Video thumbnail

If you or your students are interested in visualization tools, may I suggest signing up for my student @parkie-doo.sh's study! We're learning *a lot* about how to build direct manipulation programming tools these days! Please pass the sign up link along to your labs!
docs.google.com/forms/d/e/1F...

5 1 0 0
6 months ago
Portrait of Milagros Miceli in a frame that reads TIME100/AI 2025.

I am thrilled to be recognized by TIME as one of the 100 most influential people worldwide in the field of artificial intelligence for my work with @dataworkersinquiry.bsky.social.

>> #TIME100AI time.com/time100ai

I want to take this opportunity to share a few reflections on this work ๐Ÿ‘‡๐Ÿงต

57 17 5 3
6 months ago

Oh no! I ran out of wall space for my tally!!!๐Ÿ˜Œ

1 0 0 0
6 months ago

I am gonna start a tally for every time i have to contend with publication policies at top tier conferences that implicitly stall Global South scholarship.

2 0 1 0
6 months ago
Post image

Came accross a common Ethiopian name on one of the poems in this book as a dedication ๐Ÿ˜Š

1 0 0 0
7 months ago

this is not to say all MT is bad or MT has no place in contribution...more on that as an output of my work @dairinstitute.bsky.social ๐Ÿ˜Ž

2 1 0 0
7 months ago

Lol here is an example:

A google translated Tigrinya article: ti.wikipedia.org/wiki/%E1%88%...

English version: en.wikipedia.org/wiki/Wedding...

I took the part that says "Ethiopia" from the English article and ran it through Google Translate...almost identical output save a few words.

1 2 1 0
7 months ago

Book #11
Missing in action and presumed Dead by Rashidah Ismaili from Benin

Got this from Thrift Books and by luck got a version with the author signature โ˜บ๏ธ

Its a beautiful collection of poems and my fav one is Nomad attached in the picture below

3 0 1 0
7 months ago

Omg our advisor @schasins.bsky.social got us beanbags for our lab space a while back and we loveee them

2 0 0 0
7 months ago

This is a good step IMO...but i think we conflate "Wikipedia" with "English Wikipedia" and "AI Generated" with "LLM generated"

We should also be having conversations on Machine Translated text in non-English Wikipedia...those are also "AI Generated"๐Ÿ˜

9 4 0 1
7 months ago

Was a pleasure to work with you Chinasaโค here is to many more collaborations ๐Ÿฅ‚

1 0 1 0
7 months ago
Screenshot of paper on the ACL website with the title (Examining the Cultural Encoding of Gender Bias in LLMs for Low-Resourced African Languages) and abstract that reads: "Abstract
Large Language Models (LLMs) are deployed in several aspects of everyday life. While the technology could have several benefits, like many socio-technical systems, it also encodes several biases. Trained on large, crawled datasets from the web, these models perpetuate stereotypes and regurgitate representational bias that is rampant in their training data. Languages encode gender in varying ways; some languages are grammatically gendered, while others do not. Bias in the languages themselves may also vary based on cultural, social, and religious contexts. In this paper, we investigate gender bias in LLMs by selecting two languages, Twi and Amharic. Twi is a non-gendered African language spoken in Ghana, while Amharic is a gendered language spoken in Ethiopia. Using these two languages on the two ends of the continent and their opposing grammatical gender system, we evaluate LLMs in three tasks: Machine Translation, Image Generation, and Sentence Completion. Our results give insights into the gender bias encoded in LLMs using two low-resourced languages and broaden the conversation on how culture and social structures play a role in disparate system performances."

My latest work, โ€œExamining the Cultural Encoding of Gender Bias in LLMs for Low-Resourced African Languages,โ€ co-authored with Abigail Oppong and Hellina Nigatu, is now published at the Workshop on Gender Bias in Natural Language Processing at #ACL2025!

aclanthology.org/2025.gebnlp-...

7 3 1 0