Oscar Sainz @osainz - Bluesky Profile

Latest posts by osainz.bsky.social on Bluesky

#Latxa txatbota probatarako erabilgarri jarri dugu! latxa.hitz.eus

Jaso ditugun eskaerei erantzunez zuen eskura jarri dugu Latxaren bertsio ahaltsuena, chatGPT-tik gertu dabilena, baina euskara txukunagoa sortuz.

31.10.2025 06:57 — 👍 9 🔁 10 💬 1 📌 1

Ayer uno de nuestros investigadores, Oscar Sainz (@osainz.bsky.social‬), fue galardonado con el premio a la mejor tesis doctoral en Inteligencia Artificial por la Asociación Española para la Inteligencia Artificial (AEPIA). ¡Enhorabuena! 🥳

10.07.2025 07:25 — 👍 5 🔁 2 💬 0 📌 0

«Kaixo, Latxa naiz. Zer jakin nahi duzu gaur?» Euskarazko txatbota sortu du EHUko HiTZ ikerketa zentroak. Oraindik ez dute jendaurrean zabaldu, baina garatzaileek eta enpresek eskuratzeko aukera dute. BERRIAko testuak erabili dituzte Latxa entr...

«Kaixo, Latxa naiz. Zer jakin nahi duzu gaur?». Euskarazko txatbota sortu du EHUko HiTZ ikerketa zentroak. Oraindik ez dute publikora zabaldu, baina garatzaileek eta enpresek eskuratzeko aukera dute. BERRIAko testuak erabili dituzte Latxa entrenatzeko.
t.co/OPVNnBG2xW?utm_...

16.06.2025 21:00 — 👍 2 🔁 3 💬 0 📌 1

While the experiments were not complicated, they required the collaboration of amazing co-authors, many compute hours, and of course, the impressive collaboration of the Basque community that was involved in manually assessing the models on an arena style evaluation.

Thank you!

11.06.2025 18:01 — 👍 1 🔁 1 💬 0 📌 0

In this work we face the challenge of developing instruct models for Basque, a low-resource language.

Continue pretraining base models is intuitive, but what about instructed models? We analyze systematically all different approaches to find the best solution.

2/3

11.06.2025 18:01 — 👍 2 🔁 1 💬 1 📌 0

Do you know that you can continue pretraining Instructed LLMs without losing their instruction following capabilities?

We did so to teach Basque to Llama models with promising results!

Interestingly, you only need English instructions and target language corpora 🤯

1/3

11.06.2025 18:01 — 👍 6 🔁 3 💬 1 📌 0

[1/7]
#newHitzPaper

Many languages are underserved by open LLMs, and face the following question: Which is the best way to produce open instruction-tuned LLMs for low-resource languages?

We obtained great results for a cost-effective option!

📄Paper: arxiv.org/abs/2506.07597

11.06.2025 10:27 — 👍 7 🔁 3 💬 1 📌 0

@osainz is following 20 prominent accounts

Juan Antonio Pérez
@japer3z

associate professor · University of Alicante · language-inspired artificial intelligence, inclusive machine translation, grande cuisine with neural networks

Pukatata
@pukatata

Oier Ijurko
@oijurko

Jone
@joneareizaga

Dibujadora Actualmente arruinándome la vida escribiendo Makomia

Ona de Gibert
@onadegibert

PhD Student @HelsinkiNLP / Low-resource, Machine Translation, Knowledge Distillation, Multilinguality

Yusuke (Protein) Sakai
@yusuke1997

自然言語処理やってるはず...週8筋トレゴリラ。進捗は筋肉です！将来の夢はボディービルダー！ NAIST

Imanol Miranda
@imirandam

PhD student at HiTZ Zentroa (@hitz-zentroa.bsky.social) / IXA Group and the University of Basque Country (@upvehu.bsky.social).

Arturo
@arturocomputes

NLP PhD student at @naist-nlp.bsky.social

NAIST NLP
@naist-nlp

NLP lab at NAIST in Nara, Japan 🦌 nlp.naist.jp/en/

Rodrigo Agerri
@ragerri

Permanent researcher at HiTZ Center - Ixa, University of the Basque Country UPV/EHU. https://ragerri.github.io/ #NLProc #NLP

Jeremy Barnes
@jeremy-nlp

Assistant professor at UPV/EHU @hitz-zentroa.bsky.social. He/him. #NLP for low-resource settings, currently very interested in evaluation methodology. http://jerbarnes.github.io

Ana Marasović
@anamarasovic

Asst prof @ University of Utah · NLP · she/her 🇭🇷

Sasha Rush
@srushnlp

Professor, Programmer in NYC. Cornell, Hugging Face 🤗

Christopher Manning
@chrmanning

Stanford Linguistics and Computer Science. Director, Stanford AI Lab. Founder of @stanfordnlp.bsky.social . #NLP https://nlp.stanford.edu/~manning/

Anna Rogers
@annarogers

Associate professor at IT University of Copenhagen: NLP, language models, interpretability, AI & society. Co-editor-in-chief of ACL Rolling Review. #NLProc #NLP

Mark Riedl
@markriedl

AI for storytelling, games, explainability, safety, ethics. Professor at Georgia Tech. Associate Director of ML Center at GT. Time travel expert. Geek. Dad. he/him

Melanie Mitchell
@melaniemitchell

Professor, Santa Fe Institute. Research on AI, cognitive science, and complex systems. Website: https://melaniemitchell.me Substack: https://aiguide.substack.com/

Luca Soldaini 🎀
@soldaini.net

I like tokens! Lead for OLMo data at @ai2.bsky.social (Dolma 🍇) w @kylelo.bsky.social. Open source is fun 🤖☕️🍕🏳️‍🌈 Opinions are sampled from my own stochastic parrot more at https://soldaini.net

Emily M. Bender
@emilymbender

Book: https://thecon.ai Web: https://faculty.washington.edu/ebender

Margaret Mitchell
@mmitchell

Researcher trying to shape AI towards positive outcomes. ML & Ethics +birds. Generally trying to do the right thing. TIME 100 | TED speaker | Senate testimony provider | Navigating public life as a recluse. Former: Google, Microsoft; Current: Hugging Face