Muru Zhang @muruzhang - Bluesky Profile

Latest posts by muruzhang.bsky.social on Bluesky

Great to be part of this project led by the amazing @hamishivi.bsky.social. The most fun (in retrospect) thing is to observe how the results start to shift as we scale up the candidate pool, evaluation suite, and selection size :) And eventually we find a simple method does the best!

04.03.2025 21:14 — 👍 2 🔁 1 💬 0 📌 0

How well do data-selection methods work for instruction-tuning at scale?

Turns out, when you look at large, varied data pools, lots of recent methods lag behind simple baselines, and a simple embedding-based method (RDS) does best!

More below ⬇️ (1/8)

04.03.2025 17:10 — 👍 13 🔁 4 💬 1 📌 2

This is a great effort for the migration, thanks for putting it together! Can I be added to the list?

12.11.2024 22:23 — 👍 1 🔁 0 💬 1 📌 0

@muruzhang is following 20 prominent accounts

Hamish Ivison
@hamishivi

I (try to) do NLP research. Antipodean abroad. currently doing PhD @uwcse, prev @usyd @ai2 🇦🇺🇨🇦🇬🇧 ivison.id.au

Mechanical Dirk
@mechanicaldirk

Training big models at @ai2.bsky.social.

Allen Chang
@cylumn

NLP PhD student at UPenn | Prev USC cylumn.com

Mina Lee
@mnlee

Assistant Professor @ UChicago CS/DSI (NLP & HCI) | Writing with AI ✍️ https://minalee-research.github.io/

Luke Zettlemoyer
@lukezettlemoyer

Professor at UW; Researcher at Meta. LMs, NLP, ML. PNW life.

Jiarui Zhang
@jiaruizhang

USC CS Ph.D. student Prev Tsinghua Uni NLP, Multimodal Learning, AI for Science https://saccharomycetes.github.io/

Alisa Liu
@alisawuffles

phd student at @uwcse

Jenny Shen
@jennyshen056

1st year CS PhD student @UCSD

Ella Minzhi Li
@ellaminzhili

Visiting PhD at Stanford🌲, CS PhD student at NUS 🇸🇬, PhD Fellow @ Google, NLP researcher📒 https://yocodeyo.github.io Working on Social Intelligence and Evaluation

Jaspreet
@jaspreetranjit

3rd year PhD student @uscnlp. Interested in NLP x CSS | she/her

Dan Fu
@realdanfu

Incoming assistant professor at UCSD CSE in MLSys. Currently recruiting students! Also academic partner at Together AI. https://danfu.org/

Jonathan Frankle
@jfrankle.com

Chief AI Scientist at Databricks. Founding team at MosaicML. MIT/Princeton alum. Lottery ticket enthusiast. Working on data intelligence.

karpathy
@karpathy

AI @ OpenAI, Tesla, Stanford

Ruiyi Wang
@ruiyiwang

2nd year PhD at UCSD w/ @rajammanabrolu.bsky.social Prev: @ltiatcmu.bsky.social @umich.edu Research: Agents🤖, Reasoning🧠, Games👾

Marek Rei
@marekrei

AI/ML/NLP researcher and Senior Lecturer at Imperial College London. Working on language models for planning, reasoning and interpretable decision making

Songlin Yang
@sontaiscute

PhD student @MIT EECS & CSAIL. Working on principled and scalable methods in ML & LLM. she/her/hers sustcsonglin.github.io

Yisong Yue
@yisongyue

AI professor at Caltech. General Chair ICLR 2025. http://www.yisongyue.com

Isadora White
@izzycw

PhD Student at UC San Diego | LLM Agents, Reinforcement Learning, Human-AI Collaboration, Multi-Agent Systems

Sameer Singh
@sameer-singh

CS Prof at UC Irvine, CTO/Cofounder at Envive AI Work on evaluation and robustness of LLMs

Prithviraj "Raj" Ammanabrolu
@rajammanabrolu

AI, RL, NLP, Games Asst Prof at UCSD Research Scientist at Nvidia Lab: http://pearls.ucsd.edu Personal: prithvirajva.com