Greg Durrett's Avatar

Greg Durrett

@gregdnlp.bsky.social

CS professor at UT Austin. Large language models and NLP. he/him

2,561 Followers  |  411 Following  |  17 Posts  |  Joined: 29.05.2023  |  1.8558

Latest posts by gregdnlp.bsky.social on Bluesky

Picture of the UT Tower taken by me on my first day at UT as a postdoc in 2023!

Picture of the UT Tower taken by me on my first day at UT as a postdoc in 2023!

NewsπŸ—žοΈ

I will return to UT Austin as an Assistant Professor of Linguistics this fall, and join its vibrant community of Computational Linguists, NLPers, and Cognitive Scientists!🀘

Excited to develop ideas about linguistic and conceptual generalization (recruitment details soon!)

02.06.2025 13:18 β€” πŸ‘ 65    πŸ” 7    πŸ’¬ 12    πŸ“Œ 2

Great to work on this benchmark with astronomers in our NSF-Simons CosmicAI institute! What I like about it:
(1) focus on data processing & visualization, a "bite-sized" AI4Sci task (not automating all of research)
(2) eval with VLM-as-a-judge (possible with strong, modern VLMs)

02.06.2025 15:49 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

The end of US leadership in science, technology, and innovation.

All in one little table.

A tremendous gift to China, courtesy of the GOP.

nsf-gov-resources.nsf.gov/files/00-NSF...

30.05.2025 21:26 β€” πŸ‘ 1057    πŸ” 424    πŸ’¬ 38    πŸ“Œ 29
Preview
Percy Liang on X: "What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision: https://t.co/racsvmhyA3" / X What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision: https://t.co/racsvmhyA3

Super excited Marin is finally out! Come see what we've been building! Code/platform for training fully reproducible models end-to-end, from data to evals. Plus a new high quality 8B base model. Percy did a good job explaining it on the other place. marin.community

x.com/percyliang/s...

19.05.2025 19:35 β€” πŸ‘ 17    πŸ” 6    πŸ’¬ 1    πŸ“Œ 0

Check out Anirudh's work on a new benchmark for C-to-Rust transpilation! 100 realistic-scale C projects, plus target Rust interfaces + Rust tests that let us validate the transpiled code beyond what prior benchmarks allow.

23.04.2025 18:37 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Post image Post image

πŸš€Meet CRUST-Bench, a dataset for C-to-Rust transpilation for full codebases πŸ› οΈ
A dataset of 100 real-world C repositories across various domains, each paired with:
πŸ¦€ Handwritten safe Rust interfaces.
πŸ§ͺ Rust test cases to validate correctness.
🧡[1/6]

23.04.2025 17:00 β€” πŸ‘ 17    πŸ” 5    πŸ’¬ 1    πŸ“Œ 1

Check out Manya's work on evaluation for open-ended tasks! The criteria from EvalAgent can be plugged into LLM-as-a-judge or used for refinement. Great tool with a ton of potential, and there's LOTS to do here for making LLMs better at writing!

22.04.2025 16:30 β€” πŸ‘ 3    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

Check out Ramya et al.'s work on understanding discourse similarities in LLM-generated text! We see this as an important step in quantifying the "sameyness" of LLM text, which we think will be a step towards fixing it!

21.04.2025 22:10 β€” πŸ‘ 6    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
South by Semantics Workshop
Title: "Not-your-mother's connectionism: LLMs as cognitive models"
Speaker: Ellie Pavlick (Brown University)
Date and time: April 23, 2025. 3:30 - 5 PM.
Location: GDC 6.302

South by Semantics Workshop Title: "Not-your-mother's connectionism: LLMs as cognitive models" Speaker: Ellie Pavlick (Brown University) Date and time: April 23, 2025. 3:30 - 5 PM. Location: GDC 6.302

Our final South by Semantics lecture at UT Austin is happening on Wednesday April 23!

21.04.2025 13:39 β€” πŸ‘ 15    πŸ” 4    πŸ’¬ 2    πŸ“Œ 0

Check out @juand-r.bsky.social and @wenxuand.bsky.social 's work on improving generator-validator gaps in LLMs! I really like the formulation of the G-V gap we present, and I was pleasantly surprised by how well the ranking-based training closed the gap. Looking forward to following up in this area!

16.04.2025 18:18 β€” πŸ‘ 10    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

If you're scooping up students off the street for writing op-eds, you're secret police, and should be treated accordingly.

26.03.2025 20:00 β€” πŸ‘ 9181    πŸ” 2355    πŸ’¬ 101    πŸ“Œ 40

I'm excited to announce two papers of ours which will be presented this summer at @naaclmeeting.bsky.social eting.bsky.social and @iclr-conf.bsky.social !
🧡

11.03.2025 22:03 β€” πŸ‘ 10    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0
Post image

Excited about Proofwala, @amitayush.bsky.social's new framework for ML-aided theorem-proving.

* Paper: arxiv.org/abs/2502.04671
* Code: github.com/trishullab/p...

Proofwala allows the collection of proof-step data from multiple proof assistants (Coq and Lean) and multilingual training. (1/3)

22.02.2025 21:32 β€” πŸ‘ 21    πŸ” 5    πŸ’¬ 1    πŸ“Œ 1

Popular or not Dems cannot bend on the need for trans people to be treated with basic humanity and respect. If we give up that because the right made trans people unpopular, we give up everything. They’ll dice us group by group like a salami. We die on this hill or we die alone in a ditch

05.02.2025 21:19 β€” πŸ‘ 6908    πŸ” 1360    πŸ’¬ 143    πŸ“Œ 169

Here are just a few of the NSF review panels that were shut down today, Chuck.

This is research that would have made us competitive in computer science that will now be delayed by many months if not lost forever.

AI is fine but right now the top priority is keeping the lights on at NSF and NIH.

28.01.2025 03:06 β€” πŸ‘ 761    πŸ” 200    πŸ’¬ 10    πŸ“Œ 6
Post image

kicking off 2025 with our OLMo 2 tech report while payin homage to the sequelest of sequels 🫑

πŸš— 2 OLMo 2 Furious πŸ”₯ is everythin we learned since OLMo 1, with deep dives into:

πŸš– stable pretrain recipe
πŸš” lr anneal 🀝 data curricula 🀝 soups
🚘 tulu post-train recipe
🚜 compute infra setup

πŸ‘‡πŸ§΅

03.01.2025 16:02 β€” πŸ‘ 69    πŸ” 17    πŸ’¬ 2    πŸ“Œ 1
Preview
Taulbee Report 2024 Visit the post for more.

Congrats to Prasann and all the other awardees! Full list is here: cra.org/about/awards...

03.01.2025 14:39 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Before his post-training work, Prasann did a great project on representing LM outputs with lattices, which remains one of my favorite algorithms-oriented papers from my group in the last few years, with a lot of potential for interesting follow-up work!

03.01.2025 14:39 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

He then advanced our understanding of online DPO methods: how can we combine the strengths of reward models and DPO? (also at COLM 2024)

03.01.2025 14:39 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

...established a critical weakness of RLHF with open reward models: spurious correlation with length (COLM 2024)

03.01.2025 14:39 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Huge congrats to @prasannsinghal.bsky.social for being one of the 8 CRA Outstanding Undergraduate Researcher Award winners! It has been an absolute privilege to work with Prasann during his time at UT. (And he's applying for PhD programs this year...hint hint...)

Prasann's work 🧡

03.01.2025 14:37 β€” πŸ‘ 23    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0
Post image

What's in an attention head? 🀯

We present an efficient framework – MAPS – for inferring the functionality of attention heads in LLMs ✨directly from their parameters✨

A new preprint with Amit Elhelo 🧡 (1/10)

18.12.2024 17:55 β€” πŸ‘ 63    πŸ” 14    πŸ’¬ 1    πŸ“Œ 0

Thanks for the comments! We fine-tune biases only for certain heads, but at both training and inference time, we still have to use the entire network. So it doesn't save inference-time compute.

11.12.2024 22:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

I'm at #Neurips2024 this week!

My work (arxiv.org/abs/2406.17692) w/ @gregdnlp.bsky.social & @eunsol.bsky.social exploring the connection between LLM alignment and response pluralism will be at pluralistic-alignment.github.io Saturday. Drop by to learn more!

11.12.2024 17:39 β€” πŸ‘ 28    πŸ” 6    πŸ’¬ 0    πŸ“Œ 0

bsky.app/profile/anam...
sorry but you gotta give taylor at least 60% of your poster real estate if you want sota in swiftiness

09.12.2024 07:12 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Missed out on #Swift tickets? No worriesβ€”swing by our #SVFT poster at #NeurIPS2024 and catch *real* headliners! πŸŽ€πŸ’ƒπŸ•Ί
πŸ“ŒWhere: East Exhibit Hall A-C #2207, Poster Session 4 East
⏲️When: Thu 12 Dec, 4:30 PM - 7:30 PM PST

#AI #MachineLearning #PEFT #NeurIPS24

09.12.2024 05:55 β€” πŸ‘ 9    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

The legendary Putnam math competition had its 85th edition yesterday. Coincidentally, George Tsoukalas will present our paper on PutnamBench, a next-generation #AI4Math benchmark, at #NeurIPS2024 this week: arxiv.org/abs/2407.11214.
If you work on frontier AI for math/reasoning, talk to George!

08.12.2024 20:03 β€” πŸ‘ 15    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0
Post image Post image

I'll be at #NeurIPS2024 w/

- @fcyin.bsky.social's LoFiT: using interp to improve fine-tuning (Weds pm poster & MINT spotlight talk Sun)
- @thomlake.bsky.social's analysis of Overton pluralism (Pluralistic alignment Sat)

Please reach out to me to chat about interp, factuality, reasoning, &c!

08.12.2024 20:38 β€” πŸ‘ 46    πŸ” 8    πŸ’¬ 1    πŸ“Œ 1
Post image

I’m on the academic job market this year! I’m completing my @uwcse.bsky.social @uwnlp.bsky.social Ph.D. (2025), focusing on overcoming LLM limitations like hallucinations, by building new LMs.
My Ph.D. work focuses on Retrieval-Augmented LMs to create more reliable AI systems 🧡

04.12.2024 13:26 β€” πŸ‘ 70    πŸ” 17    πŸ’¬ 3    πŸ“Œ 2

I should influence before my window closes

22.11.2024 23:02 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@gregdnlp is following 20 prominent accounts