Iβll take this opportunity to say that UMass Amherst is amazing in many ways. But, their β8x #1 in dining by Princeton Reviewβ dining halls are not as good as Mount Holyoke dining.
26.07.2025 23:36 β π 2 π 0 π¬ 2 π 0@guha-anderson.com.bsky.social
hacker / CS professor https://www.khoury.northeastern.edu/~arjunguha/
Iβll take this opportunity to say that UMass Amherst is amazing in many ways. But, their β8x #1 in dining by Princeton Reviewβ dining halls are not as good as Mount Holyoke dining.
26.07.2025 23:36 β π 2 π 0 π¬ 2 π 0I still donβt understand why anyone cares about food quality. How else is a 21 year old boy to learn to cook if the dining hall food doesnβt suck?
26.07.2025 20:49 β π 3 π 0 π¬ 2 π 0I know folks in college PR and I heard when a Reddit thread about my exam was reported up to the college president.
25.07.2025 20:27 β π 5 π 0 π¬ 0 π 0I learned things about containers I didnβt know from her containers book. Very annoying. I always get annoyed the more I learn about containers, independent of source.
25.07.2025 15:08 β π 1 π 0 π¬ 0 π 01850βs baseball.
22.06.2025 18:37 β π 2 π 0 π¬ 0 π 0The recent *Your Brain on ChatGPT* paper seems cool.
When a ugrad approaches me to do research, I still have them read a prefix of PLAI (1st ed.) and demonstrate that they understand it.
I wonder what would happen if I asked them to self-study with an LLM exclusively. Has anyone tried this?
Now that I've written more Python than I care to admit, I'm getting tired of duplicating abstractions: one for sync code and another for async. Effect polymorphism wanted.
21.05.2025 16:06 β π 1 π 0 π¬ 0 π 0Looking forward to this.
15.04.2025 19:13 β π 0 π 0 π¬ 0 π 0Are you hiring new grads (BS) for this kind of work? I can suggest some people.
29.03.2025 07:01 β π 0 π 0 π¬ 1 π 0In this episode of HUB History, Elena Palladino discusses the creation of the Quabbin Reservoir, the four towns that were sacrificed for its construction, and her book Lost Towns of the Swift River Valley. Listen now!
www.hubhistory.com/episodes/wat...
I distinctly remember the moment in grad school when I realized I was not going to learn any more PL by taking classes. I fell bad for an instant, and then moved on.
08.03.2025 13:50 β π 3 π 0 π¬ 1 π 0There seems to be a fundamental misunderstanding here. I don't think PhD students complete assigned tasks.
arstechnica.com/ai/2025/03/w...
Yes. Still there. Also the pinball machine, the PDP, and @shriram.bsky.social .
05.03.2025 21:07 β π 4 π 0 π¬ 1 π 0Photo taken today at @browncsdept.bsky.social. I'm glad to see that the PhD students (@genevievemp.bsky.social), furniture, and faculty seem to have not changed in 10+ years.
05.03.2025 20:06 β π 4 π 0 π¬ 1 π 0Congrats to Andy and Rich. A well-deserved recognition of their work and reinforcement learning in general!
awards.acm.org/about/2024-t...
The real lesson from DeepSeek is the importance of good old-fashioned computer science. Every day this week, they've been doing open source releases. The latest is their in-house distributed file system. github.com/deepseek-ai/...
28.02.2025 10:07 β π 16 π 4 π¬ 1 π 0I think language devs can help in a few ways. Benchmarking is the easiest for us to do and necessary to guide LLM development. Iβve been meaning to writeup my experience being only PL person in the room for the StarCoder LLM development process. It was very informative.
26.02.2025 22:16 β π 12 π 0 π¬ 2 π 0Please help amplify ARBOR, a fantastic new research opportunity! If youβd like to start contributing, NDIF is now hosting DeepSeek R1 8B and 70B, open for all researchers to experiment on via our API.
Sign up for API access here: login.ndif.us
Or, ask these products to write a 2 page ICFP workshop paper in oneβs area of expertise. OK if itβs incremental, just has to be novel for 2025 and clearly positioned wrt related work. I know PhD students who can do this.
12.02.2025 22:05 β π 0 π 0 π¬ 0 π 0Our tech report has more fun examples of short prompts that make reasoning models crunch for several minutes or longer: khoury.northeastern.edu/~arjunguha/m...
04.02.2025 02:37 β π 2 π 0 π¬ 0 π 0If you want to read some deranged thoughts from frustrated models (R1 and Gemini Thinking), check them out here: huggingface.co/spaces/nuprl...
04.02.2025 02:37 β π 2 π 0 π¬ 1 π 0We believe our benchmark is out-of-domain for DeepSeek-style models: RL with verifiable rewards on math and programming. Itβs remarkable that they generalize to this type of verbal reasoning. But, perhaps there are limits to what can be done with verifiable rewards exclusively.
04.02.2025 02:37 β π 2 π 0 π¬ 1 π 0However, many problems are so hard that reasoning models βgive upβ β they output solutions that they know are wrong or argue that the problem is impossible to solve. In some cases, R1 gets stuck βthinking foreverβ. (See this example of R1 getting βfrustrated.β)
04.02.2025 02:37 β π 2 π 0 π¬ 1 π 0Our benchmark reveals capability gaps and failure modes that are not evident in existing benchmarks. E.g., we find that o1 is significantly better at these tasks than other reasoning models.
04.02.2025 02:37 β π 2 π 0 π¬ 1 π 0In short, we turn the weekly puzzles from the NPR Sunday Puzzle Challenge into a machine-checkable benchmark. These are hard problems, typically solved by a few hundred people a week. But, the answers are obvious when revealed (to U.S. adults).
04.02.2025 02:37 β π 3 π 0 π¬ 1 π 0O1, R1, etc. are so good that we evaluate them on βPhD-levelβ benchmarks. But, these benchmarks are so hard that most people canβt even understand what they are testing. Weβve built a benchmark with problems that are hard to solve but easy to verify: for both humans and models.
04.02.2025 02:37 β π 9 π 4 π¬ 1 π 1Last one: there are a LOT of people to blame for this one. I think @jasvir.bsky.social is to blame for this problem in "Humanity's Last Exam".
28.01.2025 18:29 β π 6 π 0 π¬ 2 π 0Ugh, who did this? @joepolitz.bsky.social ? Wait, was it @dbp.bsky.social ? Someone else from @shriram.bsky.social's group?
Also from "Humanity's Last Exam".
OK, who is responsible for this? Is it @natefoster.bsky.social?
Source: "Humanity's Last Exam" www.nytimes.com/2025/01/23/t...