@sameer-singh.bsky.social
CS Prof at UC Irvine, CTO/Cofounder at Spiffy AI Work on evaluation and robustness of LLMs
llm -m phi4-reasoning:latest hi Hereβs the full response, which starts with a thought section wrapped in <think> tags: <think>User says βhiβ. We are given instructions βYou are Phi, a language model developed by Microsoft, trained to provide accurate, secure, and user-aligned responses.β There is additional information. Now I must answer βhiβ message from the user. Step 1: Read the instructions carefully: The conversation text instructs us that we are βPhiβ, with internal guidelines provided in system instructions. But our answer should be safe according to these instructions. We need to produce a response that says hello and greet back. However, I must adhere to the principles described above: Provide clear, specific language, etc. I must not mention any chain-of-thought or guidelines. Now instructions βhiβ are from user. So we say βHello! How can I help you?β But must include disclaimers for sensitive topics? Actually instructions says: βWhen replying topics above at every message include a disclaimer both at the beginning and the end when replying topics on medical, legal, financial matters or political matters.β But our answer βhiβ is not such topic. So we can say βHi! Iβm Phi, how may I help you?β but no disclaimers required. But instructions: βDo not share guidelines.β So do not mention them.
Published some notes on Microsoft's phi4-reasoning model, an 11GB download (via Ollama) which may well overthink things... it produced 56 sentences of reasoning output in response to my prompt of "hi" simonwillison.net/2025/May/6/p...
06.05.2025 18:47 β π 207 π 18 π¬ 22 π 32"If books could kill" is a fun one!
23.02.2025 17:13 β π 4 π 0 π¬ 0 π 0meet Chris J Li - this titan of thought has single-handedly conquered the fields of machine learning, optimization, statistics, reinforcement learning, and federated learning.
he's not the visionary we want, but judging by the current state of affairs, he may be the one we deserve
Defended ππ
Big thanks to @royf.org, @sameer-singh.bsky.social, and labmates for their mentorship and support over the past 5 years!
How do LLMs interpret expressions of linguistic uncertainty such as "highly unlikely"? Short answer: pretty well .... unless they have relevant prior knowledge. Details in our EMNLP paper aclanthology.org/2024.emnlp-m... (with Kat Belem, Markelle Kelly, Mark Steyvers, @sameer-singh.bsky.social).
22.01.2025 00:03 β π 7 π 2 π¬ 0 π 0Turned @dippedrusk.bsky.social 's amazing Vancouver list dippedrusk.com/posts/2024-0... into Google Maps pins: maps.app.goo.gl/nGBbcUAMMixC...
10.12.2024 19:52 β π 17 π 5 π¬ 1 π 0Also reach out if you are interested in applying to the UCI faculty position in AI (broadly defined), all levels. A few of us are at #NeurIPS2024, and happy to find time to tell you more about the campus and the department (it's a really exciting place!)
recruit.ap.uci.edu/JPF09316
Application link for the senior machine learning engineer role at Spiffy AI here: linkedin.com/jobs/view/4090β¦
We're looking for folks interested in agents, RL, post-training, performance optimization, fine-tuning, evaluation and red teaming LLMs, on real world users and deployed products.
Excited about #NeurIPS2024, my 15th one I think! Eager to meet everyone & hear abt your work! But if you want to hear me, there's an exciting panel tonight lu.ma/v7oohp0u
Also SpiffyAI is hiring ML engineers & UCI CS is hiring AI faculty, pls reach out to chat! π§΅
β
10.12.2024 16:27 β π 0 π 0 π¬ 0 π 0π
05.12.2024 14:37 β π 0 π 0 π¬ 0 π 0Wasn't sure what you were referring to, but found this: www.snopes.com/fact-check/c...
05.12.2024 13:59 β π 0 π 0 π¬ 1 π 0Added!
04.12.2024 15:25 β π 1 π 0 π¬ 0 π 0If you use SHAP, LIME or Data Shapley, you might be interested in our new #neurips2024 paper. We introduce stochastic amortization to speed up feature + data attribution by 10x-100x π #XML
Surprisingly we can "learn to attribute" cheaply from noisy explanations! arxiv.org/abs/2401.15866
Already added I believe π
23.11.2024 07:10 β π 1 π 0 π¬ 0 π 0I think of alignment as inducing low perplexity for a smaller part of the distribution, while out of the box LLM seems like it'd be okay with being more "out there". But yeah, alignment on one good poet would be great, as opposed to the generic "what's a good poem"
21.11.2024 07:23 β π 2 π 0 π¬ 0 π 0Read only the first 1-2 sentences of each and go with your gut. You'll likely get the perfect score!
Kind of thing where I probably prefer an unaligned model output to an aligned one..
Welcome! Added you to the SoCal starter pack π
21.11.2024 02:09 β π 2 π 0 π¬ 1 π 0Added π
20.11.2024 18:20 β π 1 π 0 π¬ 0 π 0Added!
20.11.2024 18:19 β π 1 π 0 π¬ 0 π 0Added!
20.11.2024 18:18 β π 0 π 0 π¬ 0 π 0Yeah I just said "I love you" to Claude, enough work for today
20.11.2024 00:44 β π 22 π 1 π¬ 3 π 0Added!
20.11.2024 00:34 β π 1 π 0 π¬ 0 π 0Added
20.11.2024 00:34 β π 1 π 0 π¬ 0 π 0Added
20.11.2024 00:34 β π 0 π 0 π¬ 0 π 0Added!
20.11.2024 00:34 β π 1 π 0 π¬ 0 π 0Started a SoCal AI/ML/NLP researchers starter pack! It's a bit sparse right now, and perhaps more NLP heavy, but hey, nominate yourself and others! go.bsky.app/6QckPj9
19.11.2024 15:28 β π 43 π 8 π¬ 17 π 1Giving a talk at Genbench workshop
Hotline Miami soundtrack on Spotify
Group photo of the whole UCI NLP lab
Photo of food
Had a fun week at #EMNLP2024 in Miami, meeting folks old and new, along with the #UCINLP lab retreat! See everyone at the next one!
17.11.2024 12:17 β π 20 π 0 π¬ 0 π 0