Joe Janizek's Avatar

Joe Janizek

@joejanizek.bsky.social

physician-scientist, interested in AI safety/interpretability in biology/medicine. jjanizek.github.io

2,749 Followers  |  598 Following  |  134 Posts  |  Joined: 31.10.2023  |  2.1715

Latest posts by joejanizek.bsky.social on Bluesky

Post image Post image Post image

Thrilled to share that @ethanweinberger.bsky.social is becoming Dr. Weinberger in CSE2, where he is presenting his work, including the popular contrastiveVI for single-cell data (Weinberger et al. Nature Methods)! I feel so fortunate to work with such amazing Ph.D. students at @uwcse.bsky.social! πŸŽ‰πŸŽ“

22.04.2025 02:49 β€” πŸ‘ 7    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

Medical education 🌢️πŸ”₯take: the threat of cognitive deskilling from genAI technologies is the #1 things medical educators need to be talking about right now.

07.03.2025 13:49 β€” πŸ‘ 19    πŸ” 5    πŸ’¬ 2    πŸ“Œ 1
Post image

AI deployments in health are often understudied because they require time and careful analysis.βŒ›οΈπŸ€”

We share thoughts in @ai.nejm.org about a recent AI tool for emergency dept triage that: 1) improves wait times and fairness (!), and 2) helps nurses unevenly based on triage ability

27.02.2025 21:06 β€” πŸ‘ 30    πŸ” 8    πŸ’¬ 2    πŸ“Œ 1

Really cool paper!

21.02.2025 03:38 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

AKA the β€œConcrete” distribution, which I think is a much better name lol

21.02.2025 01:24 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Basically a continuous relaxation of discrete random variables, lets you do stuff like differentiating through sampling (e.g. argmax) operations

21.02.2025 01:23 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The Gumbel-softmax distribution

21.02.2025 01:11 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Streamlit

I've been developing a semantic search tool that covers not just bioRxiv and medRxiv, but the entire PubMed database. This means you can search across a massive collection of biomedical research using keywords, questions, hypotheses, or even full abstracts. Try it out: mssearch.xyz

16.02.2025 19:02 β€” πŸ‘ 946    πŸ” 220    πŸ’¬ 35    πŸ“Œ 17
Preview
Machine Learning 101? Imagining a new syllabus for a first course on machine learning.

What should we teach our undergrads about machine learning? I wrote up some ideas for restructuring Machine Learning 101.

13.02.2025 15:39 β€” πŸ‘ 37    πŸ” 5    πŸ’¬ 2    πŸ“Œ 1

Reading that answer, I’m realizing this may actually qualify as β€œa big jeopardy person”

12.02.2025 04:42 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Not so much recently. I watched it a lot as a kid, and then my wife and I were watching it regularly for almost a year right at the start of Covid β€” was pretty fun, you can also turn up the difficulty by trying to like, guess what the clues and answers are going to be just from the categories

12.02.2025 04:36 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

yes, but only because of a final jeopardy back in 2020 that was actually trying to clue a different poem

12.02.2025 03:51 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1
Post image Post image Post image

the different midjourney variations are so interesting. like, the whole row of guys w/ really crazy eyes, not sure where it got that from the prompt, but has a real Ilya Repin -- Ivan the Terrible / Goya -- Saturn vibe

12.02.2025 00:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Think how much performance we might be leaving on the table by not training classifiers on increasingly invasive biometrics. Pictured: medium-term radiologist-AI centaur-configuration possibility

11.02.2025 23:55 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 3    πŸ“Œ 0
Post image Post image

If this was an AI paper, you’d brand it as an interpretability technique that discovers a latent β€œnode detection” circuit in the neural network

pubs.rsna.org/doi/10.1148/...

11.02.2025 23:40 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 1

There is a lot of buzz about our new paper in Nature Medicine on the effects of LLMs (GPT-4) on physician management reasoning! I had TONS of fun working on this -- but what it MEANS requires some unpacking.

A πŸ§΅β¬‡οΈ
bsky.app/profile/ucsf...

08.02.2025 13:37 β€” πŸ‘ 20    πŸ” 13    πŸ’¬ 3    πŸ“Œ 0
Preview
Brainwide silencing of prion protein by AAV-mediated delivery of an engineered compact epigenetic editor Prion disease is caused by misfolding of the prion protein (PrP) into pathogenic self-propagating conformations, leading to rapid-onset dementia and death. However, elimination of endogenous PrP halts...

Insanely good news if this holds up.

Brainwide silencing of prion protein by AAV-mediated delivery of an engineered compact epigenetic editor

www.science.org/doi/10.1126/...

05.02.2025 04:11 β€” πŸ‘ 68    πŸ” 21    πŸ’¬ 1    πŸ“Œ 2
Preview
What's Happening Inside the NIH and NSF

A long post about what’s happening to the science funding agencies in the US and why. As mentioned, this one just kept getting longer even as I kept stripping curse words from it.

www.science.org/content/blog...

04.02.2025 16:40 β€” πŸ‘ 419    πŸ” 296    πŸ’¬ 20    πŸ“Œ 42
Preview
The in-context inductive biases of vision-language models differ across modalities Inductive biases are what allow learners to make guesses in the absence of conclusive evidence. These biases have often been studied in cognitive science using concepts or categories -- e.g. by testin...

New (short) paper showing how the in-context inductive biases of vision-language models β€” the way that they generalize concepts learned in context β€” depend on the modality and phrasing! arxiv.org/abs/2502.01530 Quick summary: 1/5

04.02.2025 16:53 β€” πŸ‘ 29    πŸ” 5    πŸ’¬ 1    πŸ“Œ 0

i use claude as a rubber duck a lot, and i always make sure to thank it. not because i think that it can appreciate my thanks but i refuse to surround myself with objects which i experience as human but refuse to treat as human. we should not be learning to dehumanize the experience of intelligence.

04.02.2025 07:09 β€” πŸ‘ 352    πŸ” 31    πŸ’¬ 9    πŸ“Œ 2
Post image

πŸ“ˆHow far are leading models from mastering realistic medical tasks? MedXpertQA, our new text & multimodal medical benchmark, reveals gaps in model abilities

πŸ“ŒPercentage scores on our Text subset:
o3-mini: 37.30
R1: 37.76 - frontrunner among open-source models
o1: 44.67 - still room for improvement!

04.02.2025 13:29 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1

Anyway, not exactly a guide on how to prompt, but I do think interacting with multiple chat models is a great way to get an understanding of β€œwhat’s common” to different models/LLMs more broadly

04.02.2025 15:02 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I could imagine that playing around querying OpenEvidence and then submitting the same queries to, say, ChatGPT with and without search enabled could be an interesting way to understand what sort of questions models tend to be reliable for, when hallucinations are more likely, etc

04.02.2025 15:01 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
OpenEvidence The leading medical information platform.

For more clinically-oriented things, I’ve really enjoyed the OpenEvidence platform. Currently free, and grounds all of its generations in real guidelines/trials/etc. www.openevidence.com

04.02.2025 14:59 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Business idea: Anki decks but for Involuntary Memory

04.02.2025 06:04 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

better screenshots

03.02.2025 23:00 β€” πŸ‘ 6    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

β€œAn AI escaped from the lab!”

β€œWhich one?”

β€œUh, something named helpful-only”

β€œDear god…”

03.02.2025 18:00 β€” πŸ‘ 32    πŸ” 4    πŸ’¬ 4    πŸ“Œ 0

If you can use Gemini to spin up reasonable reasoning traces at a fraction of the cost, even if real expert traces are eventually better, this lets you generate proof of concept that this is worthwhile to justify the investment

03.02.2025 16:49 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

59K questions * 30 min * 50$/hour is ballpark $1.5 million β€” you could obviously imagine generating a smaller expert set of reasoning traces, but even the 1K they eventually distilled to would still be ballpark $50K. That’s a huge investment either way

03.02.2025 16:48 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Concretely, you can do a back of the envelope calculation on how much it would have cost to generate expert reasoning traces for their dataset β€” they initially did 59K questions, which for this difficulty level would take experts greater than 30 min per question (see GPQA paper)

03.02.2025 16:48 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

@joejanizek is following 20 prominent accounts