Danielle Bitterman MD daniellebitterman

Humanity’s Next Medical Exam: Preparing to Evaluate Superhuman Systems The rapid advances in health care AI necessitate a fundamental shift in how we evaluate these systems. Palepu et al. (2025) demonstrate that AI can outperform medical trainees in breast cancer mana...

2. A proposal for evaluation of "superhuman" systems in healthcare: ai.nejm.org/doi/full/10....

04.12.2025 14:10 — 👍 1 🔁 0 💬 0 📌 0

The effect of using a large language model to respond to patient messages The relentless increase in administrative responsibilities, amplified by electronic health record (EHR) systems, has diverted clinician attention from direct patient care, fuelling burnout.1 In respon...

Check out our related work:
1. Gaps in ability of models to adjust response and differential diagnosis for cancer patients: www.thelancet.com/journals/lan...

@shan23chen.bsky.social

04.12.2025 14:10 — 👍 3 🔁 0 💬 1 📌 0

Our research has found that even when chatbots are given specific patient context, they often drift back toward generic, "average patient" responses. They see the data, but they don't always weigh it like a physician would.

04.12.2025 14:10 — 👍 1 🔁 0 💬 1 📌 0

As I shared in the NYT, models often see the data but fail to weigh it like a physician, drifting toward generic "average patient" responses. Context window ≠ Clinical reasoning.

www.nytimes.com/2025/12/03/w...

04.12.2025 14:10 — 👍 11 🔁 2 💬 1 📌 2

“‘Just because you’re providing all of this information to language models,’ @daniellebitterman.bsky.social says, ‘doesn't mean they're effectively using that info in the same way that a physician would’.

And once people upload this kind of data, they have limited control over how it is used.”🧪🛟

03.12.2025 20:54 — 👍 15 🔁 8 💬 2 📌 1

Check out our editorial on Zazzetti et al (2025)'s paper on synthetic data generation for breast cancer, in JCO CCI! Synthetic data could help with many gaps in clinical AI research, but challenges remain especially (IMO) issues with out-of-domain generalization @shan23chen.bsky.social

30.11.2025 17:37 — 👍 3 🔁 1 💬 0 📌 0

Super proud of @shan23chen.bsky.social for his podium presentation on his research into LLM sycophancy in the face of illogical medical queries at #AMIA25!

Full paper: www.nature.com/articles/s41...

Also cited yesterday in the NYT! www.nytimes.com/2025/11/16/w...

17.11.2025 21:44 — 👍 6 🔁 2 💬 0 📌 1

When helpfulness backfires: LLMs and the risk of false medical information due to sycophantic behavior - npj Digital Medicine npj Digital Medicine - When helpfulness backfires: LLMs and the risk of false medical information due to sycophantic behavior

LLMs tend to prioritize helpfulness > reason. We show that safety-aware, compute-efficient fine-tuning helps models reason more critically in healthcare domain, and generalizes to improved safety alignment across other domains.
www.nature.com/articles/s41... @shan23chen.bsky.social

18.10.2025 14:18 — 👍 8 🔁 5 💬 0 📌 0

When helpfulness backfires: LLMs and the risk of false medical information due to sycophantic behavior - npj Digital Medicine npj Digital Medicine - When helpfulness backfires: LLMs and the risk of false medical information due to sycophantic behavior

An overemphasis on helpfulness makes LLMs vulnerable.
Research shows models will comply with illogical medical requests, generating false information. This sycophantic tendency can be corrected with specific prompting and fine-tuning. #MedSky #MedAI #MLSky

17.10.2025 15:53 — 👍 7 🔁 4 💬 0 📌 0

Clinical Reporting: Mass General — unNatural Selection Signals Over Noise: Cleaning Up Cancer Trial Data

Mass General physician-scientist @daniellebitterman.bsky.social discusses how AI assists the clinical data pipeline leading to better treatments for patients. Listen to unNatural Selection & register for #WMIF2025 at the link in bio to hear more : www.unnaturalselection.net/podcast/s1e19
#MedTech

21.08.2025 16:06 — 👍 2 🔁 1 💬 0 📌 0

Our paper on multilingual reasoning is accepted to Findings of #EMNLP2025! 🎉 (OA: 3/3/3.5/4)

We show SOTA LMs struggle with reasoning in non-English languages; prompt-hack & post-training improve alignment but trade off accuracy.

📄 arxiv.org/abs/2505.22888
See you in Suzhou! #EMNLP

20.08.2025 20:02 — 👍 7 🔁 3 💬 0 📌 0

🚀 Join Us at the Forefront of AI & Cancer Care | Danielle Bitterman 🚀 Join Us at the Forefront of AI & Cancer Care Are you driven to use cutting-edge AI to transform patient outcomes in oncology? My lab within the AI in Medicine Program (Mass General Brigham, Har...

Are you driven to use AI to transform patient outcomes in oncology? My lab in the AI in Medicine Program (Mass General Brigham, Harvard Medical School) is seeking Postdoctoral Fellows to pioneer applications of AI—especially LLMs—in cancer care. More here: www.linkedin.com/posts/daniel...

07.07.2025 12:22 — 👍 7 🔁 3 💬 1 📌 0

Reliability of Large Language Model Knowledge Across Brand and Generic Cancer Drug Names | JCO Clinical Cancer Informatics PURPOSETo evaluate the performance and consistency of large language models (LLMs) across brand and generic oncology drug names in various clinical tasks, addressing concerns about potential fluctuati...

Reliability of Large Language Model Knowledge Across Brand and Generic Cancer Drug Names | JCO Clinical Cancer Informatics ascopubs.org/doi/abs/10.1... #JCOCCI @daniellebitterman.bsky.social

18.06.2025 17:15 — 👍 1 🔁 1 💬 0 📌 0

Does your LRM reason in your language? Check out new preprint led by ✨ @jiruiqi.bsky.social & @shan23chen.bsky.social. Implications for safety/human oversight & accuracy!

30.05.2025 16:24 — 👍 2 🔁 0 💬 0 📌 0

Led by @shan23chen.bsky.social!

22.05.2025 16:27 — 👍 1 🔁 0 💬 0 📌 0

Agents are all the rage and we need to track their abilities in the medical domain. Enter MedBrowseComp, the 1st benchmark to assess agents' abilities to reason, navigate the web, and search for verifiable med info!

Preprint: arxiv.org/abs/2505.14963
Site: moreirap12.github.io/mbc-browse-a...

22.05.2025 16:27 — 👍 3 🔁 1 💬 1 📌 0

"I think we have massive opportunity in cancer care to get patients to the right care, the most advanced care earlier, by taking those workforce shortages and using AI to get to solutions."

#STATBreakthrough

14.05.2025 21:41 — 👍 6 🔁 2 💬 2 📌 0

"The other thing I'm scared of, it's a patient's voice is going to be come lost in the conversation of on what type of AI is developed and how we implement it," Danielle

#STATBreakthrough

14.05.2025 21:56 — 👍 6 🔁 3 💬 0 📌 0

I’m thrilled to be in San Francisco for @statnews.com's Breakthrough West Summit! I’ll be bringing my firsthand perspective as a physician-scientist to speak about how AI is transforming cancer care, alongside leaders in the field.

Let's connect if you're here!
#STATBreakthroughSummitWest

14.05.2025 13:20 — 👍 2 🔁 2 💬 0 📌 0

A social card that reads Featured Session: AI in Cancer Care. Then underneath are four headshots and titles. They read: Danielle Bitterman, M.D., Clifford A. Hudis, M.D., Karen Knudsen, Ph.D., and STAT's Angus Chen.

AI in Cancer Care

Artificial intelligence has the potential to upend oncology, changing everything from diagnosis to treatment options. Get a wide-ranging view of how the use of technology could play out over the next few years.
Moderated by @angusrohan.bsky.social
#STATBreakthrough

13.05.2025 22:59 — 👍 2 🔁 2 💬 1 📌 0

ChemoTimelines 2025 Treatment regimens are key details in understanding the effects of genetic, epigenetic, and other factors on tumor behavior and responsiveness. As precision oncology progresses, insights into the fine...

Exciting news: we are organizing a shared task – 2nd edition of the Chemotherapy Treatment Timelines Extraction from the Clinical Narrative (text mining task) -- collocated with the Clinical NLP Workshop. Do LLMs solve the task? Check out bit.ly/ChemoTimelin...

23.04.2025 22:59 — 👍 3 🔁 1 💬 0 📌 0

graph of NIH basisfor new drugs

A pie graph worth keeping in mind as the NIH budget plummets jamanetwork.com/journals/jam... for 356 new FDA drugs approved

23.03.2025 16:17 — 👍 4029 🔁 1647 💬 60 📌 85

Conference and professional societies: PLEASE make hybrid options available for attendees and presenters at your conferences so that scientists from HHS-funded agencies can attend. These are unmissable opportunities to promote all the great intramural science and scientists from our government.

18.02.2025 16:34 — 👍 4 🔁 0 💬 0 📌 0

Unseen Commercial Forces Could Undermine Artificial Intelligence Decision Support Artificial intelligence (AI) is poised to transform health care, yet without robust safeguards, unseen commercial interests could distort care by prioritizing profit over patient well-being. The ph...

My Perspective in @NEJM_AI. AI could distort clinical decision-making in ways that prioritize profit over patient care. Oversight & regulation must go beyond performance metrics alone to address hidden commercial forces that could shape decision support. ai.nejm.org/doi/full/10....

06.02.2025 16:14 — 👍 13 🔁 4 💬 2 📌 0

My opinion as an actual NIH-funded researcher (unlike Vinay) at ucsf: his lies about how NIH dollars are used reflect a complete lack of understanding of how research is performed, a lack of respect for research, and are harmful to the entire biomedical research enterprise #grifter

12.02.2025 02:23 — 👍 57 🔁 10 💬 2 📌 0

Budgeting for the next year of my grants and they will all need to be rescoped, even before the 15% IDC rate. NCI funding at 83% for new awards and another 10% reduction for renewals (current state). Essentially, we are getting 50% of what we asked for...how is this sustainable? @carlbergstrom.com

09.02.2025 15:38 — 👍 14 🔁 5 💬 1 📌 2

As a cancer doctor I see every day how NIH-funded clinical trials save lives and has made the U.S. a leader in medical innovation. Here's one example: In the 1970s, childhood cancer survival was only 58%. Today it is 85%, largely thanks to NIH/NCI funding of Children's Oncology Group trials.

05.02.2025 14:48 — 👍 21 🔁 14 💬 0 📌 0

Congressional delegation outside USAID now: “We are here to shed a light on a crime unfolding before our eyes.”

03.02.2025 18:00 — 👍 34952 🔁 8878 💬 1040 📌 604

Senator Andy Kim just went to the USAID building, talked to the security guard there to confirm employees are being barred entry, and then did a press gaggle right there in front to call it out.

This is doing something. This is making an effort on messaging. Other Democratic lawmakers: take notes.

03.02.2025 17:42 — 👍 70827 🔁 15752 💬 1503 📌 1019

LGBTQ+ Health - NYC Health

Gay? Lesbian? Trans? Intersex?

NYC Health has health information for everybody. 🏳️‍🌈🏳️‍⚧️

02.02.2025 17:43 — 👍 714 🔁 164 💬 2 📌 6

Posts by Danielle Bitterman MD (@daniellebitterman.bsky.social)