Daphne Kontogiorgos-Heintz's Avatar

Daphne Kontogiorgos-Heintz

@daphnekh.bsky.social

CS PhD student @UW working on ML for nanopore protein sequencing

635 Followers  |  1,151 Following  |  1 Posts  |  Joined: 12.11.2024
Posts Following

Posts by Daphne Kontogiorgos-Heintz (@daphnekh.bsky.social)

RawBench: A Comprehensive Benchmarking Framework for Raw Nanopore Signal Analysis Techniques https://www.biorxiv.org/content/10.1101/2025.10.04.680405v1

06.10.2025 18:33 — 👍 4    🔁 2    💬 0    📌 0

BREAKING: Our client Mario Guevara, an Emmy-winning journalist detained by ICE in retaliation for livestreaming law enforcement activity, will be deported tomorrow to El Salvador.

Mario and his family are being punished for his reporting. This cruelty is meant to stifle our free press.

02.10.2025 22:42 — 👍 32081    🔁 15415    💬 645    📌 724
Post image

Happy to share that ShapeEmbed has been accepted at @neuripsconf.bsky.social 🎉 SE is self-supervised framework to encode 2D contours from microscopy & natural images into a latent representation invariant to translation, scaling, rotation, reflection & point indexing
📄 arxiv.org/pdf/2507.01009 (1/N)

23.09.2025 08:31 — 👍 71    🔁 26    💬 3    📌 5
Preview
The Washington Post Fired Me — But My Voice Will Not Be Silenced. I spoke out against hatred and violence in America — and it cost me my job.

Some personal news:

I've been fired from the Washington Post in the aftermath of the Charlie Kirk shooting.

Thread incoming.

substack.com/@karenattiah...

15.09.2025 11:07 — 👍 45168    🔁 15731    💬 2494    📌 2155
We present our new preprint titled "Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation".
We quantify LLM hacking risk through systematic replication of 37 diverse computational social science annotation tasks.
For these tasks, we use a combined set of 2,361 realistic hypotheses that researchers might test using these annotations.
Then, we collect 13 million LLM annotations across plausible LLM configurations.
These annotations feed into 1.4 million regressions testing the hypotheses. 
For a hypothesis with no true effect (ground truth $p > 0.05$), different LLM configurations yield conflicting conclusions.
Checkmarks indicate correct statistical conclusions matching ground truth; crosses indicate LLM hacking -- incorrect conclusions due to annotation errors.
Across all experiments, LLM hacking occurs in 31-50\% of cases even with highly capable models.
Since minor configuration changes can flip scientific conclusions, from correct to incorrect, LLM hacking can be exploited to present anything as statistically significant.

We present our new preprint titled "Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation". We quantify LLM hacking risk through systematic replication of 37 diverse computational social science annotation tasks. For these tasks, we use a combined set of 2,361 realistic hypotheses that researchers might test using these annotations. Then, we collect 13 million LLM annotations across plausible LLM configurations. These annotations feed into 1.4 million regressions testing the hypotheses. For a hypothesis with no true effect (ground truth $p > 0.05$), different LLM configurations yield conflicting conclusions. Checkmarks indicate correct statistical conclusions matching ground truth; crosses indicate LLM hacking -- incorrect conclusions due to annotation errors. Across all experiments, LLM hacking occurs in 31-50\% of cases even with highly capable models. Since minor configuration changes can flip scientific conclusions, from correct to incorrect, LLM hacking can be exploited to present anything as statistically significant.

🚨 New paper alert 🚨 Using LLMs as data annotators, you can produce any scientific result you want. We call this **LLM Hacking**.

Paper: arxiv.org/pdf/2509.08825

12.09.2025 10:33 — 👍 303    🔁 106    💬 6    📌 23
Post image 03.08.2025 00:48 — 👍 18408    🔁 5557    💬 296    📌 157
Post image Post image Post image

For many of those who were asking on BLOW5 vs POD5 for nanopore signal data, here is a finally detailed benchmark we did:
biorxiv.org/content/10.1...
Summary: performance of BLOW5 is >= POD5 (from ~= to 100X, see below), with benefit of having ~3 dependencies instead of >50.

05.07.2025 04:26 — 👍 14    🔁 9    💬 1    📌 0
Preview
Lossless data compression by large models - Nature Machine Intelligence Effective lossless compression requires that frequent patterns in the data can be identified. Li et al. explore using deep learning models to more effectively compress text, audio and video data.

"LMCompress shatters all previous lossless compression records on four media types: text, images, video and audio."

www.nature.com/articles/s42...

03.05.2025 14:19 — 👍 30    🔁 6    💬 0    📌 1
Post image

Analysis by altmetric shows increasing posting of research content on Bluesky but more sharing (reposting) on X.

We need to increase Bluesky connectivity and share more.

28.03.2025 14:27 — 👍 16    🔁 8    💬 0    📌 0
Preview
Toward single-molecule protein sequencing using nanopores - Nature Biotechnology Maglia and colleagues discuss advances in nanopore technology en route to single-molecule protein sequencing

Toward single-molecule protein sequencing using nanopores
www.nature.com/articles/s41...

18.03.2025 06:44 — 👍 16    🔁 5    💬 0    📌 0
Preview
Sequencing by Expansion (SBX) — a novel, high-throughput single-molecule sequencing technology Remarkable advances in high-throughput sequencing have enabled major biological discoveries and clinical applications, but achieving wider distribution and use depends critically on further improvemen...

Roche SBX preprint out

www.biorxiv.org/content/10.1...

24.02.2025 21:54 — 👍 31    🔁 15    💬 0    📌 1
Preview
Roche Xpounds on New Sequencing Technology Bar bets can be a powerful force in human society.  One of the best known books on the planet, The Guinness Book of World Records, originate...

Roche Xpounds on New Sequencing Technology

My deep dive on this exciting new entrant

omicsomics.blogspot.com/2025/02/roch...

🧬🖥️

20.02.2025 18:41 — 👍 19    🔁 11    💬 1    📌 0
Post image Post image Post image

Verena Rukes telling us about: Charge-based fingerprinting of unlabeled full-length proteins using an
aerolysin nanopore

20.01.2025 11:44 — 👍 2    🔁 2    💬 0    📌 0
Post image

Timescales in Cell Biology

01.12.2024 13:35 — 👍 367    🔁 114    💬 8    📌 17
Dr. Margaret Oakley Dayhoff

I took biochem in 2001, and for nearly 20 years read amino acid sequences daily… and I never knew Dayhoff named them or even the logic behind things like Q until last Friday (h/t Mike Janech). Also, this is another big Dayhoff moment for me. She was incredible!

#proteomics #bioinformatics

24.11.2024 12:39 — 👍 198    🔁 79    💬 14    📌 7
Preview
Sky Follower Bridge - Chrome Web Store Instantly find and follow the same users from your Twitter follows on Bluesky.

This worked like a charm to import accounts following on Twitter chromewebstore.google.com/detail/sky-f...

21.11.2024 05:37 — 👍 3    🔁 0    💬 1    📌 0