Chris Miller's Avatar

Chris Miller

@chrismiller.science.bsky.social

I study cancer at Washington University in St Louis. Cancer Genomics, Bioinformatics, Data Viz, Tumor Evolution, AML, Immunotherapy, Irreverent humor 🧬 πŸ–₯️ mostly @chrisamiller on other platforms

2,080 Followers  |  208 Following  |  297 Posts  |  Joined: 18.08.2023  |  1.9423

Latest posts by chrismiller.science on Bluesky

Preview
a man in a suit and tie holds his head in his hands ALT: a man in a suit and tie holds his head in his hands

PSA: dbGaP authorized access/run selectors seem to be not working. I assume it's the shutdown - either they turned the lights off, or something's busted and no one is there to fix it

07.10.2025 20:13 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

These 3-L bottles contain one million tiny colored spheres each.

One sphere is black (1 ppm).

Finding the black sphere is comparable to detecting a protein present at ~ 6,000 copies in the proteome of a human cell.

Quantifying the protein requires analyzing multiple jars.

27.09.2025 11:52 β€” πŸ‘ 19    πŸ” 6    πŸ’¬ 2    πŸ“Œ 0

I can't ever see the MTHFR gene symbol without doing a double take

07.10.2025 19:10 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I have been trying to get this published as an op-ed, but I am going to post it here since I think it is timely in light of the "consent" extortion events.

Deafening Quiet from the Scientific Establishment

jeremymberg.github.io/jeremyberg.g...

1/14

06.10.2025 23:27 β€” πŸ‘ 297    πŸ” 160    πŸ’¬ 13    πŸ“Œ 23

The lessons here:
1) Many gene names are stupid.
2) Edge cases may be rare, but they often matter. (TP53 is a key cancer gene that wouldn't be accessible without some special accommodations here).
3) As always, check your assumptions!
(fin)

06.10.2025 18:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

For our little internal app, this probably won't matter much, and I will either set the number of records to 200 (because we generate almost no traffic) or might code up something that dynamically decides how many queries to return, based on which genes are in the input data. (8/n)

06.10.2025 18:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Plot showing how many records need to be returned to ensure that each completely typed gene will be in the list.

Plot showing how many records need to be returned to ensure that each completely typed gene will be in the list.

For those who are interested, the plot showing cumulative percentage of human HUGO gene names (from ensembl protein-coding genes) covered by a set number of records looks like this. So 8 results covers 99% of genes, 34 results covers 99.9% of genes, and it takes 199 to cover everything. (7/n)

06.10.2025 18:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
a group of pokemon standing next to each other with gotta catch 'em all written on the bottom ALT: a group of pokemon standing next to each other with gotta catch 'em all written on the bottom

So in order to guarantee that we'll get "AR" in the list, the value should be 200 records, which seems excessive. My instinctual guess of 30 wasn't bad, and covers 99.89% of gene names, but that's not all of them! (6/n)

06.10.2025 18:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
199  AR
120  PC
100  KL
78   ZNF7
67   ZNF2
67   CS
58   CP
58   ADA
57   SI
55   ZNF3
52   TH
51   C2
43   MAG
42   ZNF8
42   TNF
41   GPR1
37   DEFB1
36   USP1
36   GAL
34   PLEK
31   MET

199 AR 120 PC 100 KL 78 ZNF7 67 ZNF2 67 CS 58 CP 58 ADA 57 SI 55 ZNF3 52 TH 51 C2 43 MAG 42 ZNF8 42 TNF 41 GPR1 37 DEFB1 36 USP1 36 GAL 34 PLEK 31 MET

It introduces a new question, though - this failed on TP53 with 10 results, so how many results need to be returned to handle all genes correctly? A few seconds of bash/grep later, I get the following list of 21 genes that will still fail. (5/n)

06.10.2025 18:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

After some digging, it turns out that mygene.info has a default max of 10 records returned for each query, and the first 10 hits include genes like "TP53TGS", "TP53TG3F", "TP53RK-DT", but not "TP53" itself. Adding "&size=30" to the query allows it to return 30 hits, which solved this problem (4/n)

06.10.2025 18:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

But when I manually tried the query string - something like mygene.info/v3/query?spe... - TP53 didn't appear in the returned json - I know that's not right! (3/n)

06.10.2025 18:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Digging through the backend code, I found that the tool was storing the ENSG id as the key (sensibly!), and then using an API call to mygene.info to match them up to gene names as they are typed. Seems fine... (2/n)

06.10.2025 18:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
a close up of a man wearing glasses and a wig . ALT: a close up of a man wearing glasses and a wig .

Today's little mystery involving started simply enough: I was hacking on a web tool that autocompletes gene names, and was surprised when searching for "TP53" didn't return that gene. I checked, and it was definitely in the input data, so I was left scratching my head (1/n)

06.10.2025 18:54 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

How do we get @ensembl.org l the infrastructure they need to not be unresponsive like three times a week? Like can we pass the hat? I'm in for twenty bucks.

06.10.2025 16:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Immigrants, particularly on H1Bs, are the lifeblood of American innovation. If you wanted to hurt US competitiveness in the next century, I can think of few more effective ways than a move like this

Even when found illegal, the mere intent will have irreparably harmed our future

20.09.2025 16:24 β€” πŸ‘ 129    πŸ” 37    πŸ’¬ 5    πŸ“Œ 1
Post image

Critical part of the President's new $100,000 charge for H1-B visas: The Administration can also offer a $100,000 discount to any person, company, or industry that it wants. Replacing rules with arbitrary discretion.

Want visas? You know who to call and who to flatter.

20.09.2025 13:40 β€” πŸ‘ 12620    πŸ” 4782    πŸ’¬ 742    πŸ“Œ 662

When you want to do reproducible analysis in R, some packages require you to set a RNG seed. I'm not sure I trust anyone who doesn't immediately run `set.seed(42)`

18.09.2025 19:47 β€” πŸ‘ 2    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0

Zstandard's --long range mode works wonders for assemblies, but needs uninterrupted single line sequences.

*AllTheBacteria 661k, multiline fasta*
gzip (pigz): 751GB
zstandard --long: 641GB (30% original size)

*Single line fasta*
gzip (pigz): 700GB
zstandard --long: 232GB (10% original size)

09.09.2025 10:27 β€” πŸ‘ 36    πŸ” 12    πŸ’¬ 2    πŸ“Œ 3

For sure. Hybrid meetings are the worst of both worlds

03.09.2025 00:51 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Everyone has a second full time job being mad at the government now

03.09.2025 00:45 β€” πŸ‘ 6582    πŸ” 957    πŸ’¬ 59    πŸ“Œ 63

Good news. The House of Representatives stands behind the NIH budget with no cuts.

02.09.2025 00:53 β€” πŸ‘ 1067    πŸ” 145    πŸ’¬ 22    πŸ“Œ 6
Post image

These are the words of a lunatic who does not belong in government, much less as our nation's top health official.

It is dangerous to allow him to oversee ALL federal health research and public health infrastructure. It is never too late to do the right thing. Fire RFK Jr.

28.08.2025 19:35 β€” πŸ‘ 1618    πŸ” 382    πŸ’¬ 90    πŸ“Œ 24
Preview
Genome Informatics Cold Spring Harbor Laboratory Meetings & Courses -- a private, non-profit institution with research programs in cancer, neuroscience, plant biology, genomics, bioinformatics.

πŸ“’πŸš¨πŸ“’πŸš¨ Genome Informatics deadline extended to September 8! meetings.cshl.edu/meetings.asp....
Please spread the word. If you are like me and had at least one abstract that wasn't quite ready by last week's deadline, you get another swing. See you there!

26.08.2025 20:38 β€” πŸ‘ 29    πŸ” 19    πŸ’¬ 1    πŸ“Œ 2

The Genome Informatics conference (@ Cold Spring Harbor Lab, Nov 5 - 8) abstract deadline is **today**. We welcome your submissions! Topics include:
- PanGenomes
- Genome Assembly & Seq. Algos.
- Algorithmic Evo. Bio
- Single Cell & Spatial Omics
- Microbial Genomics
- AI/ML & Integrative Omics
πŸ™πŸ™πŸ™

22.08.2025 15:00 β€” πŸ‘ 11    πŸ” 6    πŸ’¬ 0    πŸ“Œ 1
Call for editors | Journal of Open Source Software Blog Blog for the Journal of Open Source Software β€’ <a href='https://joss.theoj.org'>https://joss.theoj.org</a>

The Journal of Open Source Software (@joss-openjournals.bsky.social‬) is looking for editors. Come join our team!

I've only been an editor for a few months, but I love working with JOSS. Our peer review process is actually collaborative, and we're Diamond Open Access.

#AcademicSky πŸ§ͺ βš—οΈ #CompChem

08.08.2025 13:02 β€” πŸ‘ 33    πŸ” 18    πŸ’¬ 0    πŸ“Œ 0

In my small corner of biology, I’m constantly reassured by how frequently we confirm main findings. And then when differences do show up, they very rarely end up being fraud, but instead tell us something important about differences in experimental design.

08.08.2025 03:37 β€” πŸ‘ 160    πŸ” 45    πŸ’¬ 2    πŸ“Œ 3
Preview
The Bluesky Dictionary Can Bluesky say every word in the English language? Well this is your chance to find out.

Only 35% of the dictionary? Yes, it might require some tortile posts from us nonathletes to get it done, but we're no tambourinists, we can do this!

www.avibagla.com/blueskydicti...

07.08.2025 01:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

"It gives one that too-familiar mixture of rage, despair, and embarrassment that is peculiar to the Trump era, and for which the English language does not have an appropriate word."

06.08.2025 18:21 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

As stressful as it is to run a laboratory in the US right now, my thoughts are often with the NIH staff living in the awful chaos trying to preserve the whole enterprise. I’m grateful for every moment they choose not to give up. These people are the only thing keeping it from entirely collapsing.

31.07.2025 00:15 β€” πŸ‘ 164    πŸ” 44    πŸ’¬ 3    πŸ“Œ 3
Photo of president Harry S. Truman laying the cornerstone of the NIH clinical center.

Photo of president Harry S. Truman laying the cornerstone of the NIH clinical center.

President Harry S. Truman laid the cornerstone in 1951, saying of the Clinical Center's future work: "Medical care is for the people and not just for the doctors and the rich." He mentioned that 75 million Americans then without health insurance would soon become a "medically indigent class" and he challenged the scientific community to "translate the new knowledge gained by research into better care for more people." "Research to prevent disease" was a better investment for federal dollars than "providing unlimited hospitalization to treat it."

President Harry S. Truman laid the cornerstone in 1951, saying of the Clinical Center's future work: "Medical care is for the people and not just for the doctors and the rich." He mentioned that 75 million Americans then without health insurance would soon become a "medically indigent class" and he challenged the scientific community to "translate the new knowledge gained by research into better care for more people." "Research to prevent disease" was a better investment for federal dollars than "providing unlimited hospitalization to treat it."

In 1951, Harry S. Truman laid the cornerstone of the NIH clinical center, stating, "Research to prevent disease" was a better investment for federal dollars than "providing unlimited hospitalization to treat it."

30.07.2025 01:33 β€” πŸ‘ 28    πŸ” 11    πŸ’¬ 1    πŸ“Œ 0

@chrismiller.science is following 20 prominent accounts