Sasha Gusev's Avatar

Sasha Gusev

@sashagusevposts.bsky.social

Statistical geneticist. Associate Prof at Dana-Farber / Harvard Medical School. www.gusevlab.org

6,766 Followers  |  700 Following  |  889 Posts  |  Joined: 27.09.2023
Posts Following

Posts by Sasha Gusev (@sashagusevposts.bsky.social)

Post image

I used Claude Opus 4.5/4.6 (and a bit of Codex GPT-5.3) to port edgeR to Python. See edgePython github.com/pachterlab/e...
This allowed me to develop a single-cell DE method that extends NEBULA with edgeR Empirical Bayes. All in one week. Details in doi.org/10.64898/202...

19.02.2026 16:46 β€” πŸ‘ 67    πŸ” 25    πŸ’¬ 3    πŸ“Œ 3

I was using the free version and indeed hit the limits very quickly. But it was enough to get a feel for how well it works and generate a few simple visualizers.

14.02.2026 14:11 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The app

14.02.2026 01:53 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

As usual, several components didn't work on the first try, but simply telling Codex which parts weren't rendering or looked off was enough for it to identify issues.

13.02.2026 14:58 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Stabilizing Selection Dashboard

Gave Codex an old R simulation I had lying around and it very adeptly turned it into an interactive web app with tunable parameters and visualizations.

sashagusev.github.io/Stabilizing-...

13.02.2026 14:58 β€” πŸ‘ 18    πŸ” 2    πŸ’¬ 3    πŸ“Œ 0

This is claude code standalone through the terminal, but I'm playing around with Cursor next

10.02.2026 15:47 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Yeah, it has been amazing at harmonizing across lots of different data sources and generating interactive data visualizations -- basically all the menial stuff.

10.02.2026 00:07 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Good idea, any database out there you currently use?

09.02.2026 18:56 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Ironically the hardest part was dealing with inconsistency across genome builds (I still think it is suboptimal in the implementation), an issue that has never really been resolved in our field.

09.02.2026 18:50 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

But an hour of simply prompting Claude on which parts of the page weren't loading and pointing it to working examples was sufficient for it to identify all issues and resolve them -- zero coding. I cringed on several occasions at the layers of conditional logic it was implementing, but it works!

09.02.2026 18:50 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Claude took a one paragraph prompt listing sites I wanted integrated, spent ~30 minutes reading through the documentation, figured out which APIs would work for a static site call, and implemented the whole thing. Initially none of it worked.

09.02.2026 18:50 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Post image

Another Claude project: a static site that pulls in GWAS SNP data from ensemble, multiple public biobanks, open targets, gtex, eqtl catalog, and OMIM.

sashagusev.github.io/gwas_lookup/

09.02.2026 18:50 β€” πŸ‘ 46    πŸ” 13    πŸ’¬ 2    πŸ“Œ 1

And then when it works it just ... works. /x

03.02.2026 03:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 1

Lessons: it's not very creative and can probably make it seem like an exploratory project is dead on arrival. It can get stuck in a confusion cycle where improvements keep introducing new issues. But it's easy enough to just start from scratch with a more detailed prompt.

03.02.2026 03:10 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
HopScreen - Movie Marathon Planner

This time Claude totally got it and generated a working site on the first shot. I asked it to improve the UX a bit in small ways and we were done. You can try it out here:
sashagusev.github.io/hopscreen/

03.02.2026 03:10 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Each time I was able to nudge it in the right direction but it felt like it was stuck in some fundamental misunderstanding.

Third attempt: I booted up a new project and wrote a detailed prompt from scratch, trying to incorporate challenges that had come up the previous round.

03.02.2026 03:10 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

But it was clear it hadn't quite understood what I was going for. Initially the users had to select the individual show times, which didn't make sense. Then it showed other movies beyond what the user had selected. Then just pairs of movies.

03.02.2026 03:10 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Second attempt: I proposed to just ask the user to find the listing they are interested in and download it. I gave Claude a sample file which it parsed, understood, and quickly implemented a working prototype. Pretty impressive.

03.02.2026 03:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

But ultimately it gave up and proposed a long plan to use fake data and implement a mock-up instead. The choice to use fake data was actually quite buried in the work plan and could easily have been missed. Creativity in problem solving still seems a bit limited.

03.02.2026 03:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

First attempt: original idea was to do web scraping directly, Claude explored ways to do this from various public movie listings but couldn't get the rendering. It tried searching for online guides and APIs on github. Overall a very comprehensive attempt to solve the problem.

03.02.2026 03:10 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Quick Claude Code project: download movie showtimes from Fandango, upload to this site, select some movies, and it will generate a marathon schedule. For example, you could go see F1, MELANIA, Iron Lung, and Mercy at the AMC Boston Commons -- what a treat! Took <1hr to prompt ...

03.02.2026 03:10 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Thanks, I remember this one! I feel like the next step is asking whether un-/semi- supervised LLM models can do a good job predicting important clinical end-points on top of the established diagnostic factors.

31.01.2026 15:29 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Yeah absolutely. I think this could be interesting as a thought experiment to distinguish "biological" aging variance from random causes of death but that would require quantifying how much variance is left after removing the latter (and good cause-of-death data).

31.01.2026 15:17 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Is there any health system that currently allows data analysis of unstructured psychiatric notes? Anyone doing that kind of work? We are analyzing oncologist notes and disagreement with structured data (e.g. ICD codes) is severe.

30.01.2026 17:25 β€” πŸ‘ 0    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
Genetic Data From Over 20,000 U.S. Children Misused for β€˜Race Science’

Great piece in the NYtimes with quotes from @stairwaytokevin.bsky.social and @sashagusevposts.bsky.social. The misuse of NIH datasets with sensitive personal information for racist aims should be concerning for anybody interested in scientific integrity. www.nytimes.com/2026/01/24/u...

24.01.2026 14:50 β€” πŸ‘ 21    πŸ” 13    πŸ’¬ 0    πŸ“Œ 0

Yeah, this is very elegant

10.01.2026 14:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Interesting new multi-population aDNA selection scan from Colbran, Terhorst, Mathieson. Loci that are estimated to be under selection in one population also show enrichment in other populations, consistent with parallel or pre-split selection.

www.biorxiv.org/content/10.6...

09.01.2026 14:37 β€” πŸ‘ 18    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0
Post image

We used a similar design in this paper:
www.sciencedirect.com/science/arti...

Found that in cis-MR with multiple colocalising instruments (i.e. multiple indepdent signals colocalise between eQTL and pQTL traits) sign concordance was around 95%:

07.01.2026 09:20 β€” πŸ‘ 10    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0
Preview
Quantifying genetic effects on disease mediated by assayed gene expression levels - Nature Genetics Mediated expression score regression (MESC) is a new method that estimates disease heritability mediated by the cis genetic component of gene expression levels by using summary statistics from GWAS an...

When we tried to quantify this for steady state expression in prior work we see a significant (but small) fraction of disease heritability captured by measurable steady state QTL models which suggests they are a useful starting point (and what else do we have anyway?)

www.nature.com/articles/s41...

07.01.2026 18:42 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Yeah this is an important point. Some (still unknown) fraction of context specific QTLs will still show up at steady state (and even in the "wrong" tissue). The context specificity hypothesis is plausible but has so far mostly been driven by the absence of evidence without good quantification.

07.01.2026 18:42 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0