Haky Im's Avatar

Haky Im

@hakyim.bsky.social

Statistician doing genomic data science, faculty the University of Chicago, Korean, Argentinean, American. Love kimchi, math, science, books with beautiful prose.

1,389 Followers  |  2,172 Following  |  28 Posts  |  Joined: 25.09.2023  |  1.7147

Latest posts by hakyim.bsky.social on Bluesky

Post image

I'm hiring a computational biologist interested in complex trait genetics using deep learning approaches. Reach out to me, if interested.

12.09.2025 19:00 β€” πŸ‘ 37    πŸ” 46    πŸ’¬ 1    πŸ“Œ 0
Preview
scPrediXcan integrates deep learning methods and single-cell data into a cell-type-specific transcriptome-wide association study framework Zhou et al. introduce scPrediXcan, a novel transcriptome-wide association study framework that integrates the deep learning-based model ctPred for cell-type-specific expression prediction. Applied to ...

Check out our scPrediXcan paper
www.cell.com/cell-genomic...
Led by the talented @Charles_Zhou12 and supervised by @MengjieChen6
and me, with thanks to many contributors.

scPrediXcan integrates deep learning and single cell expression data into a powerful cell type specific TWAS framework.

14.05.2025 22:41 β€” πŸ‘ 14    πŸ” 6    πŸ’¬ 0    πŸ“Œ 1
Post image

Need a break from doomscrolling?

Do you study the genetics of disease subtypes? Have you ever wondered how to efficiently run gene-based associations with multicategorical outcomes? Join us for the IGES Journal Club on Wednesday, February 26, 2025! 11AM ET.

14.02.2025 20:24 β€” πŸ‘ 6    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0

Ughhh, suffering the plos system now. My collaborators won’t let me submit future papers to PLoS

11.02.2025 21:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Modern GWAS can identify 1000s of significant hits but it can be hard to turn this into biological insight. What key cellular functions link genetic variation to disease?

I'm very excited to present our new work combining associations and Perturb-seq to build interpretable causal graphs! A 🧡

26.01.2025 00:13 β€” πŸ‘ 320    πŸ” 118    πŸ’¬ 6    πŸ“Œ 10

1/At the #HoffmanLab lab meeting, we often have tech talks describing useful tools for other lab members. Since they might also prove useful for others, we've been posting almost every #HoffmanLabTechTalk for years. πŸ§ͺπŸ§¬πŸ’»

11.06.2024 21:24 β€” πŸ‘ 44    πŸ” 9    πŸ’¬ 2    πŸ“Œ 2

with invaluable contributions from @temicrates Lisha Zhu @ssalazar_02 Sarah Sumner Hyunki Kim Saideep Gona @Festus_nyasimi Rohit Kulkarni @drjosephpowell @madduri
@boxiangliu

15.11.2024 04:34 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
scPrediXcan integrates advances in deep learning and single-cell data into a powerful cell-type–specific transcriptome-wide association study framework Transcriptome-wide association studies (TWAS) help identify disease causing genes, but often fail to pinpoint disease mechanisms at the cellular level because of the limited sample sizes and sparsity ...

check out our new preprint led by Charles Zhou and supervised by Mengjie Chen and me doi.org/10.1101/2024... where we present scPrediXcan which integrates deep learning and single cell expression data into a powerful cell type specific TWAS framework

15.11.2024 02:28 β€” πŸ‘ 29    πŸ” 13    πŸ’¬ 1    πŸ“Œ 0

Consistency is key

13.11.2024 03:26 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I’m at UChicago, develop methods to understand the biology of complex trait and diseases, aspire to help real people with my research. Currently very optimistic about predicting molecular traits from DNA sequences using m/billions of parameters

13.11.2024 03:25 β€” πŸ‘ 16    πŸ” 4    πŸ’¬ 0    πŸ“Œ 0

πŸ†• Blog: Genomics scientists from and in Mexico community building lcolladotor.github.io/2024/11/11/g...

I want our voicesπŸ‡²πŸ‡½ to be heard from the scientific community. But to be heard, we have to be a part of it

#LCG-UNAM #LCG-EJ-UNAM @cdsbmexico.bsky.social #NNB-UNAM #RMB

youtu.be/OVMw0k6AydA?...

11.11.2024 19:55 β€” πŸ‘ 20    πŸ” 5    πŸ’¬ 0    πŸ“Œ 0
Preview
Tuskegee syphilis study whistleblower Peter Buxtun has died at age 86 The whistleblower who exposed the Tuskegee syphilis study that left hundreds of Black men untreated has died at age 86.

Peter Buxtun, the whistleblower who revealed that the U.S. government allowed hundreds of Black men in rural Alabama to go untreated for syphilis in what became known as the Tuskegee study, has died. He was 86. apnews.com/article/buxt...

16.07.2024 15:18 β€” πŸ‘ 90    πŸ” 52    πŸ’¬ 2    πŸ“Œ 3

IGES (international genetic epidemiology society) journal club recording "The All of Us Research Program: Data Resources and Access"; by Dr. Anji Musick is available here:

drive.google.com/file/d/1bty6...

18.04.2024 15:43 β€” πŸ‘ 1    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
PA-23-189: Research Supplements to Promote Diversity in Health-Related Research (Admin Supp Clinical... NIH Funding Opportunities and Notices in the NIH Guide for Grants and Contracts: Research Supplements to Promote Diversity in Health-Related Research (Admin Supp Clinical Trial Not Allowed) PA-23-189....

Also worth noting: the US NIH offers financial support for those from ANY underrepresented group at ANY career stage to join an NIH funded lab!

Find a professor whose research you’re interested in and see if they are willing to host you! grants.nih.gov/grants/guide...

05.11.2023 11:02 β€” πŸ‘ 33    πŸ” 31    πŸ’¬ 3    πŸ“Œ 0

TWAS (transcriptome wide association study) is a statistical method that prioritizes genes that are more likely to cause a disease. It uses GWAS (genome wide association study) data, which is a method that identifies genomic loci associated with diseases

23.10.2023 14:53 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Preprint is up now www.biorxiv.org/content/10.1...

21.10.2023 13:44 β€” πŸ‘ 4    πŸ” 3    πŸ’¬ 0    πŸ“Œ 1

Preprint should be up shortly

18.10.2023 17:48 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

4) LD is not necessary for the inflation to occur (our simulations were done using independent SNPs)

5) The inflation can be corrected by using the noncentral Ο‡2 distribution with noncentrality parameter N h2Ξ΄ Ξ¦, where the factor Ξ¦ can be pre-calculated independent of the GWAS

18.10.2023 17:48 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

2) Uncertainty in the prediction of the mediator does not cause inflation

3) Uncertainty in the prediction of the mediator reduces the power of the test

18.10.2023 17:47 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

In summary

1) Polygenicity of the target trait induces inflation in the test statistics regardless of the genetic architecture of the mediating trait

18.10.2023 17:47 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Does this inflation affect other mediator-based *WAS?

Yes

What if we use PRS of GWAS traits to correlate with target traits? Is this going to be inflated?

Yes. You need to estimate Ξ¦ and use the noncentral Ο‡2 distribution

18.10.2023 17:46 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Back to the effect of precision
Precision of prediction improves power, or equivalently prediction error reduces power but doesn’t increase inflation under the null

18.10.2023 17:46 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

How well does your formula work under the alternative with finite N?

Pretty well

18.10.2023 17:46 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

What happens under the alternative?

See figure for formula under the alternative

Ο„2 is the precision of the prediction

18.10.2023 17:45 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Can you estimate Ξ¦?

Yes

See figure: most of the Ξ¦ are around 10e-5

18.10.2023 17:45 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Can you fix it?

Yes, just use the noncentral chi2 to compute the p-value instead of the standard chi2

pchisq(chi2, ncp=N*h2Y*phi, df=1, lower.tail=FALSE)

where phi=Ξ¦ is the gene specific factor

18.10.2023 17:44 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Post image

Does it matter for TWAS in practice?

Yes

See inflation in figure when using UK Biobank genotype data and null traits Y. It will continue growing as N grows.

18.10.2023 17:44 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

How we does your formula match with finite N?

Quite well

18.10.2023 17:43 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Does prediction error drive inflation?

No, it’s not the prediction error

See figure Eχ2 does not depend on the precision consistent with established error in variable literature

It’s the polygenicity of Y since EΟ‡2 = 1 + N h2 Ξ¦. The inflation is 0 when the polygenic component of Y = 0

18.10.2023 17:42 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Does this matter?

Yes

See example in figure 1, pink area is much larger than the negligible blue area in this example

18.10.2023 17:40 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@hakyim is following 20 prominent accounts