Peng He ไฝ•้น's Avatar

Peng He ไฝ•้น

@penghe.bsky.social

Assistant professor at UCSF Building a Pokรฉdex of cell states and understanding their gene regulation. ๐Ÿฆ@PengHeAtlas

72 Followers  |  43 Following  |  18 Posts  |  Joined: 04.01.2024
Posts Following

Posts by Peng He ไฝ•้น (@penghe.bsky.social)

Preview
Toward informed batch correction for single-cell transcriptome integration - Nature Computational Science Batch effects pose substantial challenges for obtaining meaningful biological insights from large-scale yet heterogeneous single-cell RNA-sequencing datasets. Here the authors review widely adopted ba...

๐Ÿ“ขNew Perspective out! @penghe.bsky.social, @teichlab.bsky.social and colleagues review widely adopted batch correction methods and propose a path toward more informed, context-aware approaches for future method development. www.nature.com/articles/s43... ๐Ÿ–ฅ๏ธ ๐Ÿงฌ

๐Ÿ”“ rdcu.be/e4pUJ

17.02.2026 15:52 โ€” ๐Ÿ‘ 4    ๐Ÿ” 3    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

This paper wouldn't be possible without my wonderful co-authors: Shuang Li, Malte Lรผcken, and the support from the Marioni Lab and Teichmann Lab. ๐Ÿค
Huge thanks to the reviewers and community for the feedback! #SingleCell #Bioinformatics #scRNAseq #DataScience

17.02.2026 02:12 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

8๏ธโƒฃ The Future: Leverage high-quality references and interpretable models to convert "unknown" Class II effects into "known" Class I artifacts. ๐Ÿ”„
Long-term goal ๐Ÿš€: Robust reference-based analyses resilient to batch effectsโ€”ultimately reducing the need for post hoc correction altogether.

17.02.2026 02:12 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

7๏ธโƒฃ Despite progress, gaps remain: โŒ "Black box": models often obscure which genes are corrected. โŒ Inefficiency: We often train models "from scratch" for every dataset. โŒ Lack of guidance on overlapping/redundant covariates.

17.02.2026 02:12 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

6๏ธโƒฃ Notably, while the exact link between genes and technical variance remains complex, emerging feature selection insights may pave the way for interpretable frameworks that explicitly model these sources rather than blindly correcting them.

17.02.2026 02:12 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

5๏ธโƒฃ We further divide each category into three subgroups.
โš™๏ธ Data Cleaning methods relate to physical models, offering mechanistic insights and precise analysis. โฌ› Data Integration methods are highly effective but implicit, often functioning as "black boxes" where adjustments are harder to trace.

17.02.2026 02:12 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

4๏ธโƒฃ However, there is a long history of method development handling Class I variations (known artifacts like sequencing depth). ๐Ÿงน
We classify these as "Data Cleaning" methods. They explicitly model technical noise sourcesโ€”a critical step often overshadowed by integration.

17.02.2026 02:12 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

3๏ธโƒฃ Itโ€™s important to clarify that what is usually called "data integration" or "batch correction" (like MNN, Harmony, or scVI) is actually just one subset of methods.
These tools are typically designed for Class II effectsโ€”variations that are complex or batch-specific.

17.02.2026 02:12 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

2๏ธโƒฃ We classify batch effects into two categories (Fig 1a):
๐Ÿ”น Class I: Better characterized, universally unwanted artifacts (e.g., ambient RNA, sequencing depth). ๐Ÿ”น Class II: Poorly characterized, batch-specific variation (e.g., donor effects, protocol differences).

17.02.2026 02:12 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

1๏ธโƒฃ Batch effects remain the "elephant in the room" for single-cell genomics. ๐Ÿ˜
The core challenge? The trade-off between undercorrection (residual noise) and overcorrection (erasing fine-grained biological signals). We argue that not all batch effects are the same.

17.02.2026 02:12 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Toward informed batch correction for single-cell transcriptome integration Nature Computational Science - Batch effects pose substantial challenges for obtaining meaningful biological insights from large-scale yet heterogeneous single-cell RNA-sequencing datasets. Here...

Our Perspective paper "Toward informed batch correction for single-cell transcriptome integration" is out now in Nature Computational Science! ๐Ÿ“„โœจ
We review a decade of batch-correction methods and propose a move from "blind" integration to "informed" modeling. ๐Ÿงต๐Ÿ‘‡ ๐Ÿ”— rdcu.be/e4cSp

17.02.2026 02:12 โ€” ๐Ÿ‘ 6    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

give it a try and you may get addicted : )

16.06.2025 01:34 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Royal Match game ads '105' Make way for king
YouTube video by Potato Pseudo Gamer Royal Match game ads '105' Make way for king

Being a new PI ๐Ÿ”ฝ youtube.com/shorts/llV5_...

15.01.2025 20:12 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Open Postdoctoral Position in Computational Genomics and Systems Biology

๐Ÿ”ฌ Hiring: Computational Biology Postdoc @UCSF!

to develop:
1๏ธโƒฃ Novel deep learning models for spatial/single-cell multiomics
2๏ธโƒฃ Single-cell analyses across development & disease
3๏ธโƒฃ Open-source tools for the broader community

Apply here: opportunities.ucsf.edu/content/open...

#CompBio #PostdocJobs

11.12.2024 04:36 โ€” ๐Ÿ‘ 1    ๐Ÿ” 4    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
He Lab - UCSF

๐ŸŽ‰Our new lab website is live! ๐ŸŽ‰
Explore our research on single-cell๐ŸŒƒ, multi-omics๐Ÿ”ฎ, spatial-omics๐Ÿ—บ๏ธ, gene regulation๐Ÿงฌ, bioinformatics๐Ÿค–, and development๐ŸŒฒ. Meet our motivated team and stay tuned for our latest works ๐Ÿš€๐ŸŒŸNew members at all levels welcome!

peng-he-lab.github.io

30.11.2024 23:08 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

He Lab is looking for a computational research assistant to help us build the best cell atlases in the world and dissect gene regulatory networks. This position can be useful for fresh grads to take a break before applying to grad school. More details: aprecruit.ucsf.edu/JPF05334

26.11.2024 05:05 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Smart design. I do have codes to pool the replicates, demultiplex together, and then separate the replicates, in case you would need that

11.01.2024 15:12 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Ive done 6 with uneven mixing. Guess 10+ is possibow with even mixing+30k cells per library

11.01.2024 13:17 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

I'm maintaining an active list of biology-related jobs at various levels (PIs, staff scientists, postdocs, PhDs, etc.). Please feel free to subscribe or contribute! github.com/brianpenghe/...

04.01.2024 15:13 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0