Cristian Groza, PhD's Avatar

Cristian Groza, PhD

@cgroza.bsky.social

Computational Genomics. https://scholar.google.com/citations?user=DzNAR8YAAAAJ&hl=fr

340 Followers  |  491 Following  |  19 Posts  |  Joined: 14.07.2023  |  1.7944

Latest posts by cgroza.bsky.social on Bluesky

Video thumbnail

Inkscape plugin to rescale figures without distorting the text, and other useful features!
github.com/burghoff/Sci...

02.06.2025 12:15 โ€” ๐Ÿ‘ 152    ๐Ÿ” 46    ๐Ÿ’ฌ 6    ๐Ÿ“Œ 3

That's why I always start conversations by guessing higher. It's more flattering to be overestimated than underestimated.

02.05.2025 22:06 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Who would listen to AI generated music?

13.04.2025 15:57 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Iโ€™m happy to (finally!) share STRkit, a short tandem repeat genotyping tool for long reads that I've developed in @guilbourque.bsky.social's group over the past few years. Here we describe STRkit & demonstrate state-of-the-art performance in some benchmarks โ€“ย long reads are great for resolving STRs!

29.03.2025 13:01 โ€” ๐Ÿ‘ 8    ๐Ÿ” 3    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 1

Check out the new STR tool from @davidlougheed.bsky.social in the group! Works with both PacBio and Nanopore!

01.04.2025 07:15 โ€” ๐Ÿ‘ 8    ๐Ÿ” 3    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Pangenome graph augmentation from unassembled long reads https://www.biorxiv.org/content/10.1101/2025.02.07.637057v1

09.02.2025 02:50 โ€” ๐Ÿ‘ 1    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Pangenome graph augmentation from unassembled long reads https://www.biorxiv.org/content/10.1101/2025.02.07.637057v1 ๐Ÿงฌ๐Ÿ–ฅ๏ธ๐Ÿงช https://github.com/ldenti/palss

09.02.2025 12:25 โ€” ๐Ÿ‘ 14    ๐Ÿ” 3    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1
Printing a data frame (tibble) in the R console, where 3 columns are hidden

Printing a data frame (tibble) in the R console, where 3 columns are hidden

The output of df |> print(width = Inf) which prints all columns of a data frame (tibble) in the R console

The output of df |> print(width = Inf) which prints all columns of a data frame (tibble) in the R console

I don't know who needs to hear this, but if you want to look at all columns of a tibble (which has the somewhat annoying habit of only showing you as many columns as you have space for), just pipe it into print(width = Inf):

df |> print(width = Inf)

#rstats

02.12.2024 17:38 โ€” ๐Ÿ‘ 70    ๐Ÿ” 13    ๐Ÿ’ฌ 4    ๐Ÿ“Œ 1

This is what happens when you don't silence your transportable elements.

18.11.2024 18:17 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

๐Ÿ–ฅ๏ธ๐Ÿงฌ Pangenomics attacks rare disease! Check out our latest publication in Nature Communications, where we survey structural variation in the Genomics Answers for Kids cohort of disease genomes using minigraph.
rdcu.be/dwEJm

22.01.2024 18:07 โ€” ๐Ÿ‘ 9    ๐Ÿ” 6    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0
Preview
Genomics+Bioinformatics Starter Pack ๐Ÿงฌ๐Ÿ–ฅ๏ธ Join the conversation

I created a genomics+bioinformatics starter pack. If I left you off, *please* reply and I'll add you! go.bsky.app/B5YYBfq

22.10.2024 12:34 โ€” ๐Ÿ‘ 176    ๐Ÿ” 106    ๐Ÿ’ฌ 99    ๐Ÿ“Œ 7
Preview
A unified framework to analyze transposable element insertion polymorphisms using graph genomes - Nature Communications Transposable element (TE) activity affects genome structure. Here, authors present GraffiTE, a framework for analysing polymorphic TEs in long reads or assemblies. It combines state-of-the-art variant...

Proud to be a part of this project led by @clementgoubert.bsky.social and @cgroza.bsky.social.
GraffiTE identifies polymorphic mobile elements (TEs) from genomic assemblies or long-read sequencing data, and genotypes these variants using short or long read sets.
go.nature.com/4hdZ5aV

20.10.2024 03:52 โ€” ๐Ÿ‘ 9    ๐Ÿ” 3    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
https://www.nature.com/articles/s41467-024-53294-2

๐ŸŽ‰ Our #GraffiTE paper is out!!! t.co/TmoSL3WQ1P If you are interested in #Transposon insertion polymorphism, this is for you!

16.10.2024 20:47 โ€” ๐Ÿ‘ 29    ๐Ÿ” 11    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

In conclusion, we suggest a way to use pangenomes to call SVs against a reference pangenome like HPRC to filter out common SVs and find the very rare SVs that are unique to a disease genome. This allows clinicians to focus on a set of SVs that are more likely to be pathogenic.

22.01.2024 18:15 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Therefore, we created a PBSV and minigraph consensus to obtain consensus rare SVs. When used Phrank to rank the consensus rare SVs, we uncovered a deletion in KMT2E, which is likely to be a causal variant for a previously undiagnosed case.

22.01.2024 18:15 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

When looking for the very rare alleles, we found that current SV methods such as PBSV and minigraph generate too many false positive calls for a clinician to curate in a reasonable amount of time. This is especially so since false positive show up as rare SVs.

22.01.2024 18:14 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

We found that minigraph calls 29,964 in the twins, of which 84.96% are in both twins. PBSV calls 23,516 SVs in the same twins, of which 83.12% are in both twins.

22.01.2024 18:13 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

We found many alleles, but it would be helpful to know how assembly and pangenome approaches compare with long read SV callers such as PBSV. We made use of a pair of twins that is present within GA4K, and checked how many SVs are called in both twins using minigraph vs PBSV.

22.01.2024 18:12 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Overall, we have found 204,551 SV alleles that are unique to the GA4K cohort, with the most common allele occurring in only 88 of the 574 haplotypes. This highlights the importance of personal de novo assemblies in finding rare SVs.

22.01.2024 18:11 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

The additional sequence is mostly composed of simple repeats, satellites, and TEs. However, we estimate that at least 4.7 Mbp is unique sequence that is not repeats and is not present in HPRC or in the reference genome.

22.01.2024 18:10 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

First, we assembled 574 rare disease haplotypes using HiFi, hifiasm and parental data. We validated these assemblies with Flagger developed within HPRC. Then we built a pangenome graph with minigraph on top of HPRC. This added 426 Mbps that is specific to the GA4K cohort.

22.01.2024 18:09 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

๐Ÿ–ฅ๏ธ๐Ÿงฌ Pangenomics attacks rare disease! Check out our latest publication in Nature Communications, where we survey structural variation in the Genomics Answers for Kids cohort of disease genomes using minigraph.
rdcu.be/dwEJm

22.01.2024 18:07 โ€” ๐Ÿ‘ 9    ๐Ÿ” 6    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

It's seasoning, just like well kept cast iron.

20.01.2024 18:20 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1

MPs being mad about the tax hike reveals who they represent.

17.01.2024 20:12 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

2FA on top of ssh keys ๐Ÿค•

02.01.2024 16:04 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

It's mostly Apple's fault. Ditched the iPhone for a Google pixel, didn't look back

01.01.2024 21:17 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Eh, Francis Fukuyama wrote worse

27.12.2023 19:59 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Thread of some really great works out this week in the world of countering naive genetic thinking (e.g. hereditarianism, genetic determinism, scientific racism)

07.11.2023 21:42 โ€” ๐Ÿ‘ 40    ๐Ÿ” 26    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
GraffiTE: a Unified Framework to Analyze Transposable Element Insertion Polymorphisms using Genome-g... bioRxiv - the preprint server for biology, operated by Cold Spring Harbor Laboratory, a research and educational institution

๐Ÿ–ฅ๏ธ๐ŸงฌOur recent preprint on GraffiTE, a transposable element genotyping pipeline: www.biorxiv.org/content/10.1...
Together with Clement Goubert, who is the source of all these great ideas and applications in the field of transposable elements.

22.09.2023 20:18 โ€” ๐Ÿ‘ 6    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

๐Ÿงฌ๐Ÿ–ฅ๏ธWho on here doing genomics?

22.09.2023 17:00 โ€” ๐Ÿ‘ 8    ๐Ÿ” 0    ๐Ÿ’ฌ 3    ๐Ÿ“Œ 0

@cgroza is following 20 prominent accounts