Michael Hall's Avatar

Michael Hall

@mbhall88.bsky.social

Bioinformatics geek ๐Ÿค“ crafting Rust-y tools ๐Ÿฆ€ for microbial genomes ๐Ÿฆ  ๐Ÿงฌ. Trying to master Dad mode ๐Ÿ‘จโ€๐Ÿผ See what I'm up to here: https://github.com/mbhall88

170 Followers  |  281 Following  |  14 Posts  |  Joined: 20.11.2024  |  1.8683

Latest posts by mbhall88.bsky.social on Bluesky

Preview
AltaiR: a C toolkit for alignment-free and temporal analysis of multi-FASTA data AbstractBackground. Most viral genome sequences generated during the latest pandemic have presented new challenges for computational analysis. Analyzing mi

New from @dgpratas.bsky.social et al. for analyzing multiple sequences in multi-FASTA format using alignment-free methodologies. Scalable to millions of sequences for pandemic research and more

AltaiR: a C toolkit for alignment-free and temporal analysis of multi-FASTA data doi.org/10.1093/giga...

12.12.2024 10:28 โ€” ๐Ÿ‘ 4    ๐Ÿ” 4    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
How the Web of Science takes a step back <p>The Web of Science, a major commercial indexing service of scientific journals operated by Clarivate, recently decided to remove eLife from its Science Citation Index Expanded (SCIE). eLife will on...

โ€œClarivateโ€™s decision rewards journals for continuing the unhelpful practice of keeping peer review information hidden and unintentionally presenting incomplete and inadequate studies as sound science and punishes those journals that are more transparent.โ€ ๐Ÿ‘๐Ÿ™Œ

www.coalition-s.org/blog/how-the...

03.12.2024 09:49 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

The DOI URL doesn't seem to be working for the preprint currently. You can find it here: www.biorxiv.org/content/10.1...

03.12.2024 04:02 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
GitHub - mbhall88/lrge: Genome size estimation from long read overlaps Genome size estimation from long read overlaps. Contribute to mbhall88/lrge development by creating an account on GitHub.

8/ Try it out!
LRGE is open-source and ready to integrate into your workflows as a Rust library or CLI application. Whether youโ€™re on a high-performance cluster or a basic laptop, LRGE delivers fast and reliable genome size estimates. Get it here: github.com/mbhall88/lrge

03.12.2024 01:38 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

7/ We validated LRGE on 3370 long read bacterial datasets which have associated high-quality RefSeq assemblies ๐Ÿฆ . We also confirmed it generalises to eukaryote organisms ๐Ÿชฐ๐ŸŒฑ๐Ÿž

03.12.2024 01:38 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

6/ And itโ€™s efficient! โšก
LRGE uses significantly less CPU and memory than traditional approaches, making it ideal for both high-performance clusters and resource-limited environments.

03.12.2024 01:38 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

5/ LRGE vs. the competition ๐Ÿ”ฅ
LRGE delivers estimates as reliable as assembly-based methods and better than k-mer-based approaches.
Relative error (y-axis) measures the proportional difference between the estimated and true genome size.

03.12.2024 01:38 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

4/ LRGE also provides a confidence interval for the estimated genome size, offering users an expected range of variation.

03.12.2024 01:38 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

3/ Why choose LRGE?
* Outperforms traditional k-mer-based tools in accuracy and resource usage.
* Comparable in accuracy to quick assembly tools (like Raven) but much faster and with lower memory requirements.
* Built in Rust, with zero external dependencies. ๐Ÿ’ป

03.12.2024 01:38 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

2/ How does it work?
the basic idea is that if we knew the genome size we could calculate the expected number of overlaps between each read and all other reads. We invert this relationship to estimate the genome size based on the observed number of overlaps for each read

03.12.2024 01:38 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

1/ Accurate genome size estimation is crucial for genomics, yet many tools are optimised for short reads, leaving long-read datasets underserved. Enter LRGE: a lightweight, fast, and highly efficient tool specifically designed for long-read sequencing technologies.

03.12.2024 01:38 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

๐ŸŒŸ Excited to share my latest preprint with @lachlanjmc.bsky.social on @biorxivpreprint.bsky.social: "Genome size estimation from long read overlapsโ€! ๐Ÿš€

Check it out here: doi.org/10.1101/2024...
And find the code here: github.com/mbhall88/lrge

๐Ÿงต๐Ÿ‘‡

03.12.2024 01:38 โ€” ๐Ÿ‘ 29    ๐Ÿ” 14    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 1

Props to eLife for sticking to their guns and essentially telling Clarivate "stuff your Journal Impact Factor, we don't want/need it and neither should anyone else".

Having recently published in eLife, I can attest to the fact that their review process is smooth and high quality.

02.12.2024 00:37 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Poster of the EIPOD call

Poster of the EIPOD call

Join the Interdisciplinary Postdoc Fellowship Program at the European Molecular Biology Laboratory (EMBL), one of the best places to do research in modern biology and develop your career.

Great opportunities for statisticians, comp. biologists, AI experts, mathem. modelers!
www.embl.org/eipod-linc

22.11.2024 09:02 โ€” ๐Ÿ‘ 41    ๐Ÿ” 32    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Handy tip of the day: Settings->Moderation->Muted words & tags->[enter last name of buffoon who is going to be president of USA]

๐Ÿง˜โ€โ™‚๏ธ

21.11.2024 23:18 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

For all you Pythonistas out there, if you haven't tried `uv` yet, give it a go. It will blow you mind ๐Ÿคฏ it is essentially `cargo` (from the Rust world) for Python! Amazing it took so long for the ecosystem to get here, but we did. Astral.sh are doing some incredible things, with incredible devs

20.11.2024 22:31 โ€” ๐Ÿ‘ 11    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Jobs View the current job vacancies at the Microbiology Society.

Microbial Genomics journal is looking for editors at senior level (functional genomics and microbe-host interactions) and handling editors (mainly eukaryotic microbial genomics but any within journal remit welcome). If you want to know more, let me know! microbiologysociety.org/who-we-are/j....

20.11.2024 22:09 โ€” ๐Ÿ‘ 2    ๐Ÿ” 3    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@mbhall88 is following 20 prominent accounts