Today's #RDKit blog post is a heartfelt plea for clearer communication.
greglandrum.github.io/rdkit-blog/p...
@me-datapoint.bsky.social
Professor for data science at HSD, @zdd-hsd.bsky.social | ML fan & critic | current research mostly #datascience, #machinelearning, #cheminformatics #dataviz #nlp | β¨ #openscience #openaccess #rse | living data point π²
Today's #RDKit blog post is a heartfelt plea for clearer communication.
greglandrum.github.io/rdkit-blog/p...
Great post!
We also noted the same thing, which triggered us to point out some pitfalls of various fingerprints --> www.biorxiv.org/content/10.1...
BREAKING NEWS: #AI coding may not be helping as much as you think
"But for now, the disconnnect between what coders thought they would get out of the tools efficiency-wise and what they actually did get out of them is cause for reevaluation." ~ @garymarcus.bsky.social
garymarcus.substack....
Paris cycling numbers double in one year thanks to massive investment and itβs not stopping.
A visionary urban policy lead by @annehidalgo.bsky.social π«π«Άπ»ππ»
momentummag.com/paris-cyclin...
Hier in @duesseldorf.bsky.social wird vorerst lieber noch jeder Parkplatz verteidigt...
(und leider nicht nur hier)
I donβt think anyone is prepared for what they just did w/ ICE.
This is not a simple budget increase. It is an explosion - making ICE bigger than the FBI, US Bureau of Prisons, DEA,& others combined.
It is setting up to make whatβs happening now look like childβs play. And people are disappearing.
Hey Verwaltungs-Digitalisierer:innen! Am 17. Juli starten wir eine neue AG zur Verwaltungsdigitalisierung. Eure Expertise aus dem ΓΆffentlichen Dienst ist gefragt! Gemeinsam gestalten wir die Zukunft der ΓΆffentlichen, digitalen Verwaltung πͺ
d-64.org/veranstaltun...
πRead in our MS Metabolomics themed collection, a #OpenAccess review from Kevin Mildau, Henry Ehlers, @jjjvanderhooft.bsky.social et al. at @w-u-r.bsky.socialβ¬ @tuwien.atβ¬ covering effective data visualization strategies in untargeted metabolomics #natprod
Find it hereπ½
@jorainer.bsky.social and @philouail.bsky.social gave a great overview of the ecosystem around #RforMassSpectrometry and #XCMS!
#MetSoc25
I am super glad they now also provide options to combine with #Python and #matchms (thanksπ)
π’ Poster 1001 at #MetSoc2025: Marilyn De Graeve on our #SpectriPy #rstats package to integrate #python and #rstats packages for #MassSpec data analysis . TODAY
23.06.2025 11:09 β π 31 π 6 π¬ 1 π 0Hi, in case your phone didn't pick up the QR code to the slides of my Hitch-Hikers Guide to Computational Metabolomics talk this morning at #Metabolomics2025, featuring #xcms, #massbank, not #metfrag but #CASMI and #MetFamily, please find them at doi.org/10.5281/zeno...
25.06.2025 09:15 β π 16 π 8 π¬ 1 π 0Slide from presentation of Steffen Neumann
Great keynote by @sneumann.bsky.social at #MetSoc25, strongly advocating for #opensource , data-sharing, and making things interoperable.
Glad to also spot #matchms in this universe :)
Proud of Niek de Jonge who did a fantastic job in presenting his work on cross-ion mode spectral similarity scoring! π π
Work with Florian Huber @me-datapoint.bsky.social
#metabolomics #CompMetabolomics #MetSoc25 #MS2DeepScore
Chemical Space Visualizations using UMAP and various molecular fingerprints.
4/4
We also highlight options for count fingerprints, such as log-counts and IDF weighted counts. The latter can be used to adjust the bit importance to a dataset of your choice.
An example use-case are chemical space visualizations.
Preprint: www.biorxiv.org/content/10.1...
3/4
A huge issue is bit collisions.
Fingerprints with a high bit occupation (RDKit, MAP4) often lead to (1) arbitrary misinterpretations, (2) shifts to high Tanimoto scores, (3) very different handling of small and large molecules.
--> Consider using sparse fingerprints!
--> Morgan >> MAP4 / RDKit
Benchmarking plot on fingerprint duplications.
2/4
We focused on weaknesses of the fingerprints.
Many show frequent duplicates, so same fingerprint for different compounds. Most problematic: this can include *very* different compounds ending up with identical fingerprints.
- MAP4 >> Morgan-type >> daylight
- count >> binary
#cheminformatics
Sketch of count/binary fingerprints and weighing options.
New preprint out!
1/4
@julianpollmann.bsky.social and I went down several rabbit holes to assess some commonly used molecular fingerprints.
Bottom line: For large datasets, make an effort to select suitable settings. "We used Tanimoto" is not good enough.
--> www.biorxiv.org/content/10.1...
Good start for me at #metabolomics2025 with a hands-on workshop on MS2LDA by Jonas Dietrich, Rosina Torres Ortega and @jjjvanderhooft.bsky.social.
23.06.2025 08:11 β π 6 π 3 π¬ 0 π 0Elbe river seen from a train somewhere after Dresden.
Went by train to #Prague for #metabolomics2025.
These are the kind of moments that remind me how great the European project is. No border controls, no visas. Just a train following a river to the neighboring country.
Orwellβs 1984, but with LLMs
22.06.2025 02:07 β π 237 π 51 π¬ 14 π 8Hier alle Ergebnisse: fahrradklima-test.adfc.de/ergebnisse
(Besonders von Ruhrpott bis KΓΆln ist es leider ziemlich traurig)
Screenshot vom ADFC Fahrradklima-Test 2024 fΓΌr DΓΌsseldorf.
Da kann der BΓΌrgermeister @duesseldorf.bsky.social noch so oft die "Fahrradhauptstadt" (π₯Ήπ€π) herbeibeschwΓΆren... es braucht dann doch ein bisschen mehr als ein paar Kleckse Farbe.
#DΓΌsseldorf weiterhin konstant bei 4- im #ADFC Klimatest. LΓ€uft. @adfcnrw.bsky.social @adfc-duesseldorf.de
At #ASMS2025, we presented a next-gen workflow pairing the #timsMetabo MoRE (Mobility Range Enhancement) with #mzmine and #DreaMS
Compared to conventional LC-MS2, the data delivered:
- 84% more detected features
- 5.7Γ more MSΒ² spectra
- 3Γ more spectral matches
Read the full poster to learn MoRE
When you prepare lesson material while being hungry...
(added some text edits and more sketches/figures to the NLP chapters of the "Hands-on Introduction to #DataScience with #Python" textbook)
florian-huber.github.io/data_science...
#OpenScience #Teaching #CCBY
Look at the gender breakdown of who speaks in popular films!
26.05.2025 14:50 β π 385 π 128 π¬ 10 π 19Loving this: "The Copilot Delusion"
https://deplet.ing/the-copilot-delusion/
OpenAI just updated ChatGPT to be able to use RDKit, a cheminformatics Python package.
OpenAI's president says this makes ChatGPT "useful for scientific work across health, biology, and chemistry," but it is hilariously still not good at chemistry (π§΅)
#chemsky #AI βοΈπ§ͺπ₯οΈ
New release of my "Hands-on Introduction to Data Science with Python" textbook!
Contains many text edits and figure updates. For instance, in the sections on Clustering and Machine Learning.
All fully #opensource and #openaccess. Figures are #CCBY.
--> florian-huber.github.io/data_science...
It is with tremendous emotion that I share with you our recent work @jcheminf.bsky.social rb.gy/gynwlf that resulted in the update of the MIADB and the generation of valuable spectrometric signatures that could be used as #MassQL queries π
S. Szwarc @univparissaclay.bsky.social @adafede.bsky.social
Great thing about working in "Data Science" is working on so many different fun projects and topics!
Til Hunke, a master student supervised by Jochen Steffens and me, used NLP tools to analyze the lyrics of pop songs in the German charts from 1954 to 2022.
--> journals.sagepub.com/doi/10.1177/...