4 word horror story for an R user:
"has a Java dependency"
@flaviaerius.bsky.social
I work with Polygenic Risk Scores for Brazilians @ Mendelics #rstats #bioinformatics
4 word horror story for an R user:
"has a Java dependency"
- first thing needed (told last here bc you may already know, even if I didn't) is at least a superficial understanding of the concept of tokens and how it may limit your output (you can find the limit empirically, depending on the application).
11.10.2025 00:19 — 👍 0 🔁 0 💬 0 📌 0My takes on doing this for the 1st time on Gemini 2.5 Pro, to extract info from unstructured data, are:
- makes a difference if you submit csv or tsv file (the latter was better);
- depending of the day it will be able to process more or less data (timeout varies due to server load variability);
Probably late, but his week I finally understood the meaning of prompt engineering.
I used a prompt, extracted results, measured them, then adjusted both the input and prompt for better results, measured them, adjusted inputs again, measured results, and finally applied to a new dataset.
Ever wondered why analyzing RNA-seq data feels like walking through a fog with 20,000 dimensions?
Let’s talk about the curse of dimensionality in bioinformatics—and why it’s not just a math problem, it’s a biological one. 🧵
I just want to share that I had a blast at #positconf2025! Met so many wonderful people, and gave a talk for the first time. It was awesome!
27.09.2025 16:28 — 👍 5 🔁 0 💬 0 📌 0I’m new to bluesky,so.. hi!
I’m Beatriz / Bea, from Brasil, and currently I’am a post doc researcher.
I started to learn how to program in R in 2018 to use it on my research, and have been programming since then.
I have a blog where I post and share presentations etc
beamilz.com
#rstats 👋
I have just experienced the magic of rv in recovering my R packages after having to format my laptop.
Completely recommend!
github.com/A2-ai/rv
#rstats
True story. So hard to fix this!
02.08.2025 17:03 — 👍 1 🔁 0 💬 0 📌 0Every #Rstats user should read this post from @jennybryan.bsky.social.
Especially beginner to intermediate-level ones.
www.tidyverse.org/blog/2017/12...
My biggest, loudest job application advice:
WHY do you want this job? This SPECIFIC job? What about it interests you? What do you think of it? What can you bring to it?
I know this sounds obvious but a generic cover letter isn't doing you favours - really. *Show me you know the job!*
1/7
TIL you can set different values of alpha based on categories on #ggplot2.
Set 'aes(alpha = categories_col)', and then use scale_alpha_manual() to set specific transparency proportions.
I used a lower alpha in order to visualize the most important categories here.
#rstats
1/ When should you use a package vs. solving from scratch in bioinformatics?
I get asked this all the time. Here’s my approach: 👇
TIL the parameter .before in mutate() from {dplyr}.
It adds the new column(s) created to the *beginning* of the dataframe, instead of to the end. Quite useful!
Source: #RforDataScience (2e)
(sometimes you should read stuff you think you already know)
#rstats
“The idea that taking walks, reading things unrelated to your research, and hanging out with strangers in a campus pub should be considered part of the serious process of thinking, but might well meet with skepticism in practice.”
www.pnas.org/doi/10.1073/...
Use "export" + the variable name and value when creating a chain of bash scripts, for it to be available in the called script too.
For example:
export LD_CORRELATION=0.2
bash run_ld_correlation_reference.sh
bash run_ld_correlation_samples.sh
#bioinformatics
I think prenatal genetic testing should become part of IVF. The cost of even a single embryo mix up is enormous. NIPT (for parental relatedness) is affordable and can be done early during pregnancy. Parents must have assurance about their child.
thenightly.com.au/australia/mo...
Quick #bioinformatics tip:
If you'll use plink files in your pipeline, whenever possible, transform your VCF into pgen or bgen ASAP!
Plink files are much faster to work with.
I need to say that Lonely Planet is still better than chatGPT to find interesting attractions for a trip.
05.06.2025 23:28 — 👍 0 🔁 0 💬 0 📌 0I’m so sorry about that, Dariia. Stay safe!
05.06.2025 23:28 — 👍 3 🔁 0 💬 0 📌 0jq is so useful.
05.06.2025 20:11 — 👍 0 🔁 0 💬 0 📌 0I have the impression, when reading papers from the 1990's and before, that concepts were explained more clearly than nowadays.
27.05.2025 18:34 — 👍 0 🔁 0 💬 0 📌 0I have just used #Gemini 2.5 Flash as a reviewer for my Quarto doc, and it gave me interesting recommendations.
My prompt asked for both text and #Rstats code reviews.
It returned strong points and improvement suggestions.
Useful tool especially for academics.
Today I went to the weekly seminar of the department I do my postdoc in (FMUSP), mostly remotely.
It was great to interact with other researchers in-person. I miss this sometimes.
This workshop is tomorrow, don't miss your last chance to register!
Details: bit.ly/3wBeY4S
Please share!
#AcademicSky #EconSky #Python
To clarify my poor post, it is NOT present in base R. Only when loading {data.table}.
Still exciting for me, though!
#rstats
To clarify, it is not present in base R. Only when loading {data.table}.
Sorry for my poor phrasing!
Yes, this is something I already knew! But I got surprised especially because I use {data.table} a lot, and it is already there, you know? Easier than setting up my own %notin% operator.
Thanks for sharing, though!
It is from library(data.table).
Sorry, I got so excited that I did not polish my writing to become clearer 😅.
YIL that %notin% is already a thing in R.
I know it's been out for a while, from {data.table}, but I have discovered it yesterday.
Yes, I was still using the structure !(column %in% vector). 🫠
Sharing in case you've missed it too!
#rstats