Data dictionary template: osf.io/ynqcu
Project summary template: osf.io/q6g8d
Dataset level README template: osf.io/tk4cb
@cghlewis.bsky.social
Research Data Management Consultant | cghlewis.com Co-organizer @r-ladies-stl.bsky.social Co-organizer POWER Data Management Hub | https://osf.io/ap3tk/ Author of DMLSER: https://datamgmtinedresearch.com/ RDM Weekly: https://rdmweekly.substack.com/
Data dictionary template: osf.io/ynqcu
Project summary template: osf.io/q6g8d
Dataset level README template: osf.io/tk4cb
Data dictionary template: osf.io/ynqcu
Project summary template: osf.io/q6g8d
Dataset level README template: osf.io/tk4cb
An old friend called me this morning:
"Crystal, I need help. We've been running a project and I didn't get everything set up right from the beginning. 🫣 What can I do now to document our data?"
Me: "Nooooooooo. But I've got you." :)
Data documentation templates sent over ✔️
TIL: Microsoft Teams can be configured to use ISO 8601 date formatting.
I don't know why this makes me so excited. Perhaps it's the analyst in me who's had to wrestle with and wrangle various date formats in the past ...
Obligatory @xkcd.com link: xkcd.com/1179/
#dataBS
A screenshot of the Posit Blog homepage showing statistics (936+ posts, 22+ categories, 386+ tags) and two featured blog post cards. Both posts are AI Newsletter roundups from September 2025 by Sara Altman and myself, featuring AI-related R package hexes in their hero images.
ICYMI, @sara-altman.bsky.social and I have been writing a biweekly newsletter on AI and open source data science on the @posit.co blog!
A bit about how that came to be on my #rstats blog: www.simonpcouch.com/blog/2025-10...
Day 2 of ##Hiddenref2025 and a reminder that people are as important as outputs.
08.10.2025 09:50 — 👍 10 🔁 5 💬 1 📌 0Before I code something from scratch, does anyone have a #rstats function like setdiff but it works with named lists and/or data frame rows? Optionally dropping duplicate values and keeping the differences from the first list?
08.10.2025 01:36 — 👍 4 🔁 3 💬 3 📌 0A month or two ago someone posted a link to their really amazing set of LLM system instructions for writing #rstats code with good tidy/NSE patterns. (They were also good for humans!) Does anyone recall who or where that was?
07.10.2025 19:17 — 👍 16 🔁 6 💬 2 📌 1No one prepares you that when you work for yourself, you no longer have IT support available to you. But today, thanks to YouTube, I became my own IT person. 😂🙏
07.10.2025 19:35 — 👍 27 🔁 0 💬 4 📌 0All blessing, no curse. :)
07.10.2025 17:52 — 👍 1 🔁 0 💬 2 📌 0Today on my blog: some thoughts on the best data management strategies for collaboration: dataabinitio.com?p=1204
What's your best data tip for collaborative research?
Thanks for checking the newsletter out! Ooof, I don't think I can choose a favorite because they all are very interesting and helpful for different reasons. But I think the AI-generated participant data article is one that probably piques a lot of interest right now.
07.10.2025 13:13 — 👍 0 🔁 0 💬 1 📌 0Issue 16 of RDM Weekly is out! 📬
It includes:
- Data is Not Available Upon Request @ianhussey.mmmdata.io
- AI Generated Participants in Social Science @jamiecummins.bsky.social @science.org
- Why’s it Hard to Teach Data Cleaning? @randyau.com
and more!
rdmweekly.substack.com/p/rdm-weekly...
🥴
06.10.2025 21:05 — 👍 15 🔁 1 💬 3 📌 1Just filled out a web survey with a bunch of Likert-type items. The response categories were in the same order on each item, but not the order you'd expect:
fairly important
important
unimportant
very important
Pretty sure "alphabetize response categories" is not best practice in survey design.
When you've been working with someone for a while and you start to see the little ways that you are impacting how they work with data. 🤩
The name of a file someone just shared with me
"feedback_survey_raw_2025-08-15"
It's so deflating to lose an irreplaceable staff member. It's worse when you lose them to another unit on campus. I view that as a clear administrative failure and so should the admin. Academic staff is the glue that holds everything together yet they're so routinely underpaid and underappreciated.
06.10.2025 15:44 — 👍 37 🔁 4 💬 1 📌 0Oh no! 😅 I'm sorry, John!
06.10.2025 14:43 — 👍 2 🔁 0 💬 0 📌 0That is definitely a way to look on the bright side!
06.10.2025 14:41 — 👍 2 🔁 0 💬 1 📌 0Does it mean you're doing too much when you get the late start date wrong and you get your kiddo to school 1.5 hours late? 🤦
06.10.2025 14:20 — 👍 9 🔁 0 💬 2 📌 0"Deloitte Australia will issue a partial refund to the federal government after admitting that artificial intelligence had been used in the creation of a $440,000 report littered with errors including three nonexistent academic references and a made-up quote from a Federal Court judgement."
06.10.2025 00:31 — 👍 276 🔁 137 💬 3 📌 13Some Saturday reading 📖
cghlewis.com/blog/excel_e...
You know you're watching something from the 90s when you hear the term "The Net".
03.10.2025 15:15 — 👍 18 🔁 0 💬 3 📌 0I'm giving a talk next week about my favourite thing: functions! Come along!
What: Practical Functions - Practically Magic
When: 8th/9th October - www.timeanddate.com/worldclock/f...
Where: Online, via Salt Lake City R User Group www.meetup.com/slc-rug/even...
How: @juliasilge.com
#rstats
Prioritize documentation that has the biggest ROI for you, integrate documentation into your project workflow (assigning team members as responsible for it and setting aside times to update it), and also automate what you can (for instance versioning).
02.10.2025 18:41 — 👍 2 🔁 0 💬 1 📌 0I think teams know it takes time and they struggle to keep up with it. Also, some teams are just unsure how to get started with this type of documentation.
02.10.2025 16:49 — 👍 2 🔁 0 💬 1 📌 0The questions you ask are dependent on the data and the issues you run into. If you don't want to slow down a workflow, make sure you obtain all the documentation necessary to allow you to understand data lineage. Otherwise, be prepared to start asking those questions. :)
02.10.2025 16:36 — 👍 2 🔁 1 💬 1 📌 0If you want to be a good data manager, you have to get really comfortable with asking a lot of questions. When something is unclear or doesn't seem right, you can't settle or make assumptions. That's how you end up with bad data. Stay curious.
02.10.2025 15:10 — 👍 45 🔁 7 💬 1 📌 1A graphic showing the concept of mapply in R, with multiple input vectors being paired and processed by a function returning a single output vector.
Just published my new R article: 'Mapply: When You Need to Iterate Over Multiple Inputs'! 🚀 If `sapply` doesn't quite cut it for your multi-variable iterations, `mapply` is your friend. Learn to pair inputs beautifully. #RStats #Mapply
https://drmo.site/bhXeDb