Crystal Lewis's Avatar

Crystal Lewis

@cghlewis.bsky.social

Research Data Management Consultant | cghlewis.com Co-organizer @r-ladies-stl.bsky.social‬ Co-organizer POWER Data Management Hub | https://osf.io/ap3tk/ Author of DMLSER: https://datamgmtinedresearch.com/ RDM Weekly: https://rdmweekly.substack.com/

5,462 Followers  |  1,655 Following  |  1,311 Posts  |  Joined: 15.08.2023  |  2.4211

Latest posts by cghlewis.bsky.social on Bluesky

Data dictionary template: osf.io/ynqcu
Project summary template: osf.io/q6g8d
Dataset level README template: osf.io/tk4cb

08.10.2025 14:54 — 👍 13    🔁 1    💬 0    📌 0

Data dictionary template: osf.io/ynqcu
Project summary template: osf.io/q6g8d
Dataset level README template: osf.io/tk4cb

08.10.2025 14:54 — 👍 13    🔁 1    💬 0    📌 0

An old friend called me this morning:

"Crystal, I need help. We've been running a project and I didn't get everything set up right from the beginning. 🫣 What can I do now to document our data?"

Me: "Nooooooooo. But I've got you." :)

Data documentation templates sent over ✔️

08.10.2025 14:53 — 👍 7    🔁 0    💬 1    📌 0
Preview
ISO 8601

TIL: Microsoft Teams can be configured to use ISO 8601 date formatting.

I don't know why this makes me so excited. Perhaps it's the analyst in me who's had to wrestle with and wrangle various date formats in the past ...

Obligatory @xkcd.com link: xkcd.com/1179/

#dataBS

08.10.2025 14:46 — 👍 8    🔁 1    💬 1    📌 0
A screenshot of the Posit Blog homepage showing statistics (936+ posts, 22+ categories, 386+ tags) and two featured blog post cards. Both posts are AI Newsletter roundups from September 2025 by Sara Altman and myself, featuring AI-related R package hexes in their hero images.

A screenshot of the Posit Blog homepage showing statistics (936+ posts, 22+ categories, 386+ tags) and two featured blog post cards. Both posts are AI Newsletter roundups from September 2025 by Sara Altman and myself, featuring AI-related R package hexes in their hero images.

ICYMI, @sara-altman.bsky.social and I have been writing a biweekly newsletter on AI and open source data science on the @posit.co blog!

A bit about how that came to be on my #rstats blog: www.simonpcouch.com/blog/2025-10...

08.10.2025 13:16 — 👍 12    🔁 3    💬 0    📌 0
Post image

Day 2 of ##Hiddenref2025 and a reminder that people are as important as outputs.

08.10.2025 09:50 — 👍 10    🔁 5    💬 1    📌 0

Before I code something from scratch, does anyone have a #rstats function like setdiff but it works with named lists and/or data frame rows? Optionally dropping duplicate values and keeping the differences from the first list?

08.10.2025 01:36 — 👍 4    🔁 3    💬 3    📌 0

A month or two ago someone posted a link to their really amazing set of LLM system instructions for writing #rstats code with good tidy/NSE patterns. (They were also good for humans!) Does anyone recall who or where that was?

07.10.2025 19:17 — 👍 16    🔁 6    💬 2    📌 1

No one prepares you that when you work for yourself, you no longer have IT support available to you. But today, thanks to YouTube, I became my own IT person. 😂🙏

07.10.2025 19:35 — 👍 27    🔁 0    💬 4    📌 0
Preview
a man is asking a woman if she is crying or something . ALT: a man is asking a woman if she is crying or something .
07.10.2025 19:34 — 👍 0    🔁 0    💬 1    📌 0

All blessing, no curse. :)

07.10.2025 17:52 — 👍 1    🔁 0    💬 2    📌 0
Data Management for Collaborations » Data Ab Initio

Today on my blog: some thoughts on the best data management strategies for collaboration: dataabinitio.com?p=1204

What's your best data tip for collaborative research?

06.10.2025 22:14 — 👍 7    🔁 3    💬 0    📌 0

Thanks for checking the newsletter out! Ooof, I don't think I can choose a favorite because they all are very interesting and helpful for different reasons. But I think the AI-generated participant data article is one that probably piques a lot of interest right now.

07.10.2025 13:13 — 👍 0    🔁 0    💬 1    📌 0
Preview
RDM Weekly - Issue 016 A weekly roundup of Research Data Management resources.

Issue 16 of RDM Weekly is out! 📬

It includes:
- Data is Not Available Upon Request @ianhussey.mmmdata.io
- AI Generated Participants in Social Science @jamiecummins.bsky.social @science.org
- Why’s it Hard to Teach Data Cleaning? @randyau.com
and more!

rdmweekly.substack.com/p/rdm-weekly...

07.10.2025 12:56 — 👍 18    🔁 7    💬 2    📌 0

🥴

06.10.2025 21:05 — 👍 15    🔁 1    💬 3    📌 1

Just filled out a web survey with a bunch of Likert-type items. The response categories were in the same order on each item, but not the order you'd expect:

fairly important
important
unimportant
very important

Pretty sure "alphabetize response categories" is not best practice in survey design.

06.10.2025 18:42 — 👍 44    🔁 2    💬 3    📌 1

When you've been working with someone for a while and you start to see the little ways that you are impacting how they work with data. 🤩

The name of a file someone just shared with me
"feedback_survey_raw_2025-08-15"

06.10.2025 18:27 — 👍 27    🔁 1    💬 3    📌 0

It's so deflating to lose an irreplaceable staff member. It's worse when you lose them to another unit on campus. I view that as a clear administrative failure and so should the admin. Academic staff is the glue that holds everything together yet they're so routinely underpaid and underappreciated.

06.10.2025 15:44 — 👍 37    🔁 4    💬 1    📌 0

Oh no! 😅 I'm sorry, John!

06.10.2025 14:43 — 👍 2    🔁 0    💬 0    📌 0

That is definitely a way to look on the bright side!

06.10.2025 14:41 — 👍 2    🔁 0    💬 1    📌 0

Does it mean you're doing too much when you get the late start date wrong and you get your kiddo to school 1.5 hours late? 🤦

06.10.2025 14:20 — 👍 9    🔁 0    💬 2    📌 0

"Deloitte Australia will issue a partial refund to the federal government after admitting that artificial intelligence had been used in the creation of a $440,000 report littered with errors including three nonexistent academic references and a made-up quote from a Federal Court judgement."

06.10.2025 00:31 — 👍 276    🔁 137    💬 3    📌 13
Tips for data entry in Excel | Crystal Lewis This post provides a few tips for collecting higher quality and more usable data when using Excel as a data entry tool.

Some Saturday reading 📖

cghlewis.com/blog/excel_e...

04.10.2025 13:38 — 👍 19    🔁 6    💬 0    📌 0

You know you're watching something from the 90s when you hear the term "The Net".

03.10.2025 15:15 — 👍 18    🔁 0    💬 3    📌 0
Preview
Event Time Announcer - Salt Lake City talk: Practical Functions, Practically Magic Event Time Announcer shows time for Salt Lake City talk: Practical Functions, Practically Magic in locations all over the world. In Salt Lake City it happens on Tuesday, October 7, 2025 at 4:00:00 pm.

I'm giving a talk next week about my favourite thing: functions! Come along!

What: Practical Functions - Practically Magic
When: 8th/9th October - www.timeanddate.com/worldclock/f...
Where: Online, via Salt Lake City R User Group www.meetup.com/slc-rug/even...
How: @juliasilge.com

#rstats

03.10.2025 04:29 — 👍 17    🔁 4    💬 1    📌 3

Prioritize documentation that has the biggest ROI for you, integrate documentation into your project workflow (assigning team members as responsible for it and setting aside times to update it), and also automate what you can (for instance versioning).

02.10.2025 18:41 — 👍 2    🔁 0    💬 1    📌 0

I think teams know it takes time and they struggle to keep up with it. Also, some teams are just unsure how to get started with this type of documentation.

02.10.2025 16:49 — 👍 2    🔁 0    💬 1    📌 0

The questions you ask are dependent on the data and the issues you run into. If you don't want to slow down a workflow, make sure you obtain all the documentation necessary to allow you to understand data lineage. Otherwise, be prepared to start asking those questions. :)

02.10.2025 16:36 — 👍 2    🔁 1    💬 1    📌 0

If you want to be a good data manager, you have to get really comfortable with asking a lot of questions. When something is unclear or doesn't seem right, you can't settle or make assumptions. That's how you end up with bad data. Stay curious.

02.10.2025 15:10 — 👍 45    🔁 7    💬 1    📌 1
A graphic showing the concept of mapply in R, with multiple input vectors being paired and processed by a function returning a single output vector.

A graphic showing the concept of mapply in R, with multiple input vectors being paired and processed by a function returning a single output vector.

Just published my new R article: 'Mapply: When You Need to Iterate Over Multiple Inputs'! 🚀 If `sapply` doesn't quite cut it for your multi-variable iterations, `mapply` is your friend. Learn to pair inputs beautifully. #RStats #Mapply
https://drmo.site/bhXeDb

02.10.2025 13:02 — 👍 29    🔁 8    💬 0    📌 0

@cghlewis is following 20 prominent accounts