Johannes B. Gruber's Avatar

Johannes B. Gruber

@jbgruber.bsky.social

Senior Researcher @gesis.org // Data Editor @polcommjournal.bsky.social πŸ”Ž political communication (#polsky + #commsky) with text analysis and #rstats (#opendata + #openscience) 🌏 JohannesBGruber.eu πŸ‘¨β€πŸ’» research software github.com/JBGruber

1,261 Followers  |  898 Following  |  391 Posts  |  Joined: 21.09.2023  |  1.7141

Latest posts by jbgruber.bsky.social on Bluesky

It works pretty well in my limited testing. But streaming is a bit weird in R and the event objects are a bit strange to wrangle into a table. So there are still some things to do!

03.08.2025 15:01 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I love this #rstats package!! Go test it out!

03.08.2025 14:32 β€” πŸ‘ 14    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0
Preview
support firehose Β· Issue #23 Β· JBGruber/atrrr See: https://docs.bsky.app/docs/advanced-guides/firehose I tried it at some point and the socket part works fine. However, decoding the CBOR event data is a bigger challenge. The Python package use...

The next version of the #rstats πŸ“¦ {atrrr} will likely contain a function to stream from the Bluesky Firehose! Looking for people to test and comment now:

github.com/JBGruber/atr...

03.08.2025 12:40 β€” πŸ‘ 16    πŸ” 7    πŸ’¬ 2    πŸ“Œ 1
Postdoctoral Researcher for Platform Data and Computational Social Science at GESIS 100% TV-L 14, 4 years with possible tenure, starting on October 1, 2025 our Departments Computational Social Science (CSS) and Data Services for the Social Sciences (DSS) located in Cologne

New job ad @gesis.org:

πŸ‘‰ Research on Platform Data and #CSS
πŸ‘‰ Coordinate access to online platform data (#DSA)
πŸ‘‰ 100% TV-L 14, 4 years with possible tenure
❗good German language abilities

No Deadline, apply soon πŸ‘‡!

01.08.2025 13:41 β€” πŸ‘ 7    πŸ” 5    πŸ’¬ 0    πŸ“Œ 0

"Showing the important things you missed. Posts and replies from people you follow highlighting posts that got a lot of traction since you were last online. No retweets."

(Not sure if this one is still necessary since "Popular With Friends" is there by default)

bsky.app/profile/did:...

30.07.2025 14:02 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

"See posts from your mutual follows"

bsky.app/profile/did:...

30.07.2025 14:02 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

A bit of an oldschool one that just collects all posts with #rstats (created before hashtags were properly implemented here)

bsky.app/profile/did:...

30.07.2025 14:02 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

"TADAsky is a cross-disciplinary feed on text-as-data, natural language processing and computational social science research and discussions."

bsky.app/profile/did:...

30.07.2025 14:02 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

"A feed for communication scholars to have focused discussion and enjoy serendipitous discovery of each other's work."

bsky.app/profile/did:...

30.07.2025 14:02 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

I talked to some people recently who said they got bored or overwhelmed by Bluesky after a while. Turns out they didn't know about alternative feeds yet and we're scrolling the chronological one.

If you like that, great, but here are some alternative feeds I like.

30.07.2025 13:57 β€” πŸ‘ 9    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0

Ah yes! Sorry I posted the other one on the same day and grabbed the wrong link...

23.07.2025 08:55 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

They implemented something wrong in their API. Whether this is an accident or on purpose, I do not know

23.07.2025 07:54 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Collecting TikTok Data Getting TikTok data (<https://tiktok.com/>) through the official and unofficial APIsβ€”in other words, you can track TikTok. Originally a port of Deen Freelon's Pyktok (<https://github.com/dfreelon/pykt...

FYI: the comparsion was done with #rstats {traktok}.

Link to package: jbgruber.github.io/traktok/
Link to tool paper: osf.io/preprints/so...

23.07.2025 07:51 β€” πŸ‘ 6    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

This should yield the same results and the IN method saves you some API requests. But in reality, I would have missed 80% of the videos going with the IN method.

23.07.2025 07:51 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

I then tried two different approaches: 1. iterating through the accounts and searching videos where it EQuals the creator; 2. looking for videos where the creator is IN the list of party accounts

23.07.2025 07:51 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

I wanted to see which videos the German parties/top candidates had posted between the announcement of the federal election and the election. These are the accounts I found in a 5 minute search

23.07.2025 07:51 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Pretty wild that depending on how you search the #tiktok research API, you get wildly different results

23.07.2025 07:43 β€” πŸ‘ 28    πŸ” 10    πŸ’¬ 3    πŸ“Œ 0

QTA-DUB2? πŸ˜ƒ

28.06.2025 07:56 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Then, once you scrape that news, make it available to others for "non-consumptive research" - osf.io/gz3xf_v1 @jbgruber.bsky.social @vanatteveldt.com #ica25

26.06.2025 20:25 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Very impressive work to build a modular open-source infrastructure for news scraping by @jbgruber.bsky.social that should be adapted and built upon by anyone scraping news!

26.06.2025 20:21 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 1
Post image

Want to easily scrape data from TikTok?

There's an R package for that!

traktok

"While it is neither the first nor only tool to do so ... the [package provides] ... an easy-to-understand consistent syntax [meant] to encourage TikTok research"

osf.io/preprints/so...

26.06.2025 13:41 β€” πŸ‘ 72    πŸ” 22    πŸ’¬ 3    πŸ“Œ 0

Thanks for the share, I didn't even notice it got past moderation already!

26.06.2025 17:12 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Can I buy a t-shirt with this post somewhere πŸ˜‚

26.06.2025 17:08 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Want to easily scrape data from news media sites?

There's an R package for that!

paperboy

"paperboy offers writers of web scrap[ers] a clear path to publish their code & earn co-authorship on the package, while deliver[ing] news media data from many websites in a consistent format."

26.06.2025 13:46 β€” πŸ‘ 121    πŸ” 35    πŸ’¬ 6    πŸ“Œ 2

I don't think anyone ICAed harder than @damiantrilling.net, at least judging by his badge. #ica25

16.06.2025 22:44 β€” πŸ‘ 11    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I would summarize this great last presentation of #ica25, but @camilambpp.bsky.social has already done it better πŸ‘‡

16.06.2025 22:06 β€” πŸ‘ 8    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
Post image

Always worth sticking around until the end of #ica25: @profvaccari.bsky.social presenting an insightful study on whether people trust misinformation on WhatsApp simply because someone put a BBC logo on it (they do).

16.06.2025 21:29 β€” πŸ‘ 14    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Post image

Some people have left #ica25 already, but great research is still being presented, like @gongbaobao.bsky.social who shows this co-consumption network of German media

16.06.2025 20:32 β€” πŸ‘ 11    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

This was also a theme in many @icamobile.bsky.social panels... working with existing datasets to run secondary analyses to answer new questions.

16.06.2025 17:19 β€” πŸ‘ 14    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Post image Post image

@jbgruber.bsky.social & @vanatteveldt.com argue that as data become more difficult to collect (no APIs, scraping more difficult) we need to share existing data more. And how non-consumptive research is a way to do that without goi g to jail! #ica25

16.06.2025 17:07 β€” πŸ‘ 32    πŸ” 5    πŸ’¬ 1    πŸ“Œ 2

@jbgruber is following 20 prominent accounts