Thanks to @dallascard.bsky.social and @davidjurgens.bsky.social for their help on this project! We also received great feedback from members of the Blablablab and CLC lab
14.11.2024 22:36 โ ๐ 3 ๐ 0 ๐ฌ 1 ๐ 0@blitt.bsky.social
PhD student interested in computational approaches to language, politics, and media Iowa | Michigan
Thanks to @dallascard.bsky.social and @davidjurgens.bsky.social for their help on this project! We also received great feedback from members of the Blablablab and CLC lab
14.11.2024 22:36 โ ๐ 3 ๐ 0 ๐ฌ 1 ๐ 0Interested in working with SPoRC? Our data, paper, and code for creating data and doing the analysis are freely available!
data: huggingface.co/datasets/bli...
paper: arxiv.org/abs/2411.07892
processing code: github.com/blitt2018/SP...
analysis code: github.com/blitt2018/SP...
We're excited for people to use this data to explore the dynamics of long-form conversation, linguistic style matching, diffusion of information, understanding power and prestige within the podcast ecosystem, and more!
14.11.2024 22:36 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0What about the audio aspect of podcasts? We provide speaker turn information, along with audio features that capture this information, such as pitch, allowing future research to consider elements like emotion, humor, or sarcasm
14.11.2024 22:36 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0...Discussion of George Floyd was widespread across categories, with 21% of podcasts saying his name in at least one of their episodes in our time-period. Furthermore, discussion of racial justice peaked quickly around George Floyd but transitioned to a longer-lasting focus on Black Lives Matter
14.11.2024 22:36 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0How does the podcast ecosystem react to major events? As a case study, we consider collective attention in the podcast ecosystem following the murder of George Floyd in 2020...
14.11.2024 22:36 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0A network figure where podcasts are connected by edges if they have hosted the same guest. Color is assigned based on self-ascribed podcast category labels. Layout is determined with the force-directed Yifan-Hu algorithm. Podcasts in the same category appear closer.
How do the creators of podcast content exchange ideas and form communities? We find that the Business, Sports, and News categories form communities through shared guests, whereas other large categories such as Religion and Society do not
14.11.2024 22:36 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 1A figure where podcast episodes are projected such that distance indicates topical similarity. Color is assigned based on the self-ascribed podcast category label.
Podcasts have categories, but how similar are podcasts within categories in terms of what they talk about? In our content analysis, we find it's mixed! Some topics belong to distinct categoriesโbut other topics like "racial justice" or "spirituality" cut across many categories!
14.11.2024 22:36 โ ๐ 2 ๐ 0 ๐ฌ 2 ๐ 0SPoRC covers nearly all English episodes during May-June 2020, with transcripts + host/guest inferences for over 1M episodes, and audio features + speaker turns for over 370K episodes. Using this data, we study the content, structure, and responsiveness of the podcast ecosystem
14.11.2024 22:36 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0Podcasts are a popular medium, but data for computational research is limited! We introduce the Structured Podcast Research Corpus (SPoRC - huggingface.co/datasets/bli...), a large, multimodal dataset of English podcasts ๐งต
arxiv.org/abs/2411.07892