Happy birthday, #Wikidata! π
L1349 P569 Q2013 :P
@wikidatacommunity.bsky.social #WikidataBirthday
@nintendofan885.bsky.social
Also https://mastodon.social/@Nintendofan885 Pfp: fennec fox at a zoo Banner: Towers in Scunthorpe
Happy birthday, #Wikidata! π
L1349 P569 Q2013 :P
@wikidatacommunity.bsky.social #WikidataBirthday
heh, web.archive.org/details/tld:... got updated with data up to May (previously didn't include the 2025 bar)
Captures: 2,204,810,318
URLs: 1,725,452,488
New URLs: 1,506,233,721
I feel like the Archive Team project may have helped a bit :P
π From the first web page created in 1991β¦ to 1 trillion web pages archived today.
Every meme, blog, tweet & vanished site is part of our shared story. This is our collective memory. And itβs being saved.
Join in our celebration this October: blog.archive.org/trillion/
#Wayback1T #WaybackMachine
now over a petabyte :)
20.06.2025 16:21 β π 1 π 0 π¬ 0 π 0nice :)
FYI looks like you accidentally made a typo with the year
The BBC visited the Internet Archive to discuss archival efforts and public access!
www.youtube.com/watch?v=jh98...
We are wrapping up seed list nominations for #EOT2024.
Going forward, you can submit government URL nominations using a similar tool: digital2.library.unt.edu/nomination/G...
Crawls of post-EOT seeds wonβt be part of the #EOTArchive, but will appear in the @archive.org Wayback Machine
FYI you linked to the nomination tool for the 2016 one
The correct link is digital2.library.unt.edu/nomination/G...
Ever wanted to restore a taken down site from a web archive, with link navigation and references intact?
Our latest tooling make this simpler than ever!
Announcing govarchive.us - a dynamic, web archive-powered mirror of US government sites!
More details on our blog
webrecorder.net/blog/2025-03...
The Trump administration's erasure of federal data has put the Internet Archive in the spotlight. The organization, with its small but mighty team, is working to help save the world's digital history.
24.03.2025 23:49 β π 3327 π 860 π¬ 47 π 67ah, just realised that [something]-mil.govarchive.us works for .mil domains
25.03.2025 19:28 β π 0 π 0 π¬ 0 π 0nice :)
What about domains not on .gov? (e.g. military sites)
The Trump administration's erasure of federal data has put the Internet Archive in the spotlight. The organization, with its small but mighty team, is working to help save the world's digital history.
23.03.2025 14:02 β π 3533 π 857 π¬ 86 π 76π½ Give us your hidden, your overlooked,
Your orphaned Gov URLs yearning to be preserved,
The forgotten databases of your civic shore.
Send these, the neglected, the soon-to-vanish, to usβ
We lift our crawler beside the open web. π½
#EOTArchive #EOT2024
Screenshot of Webrecorder's public collection gallery of US government sites web archive on Browsertrix
Weβre excited to share the first batch of US Government websites that Webrecorder has archived as part of the
@eotarchive.org initiative. Theyβre now available on our public collections gallery app.browsertrix.com/explore/usgo...
#WebArchiving #Browsertrix #EOTarchive
FYI the URL is now web.archive.org/collection-s... after the collection was fixed
17.02.2025 17:32 β π 0 π 0 π¬ 0 π 0FYI the URL is now web.archive.org/collection-s... after the collection was fixed
17.02.2025 17:32 β π 1 π 0 π¬ 0 π 0USgovernment tracker items 1.00B done + 33.78M out + 0 to do data 539.17 TiB 578.92 KiB/u
tracker.archiveteam.org/usgovernment/ just hit 1 billion URLs! (~592 TB)
17.02.2025 15:55 β π 3 π 0 π¬ 0 π 1here's ~450 YouTube channels I found for EOT (will include a few duplicates) if it's useful.
digital2.library.unt.edu/nomination/e...
since EOT also save videos: archive.org/details/EndO...
sorry, I mean in relation to the Archive Team project
09.02.2025 20:55 β π 1 π 0 π¬ 0 π 0BTW did ScienceBase URLs ever get run? (since I remember mentioning in the IRC on the first day)
09.02.2025 20:50 β π 0 π 0 π¬ 2 π 0thanks whoever replied to my email about it :)
08.02.2025 19:00 β π 1 π 0 π¬ 0 π 0We just launched a 16TB archive of every dataset that has been available on data.gov since November. This will be updated day by day as new datasets appear. It can be freely copied, and we're sharing the code behind it to help others make their own archives of data they depend on.
06.02.2025 21:23 β π 1911 π 1010 π¬ 43 π 66Penn is getting a lot of questions about Data Refuge. That effort no longer exists, but several efforts are currently active. I've created a doc from what I & others have suggested. I'll update as I hear more. Feel free to share or suggest: docs.google.com/document/d/1...
03.02.2025 16:13 β π 78 π 45 π¬ 3 π 5It's still ongoing. The first crawl ran from September to the inauguration and the second crawl (post-inauguration) started on the 1st and continues until about April
04.02.2025 00:40 β π 1 π 0 π¬ 1 π 0huh, looks like they've fixed both of them now after I emailed
03.02.2025 13:02 β π 0 π 0 π¬ 0 π 0Links so they're clickable:
Wayback Machine: web.archive.org
End of Term Web Archive: eotarchive.org
Environmental Data and Governance Initiative: envirodatagov.org
The Data Liberation Project: www.data-liberation-project.org
Maggie Koerth: bsky.app/profile/magg...
Last year we started a project to download and preserve public data. lil.law.harvard.edu/blog/2025/01... Since saving public data is in the news today β but is always needed β letβs talk about what you can do to help.
31.01.2025 20:59 β π 429 π 201 π¬ 17 π 14looks like the #EOTArchive blog post has been shared quite a bit
01.02.2025 20:00 β π 0 π 0 π¬ 0 π 0