logo

drewdevault.com

[mirror] blog and personal website of Drew DeVault git clone https://hacktivis.me/git/mirror/drewdevault.com.git

Archive-it-or-miss-it.md (2210B)


  1. ---
  2. date: 2017-06-19
  3. layout: post
  4. title: Archive it or you will miss it
  5. tags: [linkrot]
  6. ---
  7. Let's open with some quotes from the [Wikipedia article on link
  8. rot](https://en.wikipedia.org/wiki/Link_rot):
  9. >In 2014, bookmarking site Pinboard's owner Maciej Cegłowski reported a “pretty
  10. >steady rate” of 5% link rot per year... approximately 50% of the URLs in
  11. >U.S. Supreme Court opinions no longer link to the original information...
  12. >(analysis of) more than 180,000 links from references in... three major open
  13. >access publishers... found that overall 24.5% of links cited were no longer
  14. >available.
  15. I hate link rot. It's been common when servers disappeared or domains expired,
  16. in the past and still today. Today, link rot is on the rise under the influence
  17. of more sinister factors. Abuse of DMCA. Region locking. Paywalls. Maybe it
  18. just no longer serves the interests of a walled garden to host the content.
  19. Maybe the walled garden went out of business. Users rely on platforms to host
  20. content and links rot by the millions when the platforms die. Movies disappear
  21. from Netflix. Music vanishes from Spotify. Accounts are banned from SoundCloud.
  22. YouTube channels are banned over false DMCA requests issued by robots.
  23. At this point, link rot is an axiom of the internet. In the face of this, I
  24. store a personal offline archive of *anything* I want to see twice. When I see a
  25. cool YouTube video I like, I archive the entire channel right away. Rather than
  26. subscribe to it, I update my archive on a cronjob. I scrape content out of RSS
  27. feeds and into offline storage and I have dozens of websites archived with wget.
  28. I mirror most git repositories I'm interested in. I have DRM free offline copies
  29. of all of my music, TV shows, and movies, ill-begotten or not.
  30. I suggest you do the same. It's sad that it's come to this. Let's all do
  31. ourselves a favor. Don't build unsustainable platforms and ask users to trust
  32. you with their data. Pay for your domain. Give people DRM free downloads. Don't
  33. cripple your software when it can't call home. If you run a website, let
  34. archive.org scrape it.
  35. And archive anything you want to see again.
  36. ```
  37. 0 0 * * 0 cd ~/archives && wget -m https://drewdevault.com
  38. ```