British Library: Our new Science web archive collection. “We have interpreted ‘science’ widely to include engineering and communications, but not IT, as that already has a collection. Our collection is arranged according to the standard disciplines such as biology, chemistry, engineering, earth sciences and physics, and then subdivided according to their common divisions, based on the treatment of science in the Universal Decimal Classification.”
MIT Technology Review: The race to save the first draft of coronavirus history from internet oblivion. “According to Brewster Kahle, the Internet Archive’s founder, his organization is already collecting about 1 billion URLs a day across the web. Archiving the pandemic means trying to identify and collect the pages their ordinary efforts might otherwise overlook, relying on a network of library professionals and members of the public: local and international public health pages, petitions, resources for medical professionals trying to fight covid-19, and accounts from those who have had the virus. It’s not easy. ‘The average life of a web page is only 100 days before it’s changed or deleted,’ he says.
New York Times: Meet Your Meme Lords. “Future researchers can rest easy: Know Your Meme, Urban Dictionary, Creepypasta and Cute Overload have all been preserved by the Library of Congress. So has the band website for They Might Be Giants and the entire published output of The Toast, the humor site that shut down in 2016. And while the Library of Congress owns a rare print copy of the Gutenberg Bible, the web archive features the LOLCat Bible Translation Project, which rendered the bible in LOLspeak.”
British Library: 15 Years of the UK Web Archive – The Early Years. “Think back 15 years to the beginning of 2005. Future Prime Minister David Cameron wasn’t yet Leader of the Conservative party and Google Maps, Twitter and the iPhone all had yet to be launched. It was, however, the year that we started collecting copies of UK published websites for permanent preservation and access.”
Columbia University Libraries: Just Launched: Vaccination in Modern America: Misinformation vs. Public Health Advocacy Web Archive. “Developed by librarians within the Ivy Plus Libraries Confederation, the archive preserves webpages representing the current state of public discourse and contrasting approaches to authority on vaccination in the United States, with a focus on sites that are both pro- and anti-vaccination. The purpose of this collection is to capture potentially ephemeral information about vaccination that could be used by health service researchers, information scientists, sociologists, and others to understand the motivations, practices, and outcomes of health information and information on the web.”
Internet Archive: Archiving Information on the Novel Coronavirus (Covid-19). “The Internet Archive’s Archive-It service is collaborating with the Internet Preservation Consortium’s (IIPC) Content Development Group (CDG) to archive web-published resources related to the ongoing Novel Coronavirus (Covid-19) outbreak. The IIPC Content Development Group consists of curators and professionals from dozens of libraries and archives from around the world that are preserving and providing access to the archived web.”
National Library of New Zealand: Is your Facebook account an archive of the future?. “Over the next few months, we’re inviting New Zealanders to donate their Facebook archives to the Alexander Turnbull Library. In collecting personal Facebook archives of New Zealanders here and abroad, we will be continuing the work that has always been part of our mission: documenting the lives of New Zealanders today to support the emerging and anticipated research needs of the future.”
Ball State University: Ball State to archive websites of community organizations, local businesses. “Ball State University Libraries’ Archives and Special Collections has launched a community archiving initiative to preserve and make accessible the websites of area organizations and businesses. The Ball State University Libraries’ web archive creates snapshots of selected websites at regular time intervals to capture and preserve culturally and historically significant information relevant to the city and county. The archived webpages are fully searchable and accessible.”
Library of Congress: Breaking: A New “News” Archive!. “A new digital collection, The General News on the Internet, is a free archive of online-only news sites collected from the web. The Library of Congress began preserving these sites in June 2014. How are these news-based sites captured? The Library uses a hybrid approach of weekly captures of the websites, augmented with twice-daily capture of known RSS feeds (Real Simple Syndication). This produces a more complete news archive. Given the dynamic nature of the 24-hour news cycle of today, these archives are meant to capture as much of the news distribution as possible given current limitations in technology and resources.”
UK Web Archive Blog: Collecting Interactive Fiction. “Works of interactive fiction are stories where the reader/player can guide or affect the narrative in some way. This can be through turning to a specific page as in ‘Choose Your Own Adventure’, or clicking a link or typing text in digital works. “
Library of Congress: The Library of Congress Web Archives: Dipping a Toe in a Lake of Data. “Over the last two decades, the Library of Congress Web Archiving Program has acquired and made available over 16,000 web archives, as part of more than 114 event and thematic collections. Each Web Archive is an aggregate of one or more websites, which in turn, are aggregates of many files presented together as a Web page in a browser; this aggregate of files are the images you see on the landing page of your favorite news (or gossip, no judging) site; they are the text that fills the articles; they are the bits of code that give you that clean, crisp modern layout. All of this together gives you a single web page. With an archive of over 1.7 petabytes of data in total, keeping track of every web object forming a website, which in turn form web archives, can be a bit like, well… herding cats.”
Lifehacker: Create Your Own Personal Archive of Web Pages With This Chrome Extension. “Websites change. Websites go out of business. This week I came across a new browser extension that makes saving those sites a little easier, WebSatchel. Obviously, not many people are going to have the specific use case that I do. That said, there are plenty of reasons to save a website, be it a story you enjoyed reading and might want to read again or even a recipe for something you’d like to try out later.”
Archive-It Blog: Announcing the “Pitch a Collection” Contest Winners. “We are excited to introduce the winners of our first ever Pitch a Collection contest! The selected collections are as diverse as our partners and will ensure the preservation of online content from a variety of under-represented subject areas.” Ooo, you had me at Interactive Fiction Web Archive.
Straits Times: NLB archiving Singapore websites, digital materials. “The National Library Board (NLB) has taken on the enormous task of archiving 180,000 Singapore websites ending with the .sg domain, as well as digital materials published in Singapore. The annually updated National Day Parade website and the Fighting SARS Together! website launched during the Severe Acute Respiratory Syndrome outbreak in 2003 are among the 2,000 websites that have been archived so far.”
Library of Congress: Science Blogs Web Archive. “This guest post is an interview with Lisa Massengale, Head of the Science Reference Section, with contributions by the Web Archive’s creator Jennifer Harbster, a Science Reference and Research Specialist for the Science, Technology and Business Division from Oct. 2001- Dec. 2015. Along with her reference duties for the Library’s Science Reference Service, she created Everyday Mysteries an online collection of fun and scientifically interesting questions and answers about everyday phenomena. Jennifer is the author of the Saving Science Blogs which provides additional information about the collection.”