Library of Congress: The Library of Congress Web Archives: Dipping a Toe in a Lake of Data. “Over the last two decades, the Library of Congress Web Archiving Program has acquired and made available over 16,000 web archives, as part of more than 114 event and thematic collections. Each Web Archive is an aggregate of one or more websites, which in turn, are aggregates of many files presented together as a Web page in a browser; this aggregate of files are the images you see on the landing page of your favorite news (or gossip, no judging) site; they are the text that fills the articles; they are the bits of code that give you that clean, crisp modern layout. All of this together gives you a single web page. With an archive of over 1.7 petabytes of data in total, keeping track of every web object forming a website, which in turn form web archives, can be a bit like, well… herding cats.”
Upcoming Webinar from the Digital Library of Georgia: Revealing Hidden Collections: The Our Story Digitization Project at the Atlanta University Center | The Mechanics- Part 2. “This session– part two in a series of three –will provide attendees with a deeper dive into the mechanics of implementing a complex project with multiple partners. Topics include writing the proposal, vendor selection, preparing collections for digitization, metadata creation, designing workflows and making the collections accessible. Speakers will focus on lessons learned and project management strategies that should be applicable to similar initiatives. The third webinar will focus on strategies for outreach, dissemination and incorporating content into curriculum.” The first webinar is available for viewing.
The Register: FYI: Twitter’s API still spews enough metadata to reveal exactly where you lived, worked. “Researchers have demonstrated yet again that location metadata from Twitter posts can be used to infer private information like users’ home addresses, workplaces, and sensitive locations they’ve visited.”
Entomology Today: How to Make Your Bug Info Invasive (on Social Media). “You’ve built a spiffy website about your insect research, or written a new extension bulletin about the latest pest. But for some reason, it looks weird on Facebook, and never shows up in Google Search. Why not? The reason may lie behind the scenes, in the code of your website. Fortunately, you don’t have to be a code-wonk to fix this problem! There are a few simple changes you can make to spiffy up your presence on the web.” There is nothing earth-shattering here but in this quick article Dr. Gwen Pearson beautifully lays out why you should care about things like metadata. One of those articles you send to friends with the note “When I said x this is what I meant.”
SEO Roundtable: Google: Fill In Your Meta Descriptions Because You Know Your Content Best. “Google’s John Mueller said on Twitter that it is best not to leave your meta descriptions blank. Instead he said try to fill them out because you know your content best. This is despite a recent study from Yoast that shows that Google often uses your web site content, not your meta description, for your Google search result snippets.”
Google Blog: Image rights metadata in Google Images. “As part of a collaboration between Google, photo industry consortium CEPIC, and IPTC, the global technical standards body for the news media, you can now access rights-related image metadata in Google Images.”
Wired: Twitter’s vast metadata haul is a privacy nightmare for users. “Metadata is everywhere. Everything you tweet, every picture you take, and every status update you post on Facebook. It’s used by police and security forces to identify people who try to hide their identities and locations, while associated metadata in selfies can inadvertently ensnare criminals unaware that the data can destroy their alibi. And metadata on Twitter can also be used in extremely precise identification each and every one of us – according to a new paper by researchers at University College London and the Alan Turing Institute.”