Search Engine Roundtable: Should Google Index The Entire Web & Not Cherry Pick Pages To Index?

Search Engine Roundtable: Should Google Index The Entire Web & Not Cherry Pick Pages To Index?. “For years and years Google has told us Google doesn’t index all the content and URLs they know about on the web. No just because there is a directive telling them not to but because Google chooses not to index those pages because of various factors like PageRank, duplication, other quality signals. But a WebmasterWorld thread is asking, should they index the whole web?”

Breaking: A New “News” Archive! (Library of Congress)

Library of Congress: Breaking: A New “News” Archive!. “A new digital collection, The General News on the Internet, is a free archive of online-only news sites collected from the web. The Library of Congress began preserving these sites in June 2014. How are these news-based sites captured? The Library uses a hybrid approach of weekly captures of the websites, augmented with twice-daily capture of known RSS feeds (Real Simple Syndication). This produces a more complete news archive. Given the dynamic nature of the 24-hour news cycle of today, these archives are meant to capture as much of the news distribution as possible given current limitations in technology and resources.”

The World Wide Web Turns 30: Our Favorite Memories From A To Z (The Verge)

The Verge: The World Wide Web Turns 30: Our Favorite Memories From A To Z. “Over the past 30 years, major portions of the web have come and gone. They’ve made us laugh and cringe, let us waste time and find friends, and reshaped the world in the process. For its anniversary, we’re looking back at some of our favorite websites, from A to Z, as well as some key people and technologies. Of course, there was far too much good stuff to include, so we had to note some additional favorites along the way.”

The Verge: Tim Berners-Lee says we can still save the web

The Verge: Tim Berners-Lee says we can still save the web. “The World Wide Web is 30 years old tomorrow. A day earlier, its founder, English engineer and computer scientist Tim Berners-Lee, first proposed the system that would become the WWW on March 11th, 1989. To acknowledge the anniversary, he’s revisited his ideas about the internet in a new letter published today.”

Why does that website take forever to load? Clues: Three syllables, starts with a J, rhymes with crock of sh… (The Register)

The Register: Why does that website take forever to load? Clues: Three syllables, starts with a J, rhymes with crock of sh…. “If the web seems slow, blame third-party advertising and analytics scripts. Many internet users have already come to that conclusion but Patrick Hulce, founder of Dallas, Texas-based Eris Ventures and a former Google engineer, has assembled data that clarifies the impact of third-party scripts in the hope it prompts more efficient coding.”

MakeUseOf: What Is Web Scraping? How to Collect Data From Websites

MakeUseOf: What Is Web Scraping? How to Collect Data From Websites. “Think of a type of data and you can probably collect it by scraping the web. Real estate listings, sports data, email addresses of businesses in your area, and even the lyrics from your favorite artist can all be sought out and saved by writing a small script.” This article has a couple of good examples, but it’s mostly an overview (this is not meant as a criticism; it’s an incredibly broad topic that nobody could cover in one article!)