Technical .ly: Volunteer data scrapers helped Philadelphia Lawyers for Social Equity preserve client court records

Technical .ly: Volunteer data scrapers helped Philadelphia Lawyers for Social Equity preserve client court records. “As the first state to implement the Clean Slate Law in 2018, Pennsylvania committed to sealing millions of criminal records. The law was enacted to remove educational and vocational disadvantages for people with eligible records, including those associated with certain misdemeanors and people found not guilty in court. While the law cleared barriers to housing, education and employment for individuals across the state, it indirectly created new technological barriers for Philadelphia Lawyers for Social Equity (PLSE).”

Towards Data Science: How to Scrape Tweets From Twitter

Towards Data Science: How to Scrape Tweets From Twitter. “This tutorial is meant to be a quick straightforward introduction to scraping tweets from Twitter in Python using Tweepy’s Twitter API or Dmitry Mottl’s GetOldTweets3. To provide direction for this tutorial I decided to focus on scraping through two avenues: scraping a specific user’s tweets and scraping tweets from a general text search.”

Hackaday: Think You Know cURL? Care To Prove It?

Hackaday: Think You Know cURL? Care To Prove It?. “Do you happen to remember a browser-based game ‘You Can’t JavaScript Under Pressure’? It presented coding tasks of ever-increasing difficulty and challenged the player to complete them as quickly as possible. Inspired by that game, [Ben Cox] re-implemented it as You Can’t cURL Under Pressure!”

Make Tech Easier: How to Use a Data-Scraping Tool to Extract Data from Webpages

Make Tech Easier: How to Use a Data-Scraping Tool to Extract Data from Webpages. “If you’re copying and pasting things off webpages and manually putting them in spreadsheets, you either don’t know what data scraping (or web scraping) is, or you do know what it is but aren’t really keen on the idea of learning how to code just to save yourself a few hours of clicking. Either way, there are a lot of no-code data-scraping tools that can help you out, and Data Miner’s Chrome extension is one of the more intuitive options.”

BBC: Sham news sites make big bucks from fake views

BBC: Sham news sites make big bucks from fake views. “There are 350 million registered domain names on the internet. Experts say it’s impossible to count how many are sham news sites. But just like legitimate websites, they earn money from the major tech companies that pay them to display ads.”

Codementor: How to Extract Google Maps Coordinates

Codementor: How to Extract Google Maps Coordinates. “Have you ever thought you can make money by knowing how many restaurants there are in a square mile? There is no free lunch, however, if you know how to use Google Maps, you can extract and collect restaurant’s GPS and store them in your own database. With that information on hand and some math calculations, you are off to creating a big data online service. In this article, I will show you how to quickly extract Google Maps coordinates with a simple and easy method.”

MakeUseOf: The Scrapestack API Makes It Easy to Scrape Websites for Data

MakeUseOf: The Scrapestack API Makes It Easy to Scrape Websites for Data. “Finding it time-consuming to visit all your favorite websites and read everything that matters? One solution is a web scraper, a software tool that gathers information you need from other sites. We’re going to look at the scrapestack API, a web scraping service that you can subscribe to. Once set up, you can use scrapestack to grab whatever data you want from other sites.”

Ars Technica: Web scraping doesn’t violate anti-hacking law, appeals court rules

Ars Technica: Web scraping doesn’t violate anti-hacking law, appeals court rules. “Scraping a public website without the approval of the website’s owner isn’t a violation of the Computer Fraud and Abuse Act, an appeals court ruled on Monday. The ruling comes in a legal battle that pits Microsoft-owned LinkedIn against a small data-analytics company called hiQ Labs.”

SwissInfo: Study finds Big Data eliminates confidentiality in court judgements

SwissInfo: Study finds Big Data eliminates confidentiality in court judgements. “Swiss researchers have found that algorithms that mine large swaths of data can eliminate anonymity in federal court rulings. This could have major ramifications for transparency and privacy protection.”

Hongkiat: 10 Best Web Scraping Tools to Extract Online Data

Hongkiat: 10 Best Web Scraping Tools to Extract Online Data. “Web Scraping tools are specifically developed for extracting information from websites. They are also known as web harvesting tools or web data extraction tools. These tools are useful for anyone trying to collect some form of data from the Internet. Web Scraping is the new data entry technique that don’t require repetitive typing or copy-pasting.”

Techdirt: Court Says Scraping Websites And Creating Fake Profiles Can Be Protected By The First Amendment

Techdirt: Court Says Scraping Websites And Creating Fake Profiles Can Be Protected By The First Amendment. “It’s no secret that the Computer Fraud and Abuse Act (CFAA) is a mess. Originally written by a confused and panicked Congress in the wake of the 1980s movie War Games, it was supposed to be an ‘anti-hacking’ law, but was written so broadly that it has been used over and over again against any sort of ‘things that happen on a computer.’ It has been (not so jokingly) referred to as ‘the law that sticks,’ because when someone has done something “icky” using a computer, if no other law is found to be broken, someone can almost always find some weird way to interpret the CFAA to claim it’s been violated. The two most problematic parts of the CFAA are the fact that it applies to ‘unauthorized access’ or to ‘exceeding authorized access’ on any ‘computer… which is used in or affecting interstate or foreign commerce or communications.’ In 1986 that may have seemed limited. But, today, that means any computer on the internet. Which means basically any computer.”

Web Scraping with the Wolfram Language, Part 1: Importing and Interpreting (Wolfram Blog)

Wolfram Blog: Web Scraping with the Wolfram Language, Part 1: Importing and Interpreting. “Do you want to do more with data available on the web? Meaningful data exploration requires computation—and the Wolfram Language is well suited to the tasks of acquiring and organizing data. I’ll walk through the process of importing information from a webpage into a Wolfram Notebook and extracting specific parts for basic computation.” oo!

Dark Cloud: Inside The Pentagon’s Leaked Internet Surveillance Archive (UpGuard)

UpGuard: Dark Cloud: Inside The Pentagon’s Leaked Internet Surveillance Archive. “While a cursory examination of the data reveals loose correlations of some of the scraped data to regional US security concerns, such as with posts concerning Iraqi and Pakistani politics, the apparently benign nature of the vast number of captured global posts, as well as the origination of many of them from within the US, raises serious concerns about the extent and legality of known Pentagon surveillance against US citizens. In addition, it remains unclear why and for what reasons the data was accumulated, presenting the overwhelming likelihood that the majority of posts captured originate from law-abiding civilians across the world.”

New York Times: U.S. Judge Says LinkedIn Cannot Block Startup From Public Profile Data

New York Times: U.S. Judge Says LinkedIn Cannot Block Startup From Public Profile Data . “A U.S. federal judge on Monday ruled that Microsoft Corp’s LinkedIn unit cannot prevent a startup from accessing public profile data, in a test of how much control a social media site can wield over information its users have deemed to be public. U.S. District Judge Edward Chen in San Francisco granted a preliminary injunction request brought by hiQ Labs, and ordered LinkedIn to remove within 24 hours any technology preventing hiQ from accessing public profiles.”