Nature: A large dataset of scientific text reuse in Open-Access publications. “We present the Webis-STEREO-21 dataset, a massive collection of Scientific Text Reuse in Open-access publications. It contains 91 million cases of reused text passages found in 4.2 million unique open-access publications. Cases range from overlap of as few as eight words to near-duplicate publications and include a variety of reuse types, ranging from boilerplate text to verbatim copying to quotations and paraphrases.”
Tag Archives: text
MakeUseOf: The 4 Best Pastebin Alternatives for Sharing Code and Text
MakeUseOf: The 4 Best Pastebin Alternatives for Sharing Code and Text. “The aptly named Pastebin.com was the first text storage website of its kind. It’s used for easily storing and sharing snippets of code or text with other people online. But if you don’t care for it, you’ll find plenty of alternatives to Pastebin on the web. Let’s look at the best Pastebin alternatives you can use for storing text and code. We’ll examine their best features and why they’re worth using over the well-known service.”
WIRED: How to Extract the Text From Any Image
WIRED: How to Extract the Text From Any Image. “THERE ARE PLENTY of reasons why you might want to pull the text out of an image you find online: instructions on a YouTube still, for example, or items on a printed menu, or inspirational quotes in your Instagram feed. Whatever the reason, there are text extraction tools that will do the job of recognizing and copying the words inside those images for you. As image identification techniques improve, these tools are getting better and better at accurately converting text in an image into usable, editable text.”
Greycoder: A List Of Text-Only News Sites (Updated 2022)
Greycoder: A List Of Text-Only News Sites (Updated 2022) . “Text-only websites are quite useful, especially today. Web pages are increasingly filled with ads, videos, and bandwidth-heavy content. Here is a list of text-only, clutter-free news sites.”
New York Times: A.I. Is Mastering Language. Should We Trust What It Says?
New York Times: A.I. Is Mastering Language. Should We Trust What It Says?. “GPT-3 belongs to a category of deep learning known as a large language model, a complex neural net that has been trained on a titanic data set of text: in GPT-3’s case, roughly 700 gigabytes of data drawn from across the web, including Wikipedia, supplemented with a large collection of text from digitized books. GPT-3 is the most celebrated of the large language models, and the most publicly available, but Google, Meta (formerly known as Facebook) and DeepMind have all developed their own L.L.M.s in recent years.”
Boing Boing: This website turns text into music
Boing Boing: This website turns text into music. “Typatone assigns musical tones to letters. Start typing (or paste in any text) to hear original music.” I tried it. The music I made reminded me of some of those trippy Sesame Street animations from the early 70s.
UX Collective: The power of seeing only the questions in a piece of writing
UX Collective: The power of seeing only the questions in a piece of writing. “I’ve been watching how writers use questions lately, and thought: Hmmm, it’d be cool to see only the questions in a piece of prose. I probably started down this line of thinking because last fall I created a little web tool that removes everything but the punctuation from a piece of writing. That tool wound up being a pretty intriguing type of literary x-ray: I discovered, for example, that I use a ton of parentheticals (and way too many m-dashes). Since I already had the code for that, it wasn’t too hard for me to program a version focuses on questions instead.”
Make Tech Easier: 6 of the Best Online Summarizer Tools to Shorten Text
Make Tech Easier: 6 of the Best Online Summarizer Tools to Shorten Text. “Using these nifty online tools, you can copy-paste text or URLs into a box, set your parameters for just how heavily summarized you want it to be, then click a big button to get the low-down on a given article in just a few sentences. Here are our favorite tools for this purpose.”
Analytics India: Google Releases Wikipedia-Based Image Text (WIT) Dataset
Analytics India: Google Releases Wikipedia-Based Image Text (WIT) Dataset. “Google recently released a Wikipedia-Based Image Text (WIT) dataset, a large multimodal dataset created by extracting various text selections associated with an image from Wikimedia image links and articles. It was conducted by rigorous filtering to retain high-quality image-text sets. “
New York Times: Text Memes Are Taking Over Instagram
New York Times: Text Memes Are Taking Over Instagram. “Known in internet slang as shitposting, this style of posting involves people publishing low-quality images, videos or comments online. On Instagram, this means barraging people’s feeds with seemingly indiscriminate content, often accompanied by humorous or confessional commentary. A growing ecosystem of Instagram accounts has embraced this text-heavy posting style, which has exploded in popularity among Gen Z users during the pandemic.”
FedTech: History of Lorem Ipsum
FedTech: History of Lorem Ipsum. “Have you ever seen the term Lorem Ipsum on a new website? Perhaps you have even tried entering it on Google Translate, but no sensible results came through. Most people who see it the first time think they are in the wrong address only to refresh and come back to the same page. But, what is this mysterious text that you see on pages?”
The Verge: Facebook is making it easier to export text posts
The Verge: Facebook is making it easier to export text posts. “Facebook is rolling out a new feature today, allowing users across the globe to have the option to archive their posts and notes created on the social media site and transfer a copy of that data onto Google Docs, WordPress, or Blogger.”
The Verge: Google introducing a feature in Chrome 90 to create links to highlighted text on a webpage
The Verge: Google introducing a feature in Chrome 90 to create links to highlighted text on a webpage. “An upcoming feature in Chrome 90 will allow users to create a link to a section of a website that they’ve highlighted. First launched as a browser extension called Link to Text Fragment last year, Google has now added the feature within Chrome itself.”
Boing Boing: Web tool that generates flowcharts from text
Boing Boing: Web tool that generates flowcharts from text. “You type in words; they appear in a flowchart box. To make a new box with a pointer going towards it, you indent the line. You can link back to an earlier box by using its line number.”
Science Daily: Computer scientists develop new tool that generates videos from themed text
Science Daily: Computer scientists develop new tool that generates videos from themed text. “A global team of computer scientists, from Tsinghua and Beihang Universities in China, Harvard University in the US and IDC Herzliya in Israel, have developed ‘Write-A-Video,’ a new tool that generates videos from themed text. Using words and text editing, the tool automatically determines which scenes or shots are chosen from a repository to illustrate the desired storyline. The tool enables novice users to produce quality video montages in a simple and user-friendly manner that doesn’t require professional video production and editing skills.”