- Manipulating Google Scholar Citations and Google Scholar Metrics: simple, easy and tempting (PDF) — scholarly paper on how to citespam your paper up Google Scholar’s results list. Fortunately calling your paper “AAAAAA In-vitro Qualia of …” isn’t one of the winning techniques.
- Seamless Astronomy — brings together astronomers, computer scientists, information scientists, librarians and visualization experts involved in the development of tools and systems to study and enable the next generation of online astronomical research.
- Eye Wire — a citizen science game where you map the 3D structure of neurons.
- Open Science is a Research Accelerator (Nature Chemistry) — challenge was: get rid of this bad-tasting compound from malaria medicine, without raising cost. Did it with open notebooks and collaboration, including LinkedIn groups. Lots of good reflection on advertising, engaging, and speed.
ENTRIES TAGGED "spam"
Facebook scraping could lead to machine-generated spam so good that it's indistinguishable from legitimate messages.
Cite Spam, Astro Science Labs, Citizen Science, and Accelerating Research
The cycle of good, bad, and stable has happened at every layer of the stack. It will happen with big data, too.
Illuminated Mario, Touchstone Facts, Calculating Spamicity, and Abstract Quantified Self
- Gravity in the Margins (Got Medieval) — illuminating illuminated manuscripts with Mario. (via BoingBoing)
- Hours Days, Who’s Counting? (Jon Udell) — What prompted me to check? My friend Mike Caulfield, who’s been teaching and writing about quantitative literacy, says it’s because in this case I did have some touchstone facts parked in my head, including the number 10 million (roughly) for barrels of oil imported daily to the US. The reason I’ve been working through a bunch of WolframAlpha exercises lately is that I know I don’t have those touchstones in other areas, and want to develop them. The idea of “touchstone facts” resonates with me.
- Spotting Fake Reviewer Groups in Consumer Reviews (PDF) — gotta love any paper that says We calculated the “spamicity” (degree of spam) of each group by assigning 1 point for each spam judgment, 0.5 point for each borderline judgment and 0 point for each non-spam judgment a group received and took the average of all 8 labelers. (via Google Research Blog)
- Visualizing Physical Activity Using Abstract Ambient Art (Quantified Self) — kinda like the iTunes visualizer but for your Fitbit Tracker.
Fingerprinting Cameras, Stopping Spambots, Generic Infographics, and Open Source Healthcare Records
- Fingerprinting Cameras Through Sensor Noise — using the pattern of noise consistent between images taken from the same camera to uniquely identify the device. (via Pete Warden)
- Stopping Bots with Hashes and Honeypots (Ned Batchelder) — solid techniques for preventing spambots. (via Andy Baio)
- Most Popular Infographics Generalized (Flowing Data) — it’s only funny because it’s true.
- London Hospital to Deploy Open Source Record System — hot on the heels of the NHS canning a failed expensive development of electronic health records. (via Glyn Moody)
Terminal Tool, Gamifying Education, Exponential Shortcut, and Kindle Spam
- tmux — GNU Screen-alike, with vertical splits and other goodies. (via Hacker News)
- Gamifying Education (Escapist) — a more thoughtful and reasoned approach than crude badgification, but I’d still feel happier meddling with kids’ minds if there was research to show efficacy and distribution of results. (via Ed Yong)
- Rule of 72 (Terry Jones) — common piece of financial mental math, but useful outside finance when you’re calculating any kind of exponential growth (e.g., bad algorithms). (via Tim O’Reilly)
- Spam Hits the Kindle Bookstore (Reuters) — create a system of incentives and it will be gamed, whether it’s tax law, search engines, or ebook stores. Aspiring spammers can even buy a DVD box set called Autopilot Kindle Cash that claims to teach people how to publish 10 to 20 new Kindle books a day without writing a word. (via Clive Thompson)
Alistair Croll and Sean Power examine the impact of Facebook's embedded comments tool.
Facebook's new embedded comments option offers websites an additional social layer, but does it attract or drive away content engagement?
New Copyright Laws Proposed, GMail APIs, Internet Book Roundup, and Chrome Farm
- White House Will Propose New Digital Copyright Laws (CNet) — If the Internet were truly empowering citizenry and bringing us this new dawn of digital democracy, the people who run it would be able to stop the oppressive grind of the pro-copyright machinery. There’s no detail about what the proposed law would include, except that it will be based on a white paper of “legislative proposals to improve intellectual property enforcement,” and it’s expected to encompass online piracy. I predict a jump in the online trading of those “You can keep the change” posters that were formerly the exclusive domain of the Tea Party, and the eventual passage of bad law. As the article says, digital copyright tends not to be a particularly partisan topic..
- The Information: How the Internet Gets Inside Us (New Yorker) — thoughtful roundup of books and their positions on whether the Internet’s fruits are good for us. He divides them into never better, better never (as in “we’d be better off if it had never been invented”), and ever-was (as in, “we have always been changed by our technology, so big deal”). (via Bernard Hickey on Twitter)
- New Chrome Extension Blocks Sites from Search Results — Google testing whether users successfully identify and report content farms.
Intrusion Recovery, MTurk Spam, Open Source, and Google Pottymouth
- Gawker Tech Team Didn’t Adequately Secure Our Platform — internal memo from CTO to staff after the break-in. Notable for two things: the preventative steps, which include things like two-factor authentication and not collecting commenter details; and the lack of defensiveness. When your executives taunt 4chan and your systems get pwned as a result, it must be mighty hard not to point the finger at those executives. I hope I can be as adult as Tom Plunkett when shit next happens to me. (via Andy Baio)
- Mechanical Turk Spam — 40% of the HITs from new requesters are spam. The list of tasks is the online fraud hitlist: faking votes/comments/etc on social sites, making fake accounts, submitting fake leads through lead gen sites, fake clicks on ads, posting fake ads to Craigslist, requesting personal info of the MTurk worker. (via Andy Baio who is on fire)
- 2010 The Year Open Source Went Invisible (Matt Asay) — All of which is a long way of saying that while open source has become integral to so much software development, it hasn’t remotely ended the reign of proprietary software. Indeed, much (most?) open-source software is paid for out of proprietary profits. This might have been shocking news in, say, 2004, but it’s common knowledge in 2010. Open source is how we do business 10 years into this new millennium.
- Quantitative Analysis of Culture Using Millions of Digitized Books (Science) — We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. This is related to Google Labs’ latest toy, the n-gram viewer whose correct name should be Google Pottymouth if the things people are graphing are anything to go by.