- PirateBox 1.0 — turns a wireless router into a filesharing joy. v1.0 has a responsive ui, among other things for use on tablets and phones.
- Dystopia Tracker — keep on top of which scifi dystopic predictions have been realised. I’d like filters for incubators, investors, and BigCos so you can see who is investing in dystopia.
- The Harvester, the Botmaster, and the Spammer (PDF) — research paper on the spam supply chain.
- Technical Interviewing (Moishe Lettvin) — lessons learned from conducting >250 technical interviews at Google. Why do I care? Chances are, your technical interviews suck so you’re hiring poorly.
Facebook scraping could lead to machine-generated spam so good that it's indistinguishable from legitimate messages.
A recent blog post inquired about the incidence of Facebook-based spear phishing: the author suddenly started receiving email that appeared to be from friends (though it wasn’t posted from their usual email addresses), making the usual kinds of offers and asking him to click on the usual links. He wondered whether this was a phenomenon and how it happened — how does a phisherman get access to your Facebook friends?
The answers are “yes, it happens” and “I don’t know, but it’s going to get worse.” Seriously, my wife’s name has been used in Facebook phishing. A while ago, several of her Facebook friends said that her email account had been hacked. I was suspicious; she only uses Gmail, and hacking Google isn’t easy, particularly with two-factor authentication. So, I asked her friends to send me the offending messages. It was obvious that they hadn’t come from my wife’s account; they were Yahoo accounts with her name but an unrecognizable email address, exactly what this blogger had seen.
How does this happen? How can a phisher discover your name and your Facebook friends? I don’t know, but Facebook is such a morass of weird and conflicting security settings that it’s impossible to know just how private or how public you are. If you’ve ever friended people you don’t know (a practice that remains entirely too common), and if you’ve ever enabled visibility to friends of friends, you have no idea who has access to your conversations.
Cite Spam, Astro Science Labs, Citizen Science, and Accelerating Research
- Manipulating Google Scholar Citations and Google Scholar Metrics: simple, easy and tempting (PDF) — scholarly paper on how to citespam your paper up Google Scholar’s results list. Fortunately calling your paper “AAAAAA In-vitro Qualia of …” isn’t one of the winning techniques.
- Seamless Astronomy — brings together astronomers, computer scientists, information scientists, librarians and visualization experts involved in the development of tools and systems to study and enable the next generation of online astronomical research.
- Eye Wire — a citizen science game where you map the 3D structure of neurons.
- Open Science is a Research Accelerator (Nature Chemistry) — challenge was: get rid of this bad-tasting compound from malaria medicine, without raising cost. Did it with open notebooks and collaboration, including LinkedIn groups. Lots of good reflection on advertising, engaging, and speed.
Illuminated Mario, Touchstone Facts, Calculating Spamicity, and Abstract Quantified Self
- Gravity in the Margins (Got Medieval) — illuminating illuminated manuscripts with Mario. (via BoingBoing)
- Hours Days, Who’s Counting? (Jon Udell) — What prompted me to check? My friend Mike Caulfield, who’s been teaching and writing about quantitative literacy, says it’s because in this case I did have some touchstone facts parked in my head, including the number 10 million (roughly) for barrels of oil imported daily to the US. The reason I’ve been working through a bunch of WolframAlpha exercises lately is that I know I don’t have those touchstones in other areas, and want to develop them. The idea of “touchstone facts” resonates with me.
- Spotting Fake Reviewer Groups in Consumer Reviews (PDF) — gotta love any paper that says We calculated the “spamicity” (degree of spam) of each group by assigning 1 point for each spam judgment, 0.5 point for each borderline judgment and 0 point for each non-spam judgment a group received and took the average of all 8 labelers. (via Google Research Blog)
- Visualizing Physical Activity Using Abstract Ambient Art (Quantified Self) — kinda like the iTunes visualizer but for your Fitbit Tracker.
Fingerprinting Cameras, Stopping Spambots, Generic Infographics, and Open Source Healthcare Records
- Fingerprinting Cameras Through Sensor Noise — using the pattern of noise consistent between images taken from the same camera to uniquely identify the device. (via Pete Warden)
- Stopping Bots with Hashes and Honeypots (Ned Batchelder) — solid techniques for preventing spambots. (via Andy Baio)
- Most Popular Infographics Generalized (Flowing Data) — it’s only funny because it’s true.
- London Hospital to Deploy Open Source Record System — hot on the heels of the NHS canning a failed expensive development of electronic health records. (via Glyn Moody)
Terminal Tool, Gamifying Education, Exponential Shortcut, and Kindle Spam
- tmux — GNU Screen-alike, with vertical splits and other goodies. (via Hacker News)
- Gamifying Education (Escapist) — a more thoughtful and reasoned approach than crude badgification, but I’d still feel happier meddling with kids’ minds if there was research to show efficacy and distribution of results. (via Ed Yong)
- Rule of 72 (Terry Jones) — common piece of financial mental math, but useful outside finance when you’re calculating any kind of exponential growth (e.g., bad algorithms). (via Tim O’Reilly)
- Spam Hits the Kindle Bookstore (Reuters) — create a system of incentives and it will be gamed, whether it’s tax law, search engines, or ebook stores. Aspiring spammers can even buy a DVD box set called Autopilot Kindle Cash that claims to teach people how to publish 10 to 20 new Kindle books a day without writing a word. (via Clive Thompson)
Alistair Croll and Sean Power examine the impact of Facebook's embedded comments tool.
Facebook's new embedded comments option offers websites an additional social layer, but does it attract or drive away content engagement?
New Copyright Laws Proposed, GMail APIs, Internet Book Roundup, and Chrome Farm
- White House Will Propose New Digital Copyright Laws (CNet) — If the Internet were truly empowering citizenry and bringing us this new dawn of digital democracy, the people who run it would be able to stop the oppressive grind of the pro-copyright machinery. There’s no detail about what the proposed law would include, except that it will be based on a white paper of “legislative proposals to improve intellectual property enforcement,” and it’s expected to encompass online piracy. I predict a jump in the online trading of those “You can keep the change” posters that were formerly the exclusive domain of the Tea Party, and the eventual passage of bad law. As the article says, digital copyright tends not to be a particularly partisan topic..
- The Information: How the Internet Gets Inside Us (New Yorker) — thoughtful roundup of books and their positions on whether the Internet’s fruits are good for us. He divides them into never better, better never (as in “we’d be better off if it had never been invented”), and ever-was (as in, “we have always been changed by our technology, so big deal”). (via Bernard Hickey on Twitter)
- New Chrome Extension Blocks Sites from Search Results — Google testing whether users successfully identify and report content farms.
Intrusion Recovery, MTurk Spam, Open Source, and Google Pottymouth
- Gawker Tech Team Didn’t Adequately Secure Our Platform — internal memo from CTO to staff after the break-in. Notable for two things: the preventative steps, which include things like two-factor authentication and not collecting commenter details; and the lack of defensiveness. When your executives taunt 4chan and your systems get pwned as a result, it must be mighty hard not to point the finger at those executives. I hope I can be as adult as Tom Plunkett when shit next happens to me. (via Andy Baio)
- Mechanical Turk Spam — 40% of the HITs from new requesters are spam. The list of tasks is the online fraud hitlist: faking votes/comments/etc on social sites, making fake accounts, submitting fake leads through lead gen sites, fake clicks on ads, posting fake ads to Craigslist, requesting personal info of the MTurk worker. (via Andy Baio who is on fire)
- 2010 The Year Open Source Went Invisible (Matt Asay) — All of which is a long way of saying that while open source has become integral to so much software development, it hasn’t remotely ended the reign of proprietary software. Indeed, much (most?) open-source software is paid for out of proprietary profits. This might have been shocking news in, say, 2004, but it’s common knowledge in 2010. Open source is how we do business 10 years into this new millennium.
- Quantitative Analysis of Culture Using Millions of Digitized Books (Science) — We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. This is related to Google Labs’ latest toy, the n-gram viewer whose correct name should be Google Pottymouth if the things people are graphing are anything to go by.