- Princeton Open Access Report (PDF) — academics will need written permission to assign copyright of a paper to a journal. Of course, the faculty already had exclusive rights in the scholarly articles they write; the main effect of this new policy is to prevent them from giving away all their rights when they publish in a journal. (via CC Huang)
- Good Faith Collaboration — a book on Wikipedia’s culture, from MIT Press. Distributed, appropriately, under a Creative Commons Non-Commercial Share-Alike license.
- The Local-Global Flip — an EDGE conversation (or monologue) by Jaron Lanier that contains more thought-provocation per column-inch than anything else you’ll read this week. [I]ncreasing efficiency by itself doesn’t employ people. There is a difference between saving and making money when you’re unemployed. Once you’re already rich, saving money and making money is the same thing, but for people who are on the bottom or even in the middle classes, saving money doesn’t help you if you don’t have the money to save in the first place. and The beauty of money is it creates a system of people leaving each other alone by mutual agreement. It’s the only invention that does that that I’m aware of. In a world of finite limits where you don’t have an infinite West you can expand into, money is the thing that gives you a little bit of peace and quiet, where you can say, “It’s my money, I’m spending it”. and I’m astonished at how readily a great many people I know, young people, have accepted a reduced economic prospect and limited freedoms in any substantial sense, and basically traded them for being able to screw around online. There are just a lot of people who feel that being able to get their video or their tweet seen by somebody once in a while gets them enough ego gratification that it’s okay with them to still be living with their parents in their 30s, and that’s such a strange tradeoff. And if you project that forward, obviously it does become a problem. are things I’m still chewing on, many days after first reading.
- Trolled by Gerry Sussman (Bryan O’Sullivan) — Bryan gave a tutorial on Haskell to a conference on leading-edge programming languages and distributed systems. At one point, Gerry had a pretty amusing epigram to offer. “Haskell is the best of the obsolete programming languages!” he pronounced, with a mischievous look. Now, I know when I’m being trolled, so I said nothing and waited a moment, whereupon he continued, “but don’t take it the wrong way—I think they’re all obsolete!”
Princeton Open Access, Wikipedia Culture, Food for Thought, and Trolled by Sussman
education, wikipedia, metrics, brain, science, 3d, fabbing, @fourshort
- Who Writes Wikipedia — reported widely as “bots make most of the contributions to Wikipedia”, but which really should have been “edits are a lousy measure of contributions”. The top bots are doing things like ensuring correctly formatted ISBN references and changing the names of navboxes–things which could be done by humans but which it would be a scandalous waste of human effort if they were. We analyse edits because it’s easy to get data on edits; analysis of value is a different matter.
- How I Failed and Finally Succeeded at Learning How to Code (The Atlantic) — great piece on teaching and learning programming, focusing on Project Euler. Kids are naturally curious. They love blank slates: a sandbox, a bag of LEGOs. Once you show them a little of what the machine can do they’ll clamor for more. They’ll want to know how to make that circle a little smaller or how to make that song go a little faster. They’ll imagine a game in their head and then relentlessly fight to build it. Along the way, of course, they’ll start to pick up all the concepts you wanted to teach them in the first place. And those concepts will stick because they learned them not in a vacuum, but in the service of a problem they were itching to solve.
- The Believing Brain — Belief comes quickly and naturally, skepticism is slow and unnatural, and most people have a low tolerance for ambiguity.
- 3D Printed Rocket — stainless steel rocket engine.
Parsing link rot, visualizing Wikipedia edits, and deconstructing autocorrect
In the latest Strata Week: How quickly do URLs die? Where in the world are Wikipedia editors? How does the iPhone autocorrect work (or not)?
Internet Cafe Culture, Image Processing, Library Mining, and MediaWiki Parsing
- Chinese Internet Cafes (Bryce Roberts) — a good quick read. My note: people valued the same things in Internet cafes that they value in public libraries, and the uses are very similar. They pose a similar threat to the already-successful, which is why public libraries are threatened in many Western countries.
- SIFT — the Scale Invariant Feature Transform library, built on OpenCV, is a method to detect distinctive, invariant image feature points, which easily can be matched between images to perform tasks such as object detection and recognition, or to compute geometrical transformations between images. The licensing seems dodgy–MIT code but lots of “this isn’t a license to use the patent!” warnings in the LICENSE file. (via Joshua Schachter)
- The Secret Life of Libraries (Guardian) — I like the idea of the most-stolen-books revealing something about a region; it’s an aspect of data revealing truth. For a while, Terry Pratchett was the most-shoplifted author in England but newspapers rarely carried articles about him or mentioned his books (because they were genre fiction not “real” literature). (via Brian Flaherty)
- Sweble — MediaWiki parser library. Until today, Wikitext had been poorly defined. There was no grammar, no defined processing rules, and no defined output like a DOM tree based on a well defined document object model. This is to say, the content of Wikipedia is stored in a format that is not an open standard. The format is defined by 5000 lines of php code (the parse function of MediaWiki). That code may be open source, but it is incomprehensible to most. That’s why there are 30+ failed attempts at writing alternative parsers. (via Dirk Riehle)
The online encyclopedia is a great resource for data scientists
Wikipedia is an essential tool in the data scientist's armory. Today's Strata Gem shows how it can be used to help computers distinguish between different sense of common words.