A Quantitative Literary History of 2,958 Nineteenth-Century British Novels: The Semantic Cohort Method (PDF) — This project was simultaneously an experiment in developing quantitative and computational methods for tracing changes in literary language. We wanted to see how far quantifiable features such as word usage could be pushed toward the investigation of literary history. Could we leverage quantitative methods in ways that respect the nuance and complexity we value in the humanities? To this end, we present a second set of results, the techniques and methodological lessons gained in the course of designing and running this project. Even litcrit becoming a data game.
Easy6502 — get started writing 6502 assembly language. Fun way to get started with low-level coding.
How Analytics Really Work at a Small Startup (Pete Warden) — The key for us is that we’re using the information we get primarily for decision-making (should we build out feature X?) rather than optimization (how can we improve feature X?). Nice rundown of tools and systems he uses, with plug for KissMetrics.
What Tim Berners-Lee Doesn’t Know About HTML DRM (Guardian) — Cory Doctorow lays it out straight. HTML DRM is a bad idea, no two ways. The future of the Web is the future of the world, because everything we do today involves the net and everything we’ll do tomorrow will require it. Now it proposes to sell out that trust, on the grounds that Big Content will lock up its “content” in Flash if it doesn’t get a veto over Web-innovation. [...] The W3C has a duty to send the DRM-peddlers packing, just as the US courts did in the case of digital TV.
What Teens Get About The Internet That Parents Don’t (The Atlantic) — the Internet has been a lifeline for self-directed learning and connection to peers. In our research, we found that parents more often than not have a negative view of the role of the Internet in learning, but young people almost always have a positive one. (via Clive Thompson)
Portable C64 — beautiful piece of C64 hardware hacking to embed a screen and battery in it. (via Hackaday)
School District Builds Own Software — By taking a not-for-profit approach and using freely available open-source tools, Saanich officials expect to develop openStudent for under $5 million, with yearly maintenance pegged at less than $1 million. In contrast, the B.C. government says it spent $97 million over the past 10 years on the B.C. enterprise Student Information System — also known as BCeSIS — a provincewide system already slated for replacement.
Giving a Presentation From an Apple ][ — A co-worker used an iPad to give a presentation. I thought: why take a machine as powerful as an early Cray to do something as low-overhead as display slides? Why not use something with much less computing power? From this asoft_presenter was born. The code is a series of C programs that read text files and generate a large Applesoft BASIC program that actually presents the slides. (via Jim Stogdill)
AirBnB TechTalks — impressive collection of interesting talks, part of the AirBnB techtalks series.
Gawker’s Realtime Dashboard — this is not just technically and visually cool, but also food for thought about what they’re choosing to measure and report on in real time (new vs returning split, social engagement, etc.). Does that mean they hope to be able to influence those variables in real time? (via Alex Howard)
Credibility Ranking of Tweets During High Impact Events (PDF) — interesting research. Situational awareness information is information that leads to gain in the knowledge or update about details of the event, like the location, people affected, causes, etc. We found that on average, 30% content about an event, provides situational awareness information about the event, while 14% was spam. (via BoingBoing)
The Commodore 64 — interesting that Chuck Peddle (who designed the 6502) and Bob Yannes (who designed the SID chip) are still alive. This article safely qualifies as Far More Than You Ever Thought You Wanted To Know About The C64 but it is fascinating. The BASIC housed in its ROM (“BASIC 2.0″) was painfully antiquated. It was actually the same BASIC that Tramiel had bought from Microsoft for the original PET back in 1977. Bill Gates, in a rare display of naivete, sold him the software outright for a flat fee of $10,000, figuring Commodore would have to come back soon for another, better version. He obviously didn’t know Jack Tramiel very well. Ironically, Commodore did have on hand a better BASIC 4.0 they had used in some of the later PET models, but Tramiel nixed using it in the Commodore 64 because it would require a more expensive 16 K rather than 8 K of ROM chips to house.
Faked Research is Endemic in China (New Scientist) — open access promises the unbundling of publishing, quality control, reputation, and recommendation. Reputation systems for science are going to be important: you can’t blacklist an entire country’s researchers. Can you demand reproducibility?
The Hobbit — ambitious very early game, timely to remember as the movie launches. Literally, no two games of The Hobbit are the same. I can see what Milgrom and the others were striving toward: a truly living, dynamic story where anything can happen and where you have to deal with circumstances as they come, on the fly. It’s a staggeringly ambitious, visionary thing to be attempting.
How to Get Startup Ideas (Paul Graham) — The essay is full of highly-quotable apothegms like Live in the future, then build what’s missing and The verb you want to be using with respect to startup ideas is not “think up” but “notice.”
Learn to Write 6502 Assembly Language — if retro-gaming is the gateway drug you’re using to attract kids to programming, this is the crack you wheel out after three months of getting high. Ok, this metaphor is broken on many levels. (via Hacker News)
Small Political Pieces, Loosely Joined — MySociety: We believe that the wrong answer to this challenge is to just say “Well then, everyone should build their own sites from scratch.” [...] Our plan is to collaborate with international friends to build a series of components that deliver quite narrow little pieces of the functionality that make up bigger websites. Common software components, perhaps interchangeable data … good things coming.
How Fair Use Can Solve Orphan Works — preprint of legal paper claiming non-profit libraries can begin to work on orphaned works under the aegis of free use. Finally, regardless of a work’s orphan status, many uses by libraries and archives will fit squarely under the umbrella of uses favored by the first fair use factor (the “purpose of the use”), and their digitization of entire works for preservation and access should often be justified under the third fair use factor (the amount used). As such, fair use represents an important, and for too long unsung, part of the solution to the orphan works problem.
Microsoft BASIC for 6502 — reverse-engineering magic, this person has RE’d the assembly language for various versions of the BASIC interpreter that shipped on microcomputers in the 80s. This page talks about the changes in each version, the easter eggs, and the hacks. This, kids, is how real programmers do it :)
The Sudden Rise of Peer-to-Peer Commerce (Casey Research) — Today, business are sprouting up around the world based on the idea of connecting individuals directly to each other to trade products and services. While the idea is very much in its infancy still, like the music business at the dawn of Napster, we’re beginning to grasp the potential. Something we are tracking at O’Reilly as well.
The Sensor/itive Side of Android (Luke Wroblewski) — lots of details about sensors in Android, from a Google I/O talk. Sampling rates change between devices. The data has variance and static because it comes from cost-effective components for mobile phones not robust and industry-grade sensors.
Hours Days, Who’s Counting? (Jon Udell) — What prompted me to check? My friend Mike Caulfield, who’s been teaching and writing about quantitative literacy, says it’s because in this case I did have some touchstone facts parked in my head, including the number 10 million (roughly) for barrels of oil imported daily to the US. The reason I’ve been working through a bunch of WolframAlpha exercises lately is that I know I don’t have those touchstones in other areas, and want to develop them. The idea of “touchstone facts” resonates with me.
Spotting Fake Reviewer Groups in Consumer Reviews (PDF) — gotta love any paper that says We calculated the “spamicity” (degree of spam) of each group by assigning 1 point for each spam judgment, 0.5 point for each borderline judgment and 0 point for each non-spam judgment a group received and took the average of all 8 labelers. (via Google Research Blog)
Superfastmatch — open source text comparison tool, used to locate plagiarism/churnalism in online news sites. You can pull out the text engine and use it for your own “find where this text is used elsewhere” applications (e.g., what’s being forwarded out in email, how much of this RFP is copy and paste, what’s NOT boilerplate in this contract, etc.). (via Pete Warden)
Why Our Kids Should Be Taught To Code (Guardian) — if we don’t act now we will be short-changing our children. [...] their world will be also shaped and configured by networked computing and if they don’t have a deeper understanding of this stuff then they will effectively be intellectually crippled. They will grow up as passive consumers of closed devices and services, leading lives that are increasingly circumscribed by technologies created by elites working for huge corporations such as Google, Facebook and the like. We will, in effect, be breeding generations of hamsters for the glittering wheels of cages built by Mark Zuckerberg and his kind. (via Karl von Randow)
The Pwn Plug — $770 gets you a wall-wart full of network attack tools and wifi for remote access. Plug and Pwn. (via Ars Technica)
Mobile Phone as Companion Species (Matt Jones) — They see the world differently to us, picking up on things we miss. They adapt to us, our routines. They look to us for attention, guidance and sustenance. We imagine what they are thinking, and vice-versa.
8-Bit Linux — Ubuntu 9 ported to an 6.5KHz 8-bit CPU (running a 32-bit emulator because Linux itself requires at least a 32-bit system). Takes 2 hours to boot up the kernel, four more to get to a login prompt. Moore’s Law for the win: I’ve seen more than 1000x improvement in speed from my first computer (1MHz C64) to current (1.7GHz i5). (via Slashdot)