Transparency is Not Enough (danah boyd) — we need people to not just have access to the data, but have access to the context surrounding the data. A very thoughtful talk from Gov 2.0 Expo about meaningful data release.
Feed6 — the latest from Rohit Khare is a sort of a “hot or not” for pictures posted to Twitter. Slightly addictive, while somewhat purposeless. Remarkable for how banal the “most popular” pictures are, it reminds me of the way Digg, Reddit, and other such sites trend towards the uninteresting and dissatisfying. Flickr’s interestingness still remains one of the high points of user-curated notability. (via rabble on Twitter)
People are Walking Architecture — presentation by Matt Jones of BERG, taking a new lens to this AR/ubicomp/whatever-it-is-today world. “[Mobile phones are] a whole toy box full of playful, inventive strategies for exploring cities ….”
Lexicalist — insight into geographic and age distribution of language use, based on Twitter data. (via Language Log)
Advanced Visualization Techniques — nice overview of some non-standard visualization techniques. Short shameful confession: I love polar dendrograms with a passion. These techniques are to visualizers as algorithms and data structures to programmers: each is used in specific circumstances and compromises some things to gain in others. (via Flowing Data)
iPad Usability Report (Nielsen-Norman Group) — 93-page report based on user studies. The iPad etched-screen aesthetic does look good. No visual distractions or nerdy buttons. The penalty for this beauty is the re-emergence of a usability problem we haven’t seen since the mid-1990s: Users don’t know where they can click. For the last 15 years of Web usability research, the main problems have been that users don’t know where to go or which option to choose — not that they don’t even know which options exist. With iPad UIs, we’re back to this square one. (via Andrew Savikas)
Researchers Show How To Use Mobiles to Spy on People — Using information from the GSM network they could identify a mobile phone user’s location, and they showed how they could easily create dossiers on people’s lives and their behavior and business dealings. They also demonstrated how they were able to identify a government contractor for the US Department of Homeland Security through analyzing phone numbers and caller IDs. [...] The researchers have not released details of the tools they developed, and have alerted the major GSM carriers about their results. Bailey said the carriers were “very concerned,” but mitigating these sorts of attacks would not be easy. In the meantime there is little mobile phone users can do to protect themselves short of turning off their phones. Oh joy. (via Roger Dennis)
Interview with Tim Bell (MP3) — author of Computer Science Unplugged, which teaches computational thinking in a fashion that can have five year olds understanding error correction codes, and one of the people behind a new high-school curriculum for CS in New Zealand.
How I Learned to Love Twitter (Guardian) — fascinating piece from writer Margaret Atwood. The Twittersphere is an odd and uncanny place. It’s something like having fairies at the bottom of your garden is one of my favourite things that’s ever been written about Twitter but the whole article is delightfully written.
Aren’t You Being a Little Hasty in Making This Data Free? — very nice deconstruction of a letter sent by ESRI and competitors to the British Government, alarmed at the announcement that various small- and mid-sized datasets would no longer be charged for. In short, companies that make money reselling datasets hate the idea of free datasets. The arguments against charging are that the cost of gating access exceeds revenue and that open access maximises economic gain. (via glynmoody on Twitter)
A German Library for the 21st Century (Der Spiegel) — But browsing in Europeana is just not very pleasurable. The results are displayed in thumbnail images the size of postage stamps. And if you click through for a closer look, you’re taken to the corresponding institute. Soon you’re wandering helplessly around a dozen different museum and library Web sites — and you end up lost somewhere between the “Vlaamse Kunstcollectie” and the “Wielkopolska Biblioteka Cyfrowa.” Would it not be preferable to incorporate all the exhibits within the familiar scope of Europeana? “We would have preferred that,” says Gradmann. “But then the museums would not have participated.” They insist on presenting their own treasures. This is a problem encountered everywhere around the world: users hate silos but institutions hate the thought of letting go of their content. We’re going to have to let go to win. (via Penny Carnaby)
StoryGarden — a web-based tool for gathering and analyzing a large number of stories contributed by the public. The content of the stories, along with some associated survey questions, are processed in an automated semantic computing process for an immediate, interactive display for the lay public, and in a more thorough manual process for expert analysis.
Google Apps Script — VBA for the 2010s. Currently mainly for spreadsheets, but some hooks into Gmail and Google Calendar.
There’s a Rootkit in the Closet — lovely explanation of finding and isolating a rootkit, reconstructing how it got there and deconstructing the rootkit to figure out what it did. It’s a detective story, no less exciting than when Cliff Stohl wrote The Cuckoo’s Egg.
Who Is Going To Build The New Public Services? — a thoughtful exploration of the possibilities and challenges of third parties building public software systems. There’s a lot of talk of “just put up the data and we’ll build the apps” but I think this is a more substantial consideration of which apps can be built by whom.
Quake 3 for Android — kiss the weekend goodbye, NexusOne owners! My theory is that no platform has “made it” until a first person shooter has been ported to it. (via BoingBoing)
Graph Mining — slides and reading list from seminar series at UCSB on different aspects of mining graphs. Relevant because, obviously, social networks are one such graph to be mined.
Treadmill Desk — I want one. Staying fit while working at a sedentary job is important but not easy. I tried to type while using a stepper, but that’s just a recipe for incomprehensible typing fail. (via BoingBoing)
datapkg — a data packaging tool, so you can easily find and install datasets.
On Karma — very detailed look at user reputation, full of great takeaways. As with the FICO score, it is a bad idea to co-opt a reputation system for another purpose, and it dilutes the actual meaning of the score in its original context.
Of Tandoori and Epicuration (JP Rangaswami) — Curation is the process by which aggregate data is imbued with personalised trust. Siri — a personal assistant iPhone app, like IWantSandy but with voice recognition. Evaluating the Reasons for Non-use of Cornell University's Institutional Repository — great lessons for all open data projects. The reward structure established by each discipline largely…
Facebook Data Team: Distributed Data Analysis at Facebook — job ad from Facebook gives numbers on company use of their Hive data warehouse tool built on top of Hadoop: Today, Facebook counts 29% of its employees (and growing!) as Hive users. More than half (51%) of those users are outside of Engineering. They come from distinct groups like User Operations, Sales, Human Resources, and Finance. Many of them had never used a database before working here. Thanks to Hive, they are now all data ninjas who are able to move fast and make great decisions with data. (via Simon Willison)
Open Source Enters The World of Atoms — an academic statistical analysis of open design. We indicated that, in open design communities, tangible objects can be developed in very similar fashion to software; one could even say that people treat a design as source code to a physical object and change the object via changing the source.
Why I Like Redis (Simon Willison) — coherent explanation of why Simon likes and uses a particular nosql system. I can run a long running batch job in one Python interpreter (say loading a few million lines of CSV in to a Redis key/value lookup table) and run another interpreter to play with the data that’s already been collected, even as the first process is streaming data in. I can quit and restart my interpreters without losing any data. And because Redis semantics map closely to Python native data types, I don’t have to think for more than a few seconds about how I’m going to represent my data.
Your Movements Speak For Themselves (Jeff Jonas) — Mobile devices in America are generating something like 600 billion geo-spatially tagged transactions per day. Every call, text message, email and data transfer handled by your mobile device creates a transaction with your space-time coordinate (to roughly 60 meters accuracy if there are three cell towers in range), whether you have GPS or not. Got a Blackberry? Every few minutes, it sends a heartbeat, creating a transaction whether you are using the phone or not. If the device is GPS-enabled and you’re using a location-based service your location is accurate to somewhere between 10 and 30 meters. Using Wi-Fi? It is accurate below10 meters. A thought-provoking roundup of the information leakage with modern locative systems. (via TomC on Twitter)