ENTRIES TAGGED "cs"
Tabular Data API, Open Stanford Courses, Wearable TV, and Wearable Sensors
- Tablib — MIT-licensed open source library for manipulating tabular data. Reputed to have a great API. (via Tim McNamara)
- Stanford Education Everywhere — courses in CS, machine learning, math, and engineering that are open for all to take. Over 58,000 have already signed up for the introduction to machine learning taught by Peter Norvig, Google’s Director of Research.
- Wearable LED Television — 160×120 RGBs powered by a 12v battery, built for Burning Man (natch). (via Bridget McKendry)
- Temporary Tattoo Biosensors (Science News) — early work putting flexible sensors into temporary tattoos. (via BoingBoing)
- Which Banks are Enabling Fake AV Scams? — some nice detective work to reveal the mechanisms and actors who take money from the marks in AV scams. (via BoingBoing)
- Developer Experience — new site from ex-Google developer evangelist Pamela Fox, talking about the experience that API- and software-offering companies give to the developers they’re wooing.
- Pros and Cons of Mechanical Turk for Scientific Surveys (Scientific American blogs) — So far, some indicators suggest Turk is a trustworthy source. Rand (2011) used IP address logging to verify subjects’ self-reported country of residence, and found that 97% of responses are accurate. He also compared the consistency of a range of demographic variables reported by the same subjects across two different studies, and found between 81% and 98% agreement, depending on the variable. (via Vaughan Bell)
Organising Conferences, Moving to the JVM, Language Crowdsourcing, and Bayesian Computing
- Conference Organisers Handbook — accurate guide to running a two-day 300-person conference. See also Yet Another Perl Conference guidelines.
- Twitter Shifting More Code to JVM — interesting how, at scale, there are some tools and techniques of the scorned Enterprise that the web cool kids must turn to. Some. Business Process Workflow XML Schemas will never find love.
- Louis von Ahn on Duolingo — from the team that gave us “OCR books as you verify you are a human” CAPTCHAs comes “learn a new language as you translate the web”. I would love to try this, it sounds great (and is an example of what crowdsourcing can be).
- Fully Bayesian Computing (PDF) — A fully Bayesian computing environment calls for the possibility of defining vector and array objects that may contain both random and deterministic quantities, and syntax rules that allow treating these objects much like any variables or numeric arrays. Working within the statistical package R, we introduce a new object-oriented framework based on a new random variable data type that is implicitly represented by simulations. Perl made text processing easy because strings were first-class objects with a rich set of functions to operate on them; Node.js has a sweet HTTP library; it’s interesting to see how much more intuitive an algorithm becomes when random variables are a data type. (via BigData)
Buying a Micro, Education Entrepreneurship, Faceted Search, Vector-Graphics Scripting
- Electric Dreams – The 1980s ‘The Micro Home Computer Of 1982′ (YouTube) — from a reality show where a gadget-using family are forced to relive 30 years of technology invention, one year each day. This clip is where they’re forced to choose a microcomputer from the rush of early hobbyist machines in the 80s: Spectrum, Dragon-32, etc. (via Skud)
- K-12 Entrepreneurship: Slow Entry, Distant Exit (PDF) — paper (from the set I pointed to yesterday) laying out in start terms the difficulty of educational entrepreneurship. Keeping the lights on and a teacher in every classroom consumes most of the annual money spent on education so that little is left over to generate or try new tools, techniques or approaches. Out of every dollar spent on education in 2005, only 3.5 cents was spent on materials, tools and services. Subtract the big mandatory purchases of textbooks and annual testing, and one is left with almost no free funds to deploy creatively. With class size reduction and teacher incentive pay ramping up around the country, the pressure on these budget lines continues to increase, reducing the dollars available for investment in breakthrough tools and services.
- Here Be Dragons (Bryan O’Sullivan) — the thorny problem of printing floating point numbers. Prior to Steele and White’s “How to print floating-point numbers accurately”, implementations of printf and similar rendering functions did their best to render floating point numbers, but there was wide variation in how well they behaved. A number such as 1.3 might be rendered as 1.29999999, for instance, or if a number was put through a feedback loop of being written out and its written representation read back, each successive result could drift further and further away from the original.
Sorting Out 9/11, Tagging Text, Unlocking Scientific Publishing, and Internet Archive's Meatspace Branch
- Sorting Out 9/11 (New Yorker) — the thorniest problem for the 9/11 memorial was the ordering of the names. Computer science to the rescue!
- Tagger — Python library for extracting tags (statistically significant words or phrases) from a piece of text.
- Free Science, One Paper at a Time (Wired) — Jonathan Eisen’s attempt to collect and distribute his father’s scientific papers (which were written while a federal employee, so in the public domain), thwarted by old-fashioned scientific publishing. “But now,” says Jonathan Eisen, “there’s this thing called the Internet. It changes not just how things can be done but how they should be done.”
- Internet Archive Launches Physical Archive — I’m keen to see how this develops, because physical storage has problems that digital does not. I’d love to see the donor agreement require the donor to give the archive full rights to digitize and distribute under open licenses. That’d put the Internet Archive a step in front of traditional archives, museums, libraries, and galleries, whose donor agreements typically let donors place arbitrary specifications on use and reuse (“must be inaccessible for 50 years”, “no commercial use”, “no use that compromises the work”, etc.), all of which are barriers to wholesale digitization and reuse.
Email Game, Faster B Trees, RFID+Projectors, and Airport Express Broken
- The Email Game — game mechanics to get you answering email more efficiently. Can’t wait to hear that conversation with corporate IT. “You want us to install what on the Exchange server?” (via Demo Day Wrapup)
- Stratified B-trees and versioning dictionaries — A classic versioned data structure in storage and computer science is the copy-on-write (CoW) B-tree — it underlies many of today’s file systems and databases, including WAFL, ZFS, Btrfs and more. Unfortunately, it doesn’t inherit the B-tree’s optimality properties; it has poor space utilization, cannot offer fast updates, and relies on random IO to scale. Yet, nothing better has been developed since. We describe the `stratified B-tree’, which beats all known semi-external memory versioned B-trees, including the CoW B-tree. In particular, it is the first versioned dictionary to achieve optimal tradeoffs between space, query and update performance. (via Bob Ippolito)
- DisplayCabinet (Ben Bashford) — We embedded a group of inanimate ornamental objects with RFID tags. Totems or avatars that represent either people, products or services. We also added RFID tags to a set of house keys and a wallet. Functional things that you carry with you. This group of objects combine with a set of shelves containing a hidden projector and RFID reader to become DisplayCabinet. (via Chris Heathcote)
- shairport — Aussie pulled the encryption keys from an Airport Express device, so now you can have software pretend to be an Airport Express.
Android Firefox, CloudPlayer Licenses, Github Lessons, and Data Structures
- Firefox for Android — faster than stock browser, apparently.
- Amazon CloudPlayer Needs No Licenses (Ars Technica) — that’s what Amazon claim, anyway. Because users upload the files (rather than accessing a central single copy of the ripped music), Amazon think they need no license. If this holds, expect Google and Amazon to follow suit.
- Ten Lessons from Github’s First Year — Your customers are most likely early adopters and love to see new features roll out every few weeks. If this results in a little bit of downtime, they’ll easily forgive you, as long as those features are sweet. In the early days of GitHub, we’d deploy up to ten times in one afternoon, always inching closer to that target. Make good use of that first year, because once the big important customers start rolling in, you have to be a lot more careful about hitting one of them with a stray bullet. Later in the game, downtime and botched deploys are money lost and you have to rely more on building instruments to predict where you should aim. Thoughtful take on agile and continuous deployment, among other things.
- What Are The Lesser-Known But Cool Data Structures? (Stack Overflow) — I have no joke here, I just like to say “cool data structures”. (via Joshua Schachter)
Commandline for Story, Dystopic Predictions, Studying Failures, and Two Great Tastes
- Curveship — a new interactive fiction system that can tell the same story in many different ways. Check out the examples on the home page. Important because interactive fiction and the command-lines of our lives are inextricably intertwined.
- Egypt’s Revolution: Coming to an Economy Near You (Umair Haque) — more dystopic prediction, but this phrase rings true: The lesson: You can’t steal the future forever — and, in a hyperconnected world, you probably can’t steal as much of it for as long.
- Why Startups Fail — failure is a more instructive teacher than success, so simply studying successful startups isn’t enough. (via Hacker News)
- Computer Science and Philosophy — Oxford is offering a program studying CS and Philosophy together. the two disciplines share a broad focus on the representation of information and rational inference, embracing common interests in algorithms, cognition, intelligence, language, models, proof, and verification. Computer Scientists need to be able to reflect critically and philosophically about these, as they push forward into novel domains. Philosophers need to understand them within a world increasingly shaped by computer technology, in which a whole new range of enquiry has opened up, from the philosophy of AI, artificial life and computation, to the ethics of privacy and intellectual property, to the epistemology of computer models (e.g. of global warming). I wish every CS student had taken a course in ethics.