- How to Write a Good Bio (Scott Berkun) — something we all have to do, and rarely do well the first time. Excellent advice.
- Scumbag Steve’s Advice for Annoying Facebook Girl — Some people can’t distinguish the internet from real life. There are people who refuse to believe my name isn’t Steve and that I am not really the scumbag (well not all the time, that is). Just remember who you are. And that you know you’re a decent kid. Blake (the guy whose image was adopted as “Scumbag Steve” by meme-makers) was 21 when he wrote that, and it remains the best advice for anyone dealing with sudden visibility in the public eye.
- The Battle for Obama’s Tech (The Verge) — same old story: the software that got Obama elected won’t be released. Instead it’ll atrophy and have to be rewritten in four years’ time. How do I know this? The morons at the Democratic Party did it with Kerry’s run and again for Obama’s first campaign. It’s a choice the OFA developers warn could not only squander the digital advantage the Democrats now hold, but also severely impact their ability to recruit top tech talent in the future.
- Precog Software (Wired) — researchers assembled a dataset of more than 60,000 crimes, including homicides, then wrote an algorithm to find the people behind the crimes who were more likely to commit murder when paroled or put on probation. Berk claims the software could identify eight future murderers out of 100. The software parses about two dozen variables, including criminal record and geographic location. The type of crime and the age at which it was committed, however, turned out to be two of the most predictive variables. [...] The software aims to replace the judgments parole officers already make based on a parolee’s criminal record and is currently being used in Baltimore and Philadelphia. I look forward to the study comparing human judgement from parole officers against algorithmic judgement.
ENTRIES TAGGED "writing"
Bio-Writing, Internet Fame, Obama's Tech, and Precog Software
A letter asking for an introduction meets a meditation on self-reliance.
Text Analysis Bundle, Scala Probabilistic Modeling, Game Analytics, and Encouraging Writing
- Pattern — a BSD-licensed bundle of Python tools for data retrieval, text analysis, and data visualization. If you were going to get started with accessible data (Twitter, Google), the fundamentals of analysis (entity extraction, clustering), and some basic visualizations of graph relationships, you could do a lot worse than to start here.
- Factorie (Google Code) — Apache-licensed Scala library for a probabilistic modeling technique successfully applied to [...] named entity recognition, entity resolution, relation extraction, parsing, schema matching, ontology alignment, latent-variable generative models, including latent Dirichlet allocation. The state-of-the-art big data analysis tools are increasingly open source, presumably because the value lies in their application not in their existence. This is good news for everyone with a new application.
- Playtomic — analytics as a service for gaming companies to learn what players actually do in their games. There aren’t many fields untouched by analytics.
- Write or Die — iPad app for writers where, if you don’t keep writing, it begins to delete what you wrote earlier. Good for production to deadlines; reflective editing and deep thought not included.
Access Over Ownership, Retro Programming, Replaying Writing, and Wearable Sensors
- Steve Case and His Companies (The Atlantic) — Maybe you see three random ideas. Case and his team saw three bets that paid off thanks to a new Web economy that promotes power in numbers and access over ownership. “Access over ownership” is a phrase that resonated. (via Walt Mossberg)
- Back to the Future — teaching kids to program by giving them microcomputers from the 80s. I sat my kids down with a C64 emulator and an Usborne book to work through some BASIC examples. It’s not a panacea, but it solves a lot of bootstrapping problems with teaching kids to program.
- Replaying Writing an Essay — Paul Graham wrote an essay using one of his funded startups, Stypi, and then had them hack it so you could replay the development with the feature that everything that was later deleted is highlighted yellow as it’s written. The result is fascinating to watch. I would like my text editor to show me what I need to delete ;)
- Jawbone Live Up — wristband that sync with iPhone. Interesting wearable product, tied into ability to gather data on ourselves. The product looks physically nice, but the quantified self user experience needs the same experience and smoothness. Intrusive (“and now I’m quantifying myself!”) limits the audience to nerds or the VERY motivated.
Commandline for Story, Dystopic Predictions, Studying Failures, and Two Great Tastes
- Curveship — a new interactive fiction system that can tell the same story in many different ways. Check out the examples on the home page. Important because interactive fiction and the command-lines of our lives are inextricably intertwined.
- Egypt’s Revolution: Coming to an Economy Near You (Umair Haque) — more dystopic prediction, but this phrase rings true: The lesson: You can’t steal the future forever — and, in a hyperconnected world, you probably can’t steal as much of it for as long.
- Why Startups Fail — failure is a more instructive teacher than success, so simply studying successful startups isn’t enough. (via Hacker News)
- Computer Science and Philosophy — Oxford is offering a program studying CS and Philosophy together. the two disciplines share a broad focus on the representation of information and rational inference, embracing common interests in algorithms, cognition, intelligence, language, models, proof, and verification. Computer Scientists need to be able to reflect critically and philosophically about these, as they push forward into novel domains. Philosophers need to understand them within a world increasingly shaped by computer technology, in which a whole new range of enquiry has opened up, from the philosophy of AI, artificial life and computation, to the ethics of privacy and intellectual property, to the epistemology of computer models (e.g. of global warming). I wish every CS student had taken a course in ethics.
Thumb Drives and the Cloud, FCC APIs, Mining on GFS, Check Your Prose with Scribe
- CloudUSB — a USB key containing your operating environment and your data + a protected folder so nobody can access you data, even if you lost the key + a backup program which keeps a copy of your data on an online disk, with double password protection. (via ferrouswheel on Twitter)
- FCC APIs — for spectrum licenses, consumer broadband tests, census block search, and more. (via rjweeks70 on Twitter)
- Sibyl: A system for large scale machine learning (PDF) — paper from Google researchers on how to build machine learning on top of a system designed for batch processing. (via Greg Linden)
- The Surprisingness of What We Say About Ourselves (BERG London) — I made a chart of word-by-word surprisingness: given the statement so far, could Scribe predict what would come next?
Crowdsourced Climate Science, Underground Map of Science, Programming Clue, and Great Molbio Writing
- GalaxyZoo for Climate Science? — GalaxyZoo is the crowdsourced physics research. A group of climate scientists want the same, to help predict “weather events”. See also the Guardian article. (via adw_tweets on Twitter)
- Crispian’s Science Map — gorgeous Underground-style map showing scientists and their contributions. (via arjenlentz on Twitter)
- Programming Things I Wish I Knew Earlier (Ted Dziuba) — opinionated piece, but boils down to “keep it simple until you can’t”, and “the more you know about the actual hardware, the better you can code”. With EC2, when Amazon says “I/O performance: High”, what does that even mean? Is that suitable for a heavy random read scenario? (via Hacker News)
- The Molecular Biology Carnival, 2ed — collection of excellent blog writing about molecular biology. (via BioinfoTools on Twitter)
Narrative and Structure, Teaching Science, Time-Series Statistics, and Who Benefits from Open Source
- Why Narrative and Structure are Important (Ed Yong) — Ed looks at how Atul Gawande’s piece on death and dying, which is 12,000 words long, is an easy and fascinating read despite the length.
- Understanding Science (Berkeley) — simple teaching materials to help students understand the process of science. (via BoingBoing comments)
- Sax: Symbolic Aggregate approXimation — SAX is the first symbolic representation for time series that allows for dimensionality reduction and indexing with a lower-bounding distance measure. In classic data mining tasks such as clustering, classification, index, etc., SAX is as good as well-known representations such as Discrete Wavelet Transform (DWT) and Discrete Fourier Transform (DFT), while requiring less storage space. In addition, the representation allows researchers to avail of the wealth of data structures and algorithms in bioinformatics or text mining, and also provides solutions to many challenges associated with current data mining tasks. One example is motif discovery, a problem which we recently defined for time series data. There is great potential for extending and applying the discrete representation on a wide class of data mining tasks. Source code has “non-commercial” license. (via rdamodharan on Delicious)
- Open Source OSCON (RedMonk) — The business of selling open source software, remember, is dwarfed by the business of using open source software to produce and sell other services. And yet historically, most of the focus on open source software has accrued to those who sold it. Today, attention and traction is shifting to those who are not in the business of selling software, but rather share their assets via a variety of open source mechanisms. (via Simon Phipps)