- Gravity in the Margins (Got Medieval) — illuminating illuminated manuscripts with Mario. (via BoingBoing)
- Hours Days, Who’s Counting? (Jon Udell) — What prompted me to check? My friend Mike Caulfield, who’s been teaching and writing about quantitative literacy, says it’s because in this case I did have some touchstone facts parked in my head, including the number 10 million (roughly) for barrels of oil imported daily to the US. The reason I’ve been working through a bunch of WolframAlpha exercises lately is that I know I don’t have those touchstones in other areas, and want to develop them. The idea of “touchstone facts” resonates with me.
- Spotting Fake Reviewer Groups in Consumer Reviews (PDF) — gotta love any paper that says We calculated the “spamicity” (degree of spam) of each group by assigning 1 point for each spam judgment, 0.5 point for each borderline judgment and 0 point for each non-spam judgment a group received and took the average of all 8 labelers. (via Google Research Blog)
- Visualizing Physical Activity Using Abstract Ambient Art (Quantified Self) — kinda like the iTunes visualizer but for your Fitbit Tracker.
ENTRIES TAGGED "machine learning"
Kaggle now accepting data before a contest, HP's Autonomy purchase comes into focus, Cloudera's new Hadoop distribution.
In this week's data news, Kaggle launches Prospect, HP unveils its big data plans, and Cloudera releases CDH4 (the latest version of its Hadoop distribution).
Illuminated Mario, Touchstone Facts, Calculating Spamicity, and Abstract Quantified Self
Elective Dickery, Probabilistic Data Analysis, Data Cleaning, and SSL Security
- Punting on SxSW (Brad Feld) — I came across this old post and thought: if you can make money by being a dick, or make money by being a caring family person, why would you choose to be a dick? As far as I can tell, being a dick is optional. Brogrammers, take note. Be more like Brad Feld, who prioritises his family and acts accordingly.
- Probabilistic Structures for Data Mining — readable introduction to useful algorithms and datastructures showing their performance, reliability, and resources trade-off. (via Hacker News)
- Many HTTPS Servers are Insecure — 75% still vulnerable to the BEAST attack.
Text Similarity, Designing Engagement, Clustering Stories, and Prince of Persia
- Superfastmatch — open source text comparison tool, used to locate plagiarism/churnalism in online news sites. You can pull out the text engine and use it for your own “find where this text is used elsewhere” applications (e.g., what’s being forwarded out in email, how much of this RFP is copy and paste, what’s NOT boilerplate in this contract, etc.). (via Pete Warden)
- Ten Design Principles for Engaging Math Tasks (Dan Meyer) — education gold, engagement gold, and some serious ideas you can use in your own apps.
- Clustering Related Stories (Jenny Finkel) — description of how to cluster related stories, talks about some of the tricks. Interesting without being too scary.
- Prince of Persia (GitHub) — I have waited to see if the novelty wore off, but I still find this cool: 1980s source code on GitHub.
Rich machine learning products come from skilled and knowledgeable teams.
Specific insights into a problem and careful model design separate a machine learning system that doesn't work from one that people will actually use.
Carsharing boosts city governments, why complex systems fail, and what web ops teams could do with big data.
This week on O'Reilly: How Zipcar's technology is saving big money for U.S. city governments, why scalable clouds need simple parts, and pondering the possibilities of web ops and machine learning.
In this first episode of "Editorial Radar," O'Reilly editors Mike Loukides and Mike Hendrickson discuss the important technologies they're tracking.
Google Maps alternatives, inside Dart, and the upside of offline.
This week on O'Reilly: StreetEasy's Sebastian Delmont explained why his team left Google Maps behind, we looked at the ins and outs of the Dart programming platform, and Jim Stogdill considered the alternatives to always-on living.
Squirrel Targeting with Computer Vision, Audio Recognition, Single Page Apps, and Persisting at Failing
- Militarizing Your Backyard With Python and Computer Vision (video) — using a water cannon, computer video, Arduino, and Python to keep marauding squirrel hordes under control. See the finished result for Yakkity Saxed moist rodent goodness.
- Soundbite — dialogue search for Apple’s Final Cut Pro and Adobe Premiere Pro. Boris Soundbite quickly and accurately finds any word or phrase spoken in recorded media. Shoot squirrels with computer vision, search audio with computer hearing. We live in the future, people. (via Andy Baio)
- Why Finish Books? (NY Review of Books) — the more bad books you finish, the fewer good ones you”ll have time to start. Applying this to the rest of life is left as an exercise for the reader.