- Gravity in the Margins (Got Medieval) — illuminating illuminated manuscripts with Mario. (via BoingBoing)
- Hours Days, Who’s Counting? (Jon Udell) — What prompted me to check? My friend Mike Caulfield, who’s been teaching and writing about quantitative literacy, says it’s because in this case I did have some touchstone facts parked in my head, including the number 10 million (roughly) for barrels of oil imported daily to the US. The reason I’ve been working through a bunch of WolframAlpha exercises lately is that I know I don’t have those touchstones in other areas, and want to develop them. The idea of “touchstone facts” resonates with me.
- Spotting Fake Reviewer Groups in Consumer Reviews (PDF) — gotta love any paper that says We calculated the “spamicity” (degree of spam) of each group by assigning 1 point for each spam judgment, 0.5 point for each borderline judgment and 0 point for each non-spam judgment a group received and took the average of all 8 labelers. (via Google Research Blog)
- Visualizing Physical Activity Using Abstract Ambient Art (Quantified Self) — kinda like the iTunes visualizer but for your Fitbit Tracker.
ENTRIES TAGGED "machine learning"
Strata Week: Data prospecting with Kaggle
Kaggle now accepting data before a contest, HP's Autonomy purchase comes into focus, Cloudera's new Hadoop distribution.
In this week's data news, Kaggle launches Prospect, HP unveils its big data plans, and Cloudera releases CDH4 (the latest version of its Hadoop distribution).
Four short links: 10 May 2012
Illuminated Mario, Touchstone Facts, Calculating Spamicity, and Abstract Quantified Self
Four short links: 2 May 2012
Elective Dickery, Probabilistic Data Analysis, Data Cleaning, and SSL Security
- Punting on SxSW (Brad Feld) — I came across this old post and thought: if you can make money by being a dick, or make money by being a caring family person, why would you choose to be a dick? As far as I can tell, being a dick is optional. Brogrammers, take note. Be more like Brad Feld, who prioritises his family and acts accordingly.
- Probabilistic Structures for Data Mining — readable introduction to useful algorithms and datastructures showing their performance, reliability, and resources trade-off. (via Hacker News)
- Dataset — a Javascript library for transforming, querying, manipulating data from different sources.
- Many HTTPS Servers are Insecure — 75% still vulnerable to the BEAST attack.
Four short links: 19 April 2012
Text Similarity, Designing Engagement, Clustering Stories, and Prince of Persia
- Superfastmatch — open source text comparison tool, used to locate plagiarism/churnalism in online news sites. You can pull out the text engine and use it for your own “find where this text is used elsewhere” applications (e.g., what’s being forwarded out in email, how much of this RFP is copy and paste, what’s NOT boilerplate in this contract, etc.). (via Pete Warden)
- Ten Design Principles for Engaging Math Tasks (Dan Meyer) — education gold, engagement gold, and some serious ideas you can use in your own apps.
- Clustering Related Stories (Jenny Finkel) — description of how to cluster related stories, talks about some of the tricks. Interesting without being too scary.
- Prince of Persia (GitHub) — I have waited to see if the novelty wore off, but I still find this cool: 1980s source code on GitHub.
What it takes to build great machine learning products
Rich machine learning products come from skilled and knowledgeable teams.
Specific insights into a problem and careful model design separate a machine learning system that doesn't work from one that people will actually use.
Editorial Radar with Mike Loukides & Mike Hendrickson
Discussion on machine learning, 3D printing, devices and JavaScript
In this first episode of "Editorial Radar," O'Reilly editors Mike Loukides and Mike Hendrickson discuss the important technologies they're tracking.
Strata Week: Machine learning vs domain expertise
Debating the data skills of machines and experts, a key data move for Microsoft, and Google Analytics gets social.
This week's data news includes another look at the Strata Conference's debate about machine learning versus subject matter expertise, Raghu Ramakrishnan moves from Yahoo to Microsoft, and more social data comes to Google Analytics.
Four short links: 16 March 2012
Squirrel Targeting with Computer Vision, Audio Recognition, Single Page Apps, and Persisting at Failing
- Militarizing Your Backyard With Python and Computer Vision (video) — using a water cannon, computer video, Arduino, and Python to keep marauding squirrel hordes under control. See the finished result for Yakkity Saxed moist rodent goodness.
- Soundbite — dialogue search for Apple’s Final Cut Pro and Adobe Premiere Pro. Boris Soundbite quickly and accurately finds any word or phrase spoken in recorded media. Shoot squirrels with computer vision, search audio with computer hearing. We live in the future, people. (via Andy Baio)
- Single Page Apps with Backbone.js — interesting and detailed dissection of how one site did it. Single page apps are where the server sends back one HTML file which changes (via Javascript) in response to the user’s activity, possibly with API calls happening in the background, but where the browser is very definitely not requesting more full HTML pages from the server. The idea is to have speed (pull less across the wire each time the page changes) and also to use the language you already know to build the web page (Javascript).
- Why Finish Books? (NY Review of Books) — the more bad books you finish, the fewer good ones you”ll have time to start. Applying this to the rest of life is left as an exercise for the reader.
Radar
Radar on
Radar on
Radar on
Radar on 