"computer vision" entries

Using Python for Computer Vision

Jan Erik Solem describes elements and useful tools for computer vision

In this interview, Jan Erik Solem, author of the upcoming book "Programming Computer Vision with Python," describes the uses for some common operations, and choices programmers have.

Four short links: 2 May 2011

Four short links: 2 May 2011

Internet Cafe Culture, Image Processing, Library Mining, and MediaWiki Parsing

  1. Chinese Internet Cafes (Bryce Roberts) — a good quick read. My note: people valued the same things in Internet cafes that they value in public libraries, and the uses are very similar. They pose a similar threat to the already-successful, which is why public libraries are threatened in many Western countries.
  2. SIFT — the Scale Invariant Feature Transform library, built on OpenCV, is a method to detect distinctive, invariant image feature points, which easily can be matched between images to perform tasks such as object detection and recognition, or to compute geometrical transformations between images. The licensing seems dodgy–MIT code but lots of “this isn’t a license to use the patent!” warnings in the LICENSE file. (via Joshua Schachter)
  3. The Secret Life of Libraries (Guardian) — I like the idea of the most-stolen-books revealing something about a region; it’s an aspect of data revealing truth. For a while, Terry Pratchett was the most-shoplifted author in England but newspapers rarely carried articles about him or mentioned his books (because they were genre fiction not “real” literature). (via Brian Flaherty)
  4. Sweble — MediaWiki parser library. Until today, Wikitext had been poorly defined. There was no grammar, no defined processing rules, and no defined output like a DOM tree based on a well defined document object model. This is to say, the content of Wikipedia is stored in a format that is not an open standard. The format is defined by 5000 lines of php code (the parse function of MediaWiki). That code may be open source, but it is incomprehensible to most. That’s why there are 30+ failed attempts at writing alternative parsers. (via Dirk Riehle)
Four short links: 22 March 2011

Four short links: 22 March 2011

Local Community, Building Memories, Social Media, and ChumbyVision

  1. EveryBlock Redesigned — EB has been defined for a while now as “that site that makes my city’s statistics useful and relevant”. Now they’re getting more into the user-reporting: As valuable as automated updates of crime, media mentions, and other EveryBlock news are, contributions from your fellow neighbors are significantly more meaningful and useful. While we’re not removing our existing aggregation of public records and other neighborhood information (more on this in a bit), we’ve come to realize that human participation is essential, not only as a layer on top but as the bedrock of the site. They have a new mission: our goal is to help you make your block a better place. That’s a bold goal, and quite a big change from where they were at. Will they manage any aspect of journalism, or will this become a GroupOn-ad-filled geo-portal for MSNBC? Looking forward to finding out.
  2. Typography in 8 Bits: System Fonts — nifty rundown of fonts from the microcomputer days. I still go a bit weak-kneed at the sight of the C64 fonts. Which aspect of the system you’re building will be remembered with weak knees in (gulp) thirty years’ time? (via Joshua Schachter)
  3. Twitter in the Christchurch Earthquake — analysis of the tweets around the quake, including words and retweets. (via Richard Wood)
  4. ChumbyCV — computer vision framework for Chumby. CV on low-power ubiquitous hardware makes devices smarter and be higher-level sensors of activity and objects. (via BERG London)
Four short links: 15 October 2010

Four short links: 15 October 2010

Long Tail, Copyright vs Preservation, Diminished Reality, and Augmented Data

  1. Mechanical Turk Requester Activity: The Insignificance of the Long TailFor Wikipedia we have the 1% rule, where 1% of the contributors (this is 0.003% of the users) contribute two thirds of the content. In the Causes application on Facebook, there are 25 million users, but only 1% of them contribute a donation. […] The lognormal distribution of activity, also shows that requesters increase their participation exponentially over time: They post a few tasks, they get the results. If the results are good, they increase by a percentage the size of the tasks that they post next time. This multiplicative behavior is the basic process that generates the lognormal distribution of activity.
  2. Copyright Destroying Historic Audio — so says the Library of Congress. Were copyright law followed to the letter, little audio preservation would be undertaken. Were the law strictly enforced, it would brand virtually all audio preservation as illegal. Copyright laws related to preservation are neither strictly followed nor strictly enforced. Consequently, some audio preservation is conducted.
  3. Diminished Reality (Ray Kurzweil) — removes objects from video in real time. Great name, “diminished reality”. (via Andy Baio)
  4. Data Enrichment Service — using linked government data to augment text with annotations and links. (via Jo Walsh on Twitter)
Four short links: 1 October 2010

Four short links: 1 October 2010

Javascript Hacking, Digitization, Computer Vision, and Cyborg Contemplation

  1. Interview with Marcin Wichary (Ajaxian) — interview with the creator of Google’s Pacman logo, the original HTML5 slide deck. One of the first popular home video game consoles was 1977’s Atari VCS 2600. It was an incredibly simple piece of hardware. It didn’t even have video memory – you literally had to construct pixels just moments before they were handed to the electron gun. It was designed for very specific, trivial games: two players, some bullets and a very sparse background. All the launch games looked like that. But within five years, companies figured out how to make games like Pitfall, which were much, much cooler and more sophisticated. Here’s the kicker: if you were to take those games, go back in time, and show them even to the *creators* of VCS, I bet they would tell you “Naah, it’s impossible to do that. The hardware we just put together won’t ever be able to handle this.” Likewise, if you were to take Google Maps or iPhone Web apps, take your deLorean to 1991 and show them to Tim Berners-Lee, he’d be all like “get the hell out of here.” (via Russ Weakley)
  2. Liberating LivesThe historian Tim Hitchcock, behind projects such as the Old Bailey Online and London Lives, has reflected on the impact of digitisation on our access to archives. Archives, he notes, tend to reflect the assumptions and practices of the institutions that created them. But by providing new ways into these records systems, technology can undermine the power relations that persist within their structures. Read the entire post, which has a moving description of the bureaucracy of Australia’s racism and the modern-day projects built on it. (via spanishmanners on Twitter)
  3. Deblurring Images — interesting research work reconstructing original scenes from blurred images. (via anselm on Twitter)
  4. 50 Years of Cyborgs: I Have Not the Words (Quinn Norton) — We need language that lets us talk about the terrorism of little changes. Be they good or bad, they are terrible in aggregate. Thought-provoking essay pushing our ideas of change, future, technology, and culture until they break. (via kevinmarks on Twitter)
Four short links: 4 August 2010

Four short links: 4 August 2010

Python Reasoning, Learning the Right Way, Curated Folksonomy, Arduino Image Correction

  1. FuXiPython-based, bi-directional logical reasoning system for the semantic web from the folks at the Open Knowledge Foundation. (via About Inferencing)
  2. Harness the Power of Being an IdiotI learn by trying to build something, there’s no other way I can discover the devils-in-the-details. Unfortunately that’s an incredibly inefficient way to gain knowledge. I basically wander around stepping on every rake in the grass, while the A Students memorize someone else’s route and carefully pick their way across the lawn without incident. My only saving graces are that every now and again I discover a better path, and faced with a completely new lawn I have an instinct for where the rakes are.
  3. Stack Overflow’s Curated Folksonomy — community-driven tag synonym system to reduce the chaos of different names for the same thing. (via Skud)
  4. Image Deblurring using Inertial Measurement Sensors (Microsoft Research) — using Arduino to correct motion blur. (via Jon Oxer)
Four short links: 5 May 2010

Four short links: 5 May 2010

Web IDEs, Timely Election Displays, Face Recognition, # Books/Kindle

  1. Sketch for Processing — an IDE for Processing based on Mozilla’s Bespin.
  2. British Election Results to be Broadcast on Big Ben — the monument is the message. Lovely integration of real-time data and architecture, an early step for urban infrastructure as display.
  3. Face.com API — an alpha API for face recognition.
  4. Average Number of Books/Kindle — short spreadsheet figuring out, from cited numbers. (Spoiler: the answer is 27)
Four short links: 28 April 2010

Four short links: 28 April 2010

Fair Use Economy, Deconstituted Appliances, 3D Vision, Redis for Fun and Profit

  1. Fair Use in the US Economy (PDF) — prepared by IT lobby in the US, it’s the counterpart to Big ©’s fictitious billions of dollars of losses due to file sharing. Take each with a grain of salt, but this is interesting because it talks about the industries and businesses that the fair use laws make possible.
  2. Disassembled Household Appliances — neat photos of the pieces in common equipment like waffle irons, sandwich makers, can openers, etc. (via evilmadscientist)
  3. GelSight — gel block on a sheet of glass, lit from below with lights and then scanned with cameras, lets you easily capture 3D qualities of the objects pressed into it. Very cool demo–you can see finger prints, pulse, and even make out designs on a $100 bill.
  4. Redis Tutorial (Simon Willison) — Redis is a very fast collection of useful behaviours wrapped around a distributed key-value store. You get locks, IDs, counters, sets, lists, queues, replication, and more.
Four short links: 22 January 2010

Four short links: 22 January 2010

notmuch Email, Mobile Processing, Realtime Mocap, and Making Money from Books

  1. notmuch — commandline tagging and fast search for a mailbox, regardless of which mail client you use.
  2. Processing for Android — pre-release versions of a Processing for Android devices. Mobile visual programming makes for interesting possibilities.
  3. Binary Body Double: Microsoft Reveals the Science Behind Project Natal for Xbox 360 — machine learning to recognize poses and render in the game at 30fps. It’s a basic real-time mocap and render.
  4. The Monetization Paradox — interesting post by Charlie Stross about the quandry of authors. he proposed $9.99 cap on ebooks replaces the high-end $24 hardcover. Not only does it mean less royalties for the authors, it means less money for the publishers — or, more importantly, their marketing divisions. Here’s a hint: if I wanted to spend my time marketing my books I’d have gone into marketing. I’m a writer. Every hour spent on marketing activities is an hour spent not writing. Ditto editing, proofreading, commissioning cover art, and so on. This is what I have publishers for.
Four short links: 8 December 2009

Four short links: 8 December 2009

Python Moratorium, Math Pictures, Assemblers Needed, Tennis Vision

  1. Python’s Moratorium — Python language designers have declared a moratorium on enhancement proposals (feature requests) while the world’s Python programmers get used to the last batch of New And Shiny they shipped. I’m reasonably sure that the ALGOL designers went through exactly the same discussions, and I know Perl did too. So, don’t be afraid of it – don’t think that Python is evolutionarily dead – it’s not. We’re taking a stability and adoption break, a breather. We’re doing this to help users and developers, not to just be able to say “no” to every random idea sent to python-ideas, and not because we’re done. Reminds me of Perl god Jarkko Hietaniemi’s signature file: “There is this special biologist word we use for ‘stable’. It is ‘dead’. — Jack Cohen.
  2. This Week’s Finds in Mathematical Physics — I can’t meaningfully contribute to the math, but golly them pictures are purty! (via Hacker News)
  3. x86 Assembly Encounter To use a construction industry metaphor, an average x86 assembler has the complexity and usefulness of a hammer, while the DSP world is using high-speed mag-rail blast-o-matic nail guns with automatic feeders and superconducting magnets. […] I find it ridiculous that the most popular computing platform in the world does not have a decent assembler. What’s even worse, from the discussions I’ve seen on the net, people are mostly interested in how fast the assembler is (?!) rather than how much time it saves the programmer. (via Hacker News)
  4. Finding Tennis Courts in Aerial Photos — more hacking with computer vision techniques and publicly-available data. This is going to lead to good things (and some unpleasant surprises, as that which was formerly “too hard to find” ceases to be so). (via Simon Willison)