"cv" entries

Four short links: 27 July 2015

Four short links: 27 July 2015

Google’s Borg, Georgia v. Malamud, SLAM-aware system, and SmartGPA

  1. Large-scale Cluster Management at Google with BorgGoogle’s Borg system is a cluster manager that runs hundreds of thousands of jobs, from many thousands of different applications, across a number of clusters, each with up to tens of thousands of machines. […] We present a summary of the Borg system architecture and features, important design decisions, a quantitative analysis of some of its policy decisions, and a qualitative examination of lessons learned from a decade of operational experience with it.
  2. Georgia Sues Carl Malamud (TechDirt) — for copyright infringement… for publishing an official annotated copy of the state's laws. […] the state points directly to the annotated version as the official laws of the state.
  3. Monocular SLAM Supported Object Recognition (PDF) — a monocular SLAM-aware object recognition system that is able to achieve considerably stronger recognition performance, as compared to classical object recognition systems that function on a frame-by-frame basis. (via Improving Object Recognition for Robots)
  4. SmartGPA: How Smartphones Can Assess and Predict Academic Performance of College Students (PDF) — We show that there are a number of important behavioral factors automatically inferred from smartphones that significantly correlate with term and cumulative GPA, including time series analysis of activity, conversational interaction, mobility, class attendance, studying, and partying.
Comment: 1
Four short links: 6 July 2015

Four short links: 6 July 2015

DeepDream, In-Flight WiFi, Computer Vision in Preservation, and Testing Distributed Systems

  1. DeepDream — the software that’s been giving the Internet acid-free trips.
  2. In-Flight WiFi Business — numbers and context for why some airlines (JetBlue) have fast free in-flight wifi while others (Delta) have pricey slow in-flight wifi. Four years ago ViaSat-1 went into geostationary orbit, putting all other broadband satellites to shame with 140 Gbps of total capacity. This is the Ka-band satellite that JetBlue’s fleet connects to, and while the airline has to share that bandwidth with homes across of North America that subscribe to ViaSat’s Excede residential broadband service, it faces no shortage of capacity. That’s why JetBlue is able to deliver 10-15 Mbps speeds to its passengers.
  3. British Library Digitising Newspapers (The Guardian) — as well as photogrammetry methods used in the Great Parchment Book project, Terras and colleagues are exploring the potential of a host of techniques, including multispectral imaging (MSI). Inks, pencil marks, and paper all reflect, absorb, or emit particular wavelengths of light, ranging from the infrared end of the electromagnetic spectrum, through the visible region and into the UV. By taking photographs using different light sources and filters, it is possible to generate a suite of images. “We get back this stack of about 40 images of the [document] and then we can use image-processing to try to see what is in [some of them] and not others,” Terras explains.
  4. Testing a Distributed System (ACM) — This article discusses general strategies for testing distributed systems as well as specific strategies for testing distributed data storage systems.
Comment
Four short links: 1 July 2015

Four short links: 1 July 2015

Recovering from Debacle, Open IRS Data, Time Series Requirements, and Error Messages

  1. Google Dev Apologies After Photos App Tags Black People as Gorillas (Ars Technica) — this is how you recover from a unequivocally horrendous mistake.
  2. IRS Finally Agrees to Release Non-Profit Records (BoingBoing) — Today, the IRS released a statement saying they’re going to do what we’ve been hoping for, saying they are going to release e-file data and this is a “priority for the IRS.” Only took $217,000 in billable lawyer hours (pro bono, thank goodness) to get there.
  3. Time Series Database Requirements — classic paper, laying out why time-series databases are so damn weird. Their access patterns are so unique because of the way data is over-gathered and pushed ASAP to the store. It’s mostly recent, mostly never useful, and mostly needed in order. (via Thoughts on Time-Series Databases)
  4. Compiler Errors for Humans — it’s so important, and generally underbaked in languages. A decade or more ago, I was appalled by Python’s errors after Perl’s very useful messages. Today, appreciating Go’s generally handy errors. How a system handles the operational failures that will inevitably occur is part and parcel of its UX.
Comment
Four short links: 10 June 2015

Four short links: 10 June 2015

Product Sins, Container Satire, Dong Detection, and Evolving Code Designs

  1. The 11 Deadly Sins of Product Development (O’Reilly Radar) — they’re traps that are easy to fall into.
  2. It’s the Future — satire, but like all good satire it’s built on a rich vein of truth. Genuine guffaw funny, but Caution: Contains Rude Words.
  3. Difficulty of Dong Detection — accessible piece about how automated “inappropriate” detection remains elusive. (via Mind Hacks)
  4. Evolution of Code Design at Facebook — you may not have Facebook-scale scale problems, but if you’re having scale problems then Facebook’s evolution (not just their solutions) will interest you.
Comment
Four short links: 5 June 2015

Four short links: 5 June 2015

IoT and New Hardware Movement, OpenCV 3, FBI vs Crypto, and Transactional Datastore

  1. New Hardware and the Internet of Things (Jon Bruner) — The Internet of Things and the new hardware movement are not the same thing. The new hardware movement is driven by new tools for: Prototyping (inexpensive 3D printers, CNC machine tools, cheap and powerful microcontrollers, high-level programming languages on embedded systems); Fundraising and business development (Highway1, Lab IX); Manufacturing (PCH, Seeed); Marketing (Etsy, Quirky). The IoT is driven by: Ubiquitous connectivity; Cheap hardware (i.e., the new hardware movement); Inexpensive data processing and machine learning.
  2. OpenCV 3.0 Released — I hadn’t realised how much hardware acceleration comes out of the box with OpenCV.
  3. FBI: Companies Should Help us Prevent Encryption (WaPo) — as Mike Loukides says, we are in a Post-Modern age where we don’t trust our computers and they don’t trust us. It’s jarring to hear the organisation that (over-zealously!) investigates computer crime arguing that citizens should not be able to secure their communications. It’s like police arguing against locks.
  4. cockroacha scalable, geo-replicated, transactional datastore. The Wired piece about it drops the factoid that the creators of GIMP worked on Google’s massive BigTable-successor, Colossus. From Photoshop-alike to massive file systems. Love it.
Comment
Four short links: 20 May 2015

Four short links: 20 May 2015

Robots and Shadow Work, Time Lapse Mining, CS Papers, and Software for Reproducibility

  1. Rise of the Robots and Shadow Work (NY Times) — In “Rise of the Robots,” Ford argues that a society based on luxury consumption by a tiny elite is not economically viable. More to the point, it is not biologically viable. Humans, unlike robots, need food, health care and the sense of usefulness often supplied by jobs or other forms of work. Two thought-provoking and related books about the potential futures as a result of technology-driven change.
  2. Time Lapse Mining from Internet Photos (PDF) — First, we cluster 86 million photos into landmarks and popular viewpoints. Then, we sort the photos by date and warp each photo onto a common viewpoint. Finally, we stabilize the appearance of the sequence to compensate for lighting effects and minimize flicker. Our resulting time-lapses show diverse changes in the world’s most popular sites, like glaciers shrinking, skyscrapers being constructed, and waterfalls changing course.
  3. Git Repository of CS PapersThe intention here is to both provide myself with backups and easy access to papers, while also collecting a repository of links so that people can always find the paper they are looking for. Pull the repo and you’ll never be short of airplane/bedtime reading.
  4. Software For Reproducible ScienceThis quality is indeed central to doing science with code. What good is a data analysis pipeline if it crashes when I fiddle with the data? How can I draw conclusions from simulations if I cannot change their parameters? As soon as I need trust in code supporting a scientific finding, I find myself tinkering with its input, and often breaking it. Good scientific code is code that can be reused, that can lead to large-scale experiments validating its underlying assumptions.
Comment
Four short links: 11 May 2015

Four short links: 11 May 2015

Age of Infrastructure, Facial Expressions, Proof Assistants, and Programmer Talent

  1. Welcome to the Age of Infrastructure (Annalee Newitz) — The Internet isn’t that thing in there, inside your little glowing box. It’s in your washing machine, kitchen appliances, pet feeder, your internal organs, your car, your streets, the very walls of your house. You use your wearable to interface with the world out there.
  2. Facial Performance Sensing Head-Mounted Display (YouTube) — glorious use of an Oculus headset, to capture (for reproduction on an avatar) fine-grained facial expressions. From SIGGRAPH 2015.
  3. Mathematical Proof Assistants — human augmentation in mathematics.
  4. The Programmer Talent Myth (LWN) — Jacob Kaplan-Moss on the distribution of programmer talent and the damage that the bimodal myth causes.
Comments: 3
Four short links: 12 March 2015

Four short links: 12 March 2015

Billion Node Graphs, Asynchronous Systems, Deep Learning Hardware, and Vision Resources

  1. Mining Billion Node Graphs: Patterns and Scalable Algorithms (PDF) — slides from a CMU academic’s talk at C-BIG 2012.
  2. There Is No NowOne of the most important results in the theory of distributed systems is an impossibility result, showing one of the limits of the ability to build systems that work in a world where things can fail. This is generally referred to as the FLP result, named for its authors, Fischer, Lynch, and Paterson. Their work, which won the 2001 Dijkstra Prize for the most influential paper in distributed computing, showed conclusively that some computational problems that are achievable in a “synchronous” model in which hosts have identical or shared clocks are impossible under a weaker, asynchronous system model.
  3. Deep Learning Hardware GuideOne of the worst things you can do when building a deep learning system is to waste money on hardware that is unnecessary. Here I will guide you step by step through the hardware you will need for a cheap high performance system.
  4. Awesome Computer Vision — curated list of computer vision resources.
Comment
Four short links: 22 December 2014

Four short links: 22 December 2014

Manufacturers and Consumers, Time Management, Ethical Decisions, and Faux Faces

  1. Manufacturers and Consumers (Matt Webb) — manufacturers never spoke to consumers before. They spoke with distributors and retailers. But now products are connected to the Internet, manufacturers suddenly have a relationship with the consumer. And they literally don’t know what to do.
  2. Calendar Hacks (Etsy) — inspiration for your New Year’s resolution to waste less time.
  3. Making an Ethical Decision — there actually is an [web] app for that.
  4. Masks That Look Human to Computers — an artist creates masks that look like faces to face-recognition algorithms, but not necessarily to us. cf Deep Neural Networks are Easily Fooled.
Comment: 1
Four short links: 24 November 2014

Four short links: 24 November 2014

Magic Leap, Constant Improvement, Philanthropofallacies, and Chinese Manufacturing

  1. How Magic Leap is Secretly Creating a New Alternate Reality (Gizmodo) — amazing piece of investigative tech journalism.
  2. Better All The Time (New Yorker) — What we’re seeing is, in part, the mainstreaming of excellent habits. […] Everyone works hard. Everyone is really good.
  3. Stop Trying to Save the World (New Republic) — What I want to talk shit on is the paradigm of the Big Idea—that once we identify the correct one, we can simply unfurl it on the entire developing world like a picnic blanket. (note: some pottymouth language in this article, and some analysis I wholeheartedly agree with.)
  4. Christmas in YiwuWe travelled by container ship across the East China Sea before following the electronics supply chain around China, visiting factories, distributors, wholesalers and refineries. Fascinating! 22km of corridors in the mall that dollar store buyers visit to fill their shelves. I had never seen so many variations of the same product. Dozens of Christmas stockings bearing slightly different Santas and snowmen. Small tweaks on each theme. An in-house designer creates these designs. It feels like a brute force approach to design, creating every single possibility and then letting the market decide which it wants to buy. If none of the existing designs appeal to a buyer they can get their own designs manufactured instead. When a custom design is successful, with the customer placing a large order, it is copied by the factory and offered in their range to future buyers. The factory sales agent indicated that designs weren’t protected and could be copied freely, as long as trademarks were removed. Parallels with web design left as exercise to the reader. (via the ever-discerning Mr Webb)
Comment