- UAV Offers of Assistance in Colorado Rebuffed by FEMA — we were told by FEMA that anyone flying drones would be arrested. [...] Civil Air Patrol and private aircraft were authorized to fly over the small town tucked into the base of Rockies. Unfortunately due to the high terrain around Lyons and large turn radius of manned aircraft they were flying well out of a useful visual range and didn’t employ cameras or live video feed to support the recovery effort. Meanwhile we were grounded on the Lyons high school football field with two Falcons that could have mapped the entire town in less than 30 minutes with another few hours to process the data providing a near real time map of the entire town.
- Texas Bans Some Private Use of Drones (DIY Drones) — growing move for govt to regulate drones.
- IETF PRISM-Proof Plans (Parity News) — Baker starts off by listing out the attack degree including he likes of information / content disclosure, meta-data analysis, traffic analysis, denial of service attacks and protocol exploits. The author than describes the different capabilities of an attacker and the ways in which an attack can be carried out – passive observation, active modification, cryptanalysis, cover channel analysis, lawful interception, Subversion or Coercion of Intermediaries among others.
- Data Mining and Analysis: Fundamental Concepts and Algorithms (PDF) — 650 pages on cluster, sequence mining, SVNs, and more. (via author’s page)
ENTRIES TAGGED "Big Data"
Drones Dismissed, Drones Denied, Passing PRISM, and Data Analysis and Mining
Remote Work, Raspberry Pi Code Machine, Low-Latency Data Processing, and Probabilistic Table Parsing
- Fog Creek’s Remote Work Policy — In the absence of new information, the assumption is that you’re producing. When you step outside the HQ work environment, you should flip that burden of proof. The burden is on you to show that you’re being productive. Is that because we don’t trust you? No. It’s because a few normal ways of staying involved (face time, informal chats, lunch) have been removed.
- MillWheel (PDF) — a framework for building low-latency data-processing applications that is widely used at Google. Users specify a directed computation graph and application code for individual nodes, and the system manages persistent state and the continuous ﬂow of records, all within the envelope of the framework’s fault-tolerance guarantees. From Google Research.
- Probabilistic Scraping of Plain Text Tables — the method leverages topological understanding of tables, encodes it declaratively into a mixed integer/linear program, and integrates weak probabilistic signals to classify the whole table in one go (at sub second speeds). This method can be used for any kind of classification where you have strong logical constraints but noisy data.
PaaS Vendors, Educational MMO, Changing Culture, Data Mythologies
- Amazon Compute Numbers (ReadWrite) — AWS offers five times the utilized compute capacity of each of its other 14 top competitors—combined. (via Matt Asay)
- MIT Educational MMO — The initial phase will cover topics in biology, algebra, geometry, probability, and statistics, providing students with a collaborative, social experience in a systems-based game world where they can explore how the world works and discover important scientific concepts. (via KQED)
- Changing Norms (Atul Gawande) — neither penalties nor incentives achieve what we’re really after: a system and a culture where X is what people do, day in and day out, even when no one is watching. “You must” rewards mere compliance. Getting to “X is what we do” means establishing X as the norm.
- The Mythologies of Big Data (YouTube) — Kate Crawford at UC Berkeley iSchool. The six months: ‘Big data are new’, ‘Big data is objective’, ‘Big data don’t discriminate’, ‘Big data makes cities smart’, ‘Big data is anonymous’, ‘You can opt out of big data’. (via Sam Kinsley)
Constant KV Store, Google Me, Learned Bias, and DRM-Stripping Lego Robot
- Sparkey — Spotify’s open-sourced simple constant key/value storage library, for read-heavy systems with infrequent large bulk inserts.
- The Truth of Fact, The Truth of Feeling (Ted Chiang) — story about what happens when lifelogs become searchable. Now with Remem, finding the exact moment has become easy, and lifelogs that previously lay all but ignored are now being scrutinized as if they were crime scenes, thickly strewn with evidence for use in domestic squabbles. (via BoingBoing)
- Algorithms Magnifying Misbehaviour (The Guardian) — when the training set embodies biases, the machine will exhibit biases too.
- Lego Robot That Strips DRM Off Ebooks (BoingBoing) — so. damn. cool. If it had been controlled by a C64, Cory would have hit every one of my geek erogenous zones with this find.
Big Diner, Fab Future, Browser Crypto, and STEM Crisis Questioned
- In Search of the Optimal Cheeseburger (Hilary Mason) — playing with NYC menu data. There are 5,247 cheeseburgers you can order in Manhattan. Her Ignite talk from Ignite NYC15.
- James Burke Predicting the Future — spoiler: massive disruption from nano-scale personal fabbing.
- The STEM Crisis is a Myth (IEEE Spectrum) — Every year U.S. schools grant more STEM degrees than there are available jobs. When you factor in H-1B visa holders, existing STEM degree holders, and the like, it’s hard to make a case that there’s a STEM labor shortage.
Bezos on Business, CS Ratios, Easier Hadoopery, and AWS CLI
- Bezos at the Post (Washington Post) — “All businesses need to be young forever. If your customer base ages with you, you’re Woolworth’s,” added Bezos.[...] “The number one rule has to be: Don’t be boring.” (via Julie Starr)
- How Carnegie-Mellon Increased Women in Computer Science to 42% — outreach, admissions based on potential not existing advantage, making CS classes practical from the start, and peer support.
- Summingbird (Github) — Twitter open-sourced library that lets you write streaming MapReduce programs that look like native Scala or Java collection transformations and execute them on a number of well-known distributed MapReduce platforms like Storm and Scalding.
- aws-cli (Github) — commandline for Amazon Web Services. (via AWS Blog)
The Internet of Americas, Pharma Pricey, Who's Watching, and Data Mining Course
- Bradley Manning and the Two Americas (Quinn Norton) — The first America built the Internet, but the second America moved onto it. And they both think they own the place now. The best explanation you’ll find for wtf is going on.
- Staggering Cost of Inventing New Drugs (Forbes) — $5BB to develop a new drug; and subject to an inverse-Moore’s law: A 2012 article in Nature Reviews Drug Discovery says the number of drugs invented per billion dollars of R&D invested has been cut in half every nine years for half a century.
- Who’s Watching You — (Tim Bray) threat modelling. Everyone should know this.
- Data Mining with Weka — learn data mining with the popular open source Weka platform.
Approximate Queries, Spreadsheet as Database, China Robot Plans, and Open Source Google App Engine
- blinkdb — The current version of BlinkDB supports a slightly constrained set of SQL-style declarative queries and provides approximate results for standard SQL aggregate queries, specifically queries involving COUNT, AVG, SUM and PERCENTILE and is being extended to support any User-Defined Functions (UDFs). Queries involving these operations can be annotated with either an error bound, or a time constraint, based on which the system selects an appropriate sample to operate on.
- China Plans to Become a Leader in Robotics (Quartz) — The ODCCC too funds high risk research initiatives through the Thousand Talent Project (TTP), a three-year term project with possible extension. The goal of the TTP is to recruit thousands of foreign researchers with strong expertise in hardware and software to help develop innovation in China. There are already more than 100 foreign researchers working in China since 2008, the year TTP started.
- AppScale (GitHub) — open source implementation of Google App Engine.
Aural Viz, SPOF ID, Information Asymmetry, and Support IA
- choir.io explained (Alex Dong) — Sound is the perfect medium for wearable computers to talk back to us. Sound has a dozen of properties that we can tune to convey different level of emotions and intrusivenesses. Different sound packs would fit into various contexts.
- Identity Single Point of Failure (Tim Bray) — continuing his excellent series on federated identity. There’s this guy here at Google, Eric Sachs, who’s been doing Identity stuff in the white-hot center of the Internet universe for a lot of years. One of his mantras is “If you’re typing a password into something, unless they have 100+ full-time engineers working on security and abuse and fraud, you should be nervous.” I think he’s right.
- What Does It Really Matter If Companies Are Tracking Us Online? (The Atlantic) — Rather, the failures will come in the form of consumers being systematically charged more than they would have been had less information about that particular consumer. Sometimes, that will mean exploiting people who are not of a particular class, say upcharging men for flowers if a computer recognizes that that he’s looking for flowers the day after his anniversary. A summary of Ryan Calo’s paper. (via Slashdot)
- Life Inside Brewster’s Magnificent Contraption (Jason Scott) — I’ve been really busy. Checking my upload statistics, here’s what I’ve added to the Internet Archive: Over 169,000 individual objects, totaling 245 terabytes. You should subscribe and keep them in business. I did.