"cloud computing" entries

Four short links: 3 March 2016

Tagging People, Maintenance Anti-Pattern, Insourced Brains, and Chat UI

Human Traffickers Using RFID Chips (NPR) — It turns out this 20-something woman was being pimped out by her boyfriend, forced to sell herself for sex and hand him the money. “It was a small glass capsule with a little almost like a circuit board inside of it,” he said. “It’s an RFID chip. It’s used to tag cats and dogs. And someone had tagged her like an animal, like she was somebody’s pet that they owned.”
Software Maintenance is an Anti-Pattern — Governments often use two anti-patterns when sustaining software: equating the “first release” with “complete” and moving to reduce sustaining staff too early; and how a reduction of staff is managed when a reduction in budget is appropriate.
Cloud Latency and Autonomous Robots (Ars Technica) — “Accessing a cloud computer takes too long. The half-second time delay is too noticeable to a human,” says Ishiguro, an award-winning roboticist at Osaka University in Japan. “In real life, you never wait half a second for someone to respond. People answer much quicker than that.” Tech moves in cycles, from distributed to centralized and back again. As with mobile phones, the question becomes, “what is the right location for this functionality?” It’s folly to imagine everything belongs in the same place.
Chat as UI (Alistair Croll) — The surface area of the interface is almost untestable. The UI is the log file. Every user interaction is also a survey. Chat is a great interface for the Internet of Things. It remains to be seen how many deep and meaningfuls I want to have with my fridge.

Graph databases are powering mission-critical applications

The O’Reilly Data Show Podcast: Emil Eifrem on popular applications of graph technologies, cloud computing, and company culture.

by Ben Lorica | @bigdata | +Ben Lorica | December 3, 2015

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science.

While most people associate graphs with social media analysis, there are a wide range of applications — including recommendations, fraud detection, I.T. operations, and security — that are routinely framed using graphs. This wide variety of use cases has led to rise to many interesting tools for storing, managing, visualizing, and analyzing massive graphs. The important thing to note is that graph databases are not limited to reporting and analytics, but are also being used to power mission critical applications.

In this episode of the O’Reilly Data Show, I sat down with Emil Eifrem, CEO and co-founder of Neo Technology. We talked about the early days of NoSQL, applications of graph databases, cloud computing, and company culture in the U.S. and Sweden.

Graph and NoSQL databases

The relational database had been an accelerator, and here it’s really slowing us down. What we ended up concluding was that the problem was this mismatch between the shape of the data and the abstractions that were exposed by our infrastructure. At that point, we said, okay, what if we had a database that just exposed these amazing network-oriented data structures or graph-oriented data structures, but other than that, had all the properties of a relational database. Wouldn’t that be great? … Ultimately, we said the famous last words: ‘Hey, let’s just build it ourselves. How hard can it be?’ It turns out it’s 15 years later!

…

2007 is when both the Dynamo paper had been published and the BigTable paper had been published out of Amazon and Google, respectively. That’s when, in early adopter circuits, the discourse started to change … maybe the era of the one-size-fits-all database is over. Maybe our job isn’t to take all of our data and shove it through a relational database. Maybe there are some other tools and technologies and abstractions out there that make better sense for some data. That was in ’07. I really think it was as if lightning struck in the community. … . [Dynamo and BigTable were announced] and the next day, 12 open source projects, implementing it, and then the next day, 24 new ones. It was just crazy back then.

Read more…

Jai Ranganathan on architecting big data applications in the cloud

The O’Reilly Data Show podcast: The Hadoop ecosystem, the recent surge in interest in all things real time, and developments in hardware.

by Ben Lorica | @bigdata | +Ben Lorica | November 19, 2015

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science.

Given the quick pace of innovation in the data ecosystem, we like to take a step back from the details of individual components, architecture, and applications, in order to take a wider view of the landscape of big data. This allows us to evaluate the progress of technology and infrastructure along the way, shifting our attention from the details of individual components like Spark and Kafka, to larger trends.

Some of the larger trends we’ve been exploring include the capabilities of distributed machine learning and the tradeoffs and design decisions involved in cloud architecture and stream processing.

In this episode of the O’Reilly Data Show, I sat down with Jai Ranganathan, senior director of product management at Cloudera. We talked about the trends in the Hadoop ecosystem, cloud computing, the recent surge in interest in all things real time, and hardware trends:

Large-scale machine learning

This sounds a bit like this should already exist in really good form right now, but one of the things that I’m really interested in is expanding the set of capabilities for distributed machine learning. While there are systems out there today that do do this, I think relative to what you can experience from a singular environment learning scikit-learn or R, the set of things you can do in a distributed fashion is limited. … It’s not easy to distribute various algorithms and model-building techniques. I think there is still a lot of work for us to do to improve that experience. … And I do want to have good open source options like MLlib. MLlib may be the right answer. I would be perfectly happy if that’s the final answer, but we do need systems just to provide the kind of depth that you typically are used to in the singular environment. That’s just a matter of time and investment because these are non-trivial problems, but they are things that people are working on.

Read more…

Startups suggest big data is moving to the clouds

A look at the winners from a showcase of some of the most innovative big data startups.

by Alistair Croll | @acroll | +Alistair Croll | May 11, 2015

At Strata + Hadoop World in London last week, we hosted a showcase of some of the most innovative big data startups. Our judges narrowed the field to 10 finalists, from whom they — and attendees — picked three winners and an audience choice.

Underscoring many of these companies was the move from software to services. As industries mature, we see a move from custom consulting to software and, ultimately, to utilities — something Simon Wardley underscored in his Data Driven Business Day talk, and which was reinforced by the announcement of tools like Google’s Bigtable service offering.

This trend was front and center at the showcase:

Winner Modgen, for example, generates recommendations and predictions, offering machine learning as a cloud-based service.
While second-place Brytlyt offers their high-performance database as an on-premise product, their horizontally scaled-out architecture really shines when the infrastructure is elastic and cloud based.
Finally, third-place OpenSensors’ real-time IoT message platform scales to millions of messages a second, letting anyone spin up a network of connected devices.

Ultimately, big data gives clouds something to do. Distributed sensors need a widely available, connected repository into which to report; databases need to grow and shrink with demand; and predictive models can be tuned better when they learn from many data sets. Read more…

Public vs. private cloud: Price isn’t enough

The risk relative to the savings isn’t enough to justify a shift to public cloud.

by Jim Stogdill | @jstogdill | +Jim Stogdill | March 10, 2015

Servers_Paul_Hammond_Flickr

This post was originally published on Limn This. The lightly edited version that follows is republished with permission.

Last October, Simon Wardley and I stood on a rainy sidewalk at 28th St. in New York City arguing politely (he’s British) about the future of cloud adoption. He argued, rightly, that the cost advantages from scale would be overwhelming compared to home-brew private clouds. He went on to argue, less certainly in my view, that this would lead inevitably to their wholesale and deep adoption across the enterprise market.

I think Simon bases his argument on something like the rational economic man theory of the enterprise. Or, more specifically, the rational economic chief financial officer (CFO). If the costs of a service provider are destined to be lower than the costs of internally operated alternatives, and your CFO is rational (most tend to be), then the conclusion is foregone.

And, of course, costs are going down just as they are predicted to. Look at this post by Avi Deitcher: Does Amazon’s Web Services Pricing Follow Moore’s Law? I think the question posed in the title has a fairly obvious answer. No. Services aren’t just silicon; they include all manner of linear terms, like labor, so the price decreases will almost certainly be slower than Moore’s Law, but his analysis of the costs of a modestly-sized AWS solution and in-house competition is really useful.

Not only is AWS’ price dropping fast (56% in three years), but it’s significantly cheaper than building and operating a platform in house. Avi does the math for 600 instances over three years and finds that the cost for AWS would be $1.1 million (I don’t think this number considers out-year price decreases) versus $2.3 million for DIY. Your mileage might vary, but these numbers are a nice starting point for further discussion.

These results raise an interesting question: if the numbers are so compelling, why did Walmart just reveal that they are building a ginormous private cloud? Why would anyone? Read more…

How did we end up with a centralized Internet for the NSA to mine?

The Internet is naturally decentralized, but it's distorted by business considerations.

by Andy Oram | @praxagora | +Andy Oram | January 8, 2014

I’m sure it was a Wired editor, and not the author Steven Levy, who assigned the title “How the NSA Almost Killed the Internet” to yesterday’s fine article about the pressures on large social networking sites. Whoever chose the title, it’s justifiably grandiose because to many people, yes, companies such as Facebook and Google constitute what they know as the Internet. (The article also discusses threats to divide the Internet infrastructure into national segments, which I’ll touch on later.)

So my question today is: How did we get such industry concentration? Why is a network famously based on distributed processing, routing, and peer connections characterized now by a few choke points that the NSA can skim at its leisure?
Read more…

"cloud computing" entries

Tagging People, Maintenance Anti-Pattern, Insourced Brains, and Chat UI

The O’Reilly Data Show podcast: Evan Chan on the early days of Spark+Cassandra, FiloDB, and cloud computing.

Bringing Apache Spark & Apache Cassandra together