ENTRIES TAGGED "streams"

Four short links: 13 March 2014

Four short links: 13 March 2014

Parallel Programming, Malignant Computation, Politicised GDS, and Data Stream Toolkit

  1. Is Parallel Programming Hard? And, If So, What Can You Do About It? — book by Paul E. McKenney, on single-machine multi-CPU parallel programming.
  2. Malignant ComputationThe bitcoin mining network would work just as well if it had far less computation devoted to it. Bitcoins would be mined at exactly the same rate if 1/2 or 1/4 of the computational resources were devoted. This means that bitcoin has incentivized a tremendous amount of computational busy work.
  3. GDS Becomes Political (Computer Weekly) — She [Opposition MP] said that digital should not be about imposing a way of working on the public sector – Labour is not fond of the “digital by default” mantra – but about supporting public service delivery. [...] “When this government decided upon the digitalisation of this [online job search] service they apparently did not take into account those with poor literacy skills, mental health issues or learning difficulties – who, as most people would have predicted, make up a higher-than-average proportion of the unemployed.”
  4. streamtools (Github) — a graphical toolkit for dealing with streams of data. Streamtools makes it easy to explore, analyse, modify and learn from streams of data. (via OpenNews)
Comment |
Four short links: 30 October 2013

Four short links: 30 October 2013

Offline Javascript, Android Coding, Stats Fails, and Stream Data

  1. Offline.js — Javascript library so web app developers can gracefully deal with users going offline.
  2. Android Guideslots of info on coding for Android.
  3. Statistics Done Wrong — learn from these failure modes. Not medians or means. Modes.
  4. Streaming, Sketching, and Sufficient Statistics (YouTube) — how to process huge data sets as they stream past your CPU (e.g., those produced by sensors). (via Ben Lorica)
Comment |

Collecting, Aggregating, and Analyzing Data Exhaust

Next week, O'Reilly's Research Director Roger Magoulas, will lead an exciting panel discussion on Big Data†. The focus will be on the piles of data that companies have been collecting, and are just beginning to analyze: The internet and social media create a mountain of random, unstructured, and at times ephemeral data by-products, which may appear to be trash. Yet,…

Read Full Post | Comment: 1 |
Counting Unique Users in Real-time with Streaming Databases

Counting Unique Users in Real-time with Streaming Databases

As the web increasingly becomes real-time, marketers and publishers need analytic tools that can produce real-time reports. As an example, the basic task of calculating the number of unique users is typically done in batch mode (e.g. daily) and in many cases using a random sample from relevant log files. If unique user counts can be accurately computed in real-time, publishers and marketers can mount A/B tests or referral analysis to dynamically adjust their campaigns.

Read Full Post | Comments: 6 |
Pipelining and Real-time Analytics with MapReduce Online

Pipelining and Real-time Analytics with MapReduce Online

Some organizations create their own real-time analysis tools, while others turn to specialized solutions. In a previous post, I highlighted SQL-based real-time analytic tools that can handle large amounts of data. I noted that other big data management systems such as MPP databases and MapReduce/Hadoop were too batch-oriented to deliver analysis in near real-time. At least for MapReduce/Hadoop systems things may have changed slightly. A group of researchers from UC Berkeley and Yahoo recently modified MapReduce to allow for pipelining between operators.

Read Full Post | Comments: 2 |
Big Data and Real-time Structured Data Analytics

Big Data and Real-time Structured Data Analytics

The emergence of sensors as sources of Big Data highlights the need for real-time analytic tools. Popular web apps like Twitter, Facebook, and blogs are also faced with having to analyze (mostly unstructured) data in near real-time. But as Truviso founder and UC Berkeley CS Professor Michael Franklin recently noted, there are mountains of structured data generated by web apps that lend themselves to real-time analysis.

Read Full Post | Comments: 10 |

Analytics: Are Streams the New Hits?

The definition of an online video stream can mean different things on different sites. This kind of ambiguity hurts everyone involved.

Read Full Post | Comments Off |