ENTRIES TAGGED "graphs"

There are many use cases for graph databases and analytics

Business users are becoming more comfortable with graph analytics.

GraphLab graphThe rise of sensors and connected devices will lead to applications that draw from network/graph data management and analytics. As the number of devices surpasses the number of people — Cisco estimates 50 billion connected devices by 2020 — one can imagine applications that depend on data stored in graphs with many more nodes and edges than the ones currently maintained by social media companies.

This means that researchers and companies will need to produce real-time tools and techniques that scale to much larger graphs (measured in terms of nodes & edges). I previously listed tools for tapping into graph data, and I continue to track improvements in accessibility, scalability, and performance. For example, at the just-concluded Spark Summit, it was apparent that GraphX remains a high-priority project within the Spark1 ecosystem.

Read more…

Comments: 4

Network Science dashboards

Networks graphs can be used as primary visual objects with conventional charts used to supply detailed views

With Network Science well on its way to being an established academic discipline, we’re beginning to see tools that leverage it. Applications that draw heavily from this discipline make heavy use of visual representations and come with interfaces aimed at business users. For business analysts used to consuming bar and line charts, network visualizations take some getting used. But with enough practice, and for the right set of problems, they are an effective visualization model.

In many domains, networks graphs can be the primary visual objects with conventional charts used to supply detailed views. I recently got a preview of some dashboards built using Financial Network Analytics (FNA). Read more…

Comment

Big Data solutions through the combination of tools

Applications get easier to build as packaged combinations of open source tools become available

As a user who tends to mix-and-match many different tools, not having to deal with configuring and assembling a suite of tools is a big win. So I’m really liking the recent trend towards more integrated and packaged solutions. A recent example is the relaunch of Cloudera’s Enterprise Data hub, to include Spark1 and Spark Streaming. Users benefit by gaining automatic access to analytic engines that come with Spark2. Besides simplifying things for data scientists and data engineers, easy access to analytic engines is critical for streamlining the creation of big data applications.

Another recent example is Dendrite3 – an interesting new graph analysis solution from Lab41. It combines Titan (a distributed graph database), GraphLab (for graph analytics), and a front-end that leverages AngularJS, into a Graph exploration and analysis tool for business analysts:

Smiley face

Read more…

Comment
Four short links: 5 February 2014

Four short links: 5 February 2014

Graph Drawing, DARPA Open Source, Quantified Vehicle, and IoT Growth

  1. sigma.js — Javascript graph-drawing library (node-edge graphs, not charts).
  2. DARPA Open Catalog — all the open source published by DARPA. Sweet!
  3. Quantified Vehicle Meetup — Boston meetup around intelligent automotive tech including on-board diagnostics, protocols, APIs, analytics, telematics, apps, software and devices.
  4. AT&T See Future In Industrial Internet — partnering with GE, M2M-related customers increased by more than 38% last year. (via Jim Stogdill)
Comment: 1

Semi-automatic method for grading a million homework assignments

Organize solutions into clusters and “force multiply” feedback provided by instructors

One of the hardest things about teaching a large class is grading exams and homework assignments. In my teaching days a “large class” was only in the few hundreds (still a challenge for the TAs and instructor). But in the age of MOOCs, classes with a few (hundred) thousand students aren’t unusual.

Researchers at Stanford recently combed through over one million homework submissions from a large MOOC class offered in 2011. Students in the machine-learning course submitted programming code for assignments that consisted of several small programs (the typical submission was about 16 lines of code). While over 120,000 enrolled only about 10,000 students completed all homework assignments (about 25,000 submitted at least one assignment).

The researchers were interested in figuring out ways to ease the burden of grading the large volume of homework submissions. The premise was that by sufficiently organizing the “space of possible solutions”, instructors would provide feedback to a few submissions, and their feedback could then be propagated to the rest.

Read more…

Comment
Four short links: 26 June 2013

Four short links: 26 June 2013

Neural Memory Allocation, DoD Synthbio, Sierra Leone Makers, and Complex Humanities Networks

  1. Memory Allocation in Brains (PDF) — The results reviewed here suggest that there are competitive mechanisms that affect memory allocation. For example, new dentate gyrus neurons, amygdala cells with higher excitability, and synapses near previously potentiated synapses seem to have the competitive edge over other cells and synapses and thus affect memory allocation with time scales of weeks, hours, and minutes. Are all memory allocation mechanisms competitive, or are there mechanisms of memory allocation that do not involve competition? Even though it is difficult to resolve this question at the current time, it is important to note that most mechanisms of memory allocation in computers do not involve competition. Does the dissector use a slab allocator? Tip your waiter, try the veal.
  2. Living Foundries (DARPA) — one motivating, widespread and currently intractable problem is that of corrosion/materials degradation. The DoD must operate in all environments, including some of the most corrosively aggressive on Earth, and do so with increasingly complex heterogeneous materials systems. This multifaceted and ubiquitous problem costs the DoD approximately $23 Billion per year. The ability to truly program and engineer biology, would enable the capability to design and engineer systems to rapidly and dynamically prevent, seek out, identify and repair corrosion/materials degradation. (via Motley Fool)
  3. Innovate Salone — finalists from a Sierra Leone maker/innovation contest. Part of David Sengeh‘s excellent work.
  4. Arts, Humanities, and Complex Networks — ebook series, conferences, talks, on network analysis in the humanities. Everything from Protestant letter networks in the reign of Mary, to the repertory of 16th century polyphony, to a data-driven update to Alfred Barr’s diagram of cubism and abstract art (original here).
Comment

Maps not lists: network graphs for data exploration

Preview of upcoming Strata session on data exploration

Amy Heineike is Director of Mathematics for Quid Inc, where she has been since its inception, prototyping and launching the company’s technology for analyzing document sets. Below is the teaser for her upcoming talk at Strata Santa Clara.

I recently discovered that my favorite map is online. It used to hang on my housemate’s wall in our little house in London back in 2005. At the time I was working to understand how London was evolving and changing, and how different policy or infrastructure changes (a new tube line, land use policy changes) would impact that.

The map was originally published as a center-page pull out from the Guardian, showing the ethnic groups that dominate different neighborhoods across the city. The legend was as long as the image, and the small print labels necessitated standing up close, peering and reading, tracing your finger to discover the Congolese on the West Green Road, our neighbors the Portuguese on the Stockwell Road, or the Tamils in Chessington in the distant south west.

Read more…

Comment
Four short links: 4 July 2012

Four short links: 4 July 2012

Inside Anonymous, Kanban Board, Extending Objective C, and Football Graphs

  1. How Anonymous Works (Wired) — Quinn Norton explains how the decentralized Anonymous operates, and how the transition to political activism happened. Required reading to understand post-state post-structure organisations, and to make sense of this chaotic unpredictable entity.
  2. Kanban For 1 — very nice progress board for tasks, for the lifehackers who want to apply agile software tools to the rest of their life.
  3. libextobj (GitHub) — library of extensions to Objective C to support patterns from other languages. (via Ian Kallen)
  4. Graph Theory to Understood Football (Tech Review) — players are nodes, passes build edges, and you can see strengths and strategies of teams in the resulting graphs.
Comment
Four short links: 15 June 2012

Four short links: 15 June 2012

On Anonymous, Graph Database, Leap Second, and Debugging Creativity

  1. In Flawed, Epic Anonymous Book, the Abyss Gazes Back (Wired) — Quinn Norton’s review of a book about Anonymous is an excellent introduction to Anonymous. Anonymous made us, its mediafags, masters of hedging language. The bombastic claims and hyperbolic declarations must be reported from their mouths, not from our publications. And yet still we make mistakes and publish lies and assumptions that slip through. There is some of this in all of journalism, but in a world where nothing is true and everything is permitted, it’s a constant existential slog. It’s why there’s not many of us on this beat.
  2. Titan (GitHub) — Apache2-licensed distributed graph database optimized for storing and processing large-scale graphs within a multi-machine cluster. Cassandra and HBase backends, implements the Blueprints graph API. (via Hacker News)
  3. Extra Second This June — we’re getting a leap second this year: there’ll be 2012 June 30, 23h 59m 60s. Calendars are fun.
  4. On Creativity (Beta Knowledge) — I wanted to create a game where even the developers couldn’t see what was coming. Of course I wasn’t thinking about debugging at this point. The people who did the debugging asked me what was a bug. I could not answer that. — Keita Takahashi, game designer (Katamari Damacy, Noby Noby Boy). Awesome quote.
Comment: 1
Four short links: 2 March 2012

Four short links: 2 March 2012

Robotics for Kids, Benchmarking Context Needed, Javascript Time Series Graphs, and Amazing Programming Video

  1. Interview: Hanno Sander on Robotics (Circuit Cellar) — this is what Mindstorms wants to be when it grows up. AAA++ for teaching kids. Hanno is a Kiwi Foo Camper.
  2. Context Needed: BenchmarksBenchmarks fall into a few common traps because of under-reporting in context and lack of detail in results. The typical benchmark report doesn’t reveal the benchmark’s goal, full details of the hardware and software used, how the results were edited if at all, how to reproduce the results, detailed reporting on the system’s performance during the test, and an interpretation and explanation of the results. (via Jesse Robbins)
  3. Morris.js (GitHub) — a lightweight library that uses jQuery and Raphaël to make drawing time-series graphs easy.
  4. Bret Victor: Inventing on Principle (Vimeo) — the first 20m has amazing demos of a coding environment with realtime feedback. Must see this! (via Sacha Judd)
Comment: 1