"search" entries

Four short links: 1 February 2016

Four short links: 1 February 2016

Curation & Search, Developer Tenure, AI/IA History, and Catapulting Drones

  1. Curation & Search — (Twitter) All curation grows until it requires search. All search grows until it requires curation.—Benedict Evans. (via Lists are the New Search)
  2. Average Developer Tenure (Seattle Times) — The average tenure of a developer in Silicon Valley is nine months at a single company. In Seattle, that length is closer to two years. (via Rands)
  3. An Interview with John Markoff (Robohub) — the interview will give you a flavour of his book, Machines of Loving Grace, a sweet history of AI told through the stories of the people who pioneered and now shape the field.
  4. Catapult Drone Launch (YouTube) — utterly nuts. That’s an SUV off its rear wheels! (via IEEE)

10 Elasticsearch metrics to watch

Track key metrics to keep Elasticsearch running smoothly.

Elasticsearch is booming. Together with Logstash, a tool for collecting and processing logs, and Kibana, a tool for searching and visualizing data in Elasticsearch (aka, the “ELK” stack), adoption of Elasticsearch continues to grow by leaps and bounds. When it comes to actually using Elasticsearch, there are tons of metrics generated. Instead of taking on the formidable task of tackling all-things-metrics in one blog post, I’ll take a look at 10 Elasticsearch metrics to watch. This should be helpful to anyone new to Elasticsearch, and also to experienced users who want a quick start into performance monitoring of Elasticsearch.

Most of the charts in this piece group metrics either by displaying multiple metrics in one chart, or by organizing them into dashboards. This is done to provide context for each of the metrics we’re exploring.

To start, here’s a dashboard view of the 10 Elasticsearch metrics we’re going to discuss:

spm_dashboard

10 Elasticsearch metrics in one compact SPM dashboard. This dashboard image, and all images in this post, are from Sematext’s SPM Performance Monitoring tool.

Now, let’s dig into each of the 10 metrics one by one and see how to interpret them.

Read more…

Four short links: 27 April 2015

Four short links: 27 April 2015

Living Figures, Design vs Architecture, Faceted Browsing, and Byzantine Comedy

  1. ‘Living Figures’ Make Their Debut (Nature) — In July last year, neurobiologist Björn Brembs published a paper about how fruit flies walk. Nine months on, his paper looks different: another group has fed its data into the article, altering one of the figures. The update — to figure 4 — marks the debut of what the paper’s London-based publisher, Faculty of 1000 (F1000), is calling a living figure, a concept that it hopes will catch on in other articles. Brembs, at the University of Regensburg in Germany, says that three other groups have so far agreed to add their data, using software he wrote that automatically redraws the figure as new data come in.
  2. Strategies Against Architecture (Seb Chan and Aaron Straup Cope) — the story of the design of the Cooper Hewitt’s clever “pen,” which visitors to the design museum use to collect the info from their favourite exhibits. (Visit the Cooper Hewitt when you’re next in NYC; it’s magnificent.)
  3. Two Way Streetan independent explorer for The British Museum collection, letting you browse by year acquired, year created, type of object, etc. I note there are more things from a place called “Brak” than there are from USA. Facets are awesome. (via Courtney Johnston)
  4. The Saddest Moment (PDF) — “How can you make a reliable computer service?” the presenter will ask in an innocent voice before continuing, “It may be difficult if you can’t trust anything and the entire concept of happiness is a lie designed by unseen overlords of endless deceptive power.” The presenter never explicitly says that last part, but everybody understands what’s happening. Making distributed systems reliable is inherently impossible; we cling to Byzantine fault tolerance like Charlton Heston clings to his guns, hoping that a series of complex software protocols will somehow protect us from the oncoming storm of furious apes who have somehow learned how to wear pants and maliciously tamper with our network packets. Hilarious. (via Tracy Chou)
Four short links: 5 May 2014

Four short links: 5 May 2014

After Search, Instrumenting Pompeii, Replaceable Work, and The Coding Adventure

  1. This is What Comes After Search (Quartz) — it’s “context”, aka knowing what you’re doing and thinking to the point where the device can tell you what you need to know before you search for it. Also known as the apotheosis of passive consumption.
  2. Wiretapping the Ruins of Pompeii — Pompeii on its way to being one of the most instrumented cities in the world, a mere two thousand years since it was last inhabited. (via Pete Warden)
  3. Technology is Taking Over English Departmentsbanausic—the kind of labor that can be outsourced to non-specialists. (via Courtney Johnston)
  4. phabricatorOpen software engineering platform and fun adventure game. TAKE AWESOME.
Four short links: 10 September 2013

Four short links: 10 September 2013

Constant KV Store, Google Me, Learned Bias, and DRM-Stripping Lego Robot

  1. Sparkey — Spotify’s open-sourced simple constant key/value storage library, for read-heavy systems with infrequent large bulk inserts.
  2. The Truth of Fact, The Truth of Feeling (Ted Chiang) — story about what happens when lifelogs become searchable. Now with Remem, finding the exact moment has become easy, and lifelogs that previously lay all but ignored are now being scrutinized as if they were crime scenes, thickly strewn with evidence for use in domestic squabbles. (via BoingBoing)
  3. Algorithms Magnifying Misbehaviour (The Guardian) — when the training set embodies biases, the machine will exhibit biases too.
  4. Lego Robot That Strips DRM Off Ebooks (BoingBoing) — so. damn. cool. If it had been controlled by a C64, Cory would have hit every one of my geek erogenous zones with this find.
Four short links: 4 July 2013

Four short links: 4 July 2013

Model-Driven Configuration, 1,000 RSS Readers Bloom, JSON Query Language, and Doug Engelbart's Vision

  1. ansibleModel-driven configuration management, multi-node deployment/orchestration, and remote task execution system. Uses SSH by default, so no special software has to be installed on the nodes you manage. Ansible can be extended in any language.
  2. The Golden Age of RSSOne of the things I expected least to see in 2013 was that this year would mark the greatest flourishing of RSS reader applications in the decade since it first came to prominence on the web.
  3. JSONiq: the JSON Query Languageexpressive and highly optimizable language to query and update NoSQL stores. It enables developers to leverage the same productive high-level language across a variety of NoSQL products. Implemented in Zorba, an Apache-licensed virtual machine for JSONiq and XQuery queries.
  4. Bret Victor on Doug EngelbartIf you attempt to make sense of Engelbart’s design by drawing correspondences to our present-day systems, you will miss the point, because our present-day systems do not embody Engelbart’s intent. Engelbart hated our present-day systems. Poetic, articulate, and bang on the money.
Four short links: 2 July 2013

Four short links: 2 July 2013

Microvideos for MIcrohelp, Organic Search, Probabilistic Programming, and Cluster Management

  1. How to Make Help Microvideos For Your Site (Alex Holovaty) — Instead of one monolithic video, we decided to make dozens of tiny, five-second videos separately demonstrating features.
  2. How Google is Killing Organic Search — 13% of the real estate is organic results in a search for “auto mechanic”, 7% for “italian restaurant”, 0% if searching on an iPhone where organic results are four page scrolls away. SEO Book did an extensive analysis of just how important the top left of the page, previously occupied by organic results actually is to visitors. That portion of the page is now all Google. (via Alex Dong)
  3. Church — probabilistic programming language from MIT, with tutorials. (via Edd Dumbill)
  4. mesosa cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It can run Hadoop, MPI, Hypertable, Spark (a new framework for low-latency interactive and iterative jobs), and other applications. Mesos is open source in the Apache Incubator. (via Ben Lorica)
Four short links: 19 June 2013

Four short links: 19 June 2013

Thread Problems, Better Image Search, Open Standards, and GitHub Maps

  1. Multithreading is HardThe compiler and the processor both conspire to defeat your threads by moving your code around! Be warned and wary! You will have to do battle with both. Sample code and explanation of WTF the eieio barrier is (hint: nothing to do with Old McDonald’s server farm). (via Erik Michaels-Ober)
  2. Improving Photo Search (Google Research) — volume of training images, number of CPU cores, and Freebase entities. (via Alex Dong)
  3. Is Google Dumping Open Standards for Open Wallets? (Matt Asay) — it’s easier to ship than standardise, to innovate than integrate, but the ux of a citizen in the real world is pants. Like blog posts? Log into Facebook to read your friends! (or Google+) Chat is great, but you’d better have one client per corporation your friends hang out on. Nobody woke up this morning asking for features to make web pages only work on one browser. The user experience of isolationism is ugly.
  4. GitHub Renders GeoJSONUnder the hood we use Leaflet.js to render the geoJSON data, and overlay it on a custom version of MapBox’s street view baselayer — simplified so that your data can really shine. Best of all, the base map uses OpenStreetMap data, so if you find an area to improve, edit away.
Four short links: 27 May 2013

Four short links: 27 May 2013

Search API, Cyberwar=Cyberbollocks, 4k Magic, and Geoparsing

  1. techu Search ServerTechu exposes a RESTful API for realtime indexing and searching with the Sphinx full-text search engine. We leverage Redis, Nginx and the Python Django framework to make searching easy to handle & flexible.
  2. In Defence of Digital Freedom — a member of the European Parliament’s piece on the risks to our online freedoms caused by framing computer security into cyberwarfare. Digital freedoms and fundamental rights need to be enforced, and not eroded in the face of vulnerabilities, attacks, and repression. In order to do so, essential and difficult questions on the implementation of the rule of law, historically place-bound by jurisdiction rooted in the nation-state, in the context of a globally connected world, need to be addressed. This is a matter for the EU as a global player, and should involve all of society. (via BoingBoing)
  3. Inside a 4k Demo — what it’s like to write an amazing demo with only 4k of code. (via Nelson Minar)
  4. CLAVIN — open source (Apache2) Java library for document geotagging and geoparsing that employs context-based geographic entity resolution. (via Pete Warden)
Four short links: 15 March 2013

Four short links: 15 March 2013

Search Ads Meh, Hacked Website Help, Web Design Sins, and Lazy Correlations

  1. Consumer Heterogeneity and Paid Search Effectiveness: A Large Scale Field Experiment (PDF) — We find that new and infrequent users are positively influenced by ads but that existing loyal users whose purchasing behavior is not influenced by paid search account for most of the advertising expenses, resulting in average returns that are negative. We discuss substitution to other channels and implications for advertising decisions in large firms. eBay-commissioned research, so salt to taste. (via Guardian)
  2. Google’s Help for Hacked Webmasters — what it says.
  3. 14 Lousy Web Design Trends Making a Comeback Thanks to HTML 5 — “mystery meat icons” a pet bugbear of mine.
  4. The Human Microbiome 101 (SlideShare) — SciFoo alum Jonathan Eisen’s talk. Informative, but super-notable for “complexity is astonishing, massive risk for false positive associations”. Remember this the next time your Big Data Scientist (aka kid with R) tells you one surprising variable predicts 66% of anything. I wish I had the audio from this talk!