Antigenic Cartography: Fighting the Flu with Maps

This guest post was written by Terry Jones, former antigenic cartographer and founder of fluidinfo.

At the O’Reilly ETech conference in March I gave a talk describing Antigenic Cartography, a new method being used to visualize virus evolution and to aid in the design of vaccines, in the context of influenza. My slides from the talk are online.

Having “the flu” is something of a generic complaint. Other common and less debilitating diseases are often mistaken for influenza. As a result, people tend to regard flu in casual terms, and to take it less seriously than they should—at least until they actually get the real thing! Flu and hits hard and fast and is often followed by opportunistic infections such as pneumonia. The result is often bed for a week, and being left in an extremely weakened state.

Flu kills about half a million people a year. It also periodically enters the human population from another species, such as birds, causing worldwide pandemics, killing millions. There were 3 pandemics in the 20th century, in 1918/9, 1957 and 1966. The 1918 pandemic is estimated to have killed 100 million people. The scale of suffering and the societal and economic impact of such a pandemic is probably beyond what we can imagine. I suggest John Barry’s book “The Great Influenza” as a good starting point to get to know more about the events of 1918/19.

Because the virus mutates constantly, getting the flu does not provide long-term immunity to the disease. For this reason, the flu vaccine must be regularly updated to remain effective. The vaccine contains carefully selected and weakened elements of the flu virus. If the virus strains used to make the vaccine are well chosen, they provoke your immune system into running a full dress-rehearsal that prepares your body in case you later run into the real virus.

Choosing the components of the vaccine is extraordinarily difficult. Under the leadership of the WHO Global Influenza Programme, a handful of experts get together twice a year to make the decision that will find its way into the arms of about 300 million fellow humans. Antigenic Cartography is now being used to help in this process.

antigenic map

It’s quite simple to explain conceptually how an antigenic map is made. If you look in the back of any road atlas, you’ll find the familiar driving distance table giving inter-city distances. It’s easy to make such a table. You jump in a car, drive between cities, and let the odometer record distances. Or, more carbon friendly, just measure the distances directly from a map. More interesting and challenging is to try the reverse process: given a table of distances, reconstruct the original map. It turns out this can be done with high accuracy, even in the face of data which is ambiguous, inconsistent, noisy, and partially missing. We use a mathematical optimization technique known as multi-dimensional scaling to do this. Applied to virus data from national and international labs from all over the world, we produce maps which allow us to visualize viral evolution.

The maps show the virus jumping through an abstract “antigenic space”. This allows us to suggest vaccine components by predicting the timing and direction of these jumps. The figure shows an antigenic map of 35 years’ of the evolution of influenza H3N2, the virus subtype that currently impacts humans most severely. As you can see, the map shows definite directionality over time (the two digits of each cluster label are years). A child could make a pretty good guess as to where and when the next cluster will appear. To suggest a candidate virus strain for a new vaccine, we choose one at or near the center of the current cluster to convey wide general coverage. While vaccine strain selection is far from child’s play, Antigenic Cartography does give a very clear picture of viral evolution and allows us to automatically identify potential vaccine strains.

Best of all, because the technique is not specific to influenza, there is reason to hope for similar success when it is applied to data from other diseases. The antigenic maps also shed light on other aspects of virology, but those will have to wait for a future Radar posting.

You can learn more about Antigenic Cartography via the following links

or read the original Science paper.

  • Falafulu Fisi

    MDS (multi-dimensional-scaling) generally belongs to a class of algebraic (linear algebra) algorithms that are called Dimensionality Reduction. There are numerous dimensional reduction algorithms available today that have been published in the literatures and the familiar one is the LSI (latent semantic indexing) that is popular in text search engine of today. LSI is solved via the dimensional reduction algorithm called SVD (Singular Value Decomposition). SVD has also been applied in automated online recommendation engine as described in this paper:

    Incremental Singular Value Decomposition Algorithms for Highly Scalable Recommender Systems

    I do use a kernel version of MDS (kernel-multi-dimensional-scaling) for various data analysis related development that I do, which is similar to the algorithm described in the abstract of the following paper :

    A Kernel Approach to Metric Multidimensional Scaling

    MDS is a linear algorithm while the kernel-MDS is a non-linear algorithm, which is more robust (lower error) than its linear counterpart. MDS is an old algorithm (at least 30 years), but there have been new variants of MDS that kept appearing in the literatures over the years which have better error rates than the previous published ones.

    From the map shown above, the clusters BK79, S187 and BE89 are too close and may be overlap which makes classification a bit hard. If the same data is run thru a non-linear kernel-MDS, the clusters would appear distinctive ,ie, appear separated from each other and very little overlap, which makes classification a bit easier. Dimensional reduction algorithms are usually applied to the data as a preprocessing step, the output is then feed in to a classification algorithm to learn those extracted features , such as neural network or support vector machines, etc…