Big data comes to the big screen

Using data science to predict the Oscars

By Michael GoldFarsite

Sophisticated algorithms are not going to write the perfect script or crawl YouTube to find the next Justin Beiber (that last one I think we can all be thankful for!). But a model can predict the probability of a nominee winning the Oscar, and recently our model has Argo overtaking Lincoln as the likely winner of Best Picture. Every day on we’ve been describing applications of data science for the media and entertainment industry, illustrating how our models work, and updating the likely winners based on the outcomes of the Awards Season leading up to the Oscars. 

Just as predictive analytics provides valuable decision-making tools in sectors from retail to healthcare to advocacy, data science can also empower smarter decisions for entertainment executives, which led us to launch the Oscar forecasting project. While the potential for data science to impact any organization is as unique as each company itself, we thought we’d offer a few use cases that have wide application for media and entertainment organizations.

Sales projections: Everyone wants to know as early as possible how an intellectual property will perform. Predictive analytics can inform Box Office projections, TV ratings, song downloads or ticket sales.

Geospatial analysis: Part of better revenue planning is understanding who and where audiences can be found. Geospatial analysis answers tactical questions on a local level, which drive better analysis and projections. This includes identifying in which markets and theatres a film will perform best, or which cities a band should hit on a tour.

Marketing optimization: From a film’s P&A spend to allocating digital spend on an album release, marketing attribution and micro-targeting optimizes spending and maximizes ROI.

And of course – predicting award winners!

So how do our Oscar forecasting models work? We took decades of movie and Academy Awards data and built regression-based models that isolate the key variables that likely lead to an Oscar win. We combined that historical perspective with real-time data including betting markets and nominee wins at awards such as Golden Globes leading up to this year’s Oscars. The combination of rich historical data and real-time information produces models that aim to capture the long history of Academy voting behavior and the dynamism of nominees generating momentum throughout the awards season.


While we are predicting six awards (Best Picture, Best Director, Best Actor, Best Actress, Best Supporting Actor and Best Supporting Actress), Best Picture garners the most attention. There are a number of key drivers in our model, including a director’s previous nominations and wins, odds in the betting markets, wins in the awards season leading up to the Oscars, and total nominations for the film in the current year. This year the total nominations variable favors Spielberg (Lincoln), which leads the pack with 12 nominations.

One of the strongest correlations is between Best Picture and Best Director. Directors are rarely nominated without having their film in the Best Picture category. Since 1970, 83.3% of Best Picture winners also won Best Director. Yet there have been significant changes to the nomination process require additional analysis. The Best Picture field is now up to 10 films while the Best Director category is still 5 nominees. Does this mean that a Best Director nominee is more likely to have their film up for Best Picture and thus the variable should be more important? Or is it less likely that a Best Picture nominee will have its director nominated for Best Director and the strength of variable should be reconsidered?

How a data scientist interprets this analysis significantly informs the outcome of the Best Picture and Best Director models. This question is particularly relevant this year since Ben Affleck and Argo have been racking up wins for Director and Film throughout the Awards Season, even though Affleck is not nominated for the Best Director Oscar. This underscores the importance of the human element of data science. It is crucial for data scientists to understand industry dynamics and build models that are responsive to changes in a fast moving and competitive landscape.

At the end of the day, data science and predictive analytics are incredible tools, which can enhance any executive’s decision-making process. The creative geniuses who have built the media industry will further grow and enhance their sector and advance their craft with the insights offered by better data on consumers, the market, and trends. Data science won’t replace development executives, media buyers, marketing departments, or studio or network executives. But, it will make everyone in the media industry smarter and more informed.

Disclaimer: Oscar®, Oscars®, Oscar Night®, ©A.M.P.A.S.® and Academy Award(s)® are trademarked by the Academy of Motion Picture Arts and Sciences.

tags: , , , , , ,