"Do you want to become a farmer?!” In a sense, yes.

Two years ago an informal group met for drinks in downtown Palo Alto: a mix of grad students, investors, and data science experts in Silicon Valley. In the back and forth of our conversation, we took turns describing planned projects. At the time, prominent VC firms were racing headlong into health care ventures. Much of our group seemed pointed in that direction.

In my turn, I mentioned one word: Agriculture.

That drew laughter, “You want to become a farmer?!”

In a sense, yes.

Impact of data science beyond silicon valley

Practices involving large-scale data, machine learning, cluster computing, etc., toppled entire sectors over the past decade. Retail (Amazon) went first, followed closely by Advertising (Google). Automotive (Tesla) may be next. Clearly, the impact of data science has moved beyond Silicon Valley, with mainstream industries leveraging data that matters… not simply to improve marketing funnels, rather to overhaul their supply chains, manufacturing, global deployments, etc. Advances in remote sensing and “Industrial Internet” accelerate that process, with IoT data rates growing orders of magnitude beyond what social networks have experienced, compelling new technologies.

Sometimes when a group of insiders starts guffawing, there is perhaps a subtle point being missed. Consider that Silicon Valley has spent the past decade extracting billions from e-commerce, ad-tech, social networks, anti-fraud, etc. Extracting is the quintessential word there. I wondered: among the industries outside of Silicon Valley undergoing disruptions due to large-scale data, where did Agriculture fit? Why did it seem laughable to experts as a data science opportunity?

Agriculture provides a livelihood for 40% of the world’s population

On the one hand, it’s true that agriculture is known for resisting change. Farmers have the reputation of being conservative and fiercely independent; sort of an antithesis of tattooed SF high-tech hipsters riding fixie bikes or Google busses. Technology adoption curves in agriculture measure in decades. On the other hand, farming represents the single largest employer globally. It provides the primary livelihood for 40% of the world’s population. With over a half billion small farms worldwide, most are family-run farms that rely on rain-fed agriculture. Overall, agriculture represents $15T/year annual GDP globally. Plus, consider the dependencies downstream: in Asia, every $1.00 spent on agriculture generates $0.80 in other industries.

Let’s bring this closer to home: the US holds over $2T in agricultural real estate. More than 90% of US cropland experiences high annual rates of soil depletion, some 10–40% faster than could be replenished. The US is unique in the world due to a large, rich, arable heartland distributed throughout the Mississippi River basin — one that notably does not border another nation. That geography allowed for the rise of a large middle class of farmers, who enjoyed low-cost infrastructure for shipping goods to markets. It created a perpetuo mobile for socio-economic vitality and the projection of geopolitical power. Since Andrew Jackson’s troops marched on New Orleans, no external power has breached that system — except for civil war and, say, Hurricane Katrina.

High stakes

However, there are external risks. Let’s consider the core constraints of agricultural production, the three crucial inputs: water, nitrogen (N), and phosphorus (P). Nitrogen sources tend to derive from petroleum (conventional practices) or from fish (organic practices). With wars fought for oil and wild catch populations collapsing, both will be diminished. Phosphorus sources come from mining. Deposits are waning in both quantity and quality — mostly located in countries that maintain, shall we say, political agendas. Expect disruptions in supply.

Next, let’s consider water resources. In California, the most productive agricultural region in the US, aqueduct allocations were shut down in early 2014. Aqueduct supplies depend on a complex cycle of runoff and evaporation, ultimately tracing back to snowpack levels in the mountains. Available research indicates a stark lack of data about both that water cycle and per-farm consumption of water, except for two points. Point one: agriculture consumes 70% of the world’s freshwater in aggregate, and that figure is expected to reach 89% by 2050. Point two: the variance of snowpack levels has been increasing — likely due to climate change. As snowpack variance increases, water infrastructure gets stressed from the mountains to the sea. That limits freshwater supplies and increases the rate of seawater incursion in littoral areas, some of the most valuable farm land. Historically, water conservation had not been much of a concern to farmers in California. They focused on what crop buyers wanted – until resources began to collapse. Literally, the ground collapsed in parts of California’s Central Valley, as overdrawn aquifers led to ground subsidence and huge sinkholes.

Now vegetable crops are being plowed under as farmers sell their water to more capital-intensive operations – e.g., orchards, vineyards, livestock – which cannot afford not to buy water. Expect this trend to continue, as high-margin crops continue to assume more risk, and nutritional crops get displaced to other regions.

Some people point out that technology advances have enabled large increases in agricultural yields throughout the past century. Precision agriculture promises even more improvements. Why worry, if science can compensate for changes in environment? That position belies the fact that yield increases have come at the expense of disproportionately higher increases for crucial inputs (the N and P) throughout that same period. Again, those are derived from dwindling resources. That alone should be cause for alarm, but the nuances of data at scale have raised more urgent concerns.

Bold opportunities

Enter the technology giants and how data matters. Clearly, we cannot approach problems such as protecting strategic water supplies without more data. Recent acquisitions by Monsanto Growth Ventures among Silicon Valley tech start-ups (over $1B spent to acquire Climate Corp and part of Solum) signaled a new era of data science intimately involved in agriculture. However, at a recent Ag+Data investor breakfast held in Palo Alto, Silicon Valley VC firms were noticeably absent — except for Khosla Ventures. Strategic funds run by Monsanto, BASF, Dow Chemical, Mitsui, and other industrial firms sent investment teams. The rest of the room was packed with “family office” private investors. Google, Facebook, and Alibaba are likely to become big players in this area too, with strategic moves into remote sensing, mobile apps, and “last mile” technologies for rural regions worldwide.

These technology giants share an agenda: they need data. Remote sensing technologies (small satellites, atmostats, drones, etc.) service a portion of that requirement. Actual farm data (from IoT sensors) must fill in the remainder. John Deere and Dell have teamed to turn tractors essentially into drones, collecting massive amounts of data on every vehicle pass through a field – not unlike a Google Street View for farms.

Here’s where the story pivots. Farmers tend to share a common view about handing their data over to any third-party, let alone a technology giant such as Google or Monsanto: they are terrified. Think about the legal and financial implications of flow meters on farms, leaking data so to speak… local governments could levy special taxes, neighboring farms could litigate based on overdraws, bankers could boost interest rates on loans — let alone all manner of extractive “magic” that hedge funds could perform using that data.

Most of the emerging precision agriculture technologies rely on cloud computing to aggregate vast amounts of remote sensing data and IoT telemetry. No matter what security measures get taken, there will be breaches — more so at stages where data gets aggregated, if e-commerce recent history is any guide. Monsanto’s long-term strategy relies on obtaining and aggregating that data, one way or another. A wide variety of bad actors in Finance will pay top dollar to obtain and aggregate that data, one way or another.

Therein lies an essential tension, especially about yield. In the US much of our agricultural base has been consolidated into corporate farms. While the corporate farms are terrified to release their data, they have an almost single-minded focus on increasing yield. Monsanto and other tech giants insist on obtaining data, with an almost single-minded focus on promising to increase yield. Finance wants the data to place bets about yield. In the best of scenarios a stalemate ensues; more likely outcomes are considerably less positive. That tension gets caused because of the data, and how we as a society determine its appropriate use.

Meanwhile, Monsanto itself may become well-positioned as a large and uniquely powerful hedge fund in the near future: consider that the firm has begun to pin-point crop yields, per-plot, at the time of harvest. How readily could data products based on that vantage point be played in commodity markets?

Here’s where the story pivots again. Recall that 40% of the world’s population works in agriculture. More than 80% percent of all agricultural holdings measure less than two hectares: smallholder and family farms. Those account for more than 98% of all farms, and more than 56% of global agricultural production. So much of that is outside the US, and those farmers do not necessarily focus on yield. Instead their family livelihoods depend on personal labor, return on investment, and highly efficient use of resources. Nor does high yield imply success: in a buyers market, farmer earnings may collapse.

Where are the Ag+Data start-ups?

Looking at the of Ag+Data tech start-ups, many are based outside the US, far outside of Silicon Valley. The majority of those are based in the Southern Hemisphere. Chile in particular has led investments promoting innovation – far removed from the entanglements in the US. While corporate farms predominate in areas of high potential yield, smallholder and family farms tend to be stewards of the marginal lands. Their highly specialized knowledge sustains production as natural resource challenges escalate. In some regions they use water 30­–60% more efficiently than corporate farms can sustain. That makes sense given how many large operations deliberately overwater to reduce near-term risks such as soil salinization.

As three poles of Ag+Data (corporate farms, technology giants, finance sector) square off in the US in an extractive race to the bottom, broader questions arise. Will emerging tech start-ups in the Southern Hemisphere lead innovation and best practices for using natural resources more efficiently? Could they in turn take a lead in producing the world’s nutrition? In any case, their vested interests imply conflict with the core strategies of US-based technology giants. Meanwhile, the clock is ticking: crucial inputs are running out. This situation leads to a number of production asymmetries and policy conflicts, largely revolving around applications of data at scale.

Recently we produced a white paper at The Data Guild to explore these issues in greater depth: Agriculture + Data: Outlook 2Q14 Feedback is welcome! If you’d like to continue the discussion with us, contact:

Many thanks to David Gutelius, Chris Diehl, Cameron Turner, Bill Worzel, and Brad Martin for collaboration on the Ag+Data white paper.