Getting started with data science in the cloud

Learn how to manipulate data, and construct and evaluate models in Azure ML, using a complete data science example.

Large-scale machine learning, or predictive analytics, is having a powerful impact across many industries. By using machine learning, companies, governments, and not-for-profits are replacing guesses and seat-of-the-pants estimates with valuable data-driven predictions.

Deriving value from machine learning, however, is often impeded by complex technology deployments and long model-development cycles. Fortunately, machine learning and data science are undergoing democratization. Workflow environments make tools for building and evaluating sophisticated machine learning models accessible to a wider range of users. Cloud-based environments provide secure ubiquitous access to data storage and powerful data science tools.

To get you started creating and evaluating your own machine learning models, O’Reilly has commissioned a new report: “Data Science in the Cloud, with Azure Machine Learning and R.” We use an in-depth data science example — predicting bicycle rental demand — to show you how to perform basic data science tasks, including data management, data transformation, machine learning, and model evaluation in the Microsoft Azure Machine Learning cloud environment. Using a free-tier Azure ML account, example R scripts, and the data provided, the report provides hands-on experience with this practical data science example.

Specifically, this report shows you how to complete the following tasks using Azure ML and R:

  • Manage and transform data, using a highly scalable cloud environment
  • Build and evaluate machine learning models
  • Produce R graphics
  • Publish your models as web services

The dataset and R code used throughout the report are available on GitHub. Free tier Azure ML accounts are now available with a Microsoft ID.

Download the free report.

This post is part of a collaboration between O’Reilly and Microsoft. See our statement of editorial independence.

tags: , , , ,

Get the O’Reilly Data Newsletter

Stay informed. Receive weekly insight from industry insiders.

  • Seems like the download link doesn’t work – I get a:
    Bad Request

    Your browser sent a request that this server could not understand.
    Size of a request header field exceeds server limit.



  • hxy0135

    Hi Stephen,

    I am reading through the ebook download from O’really Media. I found I cannot reproduce the Figure 8, and Figure 9 – Time series plot of bike demand for the 0700 hr. Is the code in the book exact same as the code you used to product the Figure 8?