Learn how to use Microsoft R Server to analyze large datasets using R, one of the most powerful programming languages.
Understand what you need to succeed in this course and determine if it's the right fit for your learning goals
What you need before starting this Analyzing Big Data with Microsoft R course:
Beginner-Friendly Course!
This course is designed for absolute beginners. No prior knowledge needed.
This course is perfect for:
Everything you need to know about this online course, from duration to certification
This course is part of the Microsoft Professional Program Certificate in Data Science and the Microsoft Professional Program Certificate in Big Data..The open-source programming language R has for a long time been popular (particularly in academia) for data processing and statistical analysis. Among R's strengths are that it's a succinct programming language and has an extensive repository of third party libraries for performing all kinds of analyses. Together, these two features make it possible for a data scientist to very quickly go from raw data to summaries, charts, and even full-blown reports. However, one deficiency with R is that traditionally it uses a lot of memory, both because it needs to load a copy of the data in its entirety as a data.frame object, and also because processing the data often involves making further copies (sometimes referred to as copy-on-modify). This is one of the reasons R has been more reluctantly received by industry compared to academia. The main component of Microsoft R Server (MRS) is the RevoScaleR package, which is an R library that offers a set of functionalities for processing large datasets without having to load them all at once in the memory. RevoScaleR offers a rich set of distributed statistical and machine learning algorithms, which get added to over time. Finally, RevoScaleR also offers a mechanism by which we can take code that we developed on our laptop and deploy it on a remote server such as SQL Server or Spark (where the infrastructure is very different under the hood), with minimal effort. In this course, we will show you how to use MRS to run an analysis on a large dataset and provide some examples of how to deploy it on a Spark cluster or a SQL Server database. Upon completion, you will know how to use R for big-data problems. Since RevoScaleR is an R package, we assume that the course participants are familiar with R. A solid understanding of R data structures (vectors, matrices, lists, data frames, environments) is required. Familiarity with 3rd party packages such as dplyr is also helpful.edX offers financial assistance for learners who want to earn Verified Certificates but who may not be able to pay the fee. To apply for financial assistance, enroll in the course, then follow this link to complete an application for assistance.
Difficulty Level
Intermediate
Some foundational knowledge required
Subject Category
Data Analysis & Statistics
Part of our Data Analysis & Statistics curriculum
Course Language
English
All materials in English
This online course offers comprehensive training with expert instruction, practical exercises, and a certificate of completion. Join thousands of students advancing their careers through quality online education.
Professional Course
Investment around $99
Pricing may vary. Check the course provider for current promotions and exact pricing.
Analyzing Big Data with Microsoft R isn't for you? Don't worry, explore these courses and advance your skills or learn something totally new.