What you'll learn

  • Importing data into R from different file formats

  • Web scraping

  • How to tidy data using the tidyverse to better facilitate analysis

  • String processing with regular expressions (regex)

     

  • Wrangling data using dplyr

  • How to work with dates and times as file formats, and text mining

Course description

In this course, part of our Professional Certificate Program in Data Science, we cover several standard steps of the data wrangling process like importing data into R, tidying data, string processing, HTML parsing, working with dates and times, and text mining. Rarely are all these wrangling steps necessary in a single analysis, but a data scientist will likely face them all at some point. 

Very rarely is data easily accessible in a data science project. It's more likely for the data to be in a file, a database, or extracted from documents such as web pages, tweets, or PDFs. In these cases, the first step is to import the data into R and tidy the data, using the tidyverse package. The steps that convert data from its raw form to the tidy form is called data wrangling.

This process is a critical step for any data scientist. Knowing how to wrangle and clean data will enable you to make critical insights that would otherwise be hidden.

Instructors

You may also like

In-Person
Blended
Online
Online Live

Designed for aspiring and established leaders in any industry, HBAP equips participants with the machine learning and data analysis tools they need to incorporate innovative tech into their business strategy, at the top levels of their organization.

Price
$51,500
Registration Deadline
Starts Jan 20
Online

Learn how to effectively use data to tackle your business decisions. Designed for managers, this course provides a hands-on approach for demystifying the data science ecosystem and making you a more conscientious consumer of information.

Price
$1,600
Duration
4 weeks long
Registration Deadline