← Back to Blog

Data Preparation Made Easy With Tableau Prep

Last month, Tableau released a brand-new tool to its data visualization product suite, Tableau Prep. Tableau Prep is a data preparation tool designed for analysts and business users who try to prepare data for themselves, but might get stuck because they don’t have traditional ETL expertise. The tool empowers users to combine, shape, and cleanse data for analysis in Tableau.

As you would expect, Tableau Prep is just as intuitive as Tableau Desktop. The user interface provides a visual experience that gives users a deeper understanding of their data. Smart features, simple data prep, and integration with Tableau means faster speed to insight. Below I’ll be providing my thoughts on some of the initial features available in Tableau Prep while also providing some tips along the way.


The User Interface

Tableau Prep has a clean and friendly user interface. The look feels like the final form of Tableau Desktop’s data source screen. Above is a screen shot view of a Superstore “flow” in Tableau Prep. There are a few key panes in the screen to be aware of:

  • Connections pane: Shows the databases and files you are connected to. Add connections to one or more databases and then drag the tables you want to work with into the flow pane.
  • Flow pane: As you clean, shape, and combine your data, steps will appear in the flows. This visual indication will allow the user to see an overview of their changes. The user can accomplish a variety of data cleaning tasks in moments, such as fuzzy clustering and other smart features.
  • Profile pane: Displays a summary of each field in your data and allows users to see the shape of their data and begin to identify any issues with their data.
  • Changes pane: Tableau Prep keeps track of the changes you make, in the order you make them, so you can always go back and review or edit those changes.
  • Data grid: Lets the user see row level detail and verify individual records.
Cleaning Your Data

To create a Flow and begin to clean our data, we need to connect to a data source. Tableau supports a wide variety of on-premise and cloud-based data sources. We’ll connect to three text files and create a Flow.

A Flow is a series of cleaning and transforming steps that users perform on the data and are where you do all of your cleaning, filtering, calculations, and aggregates. We can create calculated fields, split data, and utilize a variety of smart cleaning features built into Tableau Prep.

In our example, we notice that our discount field was brought in as a text field and has some issues. We can edit values directly in Prep and begin transforming our data by replacing the “OO” values with a zero and changing our data type to a decimal.

We can also profile our data in Prep to spot issues. For example, in one of our datasets we have a lot of NULL Order IDs. We can select those in Prep and begin to understand these NULLs before we move to the analysis phase. For example, we can see all the NULL Order IDs occurred during a few months in early 2013. We can rename these ‘Missing Order ID’ for further analysis later.


Integrating Data

Once we’ve cleaned our data, we need to put it together. We can easily union and join datasets in Prep. In the view below, we see the results of our Union and can immediately identify any mismatched fields or level of aggregation issues between our datasets.

Sometimes the data you join may be at differing levels of granularity. Prep allows users to easily aggregate and dis-aggregate data to get it to the same grain of detail prior to joining.

Users can customize join conditions, select join types, and view the results of the join all within Prep. Since Prep is iterative, if anything doesn’t look right, it’s easy to undo changes via the Changes Pane.


Loading Data into Tableau

Loading data into Tableau after you’ve prepared it is seamless. We simply add an output step at the end of your Flow and click the “play symbol.”

After running our Flow, we can load the Hyper data extract into Tableau Desktop or publish it to Tableau Server as a Published Data Source. We can also choose to generate a .csv file, if you would like.

Final Thoughts

Tableau Prep gives data professionals a tool to cleanse and shape their data with a similar experience to Tableau Desktop. Also, for certain use cases, Tableau Prep will eliminate the need for complicated and costly ETL tools. To get started using Tableau Prep, download a free trial here.


Want to know more? Email Ian Hagerman at ian.hagerman@infoworks-tn.com