Multi-File Datasets vs Multi Datasets

Managing data in Oracle Analytics can be done in a variety of ways – So how do you know which is the best way?

Lets looks at the options

Create Dataset from Database Tables
- Create single Table Dataset
- Create a single Datasets using Multiple Tables
Upload Data using DV
- Upload Single File into a Dataset
- Upload Multiple Files in a single datasets
Use Dataflows
- Combine multiple Datasets into a Single Datasets using a Dataflow
Use Mutiple Datasets in a single workbook

Here’s a great video showing the muti table approach

So lets recap the simple file upload methods we can use to analyse Race data:

From our earlier post we saw how to upload a file. You can add columns and change column types during the upload and dataset creation.

There is actually a method to upload more than one file into a single dataset (Option 4 above). this method allows you to define the join for the data, and make both sets of columns available in the analysis. Much in the same way that the video highlighted above.

Thats is fine where you have a join, but our data is the same in each file, just for a different race, so there is no join

Luckily, we can also create a dataset from a dataset, using a Data Flow. This gives us flexibility to separate the process, which enable analysis of the raw information, unchanged on input, and then analyse the transformed data. I find this useful for bug fixing, identifying where the problem sits.

We can also start to bring in more race data when using a 2-step process (Dataset then Dataflow) and importantly use the UNION step.

Adding Race 1 and Race together:

Above we can see that the two datasets can be placed into a union. However, we would not be able to distinguish between race 1 and 2 if we wanted to, so we add a column to each, before the Union:

You then get the added column in the entire new dataset

Note how the little diagram show the input steps “Add Columns”, you may want to make this more obvious what is going on by updating the step name:

The input now shows this:

We can now add columns to the unioned dataset

After we run the Data Flow, we should see a new dataset, with both races, and a additional columns for both races

And now all races can be analysed together, and compared, or we can see better averages over a bigger dataset.

And the the great thing is that we can start to add more races, without having to remember the formulas and positions of the calculated columns:

Upload the Race 3 csv file

Save the Dataset

Add it to the Data Flow

Add the Race Number Column

Add a union step

I choose to union race 2 and 3,

then union those to Race 1

Lets run the Dataflow and see what we get…

Our Third Race data arrived, and our analysis is going to better for it!

And now we start to ask ourselves, how far can this go?

In summary, we can bring lots of files into DV, and combine them using UNION or Joins. Union the similar data, and join the related data

So after 4 races, can we see any patterns emerging, and what about that speed comparision we saw last time?

Wow, its pretty consistent, Upwind boat speed on Port is always faster than on Starboard, and Downwind on Starboard is faster than downwind on Port. Cant see that, try some heatmap

The darker the colour the higher the value, but given that the wind strength varied, the heatmap can be a little hard to read!

Oracle Analytics Technical

Share this: