Creating a Dataset

<< Click to Display Table of Contents >>

Navigation:  Additional Services > Business Insights for third party tools  > Datasets service >

Creating a Dataset

Overview

Business Insights for Automation Service lets you expose and save in Automation Service, your Bizagi processes’ business information so you can use it for specific purposes like creating reports with third party tools or for AI analysis.

You can easily configure how and when the information is extracted from processes in Bizagi to make sure data preparation (cleaning, selecting data) is complete and that your Datasets to contain high-quality data which is final and reliable when consumed.

For introductory information on this application, refer to Datasets service.

 

This section describes how to get started with Datasets service, and create a new Dataset.

 

Procedure

To create a new Dataset, follow these steps:

 

1. Log in to the Bizagi Datasets service portal.

At https://bi-[CustomerName].bizagi.com/, provide your user credentials and log in.

 

2. Create a new Datasets service project.

Click the plus icon to add a project.

 

Cloud_Datasets5

 

Give the Datasets service project a Name and a meaningful Description, and click Create Project when you are ready.

 

A Datasets service project is not related to a Bizagi project and its processes.

A Datasets service project's main purpose is to help you organize and manage the datasets you may have.

 

5. Create a new dataset.

Click the New dataset button for a given Datasets service project,  

 

Cloud_BizagiDataset8

 

Give the dataset a Name and a meaningful Description, and define its column structure.

You can either load a .csv file for the dataset to take the file's columns as definitions, or manually define columns.

 

Cloud_BizagiDataset9

 

Defining the structure from a .csv

If using the .csv option, click Select a .csv to upload your file from its location:

 

Cloud_Datasets_Def1

 

The dataset automatically detects the data type of each column.

 

Cloud_Datasets_Def3

 

However, it is important that you double-check that each column is set with the appropriate data type. If there is a problem, you can manually change a column's definition.

You can choose String, Numeric, Boolean or Datetime as the data type.

 

Cloud_BizagiDataset11

 

Click Create dataset to finish up.

 

Defining the structure manually

Click Create a schema manually to define each of the columns by entering its column name and choosing its data type:

 

Cloud_Datasets_Def2

 

You can choose String, Numeric, Boolean, Datetime or Collection as the data type.

Click Create dataset to finish up.

 

note_pin

Regardless of how the structure was defined, you can update it by adding extra columns manually. For more information, refer to Adding a column to an existing dataset.

 

At this point, your dataset will be created. You can start using it right away.

The created Dataset should appear under a Datasets service project, and you can find its three default environments: Development, Testing and Production.

 

Cloud_Datasets10

 

To learn about these different environments and their uses, refer to Dataset environments.

 

Adding collection columns

Datasets support collections up to two levels deep. The procedure to define collections depending on how you define your Dataset structure is explained next.

 

Adding collections from a .csv

If you want to create a Dataset with collections in its structure, edit the .csv file to include the columns of your collection following this pattern:

 

CollectionName__Column

 

In this pattern, you need to use double underscore to split both the name of the collection and the name of the column. Furthermore, repeat the main register as many times as the number of items in the collection.

 

Cloud_BI_col1

 

In the example above, the Dataset has a Numeric type column (CountryCode), a String type column (CountryName) and a Collection type column called City; in this collection column you have two columns (Code and Name). As mentioned before, the collection related to the record Colombia has five items, so this is why the record value 57, Colombia is repeated five times. This is important when you feed your dataset from an environment. For more information refer to Working with Dataset environments.

 

If you use only one underscore character to split the collection and the column, the underscore is removed and the column is not interpreted as a Collection. For example:

 

Input value

Interpreted as collection

Column name(s)

City_Code

No

citycode

CityCode_

CityCode__

City__Code

Yes

city

code

City___Code

 

Back in the Bizagi Datasets service portal, click Select a .csv to upload your file from its location:

 

Cloud_BI_col2

 

The Dataset automatically detects the Data type of each column, including the collections and their inner columns.

 

Cloud_BI_col3

 

As always, it is important that you double-check that each column is set with the appropriate Data type. If there is a problem, you can manually change a column's definition.

A column detected as a collection cannot be changed to another Data type.

 

Adding collections manually

In the Bizagi Datasets service portal, click Create a schema manually, enter its Column Name and choose Collection as its Data type:

 

Cloud_BI_col4

 

Click Add column and then click the pencil icon at the right of the collection type row.

 

Cloud_BI_col5

 

In the new window, add the columns of the collection.

 

Cloud_BI_col6

 

You can choose String, Numeric, Boolean, Datetime or Collection as the Data type.

Click Close to finish up and return to the dataset. Now, the column created shows its number of columns in parentheses.