When you have created an AI experiment, as described in Creating models and experiments, you see its basic properties and the diagnosis steps.
This section describes how to work with an experiment, and explore its possible combinations of input parameters and algorithms so you can assemble the experiment that provides results that satisfy your use case statement.
When you have created an AI experiment, you see its basic properties (Name, Description and source Dataset). You need to perform three further steps to enable predictive analysis.
The steps are Predicted Feature, Feature Selection and Summary Prediction Model.
For the experiment shown above, input data was created in a Dataset called DiabetesIndianWomen.
Records in this Dataset help determine if diabetes results will be positive for a given patient, based on the information held in the testresult attribute/variable, as shown in the column presented below:
The actual data was taken from https://archive.ics.uci.edu/ml/datasets/pima+indians+diabetes, and it includes attributes/variables such as:
•Number of times pregnant (shortened to pregnant).
•Plasma glucose concentration a 2 hours in an oral glucose tolerance test (shortened to plasma).
•Diastolic blood pressure --in mm Hg (shortened to bpressure).
•Triceps skin fold thickness --in mm (shortened to skinthickness).
•Two Hour serum insulin --in mu U/ml (shortened to 2h_insulin).
•Body mass index (shortened to BMI) --weight in kg/(height in m)^2.
•Diabetes pedigree function (shortened to diabetespf)
•Age in years (shortened to years)
Step 1: Predicted Feature
The Predicted Feature step lets you select the attribute you want to predict (which refers to your use case statement).
For the example described above, we want to predict (based on other variables), if diabetes could be accurately suspected before running tests.
Select the name of the attribute to predict in the drop-down list.
Selecting testresult makes it become the predicted feature:
Click Next when you are ready.
Step 2: Feature Selection
The Feature Selection step lets you mark attributes which you consider relevant and directly influencing for your predicted feature.
In the example described above, we want to identify which variables have a relationship with diabetes (for either a positive or negative diabetes result).
To do this, check all significant attributes to make them selected features (i.e predictors).
You can for example, uncheck the checkbox for age, if you think that age should not be taken into account by the analysis.
Though you can see sample data for each feature, at any time you can switch to Customize Features for an in-depth analysis or to modify how the application treats the values of the features.
Further options are available to help you make an informed decision on feature selection:
•Modify the data type per feature.
•Define how to replace empty values per feature.
•Look up a values distribution chart per feature.
You can also use the Suggest features button so that Bizagi Artificial Intelligence automatically marks and highlights features it identifies as relevant.
By default, the predicted feature is shown as well and it cannot be unselected:
Modifying the data type
In Customize Features, you can edit the data type identified by Artificial Intelligence.
Available data types are: Numeric, Category, or Date.
Defining how to replace empty values
In Customize Features, you can decide if you want to replace empty values with a default value:
You can set empty values (those shown as null) to use Zero (0), the average value of the whole set (Mean), the most frequently used value (Mode), or as defined by you (User defined).
Looking up a values distribution chart
In the Advanced view, you can click the icon for a given feature, and view its chart representing how its data is distributed:
Step 3: Summary Predict Model
When you are done with the experiment's configuration, click Train Model > to have the Artificial Intelligence capabilities interpret the data and generate the model that presents a given certainty.
Bizagi Artificial Intelligence chooses the best algorithm for your specific use case statement and carries out machine learning analysis steps such as training the model.
For this example above where we are determining whether a diabetes result (true or false) can be predicted, the analysis yields an accuracy (given that true and false values describe diabetes as a categorical data type):
In a hypothetical case where we want to predict the age of the patient, based on variables such as the patient having diabetes, the analysis would yield a standard error (age has numeric values that we do not want to interpret as a category data type).
Even though age when recorded in years for humans has no infinite values, it could still be considered as continuous, because we would like to get a predicted age while using an offset for that prediction.
Depending on how good the presented certainty is for your use case you can choose to create additional experiments that may provide higher accuracy or a lower standard error.
Testing the experiment
Once a model has been generated, you can click the Test predictions button to manually input sample values and evaluate wheter the prediction's certainty is good enough for your use case.
To run the test, input values, selecting them from the drop-down menu (for category data types), or leaving some blank, then click Test Prediction:
At this point you have created an AI experiment. You need to publish it so your Bizagi processes can rely on AI capabilities by using the experiment.
For more information about this next step, refer to Publishing an AI experiment.
Advanced options for data scientists
At any point you can also choose to edit the experiment's parameters in an advanced mode (e.g, for data scientists) to select a different machine learning algorithm.
For more information about the advanced mode, refer to AI experiments advanced options.
If you are looking for sets of data to use as you try out Bizagi Artificial Intelligence, several web sites publishing sample data, such as: http://mldata.org/repository/tags/data/earthquakes/.
The data has the structure defined in the Data tab shown at: http://mldata.org/repository/data/viewslug/global-earthquakes/.
Similarly, sets of data can be found at https://archive.ics.uci.edu/ml/datasets.html.