When having created an AI experiment, as described at Creating models and experiments, you are presented with its basic properties and the diagnose steps.
This section describes how to work with an experiment, and explore with its possible combinations between input parameters and algorithms so that in the end, you settle with the one experiment whose results are satisfactory for your use case statement.
When having created an AI experiment, note you will see its basic properties (Name, Description and source Dataset), and will need to further perform three prediction steps for predictive analysis.
Such steps are Predicted Feature, Feature Selection and Summary Predict Model.
Note that for the created experiment shown above, input data was created in a Dataset called DiabetesIndianWomen.
Records in this Dataset are centered around determining if diabetes resulted positive for a given patient, as per the information held in the testresult attribute/variable, as shown in the column presented below:
The actual data was taken from https://archive.ics.uci.edu/ml/datasets/pima+indians+diabetes, and it considers attributes/variables such as:
•Number of times pregnant (shortened to pregnant).
•Plasma glucose concentration a 2 hours in an oral glucose tolerance test (shortened to plasma).
•Diastolic blood pressure --in mm Hg (shortened to bpressure).
•Triceps skin fold thickness --in mm (shortened to skinthickness).
•2-Hour serum insulin --in mu U/ml (shortened to 2h_insulin).
•Body mass index (shortened to BMI) --weight in kg/(height in m)^2.
•Diabetes pedigree function (shortened to diabetespf)
•Age in years (shortened to years)
Step 1: Predicted Feature
The Predicted Feature step allows you to select that attribute which you want to predict (that which refers to your use case statement).
For the example described above, we would be stating that we want to predict (based on other variables), if diabetes would be accurately suspected before running tests.
To do so, select the name of that attribute to predict in the drop-down list.
Selecting testresult will make it become the predicted feature:
Click Next when finish.
Step 2: Feature Selection
The Feature Selection step allows you to mark those attributes which you consider relevant and directly influencing a your predicted feature.
For the example described above, we would be identifying which variables have a relationship with diabetes (for either a positive or negative diabetes result).
To do so, ensure you tick all attributes which will become selected features (i.e predictors).
Notice you may for example, unmark the checkbox for age, for instance if deciding that age should not be taken into account by the analysis.
Though you can see sample data of each feature, at any time you may switch to the Customize Features for an in-depth analysis or to modify how the treatment for the values of the features.
Further options are available to help you out with an informed decision on the feature selection, as described below:
•Modifying the data type per feature.
•Defining how to replace empty values per feature.
•Looking up a values distribution chart per feature.
You may also use the Suggest features button so that Bizagi Artificial Intelligence automatically marks those features it identifies as relevant (and highlights them).
By default, the predicted feature is shown as well and it cannot be unmarked:
Modifying the data type
In the Customize Features, you may edit the data type as automatically identified by the Artificial Intelligence capabilities.
Available data types are: Numerical, Categorical, or Date.
Defining how to replace empty values
In the Customize Features, you may decide if you want to replace empty values with a given value:
Replacing empty values (i.e seen as null) can be set to use Zero (0), the average value of the whole set (Mean), the most frequently used value (Mode) or as defined by you (User defined).
Looking up a values distribution chart
In the Advanced view, you may click on the icon for a given feature, and view its chart representing how data is distributed:
Step 3: Summary Predict Model
Once you are done with the experiment's configuration, click Train Model > to have the Artificial Intelligence capabilities interpret the data and generate the model that presents a given certainty.
Note that Bizagi Artificial Intelligence will choose the best algorithm for your specific use case statement, and it will also carry out machine learning analysis steps such as training the model.
For the example above where we would be determining if a diabetes result (true or false) can be predicted, the analysis yields an accuracy (given that true and false values describe diabetes as a categorical data type):
In a hypothetical case where we would be wanting to predict the age of the patient, based on variables such as that patient having diabetes, the analysis would yield a standard error (given that an age has numeric values that we do NOT want to interpret as a categorical data type).
Note that even though age when recorded in years for humans has no infinite values, it could still be considered as continuous, mainly because we would like to get a predicted age while using an offset for that prediction.
Depending on how good is this presented certainty for your use case (seek for a higher accuracy or a lower standard error), you may decide to create additional experiments.
Testing the experiment
Once a model has been generated, you may click the Test predictions button to manually input sample values and evaluate if the prediction's certainty is good enough for your use case.
To run the test, input values, select them from the drop-down (for categorical data types), or leave some blank, and then click Test Prediction:
At this point you have created an AI experiment and you will need to publish it so that it can be used externally (e.g, so that your Bizagi processes can rely on AI capabilities).
For more information about this very next step, refer to Publishing an AI experiment.
Advanced options for data scientists
At any point you may also choose to edit this experiment's parameters in an advanced mode (e.g, for data scientists), so that you can select a different machine learning algorithm.
For more information about such advanced mode, refer to AI experiments advanced options.
If you are looking for sets of data to explore and get started to try out Bizagi Artificial Intelligence, you may rely on external web sites publishing sample data, such as: http://mldata.org/repository/tags/data/earthquakes/.
Such data has the structure as defined in the Data tab shown at: http://mldata.org/repository/data/viewslug/global-earthquakes/.
Similarly, sets of data can be found at https://archive.ics.uci.edu/ml/datasets.html.