Merging multiple model evaluation result in Azure ML

Evaluating prediction models in Azure ML  studio is always a must thing to do. But when your experiment consists of multiple models and you want to compare all at the same time, it might get a bit tricky.

Presume, we have a regression type of a problem and my experiment consists of three models.


After running the experiment, in order to check the scores of evaluation for each models, you need to check for each model separately.  And each time, the result is as following:


At this point we can use couple of ways to compare the models.

  1. By adding “Add rows” module two times one after another
  2. Merging results of two models into one “Evaluate model” module or
  3. Adding “Execute R script” module and “Add rows” module for cleaner output

By far the best is the third approach, because we manually add the labels into results knowing which model perform best.

When merging results of two models into one “Evaluate model” module, like shown on the diagram,


the main problem are the labels;


Intuitively, the logical order can be applied, but it does not mean, it will always be preserved. Moreover, when adding the the third model into the “Evaluate model” module, it can not be done – only two models can be compared. Therefore, “Add rows” module must be used.

So the experiment will look something like this:


Again the “Add Rows” module does not offer any ability to expose the model name and only some logical order can be applied when determining which result row correspond to which model.


In order to avoid confusion over labels of models (imagine having 20 models and you want to compare each one of them; and each of them is different not by the model name but by a particular parameter setting), we add “Execute R  script” module. Finalized experiment will look like:


With both modules “Execute R Script” and “Add Rows” creating the best final results with labels and models description.


Populating “Execute R Script” module with following R Code:

dataset1 <- maml.mapInputPort(1) # class: data.frame
data.set <- data.frame(Algorithm='Bayesian Linear Regression')
data.set <- cbind(data.set, dataset1[2:6])

Returned dataset can be build up to the data scientist.  It might also happen that not all algorithms return same metrics for evaluating the model. For example: “Bayesian Linear regression” model will return Negative Log likelihood where as “Linear regression” will not return this metrics. So when using R Script for output, please note that all merged datasets must have same number of columns with same names (more importantly, all consists of same metrics!).

After finally running the experiment, we get the following export:


And this is how the export of model comparison should be done. and now you don’t have to worry about mixing the names and models.