This sample demonstrates how to compare multiple multi-class classifiers using the letter recognition dataset.
##Compare Multi-class Classifiers: Letter Recognition This sample demonstrates how to create multiclass classifiers and evaluate and compare the performance of multiple models. ##Data For this experiment, we use the letter image recognition data from the [UCI repository](http://archive.ics.uci.edu/ml/machine-learning-databases/letter-recognition). The first column is the label, which identifies each row as one of 26 letters, A-Z. The remaining 16 columns are feature columns. The dataset contains 20000 instances. Description and other details about the data can be found at [http://archive.ics.uci.edu/ml/machine-learning-databases/letter-recognition/letter-recognition.names](http://archive.ics.uci.edu/ml/machine-learning-databases/letter-recognition/letter-recognition.names). ![image1] For this experiment, we used the **Split** module to randomly divide the dataset using a 80-20 ratio of training data to test data. Then we trained multiple models using the **Train Model** module with the training dataset as input. ![image2] ![image3] ![image4] ##Models We decided to compare four _multi-class classification algorithms_ provided in Azure ML Studio: **Multiclass Neural Network**, **Multiclass Decision Jungle**, **Multiclass Logistic Regression**, and **Multiclass Decision Forest**. ![image5] Azure ML Studio also provides a module called **One-vs-All Multiclass** which can use any binary classifier as an input to solve a multi-class classification problem, based on this [one-vs-all](http://en.wikipedia.org/wiki/Multiclass_classification#One-vs.-rest) method. Therefore, as the fith model for comparison in our letter recognition task, we used a binary classification module, **Two-Class Support Vector Machine**, and connected it to the **One-vs.-All Multiclass** module. ![image6] ##Results We used the **Score Model** module on each combination of a trained model and the test data, and then used the **Evaluate Model** module to compute the confusion matrix of results. To view the confusion matrix, just right-click the output port of the **Evaluate Model** module and select **Visualize**. Because **Evaluate Model** has two input ports, you can compare the confusion matrices for two different models side-by-side, by connecting the optional second port to the output of another **Score Model** module. ![image7] Next, we used custom R code in an **Execute R Script** module to compute the _macro precision_ and _macro recall_ for the individual models. We used a series of **Add Rows** modules to combine those results. The output of the final **Add Rows** module shows thecompleted results datset. From these results, we can see that the model created by using **Multiclass Decision Forest** has the best macro precision, while the model created by using **Multiclass Neural Network** has the best macro recall (although only marginally better than the decision forest model). ![image8] We used **Visualize** on the output of **Evaluate Model** to review the confusion matrix. However, if we want to obtain the the raw data to work with the prediction results, we can use an **Execute R Script** module to pass the same data through without any modification and visualize using the module's output port. In the experiment, we did this only for the branch containing the **Multiclass Neural Network** model, to ilustrate how you can see the prediction results as a count. These numbers can be further used for computing other metrics such as **micro precision**, **micro recall** etc. ![image9] If we scroll to the right side of the visualization panel we can also see these metrics for each class: **Average Log Loss**, **Precision**, and **Recall**. ![image10] <!-- Images --> [image1]:http://az712634.vo.msecnd.net/samplesimg/v1/18/data.PNG [image2]:http://az712634.vo.msecnd.net/samplesimg/v1/18/read_split_graph.PNG [image3]:http://az712634.vo.msecnd.net/samplesimg/v1/18/read.PNG [image4]:http://az712634.vo.msecnd.net/samplesimg/v1/18/split.PNG [image5]:http://az712634.vo.msecnd.net/samplesimg/v1/18/models.PNG [image6]:http://az712634.vo.msecnd.net/samplesimg/v1/18/svm.PNG [image7]:http://az712634.vo.msecnd.net/samplesimg/v1/18/conf_mat.PNG [image8]:http://az712634.vo.msecnd.net/samplesimg/v1/18/macro_avgs.PNG [image9]:http://az712634.vo.msecnd.net/samplesimg/v1/18/conf_mat_num.PNG [image10]:http://az712634.vo.msecnd.net/samplesimg/v1/18/per_class_metrics.PNG