Simplest Pytorch Model Implementation for Multiclass Classification

4 min readSep 17, 2022

using msdlib

Intro:

Pytorch is the most flexible and pythonic development tool for designing deep learning models. Today, we are going to discuss the easiest way to build a classification model in Pytorch and train+validate model performance for a multi-class classification task.

Multi-class classification task:

Multi-class classification is a type of classification task where we want to classify samples or examples into certain number of categories.

For this task, we need class labels for training data. And for validation data, we will use the trained model to generate class predictions.

Classification model example with 4 classes

Data set:

Here we will be using MNIST data set which is available in scikit-learn library under datasets submodule. This is a digit recognizer data set. For this task, total number of classes are 10 (digits 0 ~ 9).

The data is kept as one long flattened feature vector for each sample digit. But if we un-Flatten them, we can visualize the digits as image. In short, our input data is feature vectors and each feature corresponds to a pixel value for the image of the digit.

Importing libraries:

import pandas as pd
from sklearn.datasets import load_digits
import torch
import os
import sys
from msdlib import mlutils
from msdlib import msd

For installation of msdlib, please visit here: https://github.com/abdullah-al-masud/msdlib

Loading and organizing data-label:

source_data = load_digits()
feature_names = source_data[‘feature_names’].copy()
data = pd.DataFrame(source_data[‘data’], columns=feature_names)
label2index = {name: i for i, name in enumerate(source_data[‘target_names’])}
label = pd.Series(source_data[‘target’]).replace(label2index)

Standardization and Splitting:

data = msd.standardize(data)
splitter = msd.SplitDataset(data, label, test_ratio=.1)
outdata = splitter.random_split(val_ratio=.1)

Model layers definition:

Here we are defining the model’s layers, number of units/neurons in each hidden layer, activation function, dropout rate etc.

layers = mlutils.define_layers(
    input_units=data.shape[1],
    output_units=label.unique().shape[0],
    unit_factors=[100, 100, 100, 100, 100, 100],
    dropout_rate=.2,
    actual_units=True,
    activation=torch.nn.ReLU(),
    model_type=’regressor’
)

One thing to notice in the above code section. The ‘model_type’ is written as ‘regressor’ which might be slightly misleading because our target is to construct a multi-class classification model. But here we are going to use ‘regressor’ here so that the model’s output doesnt come through a softmax activation function. Later we will be using torch.nn.CrossEntropyLoss class as loss function which internally applies softmax during loss calculation. For prediction, if we need probability as model output, we need to apply softmax additionally.

One other way would be defining the model_type here as ‘multi-classifier’ and using torch.nn.NLLLoss as loss function in the next section.

Registering model as torchModel class:

Here, we register the model layers to construct torchModel class which provides support for model training, prediction, evaluation, handling data-loader supports, tensorboard visualization etc.

tmodel = mlutils.torchModel(
    layers=layers,
    model_type=’multi-classifier’,
    tensorboard_path=’runs’,
    savepath=’examples/multiclass-classification_torchModel’,
    batch_size=64,
    epoch=150,
    learning_rate=.0001,
    lr_reduce=.995
)

Here, by using ‘model_type’ as ‘multi-classifier’, we are telling the torchModel class to apply CrossEntropyLoss (along with a few other things). But if we want to use NLLLoss function, we need to specify it here with ‘loss_func’ argument. In both cases, we must use ‘multi-classifier’ for model_type definition here.

This registering process also takes custom model as input under ‘model’ argument. For full documentation, please visit here- https://msdlib.readthedocs.io/en/latest/mlutils.html#msdlib.mlutils.torchModel

Model training:

tmodel.fit(
    outdata[‘train’][‘data’],
    outdata[‘train’][‘label’],
    val_data=outdata[‘validation’][‘data’],
    val_label=outdata[‘validation’][‘label’]
)

After training the model, the learning curve will be stored inside the designated directory under savepath parameter while registering the model.

Learning curve generated after model training

The fit function is taking numpy array as input. It can also take torch tensor input as well as pytorch data loader as input. For full documentation, please check here- https://msdlib.readthedocs.io/en/latest/mlutils.html#msdlib.mlutils.torchModel.fit

Model evaluation:

result, all_results = tmodel.evaluate(
    data_sets=[outdata[‘train’][‘data’], outdata[‘test’][‘data’]],
    label_sets=[outdata[‘train’][‘label’], outdata[‘test’][‘label’]],
    set_names=[‘Train’, ‘Test’],
    savepath=’examples/multiclass-classification_torchModel’
)

Through this evaluate function, we can nominate any number of validation or test data sets for performance evaluation of the model. This evaluate function returns results as well as it saves the results in tables format showing the classification scores and confusion matrices for each inserted evaluation data sets.