Simplest Pytorch Model Implementation for Regression
Intro:
Pytorch has been a wonderful library for writing production level machine learning code and Deep Learning model design. Today I am going to show the easiest way to use Pytorch for training a deep learning model with only a few lines.
For this we are going to use a helper library named msdlib. Its open-source, completely free and available to install through pip, very efficient and gives a lot of flexibilities for model training, prediction, evaluation etc.
Lets look at the code straight.
Installing libraries:
You need to open your command prompt and install these two libraries for this tutorial at first.
pip install msdlib
pip install scikit-learn
You also need to install pytorch. You should follow instructions from here https://pytorch.org/ to install pytorch based on your hardware.
Import libraries:
# torchModel() regression example
import pandas as pd
from sklearn.datasets import fetch_california_housing
import torch
import os
from msdlib import mlutils
from msdlib import msd
Here we are using breast cancer data set which contains binary classification data. We already loaded necessary libraries. Now we will prepare the data set to run training and evaluation.
Preparing data:
# Loading the data and separating data and label
source_data = fetch_california_housing()
feature_names = source_data['feature_names'].copy()
data = pd.DataFrame(source_data['data'], columns=feature_names)
label = pd.Series(source_data['target'], name=source_data['target_names'][0])
Data Standardization and Splitting into train, validation and test:
# Standardizing numerical data
data = msd.standardize(data)
# Splitting data set into train, validation and test
splitter = msd.SplitDataset(data, label, test_ratio=.1)
outdata = splitter.random_split(val_ratio=.1)
Preparing pytorch data loader:
# Training Pytorch model
train_set = mlutils.DataSet(
torch.tensor(outdata['train']['data']),
torch.tensor(outdata['train']['label']).squeeze(),
dtype=torch.float32)
val_set = mlutils.DataSet(
torch.tensor(outdata['validation']['data']),
torch.tensor(outdata['validation']['label']).squeeze(),
dtype=torch.float32)
train_loader = torch.utils.data.DataLoader(train_set, batch_size=128)
val_loader = torch.utils.data.DataLoader(val_set, batch_size=128)
Build model:
# defining layers inside a list
layers = mlutils.define_layers(
input_units=data.shape[1],
output_units=1,
unit_factors=[100, 100, 100, 100, 100, 100],
dropout_rate=.2,
model_type='regressor',
actual_units=True,
activation=torch.nn.ReLU())# building model
tmodel = mlutils.torchModel(
layers=layers,
model_type='regressor',
tensorboard_path='runs',
interval=120,
savepath='regression_torchModel',
epoch=150,
learning_rate=.0001,
lr_reduce=.995)
Model training and saving:
tmodel.fit(train_loader=train_loader, val_loader=val_loader)
model_dict = {tmodel.model_name: tmodel.model}
model_dict = mlutils.load_models(model_dict, 'regression_torchModel')
Model evaluation:
# Evaluating the model's performance
result, all_results = tmodel.evaluate(
data_sets=[outdata['train']['data'],
outdata['test']['data']],
label_sets=[outdata['train']['label'].ravel(),
outdata['test']['label'].ravel()],
set_names=['Train', 'Test'],
savepath='regression_torchModel')print('regression result:\n', result)
For more information and documentation regarding msdlib, please visit here: https://msdlib.readthedocs.io/en/latest/index.html