Convolutional Architectures Lab#

This lab explores three influential convolutional neural network (CNN) architectures:

  1. AlexNet: A deep CNN that significantly improved image classification performance.

  2. VGG: Known for its simplicity and depth, using small, uniform convolutional filters.

  3. ResNet: Introduced residual connections, allowing for much deeper networks and addressing the vanishing gradient problem.

You will implement these architectures and apply them to the OrgansMNIST dataset from the MedMNIST collection. This hands-on experience will help you understand the design principles and performance characteristics of each architecture.

The lab begins with data loading and preprocessing, as shown in the code above. You’ll then proceed to implement and train each model, comparing their performance on the pneumonia classification task.

[ ]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torch.utils.data import sampler

import torchvision.datasets as dset
import torchvision.transforms as T
import matplotlib.pyplot as plt

import numpy as np

USE_GPU = True
dtype = torch.float32 # We will be using float throughout this tutorial.

if USE_GPU and torch.cuda.is_available():
    device = torch.device('cuda')
else:
    device = torch.device('cpu')

# Constant to control how frequently we print train loss.
print_every = 100
print('using device:', device)
[ ]:
transform = T.Compose([
                T.ToTensor(),
                T.Normalize((0.5719),(0.1684))
            ])

# load medMNIST datset

from medmnist import OrganSMNIST
dataset_train = OrganSMNIST(root='utils/datasets', split='train', transform=transform, download=True)
loader_train = DataLoader(dataset_train, batch_size=64, shuffle=True, num_workers=2)

dataset_val = OrganSMNIST(root='utils/datasets', split='val', transform=transform, download=True)
loader_val = DataLoader(dataset_val, batch_size=64, shuffle=True, num_workers=2)

dataset_test = OrganSMNIST(root='utils/datasets', split='test', transform=transform, download=True)
loader_test = DataLoader(dataset_test, batch_size=64, shuffle=True, num_workers=2)

[ ]:
# See the first image in the training set
image, label = dataset_train[5]
print(f"Image shape: {image.shape}")
plt.imshow(image.squeeze().numpy(), cmap='gray')
plt.title(f'Label: {label}')
plt.axis('off')
plt.show()

AlexNet Implementation#

AlexNet, introduced in 2012, was a groundbreaking convolutional neural network architecture that significantly improved image classification performance. It consists of 5 convolutional layers followed by 3 fully connected layers. Key features include:

  1. ReLU activations for faster training

  2. Local Response Normalization (LRN) for improved generalization

  3. Overlapping max pooling to reduce overfitting

  4. Data augmentation and dropout for regularization

Your task is to implement AlexNet using PyTorch, adapting it for the OrgansMNIST dataset. Note that you may need to modify the original architecture slightly to accommodate the different input size of the OrgansMNIST images (28x28) compared to ImageNet (224x224). Please follow the instructions in the utils/models/alexnet.py file and run the cell below to load and test the model.

[ ]:
import utils.models.alexnet as alexnet
import torch

model = alexnet.AlexNet(num_classes=11, in_channels=1)

def test_shape(model):
    x = torch.randn(16, 1, 28, 28)
    assert model(x).shape == (16, 11)
    print("Test passed")

test_shape(model)

Training AlexNet#

Now that you have implemented the AlexNet architecture, follow these steps to set up the training process:

  1. Define the loss function and optimizer:

    • Choose an appropriate loss function (e.g., CrossEntropyLoss)

    • Select an optimizer (e.g., Adam) and set the learning rate

  2. Create a training loop:

    • Iterate through a specified number of epochs

    • For each batch in the training data:

      • Move data to the appropriate device (CPU/GPU)

      • Perform forward pass, calculate loss, and backpropagate

      • Update model parameters

    • Print training progress at regular intervals

  3. Implement a validation function:

    • Create a function to evaluate the model’s performance on a given dataset

    • Calculate and return the accuracy

  4. Train the model and monitor performance:

    • Run the training loop

    • After each epoch, evaluate the model on the validation set

    • Track and plot training loss and validation accuracy over time

  5. Test the model:

    • Evaluate the trained model on the test set

    • Report the final test accuracy

Remember to move your model to the appropriate device (CPU or GPU) before training.

Experiment with different hyperparameters such as learning rate, batch size, and number of epochs to improve the model’s performance.

Please fill in the remaining code in train.py and run the cell below to train the model. You should expect to see test accuracy > 75 %

[ ]:
model = alexnet.AlexNet(num_classes=11, in_channels=1)

# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# Set number of epochs
num_epochs = 10

from utils.train import train, check_accuracy

# Train the model
model, results = train(model, loader_train, loader_val, criterion, optimizer, device, num_epochs)

# Check accuracy on the test set
test_loss, test_accuracy = check_accuracy(model, loader_test, criterion, device)
train_loss, train_accuracy = check_accuracy(model, loader_train, criterion, device)
print(f"Test Loss: {test_loss:.4f}, Test Accuracy: {test_accuracy:.4f}")
print(f"Train Loss: {train_loss:.4f}, Train Accuracy: {train_accuracy:.4f}")


[ ]:
# Plot the results
import matplotlib.pyplot as plt

# Extract variables from results dictionary
train_loss = results['train_loss']
val_loss = results['val_loss']
val_accuracy = results['val_accuracy']

# Create the plot
plt.figure(figsize=(12, 4))

# Loss plot
plt.subplot(1, 2, 1)
plt.plot(train_loss, label='Train Loss')
plt.plot(val_loss, label='Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

# Accuracy plot
plt.subplot(1, 2, 2)
plt.plot(val_accuracy, label='Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

plt.tight_layout()
plt.show()

VGG Implementation#

VGG, introduced in 2014, is known for its simplicity and depth, using small, uniform convolutional filters. It consists of 16 or 19 convolutional layers followed by 3 fully connected layers. Key features include:

  1. Small, uniform convolutional filters (3x3)

  2. Max pooling for downsampling

  3. Dropout for regularization

  4. Data augmentation

Follow the instructions in the utils/models/vgg.py file and run the cell below to load and test the model.

[ ]:
import utils.models.vgg as vgg
import torch

model = vgg.VGG13(num_classes=11, in_channels=1)

def test_shape(model):
    x = torch.randn(16, 1, 28, 28)
    assert model(x).shape == (16, 11)
    print("Test passed")

test_shape(model)
[ ]:
model = vgg.VGG13(num_classes=11, in_channels=1)

# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# Set number of epochs
num_epochs = 10

from utils.train import train, check_accuracy

# Train the model
model, results = train(model, loader_train, loader_val, criterion, optimizer, device, num_epochs)

# Check accuracy on the test set
test_loss, test_accuracy = check_accuracy(model, loader_test, criterion, device)
train_loss, train_accuracy = check_accuracy(model, loader_train, criterion, device)
print(f"Test Loss: {test_loss:.4f}, Test Accuracy: {test_accuracy:.4f}")
print(f"Train Loss: {train_loss:.4f}, Train Accuracy: {train_accuracy:.4f}")


Question#

What is the difference between AlexNet and VGG in the training results? Can you explain these results in terms of the architectures?

[ ]:
# Plot the results
import matplotlib.pyplot as plt

# Extract variables from results dictionary
train_loss = results['train_loss']
val_loss = results['val_loss']
val_accuracy = results['val_accuracy']

# Create the plot
plt.figure(figsize=(12, 4))

# Loss plot
plt.subplot(1, 2, 1)
plt.plot(train_loss, label='Train Loss')
plt.plot(val_loss, label='Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

# Accuracy plot
plt.subplot(1, 2, 2)
plt.plot(val_accuracy, label='Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

plt.tight_layout()
plt.show()

Add regularization#

Re-run the training process of vgg with dropout and L2 regularization. For dropout you may need to modify the code in the utils/models/vgg.py file. Think carefully about where to add dropout in the network.

[ ]:
# Training code below with dropout and L2 regularization

ResNet Implementation#

ResNet, introduced in 2015, introduced residual connections, allowing for much deeper networks and addressing the vanishing gradient problem. It consists of convolutional layers with residual connections. Key features include:

  1. Residual connections for faster training

  2. Batch normalization for faster training

Follow the instructions in the utils/models/resnet.py file and run the cell below to load and test the model.

[ ]:
import utils.models.resnet as resnet
import torch

model = resnet.ResNet50(num_classes=11, in_channels=1)

def test_shape(model):
    x = torch.randn(16, 1, 28, 28)
    assert model(x).shape == (16, 11)
    print("Test passed")

test_shape(model)

Training ResNet#

Now that you have implemented the ResNet architecture, follow these steps to set up the training process:

  1. Define the loss function and optimizer:

    • Choose an appropriate loss function (e.g., CrossEntropyLoss)

    • Select an optimizer (e.g., Adam) and set the learning rate

  2. Create a training loop:

    • Iterate through a specified number of epochs

You should expect to see test accuracy > 74 %

[ ]:
# Training code below
model = resnet.ResNet50(num_classes=11, in_channels=1)

# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.0005)

# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# Set number of epochs
num_epochs = 10

# Import necessary functions from utils.train
from utils.train import train, check_accuracy

# Train the model
model, results = train(model, loader_train, loader_val, criterion, optimizer, device, num_epochs)

# Check accuracy on the test set
test_loss, test_accuracy = check_accuracy(model, loader_test, criterion, device)
train_loss, train_accuracy = check_accuracy(model, loader_train, criterion, device)
print(f"Test Loss: {test_loss:.4f}, Test Accuracy: {test_accuracy:.4f}")
print(f"Train Loss: {train_loss:.4f}, Train Accuracy: {train_accuracy:.4f}")


[ ]:
# Plot the results loss plots and accuracy plots

Question#

What are some possible issues with the training process of the ResNet?

[ ]: