Key Steps

  1. Data Preparation: Loaded custom dataset, split into train/val/test (80/10/10), applied augmentation transforms
  2. Model Selection: Evaluate ResNet-18 pretrained model on test images
  3. Data Balancing: Data balancing techniques to handle class imbalance in training data
  4. Model Training: Fine-tuned ResNet18 using PyTorch Lightning with frozen backbone, Regularization, SGD optimizer with momentum and early stopping
  5. Evaluation: Computed per-class accuracy, confusion matrix, ROC curves, and Precision-Recall curves
  6. Export: Converted model to ONNX format for cross-platform deployment
  7. Inference: Tested ONNX model on sample images with accessibility features (text-to-speech [pytts3])

Results

High per-class accuracy with strong precision-recall trade-offs. Model ready for production deployment.