Skip to content

Contents


Convolutional Neural Networks (Images)

Initial modeling for images was performed using three different CNN architectures. All evaluations below use the preprocessed image datasets split into split_train and split_val directories, with images organized in class folders. Images are grayscaled and resized but not flattened to preserve spatial information for the CNN models.


CNN Model 1

The first CNN architecture serves as an initial baseline for image classification. It uses a simple convolutional architecture with three convolutional layers followed by a single dense layer.

Training Configuration

Optimizer: Adam with default learning rate Loss Function: Sparse Categorical Crossentropy Callbacks:

  • Early Stopping (patience=3)
  • ReduceLROnPlateau (factor=0.5, patience=2)
  • ModelCheckpoint (save best model)

Training and Validation Accuracy Plot

Training and Validation Accuracy 1

The plots show that the model's training accuracy improves steadily over five epochs, while validation accuracy increases only slightly before leveling off.

At the same time, training loss decreases consistently, but validation loss begins to rise after the second epoch. This indicates that the model is overfitting.

Classification Report

ClassPrecisionRecallF1-scoreSupport
00.57800.41180.4809612
10.33570.44160.3814539
20.82520.79260.8086786
30.50000.03420.0641146
40.19660.31430.2419961
50.18390.13440.1553424
60.35980.50310.4195974
70.28570.13020.1789169
80.21980.07890.1161507
90.44500.14430.2180672
100.41140.31390.35611013
110.56790.70150.6277841
120.22020.17520.1951137
130.26480.43050.32791029
140.66670.01180.0231170
150.64400.62420.6340942
160.37790.58520.4592986
170.24510.20590.2238306
180.47380.48440.4790991
190.30890.16450.2147462
200.66690.58090.62092047
210.28100.14670.1927525
220.64150.83750.7265517
230.88270.91530.8987189
240.42630.51060.4646521
250.23080.12610.1630357
260.24720.13660.1760161
MetricValue
Accuracy0.4411
Macro avg Precision0.4254
Macro avg Recall0.3680
Macro avg F1-score0.3647
Weighted avg Precision0.4513
Weighted avg Recall0.4411
Weighted avg F1-score0.4288

CNN Model 2

The second CNN architecture builds upon the first model with improvements to the network structure:

  • Double convolutional layers in each block
  • Batch normalization after each convolutional layer
  • Dropout layers to reduce overfitting
  • Increased capacity in the dense layer (256 vs 128 neurons)
  • 'same' padding to preserve spatial dimensions

Training Configuration

Optimizer: Adam with default learning rate Loss Function: Sparse Categorical Crossentropy Callbacks:

  • Early Stopping (patience=3)
  • ReduceLROnPlateau (factor=0.5, patience=2)
  • ModelCheckpoint (save best model)

Training and Validation Accuracy Plot

Training and Validation Accuracy 2

The training accuracy and loss show steady improvement, indicating that the model is learning the training data effectively.

However, the sharp fluctuations and overall lower performance in validation accuracy and loss suggest that the model struggles to generalize, which could be due to class imbalance causing instability in validation metrics or an overly high learning rate leading to erratic updates during training.

Classification Report

ClassPrecisionRecallF1-scoreSupport
00.63760.53760.5833612
10.50000.43040.4626539
20.77800.89190.8311786
30.75000.12330.2118146
40.25560.37880.3052961
50.23660.07310.1117424
60.40050.68170.5046974
70.53000.31360.3941169
80.33070.16770.2225507
90.37840.25000.3011672
100.40250.51730.45271013
110.71180.74320.7272841
120.47420.33580.3932137
130.37990.37800.37901029
140.44000.06470.1128170
150.70530.72400.7145942
160.54400.55780.5508986
170.39260.31050.3467306
180.59790.52980.5618991
190.41830.18830.2597462
200.56680.81190.66762047
210.50000.16950.2532525
220.86360.80850.8352517
230.96690.92590.9459189
240.65510.54320.5939521
250.44200.17090.2465357
260.50000.27330.3534161
MetricValue
Accuracy0.5247
Macro avg Precision0.5318
Macro avg Recall0.4408
Macro avg F1-score0.4564
Weighted avg Precision0.5242
Weighted avg Recall0.5247
Weighted avg F1-score0.5064

CNN Model 3 (Current version "best_model_miri.keras")

The third CNN architecture maintains the same overall structure as CNN Model 2 but introduces a change in learning rate schedule. It uses a cosine decay learning rate schedule instead of a constant learning rate, allowing for better optimization during training.

Training Configuration

Optimizer: Adam with cosine decay learning rate schedule Loss Function: Sparse Categorical Crossentropy Callbacks:

  • Early Stopping (patience=3)
  • ReduceLROnPlateau (factor=0.5, patience=2)
  • ModelCheckpoint (save best model)

Training and Validation Accuracy Plot

Training and Validation Accuracy 3

Compared to CNN2, CNN3 shows more stable improvements in both validation accuracy and loss, despite some early fluctuation.

The training metrics still improve steadily, but the validation metrics settle into a clearer upward trend, suggesting better generalization. This improvement is most likely due to the use of cosine decay on the learning rate, which gradually reduces the step size and helps the model converge more smoothly, avoiding the erratic behavior seen in CNN2.

Classification Report

ClassPrecisionRecallF1-scoreSupport
00.54510.63240.5855612
10.41320.52130.4610539
20.85660.88170.8690786
30.19820.30820.2413146
40.31110.24770.2758961
50.18090.20990.1943424
60.49550.50410.4997974
70.35930.49110.4150169
80.22810.26230.2440507
90.28370.27230.2779672
100.49340.36620.42041013
110.76120.72410.7422841
120.29250.54010.3795137
130.43510.37120.40061029
140.14830.22940.1801170
150.70610.72190.7139942
160.55670.55270.5547986
170.32050.46080.3780306
180.65780.49850.5672991
190.22260.31170.2597462
200.75390.59700.66632047
210.28280.31620.2986525
220.78750.87430.8286517
230.89600.95770.9258189
240.57530.56430.5698521
250.25370.29130.2712357
260.36360.49690.4199161
MetricValue
Accuracy0.5064
Macro avg Precision0.4585
Macro avg Recall0.4891
Macro avg F1-score0.4682
Weighted avg Precision0.5252
Weighted avg Recall0.5064
Weighted avg F1-score0.5115

Summary

Three CNN architectures were evaluated on the image classification task.

  1. CNN Model 1: Achieved 44.11% accuracy with a macro F1-score of 0.3647
  2. CNN Model 2: Achieved 52.47% accuracy with a macro F1-score of 0.4564
  3. CNN Model 3: Achieved 50.64% accuracy with a macro F1-score of 0.4682

CNN Model 2 shows the highest overall accuracy, while CNN Model 3 demonstrates more balanced performance across classes with better recall in many categories.

All models show strong performance on certain classes (particularly classes 2, 11, 15, 22, and 23) while struggling with others (classes 3, 5, 8, and 14). This suggests that some classes have more distinctive visual features than others, or potentially imbalanced representation in the training data.