Parameter | Reasons | Trials Done | Best Choice |
---|---|---|---|
Conv filters | Capture various features | 32,64,128 | All in different stage |
Activation | Introduce non-linearity | ReLU, Leaky ReLU | RelU |
Pooling | Reduce spatial dimensions | 2x2,3x3 | Both required |
Dropout Rate | Prevent overfitting | 0.2, 0.25, 0.5 | 0.25 and 0.5 |
Batch Norm | Stabilize training, speed up convergence | Before/After Convolution | After |
Learning Rate | Control the size of weight updates | 0.001, 0.0005, 0.0001, 0.05, 0.01 | 0.0001 |
Epochs | Ensure enough training time to learn | 50, 100, 150, 250 | 250 |