Clear all

DPL-4 Deep Learning Exercise - to clarify concepts  




Referring to exercise to create CNN model for fashion_mnist dataset.

1. I had used EarlyStopping as well as validation_split when fitting. For EarlyStopping, monitor argument used was specified as either 'val_loss' or 'val_accuracy'. However, it seems that EarlyStopping monitors 'loss' or 'accuracy'(assume this is for training data), not 'val_loss' or  'val_accuracy' for validation data. Is this understanding correct?


2. I was trying out the layers, filters and dense units to be used. Noted that iterations of same model definition can produce somewhat different results. In trying out model architecture/parameters, should I have specified a seed right at the beginning, example: np.random.seed(1)? Will specifying this also fix the random_state of whatever functions having a random_state parameter?


3. The CNN illustration given has this:

misclassified_idx = np.where(p_test != y_test)[0].

What is the meaning of the [0] at the end?


4. Noted the CNN illustration had used 3 convolutional layers with increasing filters. Is this "increasing number of filters" through Conv2D layers a best practice, or obtained by experimentation? What about Dense layers? The limited illustrations I came across seem to go narrower (decrease in number of units/nodes) through the Dense layers. Any comments here?


Lastly, just a comment. I used 1 Conv2D layer with 64 filters, MaxPooling & BatchNormalization; then 1 Dense layer with 128 units, with Dropout. Gives val_accuracy of 90-91%. If someone has a good model (beyond the given illustration), appreciate your sharing. Thanks, ym.

oh, to clarify, the "Exercise" & "Illustration" I was referring to is the one under Quiz. Noted the workbook for code walkthrough used a simpler architecture, which works fine (not sure if right, but adding layers don't seem to improve model).


Delete your account