Working with Complex Image data for CNNs

Subscribe to my newsletter and never miss my upcoming articles


Table of contents

  1. Get started with TensorFlow and Deep Learning

  2. Computer Vision with TensorFlow

  3. Using Convolutional Neural Networks with TensorFlow

  4. Extending what Convolutional Neural Nets can do

  5. Working with Complex Image data for CNNs

All the code used here is available in my GitHub repository here.

This is the fifth part of the series where I post about TensorFlow for Deep Learning and Machine Learning. In the earlier blog post, you saw how you could apply a Convolutional Neural Network for Computer Vision with some real-life data sets. It did the job pretty nicely. This time you’re going to work with more complex data and do even more with the data. I believe in hands-on coding so we will have many exercises and demos which you can try yourself too. I would recommend you to play around with these exercises and change the hyper-parameters and experiment with the code. If you have not read the previous article consider reading it once before you read this one here. This one is more like a continuation of that.

Reading the Data

In the previous blog post, we worked with MNIST data which was pretty simple, grayscaled 28 X 28 images, and the thing you want to classify is centered in the image. Real-life data is different, it has more complex images, your subject might be anywhere in the image not necessarily centered. Our dataset had very uniform images too. This time we’ll also work on a larger dataset. We’ll be using the Cats vs Dogs dataset to try out these things for ourselves. TensorFlow has something called [ImageDataGenerator]( which simplifies things for us and allows us to directly read the images and place them. So you would first have two directories called train and validation directory, each of the directories would have two subdirectories Cats and Dogs each of which would have the respective images and auto label them for us. Here’s how the directory structure looks-

The directory structureThe directory structure

Let’s now see this in code. The ImageDataGenerator is present in tensorflow.keras.preprocessing.image so first let’s go ahead and import it-

from tensorflow.keras.preprocessing.image import ImageDataGenerator

Once you do this you can now use the ImageDataGenerator -

train_image_generator = ImageDataGenerator(rescale=1./255)

train_data_gen =  train_iamge_generator.flow_from_directory(

We first pass in rescale=1./255 to normalize the images, you can then call the flow_from_directory the method from that directory and its sub-directories. So in this case taking the above diagram as a reference, you would pass in the Training directory.

Images in your data might be of different sizes to convert or resize them all into one size by the target_size . This is a very important step as all inputs to the neural network should be of the size. A nice thing about this code is that the images are resized for you as they’re loaded. So you don’t need to preprocess thousands of images on your file system you instead to do it in runtime.

The images will be loaded for training and validation in batches where it’s more efficient than doing it one by one. You can specify this by the batch_size , there are a lot of factors to consider when specifying a batch size which we will not be discussing in this blog post. But you can experiment with different sizes to see the impact on the performance.

This is a binary classifier that is it picks between two different things; cats and dogs so we specify that here by the class_mode.

And that’s all you need to read your data and auto label them according to their directories and also do some processing in run time. SO let’s do the same for validation data too-

validation_image_generator = ImageDataGenerator(rescale=1./255)
val_data_gen = validation_imadata_generator.flow_from_directory(
                 directory= validation_dir,

Training the model

Another great thing about ImageDataGenerator is there is little or almost no change while building and training the model, so let's build a sample model for the dogs vs cats problem and then compile it.

model = Sequential([
    Conv2D(16, (3,3), padding='same', activation='relu',
    input_shape=(IMG_HEIGHT, IMG_WIDTH ,3)),
    Conv2D(32, (3,3), padding='same', activation='relu'),
    Conv2D(64, (3,3), padding='same', activation='relu'),
    Dense(512, activation='relu'),
    Dense(1, activation='sigmoid')])


You can see that you don't have to do any changes while compiling your model to make it work with ImageDataGenerator , let’s get to the training part now.

history = model.fit_generator(

A difference you would see here is instead of passing the training data directly after loading it, I now pass the train_data_gen which reads the data from the disk using ImageDataGenerator and performs the transformations on it. And you can do the same with the validation data too.

Trying out for yourself

All the code we just talked about is implemented in this notebook. The model we will build is not yet a perfect or suitable model and suffers from overfitting, we will see how we can tackle this problem in the next blog in this series.

You can use the Open in Colab button to directly open the notebook in Colab or even download it and run it on your system.

About Me

Hi everyone I am Rishit Dagli



If you want to ask me some questions, report any mistake, suggest improvements, give feedback you are free to do so by emailing me at —

No Comments Yet