Transfer Learning Explained
A single most popular notion in deep learning states that it is not a good idea to train a deep neural network from scratch. Why is this notion?
Let’s say that you successfully trained a cat and dog classifier and got a fair performance on the unseen images of cats and dogs. In other words, your classifier can recognize cats and dogs at 99% accuracy. Later in few days, you want to train a horse/human classifier. You have a handful of images of horses and humans. What do you do? Is it possible to transfer the learned features of your first classifier into a new classifier?
The answer is yes. A model trained to recognize cats and dogs can be modified to be used for recognizing horses or humans. The first layers of the first classifier hold the low-level features such as edges and lines while the latter layers hold the high-level features (face, nose, ear…) which means you can reuse these learned features of the first layers without causing any problem.
That being said, transfer learning is a technique used in deep learning in which a model developed to perform a given task is used for another task.
Why Transfer Learning?
There are up to 3 reasons to do transfer learning in deep neural networks:
- The amount of datasets used to train the deep neural networks is huge and without such big data, the performance of the network is very bad. Transferring learned features (from a model trained on big datasets) allows us to get fair performance on the small datasets.
- The amount of computing resources used to train deep neural networks is huge. By using the pretrained networks and freezing the top layers, we reduce the training time and thus improving the performance.
- Leveraging novelties and state of the art models. Transfer learning allows us to take advantage of state of the art models and researches by introducing these models into the task at hand. These include ResNet, VGG, Inception, Xception, MobileNet,etc..
Typical flow of Transfer learning in practice
The following are the typical steps to do transfer learning:
- Initializing the primary model (often a pretrained model) and weights
- Freezing the layers of the primary model
- Creating the new model and stacking the classification layer to the lower layers of the primary model.
- Training the new model
- Evaluating, and improving (or finetuning) the new model
Most deep learning frameworks such as TensorFlow, Keras, and PyTorch have the option to initialize the pretrained models with their weights easily so you can focus on adding latter layers to build a new model and improving it as well.
For example, below is how you can load a MobileNet model in TensorFlow. In this case, a MobileNet model acts as a base or a primary model.
primary_model=tf.keras.applications.MobileNetV2(
input_shape=(224, 224, 3),
include_top=False,
weights="imagenet",
classifier_activation="softmax"
)
To recap, transfer learning gives us the ability to take advantage of open-source models by reusing them into the task at hand!
Until Next Time, Stay Deep!!