Transfer learning
Using pre-trained models / their trained weights as a starting point to train a model for a different data set.
Use pre-trained net, initialize last layers with random weights and e.g. create own softmax layer, freeze parameters in previous layers.
Options: Train new layers of network, or even more layers.
Prereqs:
- Task A and B have same input x
- Much data for existing model/task A, for new model only few data
- Low level features from A could be helpful for task B
- Pre-trained model needs to generalize
Another trick:
- Precompute output of frozen layers for all samples (save computation time later)
For larger set of samples:
- Only freeze first layers, train last few layers (and replace softmax)
For large set of samples:
- Use weights as initialization, then train (and replace softmax)
Image-based NNs
Pre-trained models which generalize can be used as a starting model, e.g. https://github.com/KaimingHe/deep-residual-networks ResNet models.
For CNNs more generic features are usually contained in the earlier layers.