data_mining:neural_network:transfer_learning

Transfer learning

Using pre-trained models / their trained weights as a starting point to train a model for a different data set.

Use pre-trained net, initialize last layers with random weights and e.g. create own softmax layer, freeze parameters in previous layers.

Options: Train new layers of network, or even more layers.

Prereqs:

  • Task A and B have same input x
  • Much data for existing model/task A, for new model only few data
  • Low level features from A could be helpful for task B
  • Pre-trained model needs to generalize

Another trick:

  • Precompute output of frozen layers for all samples (save computation time later)

For larger set of samples:

  • Only freeze first layers, train last few layers (and replace softmax)

For large set of samples:

  • Use weights as initialization, then train (and replace softmax)

Pre-trained models which generalize can be used as a starting model, e.g. https://github.com/KaimingHe/deep-residual-networks ResNet models.

For CNNs more generic features are usually contained in the earlier layers.

  • data_mining/neural_network/transfer_learning.txt
  • Last modified: 2018/05/25 21:13
  • by phreazer