data_mining:neural_network:cnn:resnet

Residual block

Normal flow:

$a^{[l]} \rightarrow Linear \rightarrow ReLu \rightarrow a^{[l + 1]} \rightarrow linear \rightarrow ReLu \rightarrow a^{[l + 2]}$

With residual block:

Skip connection from $a^{[l]}$ to ReLu before $a^{[l + 2]}$, so that $a^{[l + 2]}=g(z^{[l+2]} + a^{[l]})$.

Allows to build deeper nets.

Identity function is easy to learn with skip connections. Adding the residual block doesn't hurt performance.

In deep nets without skip connections, it's difficult to choose the parameters so that identity function is learnt.

Further details

  • data_mining/neural_network/cnn/resnet.txt
  • Last modified: 2019/11/09 10:54
  • by phreazer