CNN
Convolutional Neural Networks
A typical CNN consists of three stages:
- Apply convolution in parallel to get linear activations
- Apply a non-linear activation function (such as ReLU), called detector stage
- Apply a pooling function
Convolution
TODO
Pooling
Replaces output of the network with a summary statistic of nearby outputs.
- Max pooling: Select maximum output among neighbors
- Average pooling: Take average of neighbors
- norm pooling: Take norm of neighbors
- Weighted average: Take average based on distance to central pixel
The goal is to make output invariant to small changes in the input.
torch.nn.Conv2d
It takes in_channels, out_channels and kernel_size (H,W) and creates the kernel with size [out_ch, in_ch, ker_h, ker_w], this way it keeps a different kernel for each in-out channel combination.
Out channels allow us to track different features