I first learnt about Kernels and Convolution a few months ago during a Computer Vision module at University - was really insightful. The exact methods used to perform Gaussian blurring/edge detection etc was something I hadn't given much though to before.
A cool fact about the Gaussian filter is that it's separable - you can convolve in the X direction (using a 1 * n kernel), and then convolve the result again along the Y direction (using a m * 1 kernel) - the final result will be the same as convolving using a single m * n kernel, but can be done in O(N) time rather than O(N^2) (you only have m+n multiplications per pixel rather than m * n per pixel).
Not every filer is separable - it's only possible an n * m filter can be expressed as the product of a 1 * m and a n * 1 matrix.
Two really cool facts about Gaussian filter are: 1) it's the only separable isotropic (ie 'round') kernel, and 2) there's a Deriche approximation where the number of operations per pixel doesn't depend on filter size: https://espace.library.uq.edu.au/view/UQ:10982/IIR.pdf
A cool fact about the Gaussian filter is that it's separable - you can convolve in the X direction (using a 1 * n kernel), and then convolve the result again along the Y direction (using a m * 1 kernel) - the final result will be the same as convolving using a single m * n kernel, but can be done in O(N) time rather than O(N^2) (you only have m+n multiplications per pixel rather than m * n per pixel).
Not every filer is separable - it's only possible an n * m filter can be expressed as the product of a 1 * m and a n * 1 matrix.
Another cool fact is that you can perform convolution in as a point-wise multiplication in Fourier space (see http://en.wikipedia.org/wiki/Convolution_theorem).