Kreiman Lab

Biological and Computer Vision

Gabriel Kreiman

Additional Materials

Chapter VIII: Teaching computers how to see

Biologically plausible models of vision should be image computable, based on neural networks, and show the fundamental properties of selectivity, invariance, and generalization. State-of-the-art models use a divide-and-conquer hierarchical architecture with elementary compositional computations at each step. Ascending through the architecture, units show larger receptive fields, tuning for more complex features, and increased tolerance to feature transformations. At the heart of neural networks are weights that control the influence of a pre-synaptic unit on its post-synaptic target. Many models can be trained in an end-to-end fashion by adjusting those weights to optimize performance in specific tasks via back-propagation in supervised learning scenarios. Whether and how back-propagation can be implemented by biological hardware remains an issue of contention. An essential step when training models is cross-validation to mitigate the potential of overfitting models with large numbers of free parameters. It is possible to directly compare the output of the models to behavioral measurements to directly assess the extent to which the model can explain perception and also measure the responses of model units to any arbitrary image to directly compare the inner workings of the model versus the responses of real neurons in biological brains.

[1] Figures in powerpoint format for teaching
[2] Further reading
[3] iTheory: Visual cortex and deep learning (Tomaso Poggio) [46:11]

[4] Neural networks for Machine Learning (Geoffrey Hinton, Full Course)
[5] Machine learning (Andrew Ng, Full Course)
[6] Constraints: visual object recognition (Patrick Winston) [51:31 duration]

[7] How to teach computers to understand pictures (Fei Fei Li) [18:02 duration]

[8] A step by step backpropagation example (Matt Mazur)
[9] Deep Learning (Goodfellow, Bengio, Courville, MIT Press 2016)