Objections to Neural Networks
Outline
Neural Network Theory
Neural networks, as we heard earlier in the week, work by Bayesian
probabilistic inference. Eg. one might assume any number of things
about the number of minutes injury time for Leeds United. However, if
we assume a normal distribution, we now can state probabilities of
any one set of numbers belonging to Leeds. It turns out that the
sigmoid function is the optimal descion boundary. Thus, neural
networks find the best function fit by Bayesian probabilities. They
are not doing arbitrary function fitting.
Moreover, the introduction of
compatible or incompatible evidence causes the sigmoid curve to shift
right or left. And as noise increases the sigmoid slope changes
(temperature). Thus given simple classification problem, and assuming
the gaussian description, a single neuron is a reasonable and in some
cases optimal description. (Neural Networks for Pattern Recognition
by Michael Bishop). SO NOT ARBITRARY!!
Excitatory Interactions: Multisensory Integration
It is often useful to integrate information coming from multiple modalities. In the brain such multi-modal integration is calculated in the superior culliculum. Indeed, much of it's behavior can be characterized as detecting coorelations and controling saccades to them. Recordings of Ferret superior culliculi indicate that the sigmoid function fits the data very well. That is, the activations one finds from inputting audatory and visual stimuli are nearly exactly to adding the inputs and putting them through a sigmoid function.
Inhibitory Interactions
Finding out about contradictory information has the effect of shifting the curve to the right, but finding support for a conflicting theory has the effect of depreasing the curve (as the total probability can only be one. This kind of inhibition (Divisive) is actually better, because not as much information is lost. (Subtractive inhibition does lose data.) Indeed, early on in the superior culliculi divisive inhibition is quite common.
Taming Neural Networks: The Role of Constraints
Slow learning, too many parameters, non-reproducibility, and the lack of coorespondence to physical data can all be reduced somewhat by the addition of constraints into one's models. While you might be uncomfortable with such, efficent computation can never be bought at the price of survival. Thus, as some have calculated, even using 5% of your neurons at once would mean you have to eat a mars bar a minutes. Thus, efficent computation, or massively parallel representation, is not good. Instead, we do a kind of independent components analysis. This serves to minimize firing rates (generally by making components which calculate non-dependent inputs, independent). Thus,
Optimal Codes
Tlearn's networks often involve all the nodes. While great for
information storage, this is bad for food consumption.
Unlike, traditional neural networks,
sparsely coded netwroks only fire when see a very small type
of information. This is bad for info, good for food consumption.
Boltzman's function is the cross
between the two of these and maximizes both information and food
use.
So what does the brain follow? Not too
surprisingly, Boltzman for the Macaque and Cat visual cortex. This is
true for fine as well as gross time scales.
Interestingly, this means you can calculate how
efficient each neuron is at comunication information and minimizing
connections. Indeed, V1 cells and IT cells seem to be opperating at
97% efficency. Damn good, really.
Indeed, by simply saying you want to represent
everything you can, as efficently as you can, get networks developing
nearly every kind of cell you have in the visual cortex, even color
opponent cells (just by watching naturalistic stimuli.)
Conclusions
Neural Networks are good computing machines. However, given recent work in physiology, can make a few suggestions. They should be off most of the time, and they should have decay as well.
Coments to: ghollich@yahoo.com |
Last Modified: Sep 20, 1999 |