Lecture 6 (Connectionist Summer School)

Lecture 6: Nick Chater
Neural Networks and the Problem of Induction:
Implications for Development

OR

The Mystery of Neural Networks... NOT!

Careful study of neural networks revals that the properties caculated and their fine function are acctually very similiar to Baysian

Outline

Learning and Induction
Whay is Induction hard?
How is learning possible?
Bayesian Approach
Neural Networks: The Bayesian System
Implications

Learning and the Problem of Induction

Learnig involves deriving general rules from observations, just like science involves deriving hypothesis from data. This leads naturally to the characertization of children as tiny scientists.

Why Is Induction Hard?

Infinite numbers of hypotheses are consistent with finite data, and these have wildly diverging predictions. Eg. Series can be continued in ANY way: 1,2,3,4..... 6,7,8? 9,0,6,? 3,2,1,?
Of course, you might argue that the first is the only real hypothesis, but that is because you are one of the best inductive machines on the planet.

How Is Learning Possible?

Clearly, learning requires a developing bias towards some hypotheses, rather than others. And either you can do this by.

Restricting a set of Hypotheses, OR
"Goodness" measure over hypotheses.

Naturally, biases learners are only best, if they are biased in the right way. How do they do this? Perhaps evolution.

The Bayesian Approach

Induction is really a kind of uncertain reasoning.

IF    P(S) = degree of belief in S
THEN  P(S) should obey the laws of probability.

Probability theory is a general method of reasoning about uncertainty.
In order to learn, we need to revise our probabilities of hypotheses in the light of our observations, D.

start from P(H) (the prior)
revise to  P(H|D)
           = P(H given the data)

So, how do we calculate the Hypotheses given the data? Well, intuitively, the more implausible the data, the stronger the evidence must be. Mathematically,

P(Hj given the data) = K* P(Data given Hj)P(Hj)
P(Hj|D) = K*P(D|Hj)P(Hj)

Thus, the Baysian system gives us a way of reasoning about the world. AKA it tells us that our revised hypothesis is made inproportion to our old way of thinking and the new way of thinking given the evidence.

Neural Networks: A Bayesian Perspective

Neural networks don't learn by "magic" - they use the Bayesian approach to induction. HUH?

Back-Propagation: we adjust the weights of a neural network to reduce the error over all the patterns.
Bayes Theorm: says the new probability is proportional to the probability of the revised hypothesis (given the data) times the probability of the old hypothesis.
Weights: not only specify a particular mapping from inputs to outputs, but in this fashion make a particular claim about the world, aka, weights are hypotheses.
Error Signal: corresponds to the data.

Thus, backpropagation, completely in accord with Bayesian Inference, changes its "hypotheses" (weights) in proportion to the "hypotheses" (weights) that were before and the new "hypotheses" (weights) given the "data" (error signal).

Note: This works best when the sum square of all weights is Gaussian to start, and the structure of the network limits the hypotheses. Of course, many of the problems we want to solve may not neccessarily have a Gaussian distribution. In a sense, such problem violate our initial assumptions, just like in statistics.

And so we see that Connectionist learning is not mysterious, and really not new, but follows standard Bayesian inference, and it all depends on biases (eg. structure of the network & priors on weights.)

Implications

1) Learning and Innateness: learning requires biases. Thus the tabular learner can not exist.

2) What is learned? Networks are not special in the learning they employ (statistical), but in the hypothesis they use (non-linear, complex, not typical symbolic hypothesis) .

3) A New Statistical Model of Mind Like associationism, Rescorla-Wagner, Kelley's ANOVA, and many others, neural nets are really just the latest installment in viewing the mind as a statistical engine.

WARNING! Real people don't always take disconfirming evidence as disconfirming. Indeed, they often ignore such data. (Although, some studies have shown that such ignorance can accutally help solve such complex problems, thus this positive bias isn't neccessarily bad.)

Back to Connectionist Summer School HomePage

Coments to: ghollich@yahoo.com

Last Modified: Sep 20, 1999

Lecture 6: Nick Chater Neural Networks and the Problem of Induction: Implications for Development

OR

Lecture 6: Nick Chater
Neural Networks and the Problem of Induction:
Implications for Development