An introduction to error back-propagation neural networks. |
|||
|
Each hidden and output node is associated to an activation function, i.e. a differentiable function of the node total input. Several functions can be used as activation functions, but the most common choice is the sigmoid function:
Provided that the activation function of the hidden layer nodes is non-linear, an error back-propagation neural network with an adequate number of hidden nodes is able to approximate every non-linear function. A neural network works at its best if all its synaptic weights have been properly adjusted. The error back-propagation algorithm is a way to compute these weights and involves four steps: (1) the network is initialized by assigning random values to synaptic weights; (2) a training pattern is fed and propagated forward through the network to compute an output value for each output node; (3) computed outputs are compared with the expected outputs; (4) a backward pass through the network is performed, changing the synaptic weights on the basis of the observed output errors. Steps 2 through 4 are iterated for each pattern in a training set, then the network performance is checked (usually on the basis of a mean squared error) and a new set of training patterns is submitted to the network (i.e. a new epoch is started) if it needs further optimization. In the case of neural networks with a single hidden layer, like the ones that were used for phytoplankton production modeling, the forward propagation step is carried out as follows:
where ij are the outputs of the input layer (i.e. the network inputs and 1 for the bias node) and wjk are the weights of the connections between input and hidden layers. To compute the outputs of the hidden layer, these weighted sums are passed to the activation function, but for the bias node, that is forced to have an output equal to 1:
Then, the network outputs are computed in the same way:
After the forward propagation, estimated outputs ol are compared with expected outputs yl and a mean quadratic error for the current pattern is computed as:
Then, in the back-propagation step, all the synaptic weights are adjusted in order to follow a gradient descent on the error surface. For the connections between hidden and output layers, zkl are changed as:
where h is a constant (learning rate) and:
The weights wjk of the connections between hidden and input layer are also adjusted:
where
The network training is iterated until a given condition is met. Minimization of the quadratic error is usually involved, but other criteria can also be used. It has to be stressed, however, that the weight adjustment process does not provide a unique optimized result, since many non-deterministic factors (e.g. different starting values of the synaptic weights) can affect the network training. Moreover, the gradient descent on the error surface might find a local minimum. Error back-propagation neural networks are available in
many commercial, shareware and public domain software packages. However,
the Win95/NT executable and/or the FORTRAN source code of a very simple
EBP NN implementation can be requested from the author.
|
|||
|
|