Neural networks: a new approach to phytoplankton production modeling (1997).


Click here to send mail.

Michele Scardi

Stazione Zoologica “A. Dohrn” di Napoli, Villa Comunale, 80121 Napoli, Italy

 

 (Click on the figures to see an enlarged version)  

Why neural networks?

Phytoplankton production models are very important tools in oceanographic research, mainly because direct production measurements are expensive, time consuming and difficult to carry out on a routine basis. They are also necessary to exploit remote sensing and other instrumental estimates of phytoplankton biomass and photosynthetic efficiency (e.g. by pump and probe fluorometers).
Several phytoplankton production models have been developed during the last three decades and some have provided useful results (see Behrenfeld and Falkowski, 1996, for an up to date review). Empirical models are probably the most important subset of these models, since they provide reasonably accurate production estimates on the basis of widely available predictive variables (e.g. irradiance, biomass, etc.) that are linked to primary production by direct causal relationships.
But oceanographic data sets are growing larger; therefore new and more effective approaches are needed. Neural networks have been recently applied to phytoplankton production modeling (Scardi, 1996) as well as to other ecological problems (e.g. Lek et al., 1996). They are a powerful tool for empirical modeling of complex processes, since they can accurately reproduce any non-linear relationship, even those that are unknown or not fully understood, provided that enough (representative) data are available.
 

What is a neural network?

Neural networks (NN), more properly referred to as "artificial" neural networks (ANNs), are a very large and diverse set of processing devices. Their design mimics, in a simplified way, the neuronal structure of the mammalian cerebral cortex and most of them have some sort of "training" rule that makes them "learn" from examples.
The most common NNs are those that use the Error Back-Propagation (EBP) algorithm (Rumelhart et al., 1986) for their training. A typical EBP NN is shown in Fig. 1. It consists of several layers of nodes somehow analogous to neurons: an input layer (i), one or more hidden layers (h) and an output layer (o). The example in Fig. 1 is a 3-5-1 NN, since it has 3 input nodes, 5 hidden nodes and 1 output node.

Each node receives its input from the output of the previous layer nodes or from the network input. The connections between nodes are associated to weights (W, Z) that are adjusted by the EBP training procedure. In the input and hidden layers is also included a bias node with a constant output (usually 1), that plays the same role as the constant term in a multiple regression. Each hidden and output node is associated to an activation function, i.e. a differentiable function of the node total input. Even though several functions can be used, the most common is the sigmoid function: f(a)=1/[1+exp(-a)]. If the activation function of the hidden layer nodes is non-linear and an adequate number of nodes is available, an EBP NN can approximate any non-linear function.
A good introduction to neural networks can be found in Abdi (1994), but check also the URL below the title of this poster.
 

A simple model

A simple phytoplankton production (PP) model was set up using three predictive variables that are easily available from remote sensing: Sea Surface Temperature (SST), surface irradiance (I0) and surface chloropyll concentration (CHL0) as predictive variables. The model was based on a 3-4-1 EBP NN and was calibrated (i.e. trained) using a small data set that was extracted from the OPPWG data base (see ftp://warrior.das.bnl.gov/pub/Database/Database2.html).
This data set includes 97 PP measurements that were carried out in the western Mediterranean during 3 different spring cruises. The scope of the resulting model is somewhat limited, but it provides a good example of the way a NN works in comparison with other empirical models calibrated on the same data set: a very simple linear model and the Vertically Generalized Production Model (VGPM) developed by Behrenfeld & Falkowski (1997). The linear model is based on a composite predictive variable (SST·I0·CHL0) and its intercept is set to zero. The VGPM is much more complex and is based on a set of empirical relationships that take into account several predictive variables, some of which have to be indirectly assessed from available data. Since VGPM was developed from MARMAP data (NW Atlantic), a linear correction was applied to improve its fit to Mediterranean data.

The color-coded outputs of the models are compared with observed PP values in Fig. 2. The 3-4-1 NN (red diamonds) provides the best results in terms of overall goodness of fit and error distribution (the latter does not differ from a normal one according to a Shapiro-Wilk test, W=.968, p<.11). Both the VGPM (blue squares) and the linear model (green triangles) tend to underestimate PP in the 100-1000 mg C m-2 day-1 range and their error distributions are less symmetrical and leptokurtic than the NN one, even though in the VGPM case the hypothesis of normality could not be rejected. The mean square error (MSE) of the log-transformed 3-4-1 NN outputs (0.059) is much lower than the MSEs of the VGPM (0.130) and of the linear model (0.239).

Even though a NN can fit observed data better than other models do, its real power can be verified in its response to different combinations of input variables values.
The surfaces in Fig. 3 represent the PP estimates provided by the different models given varying I0 and CHL0 values and SST=18.94 °C (i.e. the mean June SST at 43°N, 8°E). It can be clearly seen that the NN surface is more complex and "feature-rich" than the linear model and the VGPM surfaces are. Of course, it does not meet some of the theoretical constraints that the other models meet (e.g. PP is not null when I0=0), but under real world conditions this is hardly a problem.
NNs are casted in the data mould, so they tend to retain all the features that are found in their training data sets. However, especially when small training data sets are used, some generalization is needed. The surface in the lower right corner of Fig. 3 shows the output of an overtrained NN, i.e. of a NN that acts as a memory rather than as a model of a process. In this case the NN response is much more complex and accurately "maps" a subset of the training data, even though it does not make sense from a more general point of view. Of course, overtraining can be easily avoided by means of different techniques: limiting the number training cycles (early stopping), adding white noise to the training patterns, etc.

Finally, an application of the 3-4-1 NN model is shown in Fig. 4, where the mean June PP was assessed for the whole Mediterranean Sea. Since the NN model was trained on Western Mediterranean data only and CZCS data were used as model input, this map has to be regarded as an example rather than as an accurate PP estimate.
 

Sensitivity analysis

A better understanding of the role of each input variable in a model can be achieved by means of sensitivity analysis. This is much more interesting when NNs are used, since, as already shown, they issue a more complex output than other models do.

Since SST, I0 and CHL0 values are independent from each other under real world conditions, 3 very simple sensitivity tests were carried out. In each test one of the input variables was allowed to randomly vary in a [-50%,+50%] range about the observed value and 1000 input were extracted resampling the training data set. The results of the sensitivity analysis are shown in Fig. 5. The random variations in I0 caused the largest impact on MSE (+30.17%). CHL0 closely follows with a +22.85% variation in MSE, whereas SST induced only minor changes in MSE (+4.11%). This result provides a good approximation of the relative importance of these variables in determining PP under the observed conditions.
 

A more complex model


A PP model that is based on SST, I0 and CHL0 can hardly manage large scale (not to say global) systems. As shown in the left plot in Fig. 6, these input variables were not sufficient to adequately train a NN when a large data set, including about 3000 observations in many different geographical regions, was used. Therefore additional information was needed to model the relationships between predictive variables and PP. Photoperiod, latitude, longitude and julian date (the latter two variables mapped into a circular reference system by means of sine and cosine transformations) provided such information. The much improved results of an “upgraded” 9-9-1 NN model are shown in the right plot in Fig. 6 (this model is currently participating in the Primary Productivity Algorithm Round Robin, that will select a “consensus” algorithm for SeaWiFS data processing). 
 

Discussion

NN models of phytoplankton production are intrinsically more effective than other empirical models because NNs are powerful computational engines. Like others empirical models, of course, NN models are just as good as the data set on which they are based, but the capability of incorporating information which is usually difficult to manage (e.g. binary or nominal data, geographical coordinates, etc.) gives them a significant edge over conventional models.
NNs will certainly be used more and more frequently in oceanographic research, since they have provided useful results also in other oceanographic applications (Scardi et al., in prep.) and their computational requirements are now easily met by most PCs.
 

References

  1. Abdi H. (1994). A neural network primer. Journal of Biological Systems, 2 (3): 247-281
  2. Behrenfeld M.J. & Falkowski P.G. (1996). A consumers guide to phytoplankton primary productivity models [on line manuscript]. Available from Internet: <ftp://warrior.das.bnl.gov/pub/Reports/ps_files/paper2.ps> [14/11/97]
  3. Behrenfeld M.J. & Falkowski P.G. (1997). Photosynthetic rates derived from satellite-based chlorophyll concentration. Limnology & Oceanography, 42(1): 1-20
  4. Lek S., Delacoste M., Baran P., Dimopoulos I., Lauga J. & Aulagnier S. (1996). Application of neural networks to modelling nonlinear relationships in ecology. Ecological Modelling, 90(1): 39-52
  5. Rumelhart D.E., Hinton G.E. & Williams R.J. (1986). Learning representations by back-propagating errors. Nature 323: 533-536
  6. Scardi M. (1996). Artificial neural networks as empirical models of phytoplankton production. Marine Ecology Progress Series, 139: 289-299
  7. Scardi M., Conversano F. & Ribera d'Alcalą M., in prep. A new method for calibration-validation of oxygen data collected with CTD oxygen probes.

 Click here to send mail.

Click here to go back to my NN page.

Click here to go back to my home page.

 


Back | Home

 Don't support browser-specific web pages!