Livestock Research for Rural Development 23 (3) 2011 Notes to Authors LRRD Newsletter

Citation of this paper

Prediction of Second Parity Milk Yield of Kenyan Holstein-Friesian Dairy Cows on First Parity Information Using Neural Network System and Multiple Linear Regression Methods

D M Njubi*, J W Wakhungu and M S Badamana

Department of Animal Production, University of Nairobi, P.O. Box 29053-00625,
Nairobi Kenya
* P.O Box 30623-00100 Nairobi Kenya, 0722 206 856


Artificial neural networks (ANN) have been used for prediction in many fields of knowledge and currently they have been used in agriculture field. The objective of this study was to investigate the usefulness of ANN in the prediction of second parity 305-day milk yield (SLMY305) of Kenyan Holstein-Friesian dairy cows based on first parity information. The ANN was compared with multiple linear regression (MLR) method.

From a total of 2808 records of first and second parities Holstein-Friesian, 1685 records were trained using back propagation neural network and the rest were used for validation and testing sets.  The network architecture was optimized by testing several types of structures. The model efficiency and accuracy were measured based on root mean square (RMSE) and regression coefficient (R2). Multilayer perceptron with the best ANN structure was determined as 8-1-1 with RMSE=682 and R2=0.86 with tangent sigmoid transfer function for hidden layer.

The correlation coefficients between the observed and the predicted SLMY305 for the two estimation methods were generally high (>0.80) although ANN had higher figure but the difference was not statistical significant (P<0.05).

Results illustrated the potential of ANNs in predicting SLMY305 in dairy cows. The implication is that dairy cow farmers could make selection decisions of prospective productive cows early hence increasing genetic potential of dairy herds.

Keywords: artificial neural networks, 305-d lactation yields, prediction, selection decisions


The dairy sector growth in Kenya has been heralded as a great success story, and yet further gains in dairy production are constrained by a wide range of problems among them inefficient breeding programs. One of the main challenges in the industry therefore, is to improve the productivity of breeds through the design of efficient breeding programs. 

Breeding programs are based primarily on milk yield and it is imperative that accurate measurements or prediction of milk yield is done. In designing breeding programs the ultimate goal is to optimize selection and mating strategies, under the best population structure. Selection is therefore a very important aspect to effective animal breeding program and can impact on farm profitability and eventually national dairy industry. So, the accuracy of predicting second parity milk yield is extremely important for increasing selection intensity and eventually overall genetic gain.

Milk yield production trait is affected by genetic and environmental factors and the interactions between them. The mathematical models which have been used in dairy science to predict milk yield have limitations like considering linear relations between the input and the desired output and others assume apriori relationship between the input and output variables that could not be true necessarily.

Artificial neural network (ANN) is a form of simulated human central nervous system (Wildberger, 1990; Adamczyk et al 2005), where models are of nonlinear nature. ANNs have been found to be tolerant to both the noise and ambiquity in data resulting from environmental influences (Widrov et al 1994, Finn et al 1996).

The use of ANNs in agricultural application areas has been increased recently. For example, Peacock et al (2007) have used ANN in the application in areas of plant protection and biosecurity while Wade and Lacroix (1994) investigated the role of ANNs in animal breeding. Neural networks have also been found appropriate in dairy prediction problem (Widrov et al 1994, Finn et al 1996). 

The results generated from use of ANNs have been applied to design expert systems supporting decision-making processes for many sectors of agriculture. For example Wen (2007) presented a knowledge-based intelligent e-commerce system for selling agricultural products while Patel et al (1998) developed an expert system for egg sorting.

The wide applicability of ANNs stems from their flexibility and ability to model both linear and non-linear systems without prior knowledge of an empirical model.

The aim of the present study was to investigate the ability and accuracy of ANN in assessing and predicting second parity milk yield of Kenyan Holstein Friesian dairy cows from first parity information. Selecting high producing cows as prospective producers and the parents of the next generation is important. The ANN results are compared with multiple linear regression (MLR).

Materials and methods

The input data for the present study were collected from the Dairy Recording Society of Kenya (DRSK). In this study only Holstein-Friesian cattle data were used because they comprise a high proportion of the exotic dairy animals raised in Kenya. Large- and medium-scale farms rear approximately 24% of the Holstein Friesian dairy cows in the country and produce most of milk sold in main urban centers (Ojango and Pollot 2001, Wakhungu 2001).

A total of 2808 records of Holstein Friesian were available. Each record contained the following information; herd identification, individual cow identification, cows date of birth, cows calving dates, lactation milk yield (kg), lactation length (days), parity, sire and dam. Variables average daily milk yield, age at calving, season of calving, herd average milk yield  and 305 days milk yield were derived.

The data was pre-processed with all the inconsistency removed eg. animals without  known sire and dam. For the breeding value analysis data was coded, with sires coded lower than the dams and the dams lower than the cows. A pedigree file was then created.

For prediction, artificial neural network (ANN) back propagation feed-forward neural network was used with programming in MATLAB (2002) while multiple linear regression (MLR) used SAS (2003) for analyses. T-test and correlations was employed for comparing observed values with ANN and MLR predictions.

Neural network model construction

The back propagation training algorithm was employed to predict the second lactation 305-day milk yield (SLMY305). Multilayer Perceptron (MLP) is a layered feed-forward network typically trained with back-propagation (learning algorithm). MLPs have been proven to be universal approximators (Reed and Marks, 1998), capable of implementing any given function through the use of various non-linear transfer functions.

Several training algorithms, viz., (i) conjugate gradient descent algorithm; (ii) Quasi-Newton algorithm; and (iii) Levenberg–Marquardt algorithm; and (iv) online back propagation along with various network architectural parameters, i.e., data partitioning strategy, initial synaptic weights, number of hidden layers, number of neurons in each hidden layer, activation functions, etc., are experimentally investigated to arrive at the best model for predicting the SLMY305. Transfer function tansig give more accuracy with the  hyperbolic tangent function having compressed unit’s set input into an activation value in the range [-1,1].

The input layer of the model for the first parity dam consisted of the nodes corresponding to the following variables: MY 305 days, average daily milk yield, herd average milk yield, age at calving in days, calving season and breeding value of the sire. The output layer (representing the variables that are being predicted) consisted of the nodes related to second parity dam milk yield. The input variables related to the first parity and one output variable of the second parity corresponding to the first parity of the individual cow were introduced to the three neural network layers of input, hidden and output of 8, 20 and 1 neurons respectively. The model construction used the neural network toolbox (MATLAB 2002).

The network architecture was optimized by selecting the best number of hidden layers, nodes per layer and epochs. MLP with one hidden layer was shown to model the second lactation milk yield with the best accuracy.  It has been demonstrated that at most two hidden layers are sufficient to solve any problem (Haykin 1999).

To circumvent overoptimistic and biased results, data was randomly split, 60%, 30% and 10% of the data was used for training, validation and testing sets respectively. Root Mean Squared Error (RMSE), Correlation Coefficient and regression coefficient (R2) are some of the performance functions used to improve the generalization performance of the feed forward neural network.

Results and discussion

The architecture [8-1-1] had the best fitness (Table 1). Multilayer perceptron (MLP) with varying number of hidden layers and varying number of nodes per layer ranging from 1 to 20 were analyzed.  MLP with one hidden layer and 8 nodes had higher prediction accuracy.  Beyond one hidden layer there could be over-fitting resulting from poor generalization (Chan et al., 2006) furthermore it has been demonstrated that at most two hidden layers are sufficient to solve any problem (Haykin 1999).

Training was confined to 2000 epochs, but in most cases there were no significant improvement in the mean square error after 500 epochs. The training set size was setup at 25%, 50%, 75% and 90% of the sample set. The best MLP had one hidden layer, eight nodes and 50% training set size (Figure 1).  Selection of proper sample size and randomization is very important for training an ANN.  The correlation between the predicted SLMY305 and the observed was 0.857 while that of test set was 0.937 which show that our training of ANN was good. The absolute error (AE) of the training data decreased and stabilised at 500 epochs, therefore we can use at most 500 epochs for future models. 

Of the several network algorithms investigated conjugate gradient descent algorithm give the best results for the correlation and the R2The predicted second parity 305day milk yield (pi) and the observed milk yield (oi) was not significantly different (P<0.01) as measured by relative error (RE) for ANN and MLR with R2 and correlation coefficients of 0.87 and 0.871 respectively.

                                    RE=(pi - oi)/ pi

Table 1. The fitness for verified architecture



Train Error

Validation Error

Test Error



























































Figure 1 AE of training with the best MLP with one hidden layer (H=8, training set size=50%)

This is further reinforced by the high correlation between the test set for ANN and MLR which was over 0.85. The results of ANN study are compared with the results of MLR model (Tables 2 and 3) and there was no significant difference (P<0.05). The implication is that both ANN and MLR are reliable for milk yield trait.  Figures 2 and 3 show plot between predicted and observed data by ANN and computed data by MLR model for the second parity 305d milk yield. The high correlation (0.859) from a random third of the data shows that ANNs are reliable as predictors.  Similarly for the MLR the correlation was high (0.835). 

Table 2. Differences between the actual values of 305d milk yield trait and predicted by ANN and MLR for the test set














Standard deviation





Table 3. The correlation coefficient and root mean square error between the actual values of  305d milk yield trait and predicted by ANN and MLR for the test set













Figure 2: Regression of ANN predicted on the observed data for SLMY305 (R=0.86)

Figure 3: Regression of MLR predicted on the observed data for SLMY305(R=0.84)

The results prove that the proposed ANN can be used successfully for the prediction of second parity 305 day milk yields in dairy cow. These results collaborate with findings by Hosseinia et al (2007). If breeding organizations and breeders in Kenya can work together to accurately select our top females then actively use modern breeding tools we can drastically improve the changes of retaining competitiveness locally and in the global dairy breeding industry.



Appreciation is expressed to the Dairy Recording Society of Kenya for providing data.


Adamczyk K, Molenda K, Szarek J and Skrzynski G 2005 Prediction of bulls slaughter value from growth data using artificial neural network. Journal of Central European Agriculture 6:133-142.


Chan Z S H, Ngan H W, Rad A B, David A K and Kasabov N 2006 Short-term ANN load forcasting from limited data using generalization learning strategies. Neurocomputing 70: 409-419.


Finn G D, Lister R, Szabo R, Simonetta D, Mulder H and Young R 1996 Neural Networks applied to a large biological database to analyse dairy industry pattern. Neural Computing and Applications 4:237-253.


Haykin S 1999 Neural Networks: A comprehensive Foundation. Prentice Hall.


Hosseinia P, Edrisi M, Edrisi M A and Nilforooshan M A 2007 Prediction of second parity milk yield and fat percentage of dairy cows based on first parity information using neural network system. Journal of Applied Science, 7:3274-3279 from


MATLAB 2002  Matlab 6.5 (Release 13), The Language of Technical Computing, The MathWorks, Natick, Mass, USA.


Ojango J M K and Pollott G E 2001 Genetics of milk yield and fertility traits in Holstein-Friesian cattle on large-scale Kenyan farms. Journal of Animal Science 79:1742-1750 from


Patel V C, McClendon R W and Goodrum J W 1998 Development and evaluation of an expert system for egg sorting. Computers and Electronics in Agriculture 20: 97-116.


Peacock L, Worner S and Pitt J 2007 The application of artificial neural networks in plant protection. Bulletin OEPP/EPPO Bulletin 37:277-282.


Reed R D and Marks R J 1998 Neural smithing: Supervised learning in feedforward artificial neural networks. Cambridge, MA: MIT Press.


SAS 2003 Procedures guide for personal computers (version 9.1 edition). SAS Institute Inc, Cary, NC, USA


Wade K M and  Lacroix R 1994 The Role of Artificial Neural Networks in Animal Breeding, Proceedings of the 5th World Congress on Genetics Applied to Livestock Production, Guelph, Canada, vol.22, pages.31-34.


Wakhungu J W 2001 Ph.D Thesis title “Dairy Cattle Breeding Policy for Kenyan Smallholders:An Evaluation Based on a Demographic Stationary State Productivity Model” University of Nairobi, Kenya.

Wen W 2007 A knowledge-based intelligent electronic commerce system for selling agricultural products. Computers and Electronics in Agriculture 57: 33-46.


Widrov B, Rundhart D E and Lehr M A 1994 Neural Networks Applications in industry, business and science communications of ACM 37:93-105.


Wildberger A M  1990 Neural networks as a modelling tool. AI and Simulation. Theory and Application, San Diego, California, pp:65-74.

Received 20 July 2010; Accepted 15 November 2010; Published 6 March 2011

Go to top