Livestock Research for Rural Development 19 (7) 2007 Guide for preparation of papers LRRD News

Citation of this paper

Identification of production systems and assessment of heterogeneity of variance components for Holstein-Friesian cattle in the tropics

T K Muasya, T M Magothe*, **, E D Ilatsia** and A K Kahi**

National Animal Husbandry Research Centre, Kenya Agricultural Research Institute, PO Box 25, 20117 Naivasha, Kenya
*Livestock Recording Centre, Ministry of Livestock and Fisheries Development, PO Box 257, 20117 Naivasha, Kenya
**Animal Breeding and Genetics Group, Department of Animal Sciences, Egerton University, PO Box 536, 20107 Njoro, Kenya


Milk yield data (n=120307) from116 Holstein-Friesian herds were used to group herds into clusters and carry out genetic characterization of the production environments in Kenya. Herds were clustered based on herd mean 305-day milk yield, and herd standard deviation of 305-day milk yield. Variance components for the clusters were estimated by univariate animal models using derivative free REML algorithm, and significance tests were done using the Fmax procedure.

Based on the descriptive variables, three production environments or clusters were identified. Phenotypic, additive genetic and residual variances varied across production levels: 1134608.1, 1513952, and 827057; 4144955, 503934, and 122837; and 1134608, 918189, and 661768, respectively for herd production environments 1, 2 and 3. The heritability estimates were 0.23 ± 0.04, 0.33 ± 0.04 and 0.14 ± 0.03, respectively. Differences in production environments are important and cause heterogeneous variances which should be accounted for in genetic evaluation for Holstein-Friesian in Kenya.

Key words: clusters, heterogeneity, Holstein-Friesian, milk yield, production environments, variance components


The dairy industry in Kenya is based on exotic breeds and their crosses with indigenous breeds. Bos taurus dairy cattle breeds such as Holstein-Friesian, Ayrshire, Guernsey, Jersey and crosses among themselves, and with Sahiwal or the East African Zebu are found in various agro-ecological zones where they are raised in different production systems. Over 76% of dairy cattle are raised under the smallholder production system while the rest are raised in production systems found on medium and large-scale farms (Peeler and Omore 1997). Smallholder production systems predominate where land sizes are small, while medium and large scale farms are common where land is not limiting.

The orientation of the breeding programme is towards increasing milk yield and both locally bred sires and semen from foreign bulls are used (Ojango 2000; Bebe et al 2002). In the genetic evaluation of locally bred sires, herds are fitted as fixed effects (Ojango 2000; Olukoye and Mosi 2002; Magothe et al 2006). Given the production systems, and the large number of herds, use of multiple trait models becomes increasingly computationally unfeasible. Inadequate genetic ties between herds can lead to erroneous covariance estimates. Variances of milk yield vary with the level of management and environment (Costa et al 2000) due to genotype by environment correlation and methods of feeding concentrates (Brotherstone and Hill 1986). Bias arises in genetic evaluations from differences in variation within herds, and may become more severe as intensity of selection increases (Vinson 1987). Where different production environments or herds exist (Olukoye and Mosi 2002), herds can be grouped according to management, climatic and genetic information (Naya et al 2002; Weigel and Rekaya 2000) to increase the precision of genetic parameter estimates in structural covariance models. Such clustering of herds can lead to borderless evaluation and even specific to each production system/environment or herd (Weigel and Rekaya 2000; Lohuis and Dekkers 1998).

Dairy cattle evaluation using Best Linear Unbiased Predictions (BLUP) requires appropriate variance components to provide solutions. Use of BLUP assumes independence of genetic and environmental variances from the mean and that they are homogenous across herds or environments, and that the genetic correlation between genetic values in different environmental variance groups is unity (Meyer 1998). Heteroscedasticity across production environments (Olukoye and Mosi 2002; Costa et al 2000; See 1998; Visscher et al 1991) reduces the accuracy of predicted breeding values relative to the population mean (De Mattos et al 2000; Verrier et al 1993) and can lead to favouring high performers from more variable herds over high performers from low-variance herds, causing a reduction in response to selection (Hill 1984). These biases in evaluations accumulate over time as dams and daughters tend to express records in the same herds or environments (Vinson 1987).

Non-genetic factors are important causes of heterogeneity of variance at phenotypic level (Olukoye and Mosi 2002) in the Holstein-Friesian population in Kenya. Response to selection is a function of selection intensity, heritability and phenotypic standard deviation (Falconer 1989), and therefore genetic variances should be investigated for heteroscedasticity. Dairy cattle evaluation using BLUP is just being implemented in Kenya (Magothe et al 2006), and the usefulness of the evaluations will depend on how well the assumptions of homogeneity of variance components match the data. The objective of the study was to determine if evidence exists for heterogeneity of variance of milk yield in Holstein-Friesian population in Kenya using cluster analyses

Materials and methods

305-day milk yield data was obtained from Dairy Recording Services of Kenya on herds participating in performance recording in Kenya. The data consisted of cows that calved between 1985 and 2005 and had completed the current lactation by the time of the analysis. Information in the data included pedigree of each cow, season and year of calving parity and herd.

Two variables were used to identify production environments. Mean 305-day milk yield (LMYD) and average standard deviation (SDLMYD) for each herd provided information about intensity of management on each farm.

The variables were defined as follows:

Standard deviation of milk

It provides a measure of production intensity in each herd. This parameter assumes that a more effective management elicits greater performance variability within a herd (Dong and Mao 1990; Naya et al 2002; Raffrenato et al 2003)

Average herd 305-day milk yield

This variable provides a measure of the intensity of feeding and general management. Lactations were extended to 305-day yield equivalents using Woods gamma function (Muasya 2005), where cows had dried off earlier or had not finished lactating.

Identification of the herd clusters was performed with cluster analysis techniques, using the variables defined above. The original data on the definitive variables were standardized to a mean of zero and a standard deviation of one using PROC STANDARD (SAS 2002). The resultant data was then subjected to hierarchical clustering under the PROC CLUSTER with the method of minimum variances within group of ward. Derivation of the appropriate number of clusters was based on the pseudo F statistic. After cleaning and editing, 12307 records were available for analyses.

Least square means and tests for statistical differences were carried out using PROC GLM of SAS computer programme (2002) using the following fixed effects model.


= 305-day milk yield

= Underlying mean

= Fixed herd cluster with i=1, 2, 3

=Fixed effect of parity with j=1 to 6

=Fixed effect of season of calving with k=1, 2, 3, 4

=Fixed effect of year of calving with l=1985=1 to 2005=22

=Random residual error NID (0, Iσ2e)

Genetic characterization of the clusters was done by estimating phenotypic variance, additive genetic variance, residual variances and heritability using a univariate repeatability animal model.

The mixed model in matrix notation was:


Y is a vector of observations,
and Z are known incidence matrix of fixed effects, and random effects, respectively;
and u are unknown vector of fixed and random effects respectively, while
e is a vector of residuals.

Variance components were estimated using the DFREML package (Meyer 1998) and the Fmax procedure was used to test for homogenous variances.


Three clusters were formed and their basic statistics: Mean 305-day milk yield (LMYD) average standard deviation (SDLMYD), number of herds and the resultant number of records are shown in table 1. All clusters shared the same sires. The herd clusters differed significantly (P<0.05) for the two descriptive variables.

Table 1.  Characteristics of the production clusters

Production environment

No. of herds

No. of records

LMYD, kg


Cluster 1





Cluster 2





Cluster 3





Different superscripts denote different means (P<0.05)

A cluster is group of herds with homogeneous variance components

Cluster 2 had the highest mean 305-day milk yield, while cluster 1 had the largest standard deviation for milk yield. Cluster 3 had the highest number of herds and total number of records but the lowest standard deviation for milk yield and mean 305-day milk yield.

Phenotypic, additive genetic, residual variances and heritability estimates, for each production level, are presented in table 2. Estimates of phenotypic, additive genetic, residual variances increased with herd production level, as did heritability estimates. Table 2 shows higher variance for cluster 1.

Table 2.   Estimates of variance components and heritability estimates for milk yield by cluster

Production environment

Phenotypic variance, (σ2p)

Additive genetic variance (σ2a)

Residual variance (σ2e)



Cluster 1




0.23 0.04

Cluster 2




0.33 0.04

Cluster 3




0.14 0.03

Different superscripts denote different means (P<0.001)

A cluster is group of herds with homogeneous variance components

The Fmax statistical test revealed that all the variances (Phenotypic, additive genetic, residual variances) were significantly different (P<0.05) across clusters. The results in table 2 clearly indicate that there is heteroscedasticity of variance components for milk yield.


Herd clusters

Three distinct herd clusters were identified with the cluster analysis, which were all significantly different from one another in terms of the descriptive variables used (Table 1). The difference in clusters in their original variables (Table 1) show that milk yield varies with the level of management and environment (Costa et al 2000) and this could be due to general feeding and genotype by environment correlation and methods of feeding concentrates (Brotherstone and Hill 1986).

Variance components

Phenotypic, additive genetic and residuals variances and the respective heritability estimates were different for all herd clusters (table 2). The results of this study agree with those of Neser (2002), Weigel and Rekaya (2000), Naya et al (2002) who demonstrated that variance components varied with change in production environment the differences in heritability estimates for the clusters affect response to selection(Hill 1984) and would result in reduction of the accuracy of predicted breeding values due to favouring of high performers in more variable clusters at the expense of their counterparts in less variable clusters (Hill 1984).

The biases that arise with heterogeneous variances can therefore influence the selection of Bull-dams and superior sires in a breeding programme such as Kenya's, and would cause a reduction in response to selection (Hill 1984). Heterogeneous variances may give rise to genotype by environment interaction implying that different production environments may require a different set of sires.

Conclusions and recommendations


The authors wish to acknowledge the LRC and DRSK for provision of data and the Kenya Agricultural Research Institute (KARI) and Egerton University (EU) for provision of computing facilities.


Bebe B O, Udo H M J, and Thorpe W 2002 Development of smallholder dairy systems in the Kenya highlands. Outlook on Agriculture 31: 113-120.

Brotherstone S and Hill WG 1986 Heterogeneity of variance among herds for milk production. Animal Production 42. 297-303

Costa C N, Blake R W, Pollak E J, Oltenacu P A, Quaas R L and Searle S R 2000 Genetic analysis of Holstein cattle populations in Brazil and the United States. Journal of Dairy Science 83:2963-2974.

De Mattos D, Mitszal I, Bertrand J K 2000 Variance and covariance components for weaning weight for Herefords in three countries. Journal of Animal Science 78:33-37

Dong M C and Mao I L 1990 Heterogeneity of covariance and heritability in different levels of intra-herd milk production variance and of herd average. Journal of Dairy Science 73:843-851

Falconer D S 1989 "Introduction to quantitative genetics". Third edition, Longman, London.

Hill W G 1984 On selection among groups with heterogeneous variance. Animal Production 39:473-477

Lohuis M M, and Dekkers J C M 1998 Merits of borderless evaluations in different countries. Proceedings of the 6th World Congress on Animal Genetics Applied Livestock Production., Armidale, Australia XXVI: 169-172

Magothe T M, Ilatsia E D, Wasike C B, Migose S A and Kahi A K 2006 Genetic evaluation of milk yield of Bos taurus dairy breeds in Kenya. Proceedings of the 10th KARI scientific conference, Nairobi, Kenya.

Meyer K 1998 'DFREML version 3.0 user notes''

Muasya T K 2005 Genetic evaluation of the dairy cattle herd at the university of Nairobi veterinary farm. MSc. Thesis. University of Nairobi.

Naya H J, Urioste I and Franco J 2002 Identification of production environments and presence of G x E interactions in Uruguay using Holstein herds records. In Proceedings of the 7th World Congress on Genetics Applied to Livestock Production, August 19-23, 2002, Montpellier, France.

Neser  F W C  2002 A preliminary investigation into the use of cluster analyses in genotype x environment interaction studies in beef cattle. In Proc. 7th World Congress on Genetics Applied to Livestock Production, August 19-23, 2002, Montpellier, France.

Ojango J M 2000 Performance of Holstein-Friesian cattle in Kenya and the potential for genetic improvement using international breeding values. PhD thesis. Wye University College, University of London.

Olukoye G A and Mosi RO 2002 Non-genetic causes of heterogeneity of variance in milk yield among Holstein-Friesian herds in Kenya. The Kenya Veterinarian 25, 18-23.

Peeler E J and Omore A O 1997 Manual of Livestock Production Systems in Kenya, KARI/DFID National Agricultural Research Project II. Nairobi, Kenya.

Raffrenato E, Blake R W, Oltenacu P A, Carvalheira J and Licitra G 2003 Genotype by Environment Interaction for Yield and Somatic Cell Score. Journal of Dairy Science86:2470-2479

See M T 1998 Heterogeneity of (Co)variance Among Herds for Backfat Measures of Swine. Journal of Animal Science 76:2568-2574

Statistical Analysis System (SAS) 2002 The statistical Analysis System, Version 8 for windows. SAS Institute Inc., Cary NC, USA.

Verrier E, Colleau J J and Foulley J.-L 1993 Long-term effects of selection based on the animal model BLUP in a finite population. Theoretical and Applied Genetics 87:446-454.

Vinson W E 1987 Potential bias in genetic evaluations from difference in variation within herds. Journal of Dairy Science 70: 2450-2455.

Visscher P M, Thompson R, and Hill W G 1991 Estimation of genetic and environmental variances for fat yield in individual herds and an investigation into heterogeneity of variance between herds. Livestock Production Science 28:273-290.

Weigel K A, and Rekaya R 2000 A multiple-trait herd cluster model for international dairy sire evaluation. Journal of Dairy Science 83:815-821

Received 26 February 2007; Accepted 22 April 2007; Published 6 July 2007

Go to top