Livestock Research for Rural Development 14 (1) 2002

http://www.cipav.org.co/lrrd/lrrd14/1/segu141.htm

Choice of phenotypic (co)variances structure for test day records in Bos taurus x Bos indicus cows under a dual-purpose cattle system

J C Segura and M M Osorio*

Facultad de Medicina Veterinaria y Zootecnia, Universidad Autónoma de Yucatán. Km 15.5 Carretera Mérida- Xmatkuil, AP 4-116, Mérida, Yucatán, México.

^*CampusTabasco del Colegio de Postgraduados, AP 24 Cárdenas Tabasco, México.

Data on test day (TD) milk yield of cows from a dual-purpose herd were analysed by a mixed linear model in order to: choose the appropriate phenotypic (co)variance matrix structure, analyse the existence of trends in variances and covariances over time, and obtain lactation curves for Bos taurus x Bos indicus cows from an experimental station of the Colegio de Postgraduados at Cardenas, Tabasco, Mexico. The data set was comprised of 5566 test day milk yield records from 321 lactations (233 Holstein x Zebu and 88 Holstein x Sahiwal) collected between 1992 and 1999. Lactations up to 287 days were divided into 21 milking intervals of 14 days each, starting from day 7 of lactation. Three different (co)variance structures were fitted to the TD data: A compound symmetry structure (CS); an unstructured (UN) matrix and a (co)variance structure that combined the CS with a first-order autoregressive process [CS + AR(1)].

The best (co)variance structure was then used to obtain lactation curves for the ¾ Holstein x ¼ Zebu and ½ Holstein x ½ Sahiwal cows. The best (co)variance structures were UN and CS + AR(1). The mixed model methodology described well the lactation curve of Bos taurus x Bos indicus cows.

Keywords: Dual-purpose cattle; milk yield, (co)variance structure, test day models, tropics.

Introduction

Recently, the analysis of repeated data such as test day (TD) milk yield records has increased. The TD models can be used to analyse individual TD records of cows rather than a cumulative lactation yield, which is currently used. In addition, TD models allow for: a more accurate estimation of fixed effects; the consideration of a different number of records from each lactation; the variation of fixed effects estimates across herds and lactation stages; the adjustment for different effects of sampling date (Stanton et al 1992; Swalve 2000) and the prediction of daily milk yields from a limited number of TD records (Pool and Meuwissen 1999). Furthermore, the shape of the lactation curve can be accommodated in the model to account for differences in lactation curves among cows and problems associated with persistency (Brotherstone et al 2000).

The analysis of repeated records requires special attention due to problems of correlated errors and heterogeneity in the error (co)variance matrix structure. The general mixed linear model methodology is able to address these issues by directly modelling the (co)variance structure among repeated measures (Little et al 1998). Such a property seems suited for dual-purpose milk production analyses involving environmental effects on daily milk yield which can vary markedly across lactation stages, mainly due to aspects of the dual-purpose management system. Ptak and Schaeffer (1993) used a TD model, which assumes that covariances between successive TD are equal to those between TD that are far apart. Garcia and Holmes (2001) found the mixed model suitable for the description of the lactation curve of dairy cows in pasture-based systems.

In this study, TD yields of cows from a dual-purpose herd were analysed by mixed linear methodology in order to: model the (co)variance structure; analyse the existence of trends in variances and covariances over time and obtain the lactation curve for Bos taurus x Bos indicus cows.

Materials and methods

Source of data

The data consisted of lactations records collected from a population of dual-purpose, crossbred cows born in an experimental station at the Colegio de Postgraduados at Cardenas, Tabasco, Mexico. The region has a humid tropical climate, Aw₁ (García 1988). The cows were: ¾ Holstein x ¼ Zebu (HZ) (n=92) with two generations of selection for milk yield and inter-se mating (F₁ females inseminated with Holstein semen); and an F₂ generation from ½ Holstein x ½ Sahiwal cows (HS) (n=26) brought from New Zealand. For the HS group, F₁ bulls were mated to F₁ cows. Management consisted in milking twice daily with calf at foot, keeping the cow and calf together after milking in paddocks of Star grass (Cynodon plectostachyus). The milk yield was determined biweekly, starting 1 week after calving. During milking, cows received a concentrate containing 16% crude protein and 2.3 Mcal ME/kg (4 kg/cow/day).

Statistical analysis

The data set was comprised of 5566 test day milk yield records from 321 lactations cows (233 HZ and 88 HS lactations) collected between 1992 and 1999. Lactations with less than 8 TD records were discarded. Lactations up to 287 days in length were used and were divided into 21 intervals (DIM = days in milk) of 14 days each, starting from day 7 of lactation. Elimination of data was decided because of the few observations for lactations shorter than 105 days and longer than 287 days and to allow the statistical programs to converge. The number of lactations per genotype and parity are shown in Table 1. The average number of lactations per cow was 2.7. Repeated lactations were assumed to be uncorrelated, as previously suggested by other authors (Stanton et al 1992).

Table 1. Distribution of lactations per genotype and parity in a dual-purpose cattle system in Tabasco, Mexico.
Parity	¾ Holstein x ¼ Zebu	½ Holstein x ½ Sahiwal
1	87	26
2	57	20
3	38	16
4	25	11
5	17	10
6	9	5

Data were classified according to the following factors: year of calving (1992-1999), season of calving (dry, rainy, and windy and rainy), parity (1, 2, 3–4, 5 or greater), genotype (HZ and HS) and 21 DIM intervals. Season of calving was grouped based on temperature and rainfall of the region as: dry (February to May), rainy (June to September) and windy and rainy (October to January).

Data were analysed with the following linear model

y_ijklmno = M + A_i + E_j + P_k + G_l + DIM_m + L(G)_ln + e_ijklmno

where y_ijklmno = test day milk yield of lactation, M = overall mean; A_i = fixed effect of year of calving, E_j = fixed effect of season of calving, P_k = fixed effect of parity, G_l = fixed effect of genotype, DIM_m = fixed effect of DIM interval, L(G)_ln = random effect of lactation within genotype; and e_ijklmno = random residual.

Three different (co)variance structures were fitted to the TD data: (1) a structure known as compound symmetry (CS) which uses the same variance and correlations for all pairs of measures on the same animal. It represents the most commonly used (co)variance structure in phenotypic analysis of repeated measures (Stanton et al (1992) due to the advantage of only two parameters to be estimated: the between lactation and residual components of variance; (2) The second and third structures were considered in order to take account of two main aspects of correlations between repeated measures. Firstly, two measures can be correlated simply because they share a common contribution from the same lactation. Secondly, measures on the same lactation closer in time are usually more correlated than measures further apart in time. The second structure uses a model that calculates covariances and correlations for each pair of measures on the same animal, known as unstructured (UN). The third structure was a (co)variance structure that combined the CS structure with a first-order autoregressive process [CS + AR(1)].

The goodness of fit of the (co)variance models was assessed by comparing values of REML Log likelihood, Akaike’s Information Criterion (AIC) and Schwartz Bayesian Criterion (SBC). Larger values for the parameter criteria indicates a better fit of the model. The best (co)variance structure was then used to obtain generalized least squares estimates for TD milk yields, which were then plotted against DIM to represent lactation curves for Holstein x Zebu and Holstein x Sahiwal cows. All analyses were performed using the mixed procedure of SAS (1995).

Results and discussion

The CS structure produced variance, covariance and correlation values of 630, 309 and 0.49, respectively. (Co)variances and correlations between TD at different DIM intervals estimated for UN and CS + AR(1) are reported in Tables 2 and 3, respectively. Correlation values between repeated measures showed a clear decreasing pattern with time for the CS + AR(1) structure. This means that environmental effects in the dual purpose system are important and also that estimation of total milk production based on partial lactations may not be as efficient as desired. In the CS + AR(1) structure, covariance and correlation values were practically zero after the 10^th DIM interval. Decreasing trend in variances, covariances and correlations, are recognized in the UN matrix case, although they are partially hidden.

Table 2. Matrix of variances, covariances and correlations (along, above and below the diagonal, respectively) for milk yield (kg²) using a matrix structure that makes no assumptions regarding equal variances and correlations
Days in milking intervals
	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17
1	685.2	427.5	436.5	403.4	418.5	305.4	317.5	263.9	238.9	243.2	238.8	249.4	209.3	190.5	214.8	144.8	211.3
2	0.635	660.6	503.0	462.7	422.3	294.0	337.3	286.4	239.5	240.8	252.6	277.3	219.6	195.6	193.9	158.3	186.6
3	0.574	0.673	845.4	534.8	541.4	375.3	375.2	324.8	284.1	283.5	265.7	243.2	251.9	203.5	251.3	198.2	207.8
4	0.555	0.648	0.662	771.6	566.6	416.3	488.5	405.3	350.6	322.5	349.9	323.6	287.9	253.0	262.9	231.7	244.4
5	0.577	0.593	0.672	0.736	767.8	488.3	449.6	375.6	370.6	319.7	318.8	313.1	323.4	306.3	297.5	280.5	290.2
6	0.472	0.463	0.523	0.609	0.714	609.8	746.2	456.5	455.4	398.5	384.0	374.7	300.7	291.1	247.3	251.4	236.1
7	0.444	0.480	0.472	0.534	0.645	0.666	456.5	619.4	420.2	404.3	355.5	326.5	348.7	330.8	314.6	306.7	256.2
8	0.405	0.448	0.449	0.486	0.578	0.611	0.671	420.2	622.5	405.5	390.6	352.4	333.8	289.0	268.7	295.1	271.5
9	0.366	0.373	0.392	0.421	0.507	0.601	0.668	0.677	405.5	617.9	401.8	372.1	369.8	327.2	287.7	285.4	265.4
10	0.374	0.377	0.392	0.407	0.468	0.521	0.587	0.653	0.654	401.8	536.5	406.9	364.1	312.4	272.9	266.8	238.1
11	0.394	0.424	0.394	0.446	0.545	0.557	0.607	0.617	0.676	0.698	406.9	677.8	367.7	332.3	312.0	282.0	258.2
12	0.366	0.414	0.321	0.394	0.449	0.487	0.527	0.504	0.542	0.575	0.675	400.3	400.3	373.4	333.2	297.1	263.0
13	0.333	0.356	0.361	0.431	0.486	0.507	0.531	0.558	0.617	0.610	0.661	0.640	577.2	412.2	366.2	345.2	293.1
14	0.307	0.321	0.295	0.384	0.466	0.497	0.511	0.490	0.553	0.530	0.605	0.605	0.724	561.9	399.6	389.0	340.6
15	0.354	0.326	0.374	0.409	0.464	0.433	0.498	0.467	0.498	0.474	0.582	0.553	0.659	0.729	535.3	397.4	341.8
16	0.240	0.267	0.295	0.361	0.438	0.441	0.486	0.514	0.496	0.465	0.527	0.494	0.622	0.711	0.744	532.8	391.1
17	0.334	0.300	0.296	0.364	0.434	0.395	0.388	0.452	0.441	0.397	0.462	0.418	0.505	0.595	0.612	0.702	583.0

Table 3. Matrix of variances, covariances and correlations (along, above and below the diagonal, respectively) for milk yield (kg²) using a (Co)variance matrix that combined the compound symmetry (CS) structure with a first-order autoregressive process AR(1)
	Days in milking interval
	1	2	3	4	5	6	7	8	9	10	11	12	13
1	356.44	155.25	67.62	29.45	12.83	5.59	2.43	1.06	0.46	0.20	0.09	0.04	0.02
2	0.436	356.44	155.25	67.62	29.45	12.83	5.59	2.43	1.06	0.46	0.20	0.09	0.04
3	0.190	0.436	356.44	155.25	67.62	29.45	12.83	5.59	2.43	1.06	0.46	0.20	0.09
4	0.083	0.190	0.436	356.44	155.25	67.62	29.45	12.83	5.59	2.43	1.06	0.46	0.20
5	0.036	0.083	0.190	0.436	356.44	155.25	67.62	29.45	12.83	5.59	2.43	1.06	0.46
6	0.016	0.036	0.083	0.190	0.436	356.44	155.25	67.62	29.45	12.83	5.59	2.43	1.06
7	0.007	0.016	0.036	0.083	0.190	0.436	356.44	155.25	67.62	29.45	12.83	5.59	2.43
8	0.003	0.007	0.016	0.036	0.083	0.190	0.436	356.44	155.25	67.62	29.45	12.83	5.59
9	0.001	0.003	0.007	0.016	0.036	0.083	0.190	0.436	356.44	155.25	67.62	29.45	12.83
10	0	0.001	0.003	0.007	0.016	0.036	0.083	0.190	0.436	356.44	155.25	67.62	29.45
11	0	0	0.001	0.003	0.007	0.016	0.036	0.083	0.190	0.436	356.44	155.25	67.62
12	0	0	0	0.001	0.003	0.007	0.016	0.036	0.083	0.190	0.436	356.44	155.25
13	0	0	0	0	0.001	0.003	0.007	0.016	0.036	0.083	0.190	0.436	356.44

The autoregressive (p) parameter value for the CS + AR(1) (co)variance structure was 0.44. This value is smaller than the repeatability value for milk yield (0.65) reported for a dual-purpose cattle system (Hernández-Reyes et al 2000) but similar to that notified by De Alba and Kennedy (994) in purebred and crossbred Criollo cattle in Mexico (0.44).

Goodness of fit criteria (Table 4) based on REML and AIC showed that the UN (co)variance structure was more appropriate for TD data analysis. However, the SBC criteria indicated that the CS + AR(1) structure was better in fitting TD records. Nevertheless, there were no marked differences among (co)variance structures. There are two major potential problems with using the unstructured covariance. One, it requires estimation of a large number of variance and covariance parameters and can lead to severe computational problems, especially for unbalanced data. Two, it does not exploit existence of trends in variances and covariances over time, and thus often results in erratic patterns of standard error estimates (Little et al 1998).

Table 4. Goodness of fit statistics for an unstructured (UN), a compound symmetry (CS) and a CS with a first order autoregressive process [CS + AR(1)] (co)variance structures for test day milk yields of a dual- purpose herd of dairy cattle in Tabasco, Mexico
	UN	CS	CS + AR(1)
REML	-23679	-24403	-23968
AIC	-23889	-24405	-23972
SBC	-24584	-24412	-23985
REML= Restricted Maximum likelihood Log likelihood, AIC = Akaike’s Information Criterion; SBC = Schwartz Bayesian Criterion.

The two components of (co)variance that the CS + AR(1) model disentangles for milk yield around the respective average lactation curve (the compound symmetry and the autoregressive components), can be regarded as effects of different random factors. The constant component given by the CS part can be related to factors affecting the whole lactation, such as genetic merit of the animal and permanent environmental factors. On the other hand, the AR(1) part suggests, as mentioned before, the existence of an appreciable component of covariation that decreases rapidly with increasing DIM intervals, which can be seen as an indication of important environmental factors that result in short term effects, which are lost within a few DIM intervals.

Average lactation curves for the HZ and HS estimated by imposing the CS + AR(1) structure to the model are reported in Figure 1. The HZ had higher TD yields than the HS cows. They show the typical ascending phase of milk yield curves. Peak milk yield was reached at 5 weeks of production. However, these shapes are different to those reported by Madalena et al (1979) for HZ cows in Brazil, who used Wood models to fit curves, and who found an earlier peak (at week one) than in this study.

Figure 1. Lactation curves of 3/4 Holstein x 1/4 Zebu and 1/2 Holstein x 1/2 Sahiwal cows
in Tabasco, Mexico

The use of suitable (co)variance matrices represents a fundamental point to correctly analyse repeated measure traits like weekly or monthly milk yields or other economically important traits. This study, however, did not show a clear benefit of using any of the (co)variance structures. Nevertheless the CS + AR(1) structure was able to point out some essential features of the behaviour of individual lactations around the average curve. In particular, it highlighted the differences between factors that cause constant and short term covariation, and the relative importance that such components can have on milk yield. The results of this study are also of technical interest because they seem to indicate that a biweekly interval between two consecutive records is a valid option for describing the lactation curve of dual-purpose cattle.

Conclusions

The CS + AR(1) matrix structure is recommended for the analysis of TD milk yield data from dual purpose cattle. The mixed model methodology described well the lactation curve of Bos taurus x Bos indicus cows under the management conditions of this study.

References

Brotherstone S, White I M S and Meyer K 2000 Genetic modelling of daily milk yield using orthogonal polynomials and parametric curves. Animal Science 70: 407-415

De Alba J and Kennedy B W 1994 Genetic parameters uf purebred and crossbred milking criollos in tropical Mexico. Animal Production 58: 159-165

García E 1988 Modificaciones al Sistema de Clasificación Climática de Koppen. Instituto de Geografía, Universidad Nacional Autónoma de México. México, D F 276 p.

Garcia S C and Holmes W J 2001 Lactation curves of autumn- and spring- calved cows in pasture-based dairy systems. Livestock Production Science 68: 189-203.

Hernandez-Reyes E, Segura-Correa V M, Segura-Correa J C and Osorio-Arce M M 2001 Calving interval, lactation length and milk production in a dual purpose herd in Yucatan, México. Agrociencia 35: 699-705

Jamrozik J and Schaeffer L R 1997 Estimates of genetic parameters for a test day model with random regression for yield traits of first lactation Holsteins. Journal of Dairy Science 80: 762-770

Little R C, Henry P R and Ammerman C B 1998 Statistical analysis of repeated measures data using SAS procedures. Journal of Animal Science 76: 1216-1231

Madalena F E, Martinez M L and Freitas A F 1979 Lactation curves of Holstein-Friesian and Holstein-Friesian x Gir cows. Animal Production 29:101-107

Osorio M M y Aranda E 1999 Productividad de vacas Bos taurus x Bos indicus en un sistema de doble propósito en el trópico. Memorias VII Reunión Nacional de Investigación Pecuaria. INIFAP. Querétaro, Querétaro, México. Noviembre 12-14, 1999. p. 123

Ptak E and Schaeffer L R 1993 Use of test day yields for genetic evaluation of dairy sires and cows. Livestock Production Science 34:23-34

Pool M H and Meuwissen T H E 1999 Prediction of daily milk yields from a limited number of test days using test day models. Journal of Dairy Science 82:1555-1564.

SAS 1995 SAS/STAT User’s guide, Version 6.11 SAS Institute, Cary, North Carolina

Stanton T L, Jones L R, Everett R V and Kachman S D 1992 Estimating milk, fat and protein lactation curves with a test day model. Journal of Dairy Science 75:1691-1700.

Swalve H H 2000 Theoretical basis and computational methods for different Test-day genetic evaluation methods. Journal of Dairy Science 83:1115-1124

Wiggans G R and Goddard M E 1997 A computationally feasible Test Day model for genetic evaluation of yield traits in the United States. Journal of Dairy Science 80:1975-1800