CONTENTS Introduction Methods Results and Discussion References
Northeast Fisheries Science Center Reference Document 06-27
Accuracy and precision exercises associated with 2005 TRAC production agingSandra J. Sutherland, Nancy J. Munroe, Vaughn Silva, Sarah E. Pregracke, and John M. Burnett
National Marine Fisheries Serv., Woods Hole Lab., 166 Water St., Woods Hole MA 02543-1026
Web version posted January 23, 2007Citation: Sutherland SJ, Munroe NJ, Silva V, Pregracke SE, Burnett JM. 2006. Accuracy and precision exercises associated with 2005 TRAC production aging. US Dep Commer, Northeast Fish Sci Cent Ref Doc 06-27; 17 p.
Information Quality Act Compliance: In accordance with section 515 of Public Law 106-554, the Northeast Fisheries Science Center completed both technical and policy reviews for this report. These predissemination reviews are on file at the NEFSC Editorial Office.INTRODUCTION
In production aging programs, age reader accuracy can be thought of as how often the “right” age is obtained, and precision as how often the “same” age is obtained (Campana 2001). It is possible that, over time, an age reader may inadvertently change the criteria that are used for determining ages, thereby introducing a bias into the age data. This bias can be measured with accuracy tests, which consist of the age reader blindly examining known- or consensus-aged fish from established reference collections. An age reader may also make periodic mistakes, which introduces random errors into the data. The degree of this error can be measured with precision tests, which consist of the age reader blindly re-aging fish which they have already aged. Both accuracy and precision must be considered within a quality-control monitoring program.
Acceptable levels of aging accuracy and precision are influenced by factors such as species, age structure, and age reader experience. Although percent agreement is strongly affected by these differences, the staff of the Fishery Biology Program at the Northeast Fisheries Science Center (NEFSC) have long considered levels above 80% to be acceptable. The total coefficient of variation (CV) is less affected by these differences and, thus, is a better measure of aging error. In many aging labs around the world, total CVs of under 5% are considered acceptable among species of moderate longevity and aging complexity (Campana 2001), such as the species considered here.
For over 35 years, scientists at the NEFSC Fishery Biology Program have regularly conducted production aging, determining the ages for large numbers of samples over a short period of time using established methods (Penttila and Dery 1988), for the species assessed by the Transboundary Resources Assessment Committee (TRAC). Historically, our approach to age-data quality control and assurance has been a two-reader system. In this approach, there are both a primary and a secondary age reader for each species. The primary age reader conducts all production aging, and the secondary age reader then ages a portion of those same samples using similar methods. The ages determined by the two readers are compared, and if they agree sufficiently (above 80% agreement), the production ages are considered valid. If not, the sources of disagreement must first be resolved. This interreader approach is still used in the course of training new readers in order to ensure consistency in application of aging criteria and in inter-laboratory sample exchanges. Budgetary and staffing constraints have made this approach less feasible, however, by reducing the number of species for which there are two competent age readers at this laboratory.
In response, the NEFSC Fishery Biology Program has implemented a new approach to quality control and assurance. Intrareader tests of aging accuracy and precision, as described above, allow us to quantify the amount of inherent aging error and bias in the ages determined by each of our staff members. These values provide a measure of the reliability of the production age data used in stock assessments, and they may be directly incorporated into population models as a source of variability.
In conjunction with implementation of these tests, we have begun to establish reference collections of age samples for each species. These collections are necessary to evaluate aging accuracy. Fish of known age are difficult to obtain, so we have focused on assembling collections from age samples which have been included in aging exchanges with other laboratories. From those samples, we have selected those fish for which multiple experienced age readers agree on the age (see Silva et al. 2004 for more details).
In what has become an annual process, exercises were undertaken to estimate the accuracy and/or precision of U.S. production aging for the 2005 TRAC assessments (Hunt et al. 2005; Stone and Legault 2005; Van Eeckhaute and Brodziak 2005) of Georges Bank stocks of cod (Gadus morhua), haddock (Melanogrammus aeglefinus), and yellowtail flounder (Limanda ferruginea). This report lists the results of those exercises.METHODS
For all species, subsamples were randomly selected to be re-aged in order to test age-reader accuracy (versus the reference collections) or precision (versus samples previously aged by that reader). Some consideration was given to selecting a range of lengths in these random samples to include a wider range of ages. When re-aging fish, the age reader had knowledge of the same data as during production aging (i.e. fish length, date captured, and area captured) but no knowledge of previous age estimates.
During age-testing exercises, no attempts were made to improve results with repeated readings. There was also no attempt to revise the production ages in cases where differences occurred. Results are presented in terms of percentage agreement, total coefficient of variation (CV), age-bias plots, and age-frequency tables (Campana et al. 1995; Campana 2001).
For cod, the current primary age reader was unable to conduct production aging within the available time, and did not resume production aging until late in 2005. Therefore, the previous age reader, who aged cod samples from 1984 to 2003, completed aging of all samples for the 2005 TRAC meeting. Following production aging, the accuracy of this previous age reader was determined from a random subsample drawn from the NEFSC cod otolith reference collection. Because of time constraints, no precision estimates were attempted.
For haddock, age-reader precision was estimated on multiple occasions from blind second readings of subsamples from each NEFSC survey (autumn 2004 and spring 2005) and from each quarter of the 2004 NEFSC commercial port samples. These exercises immediately followed the completion of each cruise or quarter. Following the completion of production aging, age-reader accuracy was also assessed by re-aging a subsample from the NEFSC haddock otolith reference collection.
For yellowtail flounder, age-reader precision was estimated three times from blind second readings of random subsamples from the 2004 Canadian Department of Fisheries and Oceans (DFO) port samples, the 2005 DFO spring survey, and a combination of U.S. samples (autumn 2004 and spring 2005 NEFSC surveys, plus 2004 NEFSC commercial port samples). These latter samples were also re-aged by the person who was then being trained as a yellowtail age reader. This trainee assumed yellowtail aging duties in 2006, after the previous age reader retired. Both age readers worked together during production aging for the above samples, precluding an interreader comparison.RESULTS AND DISCUSSION
The total sample sizes associated with the accuracy and precision exercises were N = 106, 393, and 367 for cod, haddock, and yellowtail flounder, respectively. Results for cod are presented in Figure 1, haddock in Figure 2, Figure 3, Figure 4, Figure 5, Figure 6, Figure 7, and Figure 8, and yellowtail in Figure 9, Figure 10, Figure 11, and Figure 12. Table 1 summarizes these results.
The accuracy estimate for cod was high (91% agreement), and the total CV (1.5%) was low. However, there was a tendency toward overaging by one year in the test readings (Figure 1). Even so, the precision level was virtually the same as that obtained last year (91% agreement and 1.9% CV, Sutherland et al. 2004 [unpubl.]), suggesting that the temporary change to the previous age reader was not problematic.
For haddock, precision levels ranged between 91 and 98% agreement, with total CVs of 0.3–0.9% (Figure 2, Figure 2, Figure 3, Figure 4, Figure 5, Figure 6, and Figure 7), indicating a high level of consistency in age determinations. No disagreement between readings was more than one year. More importantly, no pattern of seasonal bias was present across exercises this year, as was observed last year in samples from the 1st and 2nd quarters (Sutherland et al. 2004 [unpubl.]). This year’s results showed an increase in precision from last year (median of 86% agreement and 2.0% CV, Sutherland et al. 2004 [unpubl.]). The relatively high accuracy estimate (94% agreement, 1.3% CV, Figure 8), coupled with consistently high precision results, supports the conclusion that the haddock age reader is performing at a reliable level of aging capability.
Precision levels for yellowtail flounder aging were consistent between Canadian samples from the 2004 DFO port samples (86% agreement, 2.5% CV, Figure 9) and the 2005 DFO spring survey (92% agreement, 1.8% CV, Figure 10). In the port samples, there was a tendency towards higher ages for intermediate-age fish in the second readings. The values obtained for U.S. samples, however, were less precise (71% agreement and 6.6% CV, Figure 11) and revealed a bias towards underaging of older fish (age ≥ 4 years) in the second readings.
When the latter exercise was performed by the trainee, results were comparably precise (73% agreement, 6.1% CV, Figure 12) but did not exhibit a bias. This may indicate that the change in age readers for yellowtail flounder could increase the reliability of age determinations. Nevertheless, the new reader’s progress was closely monitored in the first year of production aging.
Observations of poor scale condition in yellowtail flounder from eastern Georges Bank, which began in 2002, have continued in these samples. The scales were characterized by actual holes and moderate to severe erosion of the anterior scale edges (illustrated in Sutherland et al. 2004 [unpubl.]). This condition remains unexplained.
In summary, U.S. age determinations for cod and haddock appear to be reliable during recent production aging. Yellowtail flounder aging precision was acceptable for Canadian samples, but lower among U.S. samples. This situation may improve among samples aged in 2006, after the new age reader has take responsibility for production aging in this species.REFERENCES
Campana SE. 2001. Accuracy, precision, and quality control in age determination, including a review of the use and abuse of age validation methods. J Fish Biol. 59:197-242.
Campana SE, Annand MC, McMillan JI. 1995. Graphical and statistical methods for determining the consistency of age determinations. Trans Am Fish Soc. 124:131-138.
Hunt JJ, O'Brien L, and Hatt B. 2005. Population status of eastern Georges Bank cod (unit areas 5Zj,m) for 1978-2006. TRAC Ref Doc. 2005/01; 48 p. Available at http://www.mar.dfo-mpo.gc.ca/science/trac/trac.htm
Penttila J and Dery LM. 1988. Age determination methods for northwest Atlantic species. NOAA Tech Rep NMFS 72; 135 p. Available at http://www.nefsc.noaa.gov/fbi/age-man.html
Silva V, Munroe N, Pregracke SE, Burnett J. 2004. Age structure reference collections: the importance of being earnest. In: Johnson DL, Finneran TW, Phelan BA, Deshpande AD, Noonan CL, Fromm S, Dowds DM, compilers. Current fisheries research and future ecosystems science in the Northeast Center: collected abstracts of the Northeast Fisheries Science Center's Eighth Science Symposium, Atlantic City, New Jersey, February 3-5, 2004. Northeast Fish Sci Cent Ref Doc. 04-01; p. 60.
Stone HH, and Legault CM. 2005. Stock assessment of Georges Bank (5Zhjmn) yellowtail flounder for 2005. TRAC Ref Doc. 2005/04; 83 p. Available at http://www.mar.dfo-mpo.gc.ca/science/trac/trac.htm
Van Eeckhaute L and Brodziak J. 2005. Assessment of haddock on eastern Georges Bank. TRAC Ref Doc. 2005/03; 77 p. Available at http://www.mar.dfo-mpo.gc.ca/science/trac/trac.htm