Quality Assurance and Quality Control Estimates
for the Production Ageing of Northwest Atlantic Species
It is important to ensure consistency in fish ages generated by a production ageing laboratory. There are three components to measuring this consistency, namely accuracy, intra-reader precision, and inter-reader precision. Accuracy is determined by how closely the ages generated in production ageing are to the known ages for a set of fish; this is a measure of whether the age reader applies ageing criteria correctly. Intra-reader precision is determined by how reliably an age reader will assign the same age to an individual fish; this is a measure of how consistently the ageing criteria are applied from day to day. Finally, inter-reader precision tests whether fish ages are comparable between different people; it is measured by two (or more) age readers examining the same set of fish independently. For all three components, age is determined multiple times for each fish, and a comparison of the resulting ages determines the level of consistency. These aspects of consistency may change over time or between age readers, so it is necessary to measure them regularly throughout the production ageing process.
The three components affect the production age data in different ways, but all may introduce errors into the data. Measurement of ageing error has two primary aspects: quantification of variability, and detection of systematic bias. Any significant ageing bias indicates that ageing criteria are not being applied properly, whether it occurs within an intra-reader precision test (indicative of a drift in how the person applies ageing criteria) or in an accuracy test (indicating that the person is applying incorrect ageing criteria). Either case implies that the most recent production ages may be inconsistent with past years’ data. Variability in intra-reader precision levels will introduce random errors, and may reduce the apparent abundance of strong year-classes while making weak year-classes appear more abundant. Within accuracy tests, high variability indicates that ageing criteria are not being applied consistently. Finally, if two readers differ in their age determinations, it becomes more difficult to utilize data from both of them in one stock assessment.
Providing these measures of consistency allows assessment scientists to consider these sources of variability within stock assessments. These measures are regularly estimated within the Fishery Biology Program at the Northeast Fisheries Science Center (NEFSC). Accuracy tests are conducted for those species that have reference collections already assembled. Intra-reader precision tests are conducted on each set of production ages generated. A test of inter-reader precision is completed when an inter-laboratory exchange is conducted or a change in age reader occurs (due to temporary substitution or when training a permanent replacement). This website is an effort to make the results of those tests easily available to assessment scientists and other interested parties.
All production ageing at the Fishery Biology Program at the NEFSC follows established ageing methods, as described in Penttila and Dery (1988). Tests of the various aspects of ageing consistency are regularly conducted as described below. In all tests, age readers have knowledge of the data normally available during production ageing (i.e. fish length, date captured, and area captured), but do not have knowledge of previous ages given to the fish. If two ages (e.g., test age and production age) are not assigned to a given fish for any reason, that fish is excluded from calculation of statistical measures.Accuracy
For each accuracy test, age readers are asked to re-age a random subset (N = 50–100 fish) of the reference collection. Most tests are conducted after the completion of production ageing; for haddock, tests are usually conducted both before and after production ageing.
A prerequisite to conducting these tests is the establishment of a reference collection, composed of a few hundred fish of known age. However, it is very difficult to obtain a large sample of fish for which the ages are definitively known; therefore, the NEFSC ageing laboratory has selected samples which have been aged by multiple experienced age readers and for which a consensus age has been agreed upon (Silva et al. 2004). For cod and haddock, samples have been assembled from past inter-laboratory exchanges with Canadian age readers; therefore, these reference collections only include fish from the Georges Bank stock. In the case of yellowtail flounder, however, samples from various stocks were chosen, distributed to four age readers experienced in ageing this species, and only fish for which these readers agreed on the age remained in the collection. Reference collections for other species will be assembled in upcoming years.Calibration
For species without established reference collections, a calibration test may be done. This consists of re-ageing a representative subsample of fish from a previous year, before the current year's production is begun. This determines whether the age reader has a sufficient precision level to generate reliable ages, but does not test the accuracy of these ages.Intra-reader precision (listed as Precision tests)
After the completion of production ageing for a given set of samples (i.e., a specific survey, quarter, or year), the age reader conducts a precision test on a representative subsample (usually 50–100 fish) taken from that sample set. Subsamples are randomly selected, but include the range of lengths and sampling locations in the production age sample. Stock management areas are combined together in these tests, except when production ageing for each stock area occurs at different times. Although test ages may differ from production ages, no effort is made to improve results by further examination of samples, nor are production ages revised after tests are conducted.Inter-reader precision (listed as Precision or Exchange)
Precision tests between two readers are conducted less frequently, when a change in age reader occurs either due to temporary substitution or when a new age reader has been trained. They are structured similarly to intra-reader precision tests: one reader first ages all the samples (perhaps while doing production ageing of the fish), and a second reader later re-ages a portion (or all) of the sample set. Results of these tests are presented in terms of one of the reader's ages. In cases where one reader has been training the other to age a species, the trainee's ages are presented in terms of the established reader's ages. In other cases, either set of ages may be presented on the x-axis, and no assumption is made as to which person's ages are more reliable.
One specific type of inter-reader test is the interlaboratory exchange. Such tests are annually conducted for cod and haddock in cooperation with Canada's Department of Fisheries and Oceans (DFO). The exchanges consist of each laboratory shipping otolith samples to the other laboratory, and the alternate age reader determining the ages of the samples.
Historically, this ‘two-reader’ approach was used within the Fishery Biology Program to ensure quality control. The primary age reader for a given species would examine all the samples during production ageing, and then the second age reader would review 5–10% of the samples. This tested whether the primary reader had applied ageing criteria in the same way as the second reader.