TY - JOUR
T1 - Adapting mark-recapture methods to estimating accepted species-level diversity
T2 - a case study with terrestrial Gastropoda
AU - Rosenberg, Gary
AU - Auffenberg, Kurt
AU - Bank, Ruud
AU - Bieler, Rüdiger
AU - Bouchet, Philippe
AU - Herbert, David
AU - Köhler, Frank
AU - Neubauer, Thomas A.
AU - Neubert, Eike
AU - Páll-Gergely, Barna
AU - Richling, Ira
AU - Schneider, Simon
N1 - Funding Information:
This work was supported by NSF grants DBI 1902328 (lead PI: N. Yeung, through a subaward to Gary Rosenberg) for Pacific Island land snails, EF-02667 (PIs: P. Sierwald and Rüdiger Bieler) for terrestrial and aquatic North American mollusks, and DBI 2001570 (PI: Gary Rosenberg) and DBI 2001510 (PI: Rüdiger Bieler) for mollusks of the Eastern Seaboard of the United States. The work of the WoRMS Data Management Team is funded by Research Foundation—Flanders (FWO) as part of the Belgian contribution to LifeWatch. In addition, the involved authors have previously received financial support through the Belgian contribution to LifeWatch, to expand the content and enhance the quality of MolluscaBase. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Publisher Copyright:
© 2022 Rosenberg et al.
PY - 2022/6
Y1 - 2022/6
N2 - We introduce a new method of estimating accepted species diversity by adapting mark-recapture methods to comparisons of taxonomic databases. A taxonomic database should become more complete over time, so the error bar on an estimate of its completeness and the known diversity of the taxon it treats will decrease. Independent databases can be correlated, so we use the time course of estimates comparing them to understand the effect of correlation. If a later estimate is significantly larger than an earlier one, the databases are positively correlated, if it is significantly smaller, they are negatively correlated, and if the estimate remains roughly constant, then the correlations have averaged out. We tested this method by estimating how complete MolluscaBase is for accepted names of terrestrial gastropods. Using random samples of names from an independent database, we determined whether each name led to a name accepted in MolluscaBase. A sample tested in August 2020 found that 16.7% of tested names were missing; one in July 2021 found 5.3% missing. MolluscaBase grew by almost 3,000 accepted species during this period, reaching 27,050 species. The estimates ranged from 28,409 ± 365 in 2021 to 29,063 ± 771 in 2020. All estimates had overlapping 95% confidence intervals, indicating that correlations between the databases did not cause significant problems. Uncertainty beyond sampling error added 475 ± 430 species, so our estimate for accepted terrestrial gastropods species at the end of 2021 is 28,895 ± 630 species. This estimate is more than 4,000 species higher than previous ones. The estimate does not account for ongoing flux of species into and out of synonymy, new discoveries, or changing taxonomic methods and concepts. The species naming curve for terrestrial gastropods is still far from reaching an asymptote, and combined with the additional uncertainties, this means that predicting how many more species might ultimately be recognized is presently not feasible. Our methods can be applied to estimate the total number of names of Recent mollusks (as opposed to names currently accepted), the known diversity of fossil mollusks, and known diversity in other phyla.
AB - We introduce a new method of estimating accepted species diversity by adapting mark-recapture methods to comparisons of taxonomic databases. A taxonomic database should become more complete over time, so the error bar on an estimate of its completeness and the known diversity of the taxon it treats will decrease. Independent databases can be correlated, so we use the time course of estimates comparing them to understand the effect of correlation. If a later estimate is significantly larger than an earlier one, the databases are positively correlated, if it is significantly smaller, they are negatively correlated, and if the estimate remains roughly constant, then the correlations have averaged out. We tested this method by estimating how complete MolluscaBase is for accepted names of terrestrial gastropods. Using random samples of names from an independent database, we determined whether each name led to a name accepted in MolluscaBase. A sample tested in August 2020 found that 16.7% of tested names were missing; one in July 2021 found 5.3% missing. MolluscaBase grew by almost 3,000 accepted species during this period, reaching 27,050 species. The estimates ranged from 28,409 ± 365 in 2021 to 29,063 ± 771 in 2020. All estimates had overlapping 95% confidence intervals, indicating that correlations between the databases did not cause significant problems. Uncertainty beyond sampling error added 475 ± 430 species, so our estimate for accepted terrestrial gastropods species at the end of 2021 is 28,895 ± 630 species. This estimate is more than 4,000 species higher than previous ones. The estimate does not account for ongoing flux of species into and out of synonymy, new discoveries, or changing taxonomic methods and concepts. The species naming curve for terrestrial gastropods is still far from reaching an asymptote, and combined with the additional uncertainties, this means that predicting how many more species might ultimately be recognized is presently not feasible. Our methods can be applied to estimate the total number of names of Recent mollusks (as opposed to names currently accepted), the known diversity of fossil mollusks, and known diversity in other phyla.
KW - Biodiversity informatics
KW - Diversity
KW - Global species databases
KW - Mark-recapture
KW - Mollusca
KW - Sources of uncertainty
KW - Species richness
KW - Taxonomic databases
KW - Terrestrial gastropods
U2 - 10.7717/peerj.13139
DO - 10.7717/peerj.13139
M3 - Article
AN - SCOPUS:85132452359
SN - 2167-8359
VL - 10
JO - PeerJ
JF - PeerJ
M1 - e13139
ER -