An assessment of the information content of South African alien species databases

http://www.abcjournal.org doi:10.4102/abc.v45i1.1103 National alien species databases indicate the state of a country’s biodiversity and provide useful data for research on invasion biology and the management of invasions. In South Africa there are several different published alien species databases, but these databases were created for different purposes and vary in completeness and information content. We assessed the information content of published South African alien species databases in the context of other such databases globally, and evaluated how the information content of South African databases varies across taxonomic groups. Although introduction pathway, date of introduction, region of origin and current broad-scale distribution data are available for most taxonomic groups assessed (60% – 90%), data on invasion status, introduction effort and introduction source are available for few taxonomic groups (5% – 18%). South African alien species databases have lower information content than the detailed databases available in other parts of the world and thus cannot be utilised to the same extent. We conclude with 11 recommendations for improving South African alien species databases. In particular, we highlight the data types that should be incorporated in future databases and argue that existing data should be collated in a single, standardised meta-database to facilitate cross-taxon comparisons, highlight gaps in effort, and inform managers and policy makers concerned with alien species. Authors: Katelyn T. Faulkner1,2 Dian Spear1,3 Mark P. Robertson2 Mathieu Rouget4 John R.U. Wilson1,3

National alien species databases indicate the state of a country's biodiversity and provide useful data for research on invasion biology and the management of invasions.In South Africa there are several different published alien species databases, but these databases were created for different purposes and vary in completeness and information content.We assessed the information content of published South African alien species databases in the context of other such databases globally, and evaluated how the information content of South African databases varies across taxonomic groups.Although introduction pathway, date of introduction, region of origin and current broad-scale distribution data are available for most taxonomic groups assessed (60% -90%), data on invasion status, introduction effort and introduction source are available for few taxonomic groups (5% -18%).South African alien species databases have lower information content than the detailed databases available in other parts of the world and thus cannot be utilised to the same extent.We conclude with 11 recommendations for improving South African alien species databases.In particular, we highlight the data types that should be incorporated in future databases and argue that existing data should be collated in a single, standardised meta-database to facilitate cross-taxon comparisons, highlight gaps in effort, and inform managers and policy makers concerned with alien species.

Introduction
Humans are introducing species to regions beyond their native range; however, few of these species become invasive and have deleterious impacts (Blackburn et al. 2011).National lists of alien species provide the taxonomic identities of introduced species.These data are required to assess the current state of biodiversity; for example, they are used to measure progress towards meeting the Convention on Biological Diversity's (CBD) Strategic Plan for Biodiversity (2011-2020) Aichi target 9 (Butchart et al. 2010;McGeoch et al. 2010McGeoch et al. , 2012;;UNEP 2011).Alien species databases contain much more data than a simple list of introduced species.The valuable data stored in these databases (e.g. on pathways and dates of introduction, distribution and invasion success) can be used to inform the management of invasions and further our understanding of biological invasions (Table 1) (also see Cadotte, Murray & Lovett-Doust 2006;Pyšek et al. 2012).For example, alien species databases are a data source for research on the predictors of invasion success, pathways of introduction and species distribution modelling.Such research underpins invasive species risk assessments and aids in the prioritisation of species, pathways and areas for surveillance and management.
The documented knowledge of introduced organisms varies greatly across countries (Pyšek et al. 2008).Although some databases provide minimal data others are quite detailed.For example, an alien plant catalogue for the Czech Republic provides 13 fields of data on 1454 species (Pyšek et al. 2012;Pyšek, Sádlo & Mandák 2002).The data provided in this Czech catalogue have been used in studies covering many topics, including range filling, associations with pollinators and the interaction of traits (Pyšek et al. 2012).In contrast, databases that lack detail, or that are incomplete or poorly contextualised, pose a biosecurity risk and may reduce management effectiveness and research quality and scope (McGeoch et al. 2012;Pyšek 2003).Moreover, global research effort on alien species (Pyšek et al. 2008) and alien species databases (Crall et al. 2006;Ricciardi et al. 2000) are taxonomically biased.
The consequences of inadequate databases and taxonomically biased data can be averted through the identification of data gaps and efforts made to alleviate the detected disparities.However, increasing the amount of data does not necessarily lead to an equal increase in benefits for research, decision making and management (Grantham et al. 2008;Pyšek et al. 2008;Simberloff 2003).For example, detailed data (e.g. on population biology) is often not required to eradicate recently introduced species, but may be vital for the management of established alien species (Simberloff 2003).Additionally, comprehensive data on a limited number of species is often sufficient to generalise and develop theories on biological invasions (Pyšek et al. 2008).Thus, although the data contained in detailed alien species databases is valuable, the types and amount of data required will depend on the research question or management strategy (Table 1).
South Africa has a large number of alien species from a wide variety of taxonomic groups, including the Insecta, Mammalia, Mollusca and Plantae (Henderson 2001;Herbert 2010;Picker & Griffiths 2011;Van Rensburg et al. 2011).For many taxonomic groups recent alien species databases are available, some of which provide many types of data.However, these databases were developed for different purposes and vary in information content.Consequently, it is unknown whether South African alien species databases can be used to the same extent as the detailed databases in other countries.We aimed to assess the overall information content of South African alien species databases in terms of introduction (dates, pathways, effort and source), region of origin, distribution and invasion status (current status and failure).We explore how the information content of these databases varies across taxonomic groups.Finally, we identify knowledge gaps and suggest key areas for future work.

Database identification
Alien species databases published up until December 2012 in peer-reviewed papers, books and reports were identified and assessed.A large number of databases pertain to South African alien species, but many are either poorly integrated or do not focus entirely on alien species.Therefore, we obtained a sample that was of a manageable size and that was representative of all taxonomic groups.These databases were identified using expert opinion and by consulting the references of previously assessed publications.We only assessed databases developed for a national level or databases developed for a regional or global level from which national level data could be extracted.Although comprehensive lists of alien Reptilia in captivity (Van Wilgen et al. 2010) and Plantae under cultivation (Glen 2002) are available, lists of species in the introduction stage of the invasion continuum (Blackburn et al. 2011) are not available for many other taxonomic groups.Furthermore, many of the data types assessed here (e.g.distribution data) are not applicable for species that have not yet spread outside of captivity or cultivation.Thus databases of species in captivity or under cultivation were not evaluated.A total of 34 alien species databases spanning 23 taxonomic groups were assessed, such that an indication of the number of alien taxa and the data content housed in each database was obtained (Tables A1 & A2).
For each taxonomic group we selected (from the sample of 34 databases) recent databases (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012) that list a high number of alien taxa and that provide numerous types of data (Tables A1 & A2).We focussed on more recent alien species databases as such databases collate and update the data found in previous inventories, and should incorporate more recent taxonomic revisions.Additionally, for taxonomic groups that occur in a range of environments (e.g.Mollusca and Crustacea), care was taken to ensure that the selected databases spanned the various environments inhabited (Table A2).Consequently for some groups, databases that list few species but focus on a specific TABLE 1: Research questions or topics that can be addressed using the data in alien species databases, the usefulness of each question or topic for management or policy, the types of data provided by alien species databases required to address each question and examples from literature.environment were included.For example, Insecta that are associated with the intertidal zone are discussed in a paper by Mead et al. (2011) on estuarine and marine taxa (Table A2).For taxonomic groups for which multiple, recent alien species databases exist, we used expert opinion to further confirm our selection.For each taxonomic group (i.e.marine invertebrate groups [e.g.Tunicata], Plantae, Aves, Reptilia, Crustacea, Insecta, Actinopterygii [ray-finned fishes], Mollusca and Mammalia), at least one South African expert that has worked on alien species listing was contacted.Each expert was asked to identify, for the taxonomic group of interest, the published alien species database that is currently the most comprehensive with regards to both the listed taxa and information content.Based on the opinion of these experts, two databases (i.e.De Moor &Bruton 1988 andGermishuizen et al. 2006) were added to our selection as they currently contain the most recent, comprehensive lists available for the Actinopterygii and Plantae -despite De Moor and Bruton (1988) being published before 2000.Finally, as an updated version of Germishuizen et al. (2006) is available online (http://posa.sanbi.org),this online database was used in the full analysis.In total, 14 databases spanning 23 taxonomic groups were selected for the full analysis (Tables A1 & A2).

Data extraction
Data on taxon name and taxonomic group were extracted from the 14 selected alien species databases.Taxa were assigned to taxonomic groups based on the taxonomy used by the selected databases.Although such definitions may influence results and lead to groupings at various taxonomic levels, these groupings reflect the taxonomic levels at which alien taxa are often listed and managed.
Taxa listed that are translocated indigenous species (e.g. the Mozambique tilapia Oreochromis mossambicus; see Van Rensburg et al. 2011), suspected to be indigenous or listed as 'dubious records' (e.g. the mollusc Vertigo antivertigo, which has been found only as a subfossil; see Herbert 2010) were not included in the analysis.As the listing of species in captivity or under cultivation is not comprehensive, any species listed that has entered the country but is not found outside of captivity or cultivation was excluded from the analysis.Furthermore, the Brachiopoda, one of the 23 taxonomic groups included in the selected databases, were not included in the analysis as the only introduced species, Discinisca tenuis, is found exclusively within aquaculture facilities (Mead et al. 2011).
Although for each taxonomic group recent alien species databases that list many species were utilised to develop the resultant list of taxa (Figure 1), there may be alien taxa in South Africa, besides those discussed in the paragraph above, that have been excluded.Such exclusions may be a result of listing errors (McGeoch et al. 2012) or the rapid rate at which new species are introduced.However, the aim of this work was not to create a comprehensive list of South African alien taxa but rather to assess the data provided by a representative sample of existing alien species databases.Additionally, our aim can be achieved by using a representative list that contains a large proportion of South African alien taxa.
Date of introduction, pathway of introduction, region of origin, distribution and invasion status data were extracted from the selected alien species databases (Table 2).Notes were also taken on whether data on introduction source (region from which the organism was introduced), introduction effort (number of individuals introduced and/ or introduction events) and failure (taxa that failed to establish) were provided (Table 2).Approximate dates of introduction or regions of origin (e.g.continent) and distribution data in descriptive form or point distribution maps were included as available data (Table 2).Invasion status data were only deemed available if the invasion status of the organism as per Richardson et al. (2000) or Blackburn et al. (2011) was stated or the category of the taxon under legislation -Conservation of Agricultural Resources Act (CARA) and National Environmental Management Biodiversity Act (NEMBA) -was specified (Table 2).Although various invasion status classifications exist, the classifications of Richardson et al. (2000) and Blackburn et al. (2011) were employed as they are used internationally (e.g.Pyšek et al. 2012) and as the classification of Blackburn et al. (2011) is applicable to all taxa.These classifications divide the invasion continuum into four stages: transport, introduction, establishment and spread (Blackburn et al. 2011;Richardson et al. 2000).Based on the invasion stage occupied, an organism's invasion status is classified as (1) introduced or casual, (2) naturalised or established and (3) invasive (Blackburn et al. 2011; Note: Pycnogonida (sea spiders), Porifera (sponges), Echinodermata (e.g.star fish and sea urchins), Nematoda (round worms), Bryozoa (moss animals), Platyhelminthes (flat worms), Myriopoda (e.g.centipedes), Cnidaria (e.g.jelly fish), Tunicata (ascidians), Actinoptergii (rayfinned fishes), Annelida (e.g.earthworms), Aves (birds).

FIGURE 1:
The number of alien taxa listed for each taxonomic group in the selected alien species databases and included in the analysis.

Number of alien taxa
Taxonomic group Richardson et al. 2000).Data were classified as unavailable if either no data were available or the characteristics were listed as 'unknown'.The information content of the selected alien species databases for each taxonomic group was determined by calculating the total number of alien taxa in each taxonomic group (Figure 1), and determining the percentage of taxa in each group for which the data of interest were provided.Results were plotted in R version 3.0.0(R Core Team 2013).

Results
For the majority of the taxonomic groups, pathway (64% of taxonomic groups) and date of introduction data (59% of taxonomic groups) are available for over 50% of taxa (Figure 2).These introduction data are available for a large proportion of the vertebrate and invertebrate groups (Figure 2).However, the availability of both pathway and date of introduction data are poor for the two taxonomic FIGURE 2: Percentage of the total number of alien taxa per taxonomic group for which data on (1) pathway of introduction, (2) date of introduction, (3) region of origin, (4) distribution, (5) invasion status and (6) all the aforementioned categories were provided.The number of species in each taxonomic group is given in round brackets and taxonomic groups are arranged according to descending data comprehensiveness (i.e. the number of categories for which data is available for greater than 50% of taxa).% groups with the greatest number of recorded taxa, namely the Plantae and Insecta (Figure 2).The availability of other introduction data, in general, is poor and introduction source data are only available for the Actinopterygii, whereas introduction effort data are available for the Aves, Actinopterygii, Mammalia and Reptilia.
Data on region of origin are available for a large proportion (50% or greater) of taxa from all taxonomic groups except the Plantae, Tunicata and Ciliophora -that is, 82% of taxonomic groups (Figure 2).For the majority of taxonomic groups, these data are available at a continental scale (Table 2).
Distribution data are available for over 50% of the taxa from all taxonomic groups except the Mammalia -that is, 91% of taxonomic groups (Figure 2).For most taxa these data are in a descriptive form and point data are only available for the terrestrial Mollusca and some introduced Plantae (Table 2).
Invasion status data are not available for most taxonomic groups (86%), with the exception of the Aves, Reptilia and Mollusca, for which these data are available for more than 50% of taxa (Figure 2).When all taxonomic groups are considered, invasion status data are available for 33% (633 of 1945) of taxa.For those taxa for which invasion status data are available, 14% (88 of 633) were classified as introduced or casual, 23% (145 of 633) as established and 63% (400 of 633) as invasive.Data on introductions that failed to establish are only available for the Actinopterygii (4 taxa), Aves (52 taxa), Mammalia (1 taxon) and Insecta (23 taxa released as biological control agents).
Across the taxonomic groups few taxa (172 taxa or 9%) had data available for all data categories (Figure 2).Additionally, for only one taxonomic group (Mollusca) data for all categories are available for the majority of taxa (Figure 2).No data are available for 8% of the introduced Insecta.However, across taxonomic groups, data for at least one data category are available for 98% of introduced taxa.Therefore, the level of data provided by South African alien species databases is high for some taxonomic groups (e.g.Mollusca, Reptilia, Aves, Crustacea and some marine invertebrate groups), but low for others (e.g.Plantae and Insecta) (Figure 2).

Discussion
The  (Howell & Sawyer 2006) -it is similar to the availability of these data in other countries -for example, aquatic species in Germany (Gollasch & Nehring 2006).
Consequently, the degree to which South African alien species databases can be used for research and management varies across taxa and depends on the type of data required (Table 1).For instance, pathway of introduction analyses, work on the predictors of invasion success and distribution modelling are possible for the Mollusca, and pathway analyses are feasible for Aves (Table 1).However, as they currently stand, even the most detailed South African alien species databases cannot be utilised to the same degree as the detailed catalogues that are available in other parts of the world.For example, South African alien species databases cannot be used to tackle the wide range of research topics -for example, species invasiveness, habitat invasibility and rates of spread (Table 1) -that have been addressed using the alien plant catalogues of the Czech Republic (Pyšek et al. 2012).
The data gaps identified here may be attributed to two main sources, namely a lack of data and data inaccessibility (McGeoch et al. 2012).A lack of data may be ascribed to difficulty recording and collecting data on some organisms.For example, data on intentional introductions (e.g.pathway and date of introduction) may be more easily recorded than for unintentional introductions (Lehan et al. 2013).However, as shown here, the data available for taxonomic groups that are often introduced accidentally (e.g.Mollusca and Crustacea) are comparable to the data available for organisms that are often introduced intentionally (e.g.Aves and Reptilia).
Moreover, the relatively poor data available for the Plantae and Insecta may be ascribed to difficulties in collecting, recording and maintaining data for a large number of organisms.A lack of data can be remedied by directed action.For example, the MammalMAP project will improve distribution data for African Mammalia, including aliens ( ).These data availability problems are not unique (Crall et al. 2006;Ricciardi et al. 2000), for example only 43% of invasive species databases in the USA are available online (Crall et al. 2006).
Lists of alien species suffer from a wide variety of errors (McGeoch et al. 2012), and any inaccuracies in the taxonomic data contained in the utilised databases would have influenced our conclusions (Pyšek et al. 2013).Alien plants and vertebrates in South Africa are relatively well studied (Richardson et al. 2003); in contrast, as a result of inadequate sampling and poor taxonomic knowledge, data on invertebrates are inadequate (Griffiths, Robinson & Mead 2009;Picker & Griffiths 2011;Richardson et al. 2011).These taxonomic biases may be a result of research needs (plants dominate the alien species pool), the ease with which plants may be recorded and studied (Crall et al. 2006;Pyšek et al. 2008), and the high degree of human assistance required for vertebrate introductions (Van Rensburg et al. 2011;Vitousek et al. 1996).As a consequence, the taxonomic data and related alien species richness estimates for plants and vertebrates may be more reliable than that available for invertebrates.However, determining the number and identity of introduced taxa in a region is difficult and differing definitions, methodologies or years of assessment can lead to disparate results (Bastos et al. 2011;Pyšek et al. 2004;Vitousek et al. 1996).
Finally, a wide range of alien and invasive species definitions exist and the use of disparate definitions may lead to listing differences and confusion (McGeoch et al. 2012;Richardson et al. 2000).In this assessment we only included invasion status designations made using the terminologies of Richardson et al. (2000) and Blackburn et al. (2011).Thus the inclusion of other terminologies and definitions may have increased the number of taxa (particularly for the Plantae) for which invasion status data are available (Richardson et al. 2000).For example, SAPIA designates species into categories that include transformer weeds, special effect weeds and ruderal weeds.However, the classifications of Richardson et al. (2000) and Blackburn et al. (2011) are utilised internationally and it is vital for research and management that such standardised and recognised terminologies and classifications are employed (Pyšek et al. 2004).

Conclusion
We conclude with 11 recommendations for improving South African alien species databases in Box 1.We argue that the last recommendation (that of creating a metadatabase) is currently the highest priority.A meta-database should have a standard format that would facilitate analyses within and across taxonomic groups.Currently, the wide variety of data formats in use makes these analyses difficult.The database would potentially resolve issues of accessibility, and could be formally published periodically (Cadotte, Murray & Lovett-Doust 2006;Pyšek, Sádlo & Mandák 2002;Pyšek et al. 2012).A database such as this, which can be rapidly updated, would better manage the rapidly changing nature of alien species data.The database could include known failed introductions, hybrids and taxa in captivity or under cultivation.Additionally, invasive alien taxa that pose an introduction risk because of their presence in neighbouring countries could be included (Hulme et al. 2009a).As it would work across different databases, data quality checks could be developed (Crall et al. 2006) and independent reviews would be easier to undertake (Hulme et al. 2009a).These checks could focus on the various errors that may influence the data quality of alien species databases (McGeoch et al. 2012) and which in turn affect the management and research that rely on these data (Crall et al. 2006;Pyšek 2003).We believe that trying to combine databases into a single meta-database will help resolve, or at least highlight, many of the gaps in our knowledge of alien species in South Africa, and will certainly help work towards regular, detailed biodiversity assessments.
BOX 1: Recommendations on how South African alien species databases can be improved.
1. Future databases should include data on species name, synonyms, family, date of introduction, pathway of introduction (which could be classified according to Hulme et al. (2008) as release, escape, contaminant, stowaway, corridor and unaided), introduction effort, point of introduction, introduction source, region of origin, date of last record, distribution, invasion status, impact and biological data.The collation of such data for individual species would require considerable effort, numerous data sources and consultation with experts.
2. Further surveys, particularly focusing on poorly surveyed organisms, for example soil organisms and other invertebrates (see Spear et al. 2011), should be undertaken and more taxonomists should be trained and funded (Pyšek et al. 2013).Such targeted investments often lead to a large increase in the number of recorded alien taxa (Hulme et al. 2009b;Mead et al. 2011).Additionally, sampling should be focussed on introduction hotspots, for example, harbours for marine organisms (Griffiths, Robinson & Mead 2009).
3. Lists of alien taxa in captivity or under cultivation need to be collated.Such lists are vital to prevent introductions through escapes.The collation of these lists would require information from various sources, for example, lists of terrestrial vertebrates kept in zoos (Spear & Chown 2008, 2009), Actinopterygii in aquaria stores (Semmens et al. 2004), vertebrates in pet stores and Plantae in nurseries (see Van Wilgen et al. 2010).
4. Standardised, internationally recognised terminologies and definitions must be utilised.For the purpose of invasion status designations we recommend the framework of Blackburn et al. (2011).This scheme is applicable across taxa, although the categories might need additional interpretation for particular groups (e.g.Wilson et al. 2014 for introduced trees).For recording current environmental impact we recommend another recent scheme by Blackburn et al. (2014).
5. The metadata for databases need to state the purpose for which the database was developed.
6. Estimates of the effort taken in constructing the databases are needed.For example, which areas of the country were sampled and with what intensity.Additionally, information on the sources of additional data and the effort expended to identify these sources would be useful.
7. Estimates of the error rates in existing databases (e.g. the number of taxonomic misidentifications) are difficult to measure, but crucial if the databases are to be used with confidence, and can have important consequences for management (Paterson et al. 2011;Pyšek et al. 2013).Updated databases could report errors made in previous versions and justifications for changes could be provided (e.g.Pyšek et al. 2012).
8. Existing expertise should be utilised.This could be facilitated through the use of an expertise registry that is regularly updated (e.g.Musil & Macdonald 2007).
9. Taxonomies must be standardised and synonymies avoided.For example, for the Plantae the Angiosperm Phylogeny Group (e.g.APG 2009) can be used to standardise the taxonomy of angiosperm species.See www.theplantlist.org for accepted nomenclature.
10. Data from different sources need to be collated, shared and published (Crall et al. 2006).Various unpublished sources of data exist and to identify these sources the assistance of many experts would be required.
11. Finally, a single meta-database should be developed for the purpose of housing data on all South African alien taxa (see the Conclusion for details).

TABLE 2 :
Categories of information content used in the analysis of the South African alien species databases and ranked value.
'Possibly extinct, failed to establish', 'Extinct' †, A value of 1 represents a high ranking.
(Pyšek et al. 2012)data in South Africa is poor in comparison to some nations -for example, alien Plantae of the Czech Republic(Pyšek et al. 2012)and New Zealand (Ugarte et al. 2011)azzoni 2011)l.2007;Pyšeketal. 2012)tribution) are available for alien taxa in South Africa in comparison to data available for organisms in Europe(Genovesi et al. 2012;Kenis et al. 2007;Pyšek et al. 2012), for vertebrates in Brazil(Rocha, Bergallo & Mazzoni 2011)and for Plantae in Chile(Ugarte et al. 2011).However, although the availability

TABLE A2 :
Results of the assessment of alien species databases.