
Match collection to an enrichment database via taxonomic names
match_collection_to_enrich_database.RdMatch collection to an enrichment database via taxonomic names
Usage
match_collection_to_enrich_database(
collection,
enrich_database,
taxon_name_column = NA,
taxon_name_full_column = NA,
taxon_author_column = NA,
enrich_taxon_name_column = NA,
enrich_taxon_authors_column = NA,
typo_method = "All",
do_add_split = TRUE,
do_fix_hybrid = TRUE,
do_rm_autonym = TRUE,
do_rm_cultivar_indeterminates = TRUE,
do_match_single = TRUE,
do_match_multiple = TRUE,
do_fix_taxon_name = TRUE,
matching_criterion = BGSmartR::no_additional_matching,
...
)Arguments
- collection
A data frame containing a collection.
- enrich_database
A data frame of enriching information.
- taxon_name_column
The name of the column in the
collectioncorresponding to taxonomic names.- taxon_name_full_column
The name of the column in the
collectioncorresponding to joined taxonomic names and authors.The name of the column in the
collectioncorresponding to the authors of the taxonomic names.- enrich_taxon_name_column
The name of the column in
enrich_databasethat corresponds to the taxonomic names.The name of the column in
enrich_databasethat corresponds to the authors of taxonomic names.- typo_method
Either
'All','Data frame only','Data frame + common',no; detailing the level of typo finding required.- do_add_split
Flag (TRUE/FALSE) for whether we search for missing f./var./subsp.
- do_fix_hybrid
Flag (TRUE/FALSE) for whether we search for hybrid issues.
- do_rm_autonym
Flag (TRUE/FALSE) for whether we try removing autonyms.
- do_rm_cultivar_indeterminates
Flag (TRUE/FALSE) for whether we remove cultivars and indeterminates prior to taxonomic name matching.
- do_match_single
Flag (TRUE/FALSE) for whether we do matching to unique taxonomic names in
enrich_database.- do_match_multiple
Flag (TRUE/FALSE) for whether we do matching to non-unique taxonomic names in
enrich_database.- do_fix_taxon_name
Flag (TRUE/FALSE) for whether attempt to fix common issues in taxonomic names to aid matching. Sections of common issue fixes can also be turned on/off using the inputs
do_add_split,do_fix_hybrid,do_rm_autonym.- matching_criterion
A function used to chose the best method from extracts of the
enrich_database.- ...
Arguments (i.e., attributes) used in the matching algorithm (passed along to nested fuctions). Examples include,
enrich_display_in_message_columnandenrich_plant_identifier_column.
Value
A list of length seven containing:
$matchthe index of the record inenrich_databasewhich matches the record in the collection database.$details_shorta simplified message detailing the match.$match_taxon_namea longer format message detailing the match.$original_authorsThe author/s (extracted) from thecollectiondatabase.$match_authorsThe author/s of the matched record inenrich_database.$author_checkEitherIdentical,PartialorDifferent(No Matchif a match toenrich_databasecannot be found). A message informing the similarity of the collection's taxon authors and the authors found inenrich_database. Author similarity is found using the functionauthor_check().
Details
This function allows matching of a collection's database to an enrichment database.
By default the function uses all the steps of our matching algorithm, for details of this see the vignette Matching.Rmd (Method of Matching taxonomic records). If parts of the algorithm are not required these can be switched off using typo_method, do_add_split, do_fix_hybrid, do_rm_autonym, do_rm_cultivar_indeterminates, do_match_single, do_match_multiple and do_fix_taxon_name. Moreover, by default no custom matching if performed. A user inputted custom matching criterion (function) can be added via the input matching_criterion.
To perform the matching you must specify the columns name of the taxon name in the enrichment database (enrich_taxon_name_column). If author matching is required then this column must also be specified for the enrichment database (enrich_taxon_authors_column).
The enrichment database must have some columns required for matching (single_entry, taxon_length, etc), we advice using prepare_enrich_database() to add these columns.
Similarly, you must specify the columns name of the taxon name in the collection database (taxon_name_column). If author matching is desired then you have two choices:
specify the taxon author column
taxon_author_column.Specify the combined taxon name and author column,
taxon_name_full_columnwhich removes words found in the taxon names from taxon names full to extract the authors.
Note if both are specified then the authors from taxon_author_column are used.