iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: http://list.worldfloraonline.org/taxon/wfo-9000000496-2024-06
WFO Plant List: Name Matching

 

WFO Plant List API

Welcome to the WFO Plant List API. The WFO Plant List is the consensus list of names used as the taxonomic backbone of the World Flora Online portal. A new version is released every six months.

This site provides techies with access to WFO Plant List data. It is intended for those who are comfortable working with simple (not pretty) HTML forms or who want to exploit the REST or GraphQL APIs.

Are you in the right place?

  1. WFO Portal: The main entry point for the WFO including all the description, distribution, image and other data. This is where you go to find out more about a plant.
  2. WFO Plant List: Human friendly access to the WFO Plant List in all its versions as soon as they are released. It forms part of 1, the main portal.
  3. WFO Plant List Download [doi:10.5281/zenodo.7460141]: Gives access to the same data as available from 2 above to download in multiple formats and citable via a DOI.
  4. WFO Plant List API: This site. Gives access to the same data as available from 2 above but via APIs and specialist tools.
  5. Rhakhis Taxonomic Editor: A tool for taxonomists preparing the next WFO Plant List data release.

The services here only cover names governed by the International Coded of Nomenclature for Algae, Fungi and Plants as curated by the WFO. If your data includes names from the Zoological code or you wish to query other sources then you may be better served by the Global Names Verifier.

What is here?

For our purposes, matching/reconciling is the process or binding your data to a WFO Name record (represented by a WFO ID) on the basis of a string of characters you supply. This differentiates between a Name (capital 'N') and a string of characters that represent the Name in a particular context and thus avoids us getting into a semantic/philosophical tangle.

Data Model

Every six months, on the solstices, a snapshot of the data in the Rhakhis editor is taken and added to this service. The data available here therefore represents multiple classifications of the plant kingdom showing how our understanding has changed through time.

In order to represent multiple classifications in a single dataset it is necessary to adopt the TaxonConcept model which differentiates between taxa (TaxonConcepts) which vary between classifications and names (TaxonNames) which do not, but which may play different roles in different classifications.

Taxon name/concept background: A good analogy for those unfamiliar with the TaxonConcept model is that of polygons and points within a geospatial model. A classification is like a map of contiguous, nested polygons (like counties, regions, countries, continents). These are the taxa. The names are like fixed points on the map. They never move. Each polygon might contain multiple points. The name used for a polygon is based on the oldest point that occurs within it. Other names that fall in the polygon are referred to as synonyms. Different taxonomic classifications are like different maps of the same terrain with different polygons but with the same points. Polygons on two maps might have the same calculated name but different boundaries and different synonyms. It is therefore necessary to refer to taxa in different classifications using unique identifiers rather than just their calculated names.

Identifiers

All name records have a single, prescribed ID which is of the form wfo-0000615907. The lowercase letters "wfo" followed by a hyphen followed by ten digits. A regular expression similar to '/^wfo-[0-9]{10}$/' will match a WFO ID (depending on your precise regex implementation).

Once created WFO IDs are never deleted and will always return data. However it is possible that two IDs have been created for one real world name. We do our utmost to prevent this happening but we are still dealing with some historical duplication within the data. In the cases where it is decided that multiple WFO IDs apply to a single name then the records are merged and one of the IDs prescribed as the one that should be used going forward. The other WFO ID becomes a deduplicated ID for that name record. Services will still respond to that ID. It will never be deleted but it won't be presented as the WFO ID that should be used for that name again.

With each data release a new set of IDs are created that are of the form wfo-0000615907-2022-12. For each name the year and month of the data release are appended. A regular expression similar to '/^wfo-[0-9]{10}-[0-9]{10}-[4]{2}$/' will match a versioned WFO ID (depending on your precise regex implementation).

Within a data release names play one of four roles and the meaning of the sixteen digit WFO ID depends on role the name is playing.

  1. Placed as the accepted name of a taxon. The ID refers to the taxon.
  2. Placed as a synonym within a taxon. The ID refers to the name usage.
  3. Unplaced if our experts have yet to express an opinion on placement. The ID refers to the name alone.
  4. Deprecated if our experts conclude it isn't possible to place this name and it should not be used. The ID refers to the name record that shouldn't be used.

Un-versioned, ten digit WFO IDs will usually be treated as referring to the current usage of that name. i.e. if a data release version isn't specified the current release will be presumed.

WFO IDs are used in their ten digit and sixteen digit forms in different services or as the final parts of Stable HTTP URIs.

To link to a WFO name it is recommended to always use the Stable HTTP URI form and not to reverse engineer the URL that appears in a browser bar which isn't guaranteed to be stable.

Scalability and Performance

Currently no API keys are required for these services. They are open for anyone to use. If we find that service is being degraded we may introduce IP based throttling or access tokens to ensure availability for all.

This whole application can be installed on a server or personal machine by anyone with appropriate technical skills. If you are likely to require heavy use of the service or wish to embed it within a production workflow you are encouraged to install a local copy of the application. The code and instructions are available on GitHub. The data can be downloaded from Zenodo. Any questions please contact Roger Hyam.