The key publication to read and cite is “InChI, the IUPAC International Chemical Identifier” (Stephen R Heller, Igor Pletnev, Stephen Stein and Dmitrii Tchekhovskoi, J. Cheminformatics, 2015, 7:23)
For the best survey of InChI use and publications, please read Wendy Warr’s excellent summary article “Many InChIs and quite some feat” (Warr, W.A., J. Comput. Aided Mol. Des., 2015, 29: 681. https://doi.org/10.1007/s10822-015-9854-3)
How to get involved
There are a number of existing InChI Working Groups to extend the standard run by IUPAC. For details see https://www.inchi-trust.org/inchi-working-groups/
We welcome proposals for InChI enhancements as well as technical questions, bug reports and queries, or even offers of help – please use the SourceForge inchi-discuss list as a forum. To register via the SourceForge web page (https://lists.sourceforge.net/lists/listinfo/inchi-discuss). Messages to the list (at [email protected]) from unregistered participants are moderated. The SourceForge project (see http://sourceforge.net/projects/inchi) aims to enable and encourage development of facilities and applications in an Open Source context.
These few general questions and answers about the InChI standard are supplemented by an extensive technical FAQ.
What is InChI?
Originally developed by the International Union of Pure and Applied Chemistry (IUPAC), the IUPAC International Chemical Identifier (InChI) is a character string generated by computer algorithm. It is a tool to be used in software applications designed and developed by those who choose to use it.
The InChI algorithm turns chemical structures into machine-readable strings of information. InChIs are unique to the compound they describe and can encode absolute stereochemistry making chemicals and chemistry machine-readable and discoverable. A simple analogy is that InChI is the bar-code for chemistry and chemical structures.
The InChI format and algorithm are non-proprietary and the software is open source, with ongoing development done by the community. A number of IUPAC working groups are currently extending the standard for areas of chemistry that are not yet handled by the InChI algorithm.
What is the InChIKey?
The InChIKey was designed so that Internet search engines can search and find the links to a given InChI.
To make the InChIKey the InChI string is subjected to a compression algorithm to create a fixed-length string of upper-case characters. While the InChI to InChIKey hash compression is irreversible, there are a number of InChI resolvers available to look up an InChI given an InChIKey.
Where can I download the InChI software?
The most current InChI software release can be downloaded from GitHub.
Why do we need the InChI and what is it used for?
InChIs and InChIKeys are used by scientists, publishers and database providers to enable web-based linking between sources of chemical content whether web-page, journal or magazine, or database.
For the best survey of InChI use and publications, please read Wendy Warr’s excellent summary article “Many InChIs and quite some feat” (Warr, W.A., J. Comput. Aided Mol. Des., 2015, 29: 681. https://doi.org/10.1007/s10822-015-9854-3)
Laboratory Information Systems
InChIKeys are used to integrate chemical data across other types of data in a research laboratory. This can include safety information, chemical properties, inventory, disposal, toxicity, etc.
Identifier for Chemical Registries
InChIKeys are often used as the primary chemical identifier to create a chemical inventory for single site. Additionally InChIKeys are being used for complex inventories spanning multiple sites and research laboratories.
Integrating Chemical and Biological Data
Integrating chemical and biological databases is a critical operation for many organizations. InChIKeys can be used to provide the link between databases which contain information on chemicals as well as biological and many other types of data.
Search InChIKey
Google has implemented the use of InChIKey for queries. An InChIKey can simply be embedded in a Google query to find all of the information on a chemical. This includes most patents and a broad set of additional related chemical data.
InChI in Education (OER)
The InChI Open Education Resource (OER) is devoted to the use of InChI. Chemical nomenclature underpins chemical communication and InChI supports the advancement of chemical nomenclature into the digital age. InChI is evolving to handle reactions, mixtures and other needs of 21st century scientific communication, and yet there is little educational material available on the use of InChI. OER provides InChI related resources to assist practicing scientists and educators in learning about and benefiting from the use of InChI.
Directions and Priorities
The InChI Membership Program provides direction and priorities for the project. The code is maintained by a broad group of supporters from industry, government and academia and is contributed to the InChI Trust. There are about a dozen Working Groups extending InChI in many different areas of chemistry (nanomolecules, polymers, organometallics, Markush, etc.). We also support some paid software development for integrations across areas (like stereochemistry or tautomers).
History of InChI
1999: Steve Heller initiated a proposal at the National Institute of Standards and Technology (NIST) for a public domain structure representation standard for the NIST databases, along with Steve Stein who was the initial designer and architect for the project.
2000: from a meeting with a wide range of users of chemical nomenclature including database providers, patent officials, international trade representatives, et al, under the direction of Alan McNaught, it was decided that InChI would be an IUPAC initiative to meet the needs of the chemical and related communities.
2001: the IUPAC Chemical Identifier project began in collaboration with the US National Institute of Standards and Technology (NIST). The aim was to devise a computer based algorithm yielding a unique label for any chemical structure, regardless of how it was represented (on screen).
2005 (April): version 1 of the IUPAC International Chemical Identifier (InChI) was launched. The development and associated programming work was predominantly carried out by Dmitrii Tchekhovskoi.
2008: a shorter hash key version of InChI, known as InChIKey was developed by Igor Pletnev. This alternative format is much more suitable for use in search engines and offers many new possibilities for software applications.
2009 (January): standard versions of InChI and the InChIKey were released, which took the original algorithm with its many variable parameters and fixed them so that interoperability between databases and resources with InChIs could be achieved.
2009 (July): the InChI Trust was formed. The 2009 Board of Directors consists of 8 members: The Royal Society of Chemistry • Nature Publishing Group • Elsevier Properties SA • Thomson Reuters • John Wiley & Sons • FIZ CHEMIE • Taylor & Francis • IUPAC.
2011: version 1.04 of the InChI software was released as well as an InChI Certification Suite which is designed to check the correct installation of the InChI software. Certification logos for correct installations as well as InChI Trust member and supporter logos are provided by the InChI Trust.
2017: version 1.05 of the InChI software was released, along with version 1.00 of the Reaction InChI (RInChI).
2020: version 1.06 of the InChI software was released.
2024: InChI codebase moved to GitHub, and the new version 1.07 approved.