GlobalChem: Your Chemical Knowledge Graph
  • Welcome to the GlobalChem Documentation!
  • Quick Start
  • Extensions
  • API
    • GlobalChem
    • Graph Algorithm
  • Mother Nature
    • Mother Nature Commands
    • Discord Roles
  • Cheminformatics
    • SMILES Validation
    • Decoding Fingeringprints and SMILES to IUPAC
    • SMILES to PDF And Back
    • Drug Design Filters
    • Deep Layer Scatter
    • Identifier SMARTS
    • Protonating SMILES
    • Sunbursting SMILES
    • Visualizing SMARTS
    • One-Hot Encoding SMILES
    • Principal Component Analysis SMILES
    • GlobalChem Graph to Networkx Graph
    • Amino Acid Sequence to SMILES
    • Scaffold Graph Adapter
  • Bioinformatics
    • GlobalChem Protein
    • GlobalChem RNA
    • GlobalChem DNA
    • GlobalChem Bacteria
    • GlobalChem Monoclonal Antibody
  • Quantum Chemistry
    • Z-Matrix Store
    • Psi4Parser & Orbital Visualizer
  • ForceFields
    • GlobalChem Molecule
    • CGenFF Molecule
    • GAFF2 Molecule
    • CGenFF Dissimilarity Score
  • Development Operations
    • Open Source Database Monitor
  • Graphing Templates
    • Plotly
Powered by GitBook
On this page
  1. Cheminformatics

Decoding Fingeringprints and SMILES to IUPAC

Decoding your fingerprints to your SMILES and to an IUPAC name takes a good annotated dictionary of bit vectors that can accurately guess the chemical space that exists within your molecule.

To accomplish this, GlobalChem has a 1 to 1 mapping of bit vectors produced on a 512 scale with a morgan radius of 2 to capture the chemical environment. So this is where you can play with the parameters to decode your fragmented SMILES or your long SMILES or individual bit vector fragments.

Load the Decoder Engine

decoder_engine = cheminformatics.get_decoder_engine()

Generate a Morgan Fingerprint

For ease of use we are sticking to hyperparameters as defined in GlobalChem with radius of 2 and 512 bit length.

For benzene:

morgan_fingerprint = decoder_engine.generate_morgan_fingerprint('C1=CC=CC=C1')
00000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

Classify Fingerprints

You can classify fingerprints based on the node key that you pass in or all or them. Depends on how accurate or which chemical space you would like to explore.

print(decoder_engine.classify_fingerprint(
    morgan_fingerprint,
    node='organic_and_inorganic_bronsted_acids'
))
['benzene']

Classify Bigger SMILES

If you would like to classify a bigger SMILES then GlobalChem will use the BRICS module to fragment the molecule into easier bitvector fragments and do comparisons using tanimoto similarity to achieve a viable functional group space. It follows the same concept of passing in a node to determine the bit vector chemical space to explore.

print(decoder_engine.classify_smiles_using_bits(
    'CCC(=O)N(C1CCN(CC1)CCC2=CC=CC=C2)C3=CC=CC=C3',
     node='organic_and_inorganic_bronsted_acids'
))
['benzene', 'ammonia']

PreviousSMILES ValidationNextSMILES to PDF And Back

Last updated 2 years ago