GlobalChem: Your Chemical Knowledge Graph
  • Welcome to the GlobalChem Documentation!
  • Quick Start
  • Extensions
  • API
    • GlobalChem
    • Graph Algorithm
  • Mother Nature
    • Mother Nature Commands
    • Discord Roles
  • Cheminformatics
    • SMILES Validation
    • Decoding Fingeringprints and SMILES to IUPAC
    • SMILES to PDF And Back
    • Drug Design Filters
    • Deep Layer Scatter
    • Identifier SMARTS
    • Protonating SMILES
    • Sunbursting SMILES
    • Visualizing SMARTS
    • One-Hot Encoding SMILES
    • Principal Component Analysis SMILES
    • GlobalChem Graph to Networkx Graph
    • Amino Acid Sequence to SMILES
    • Scaffold Graph Adapter
  • Bioinformatics
    • GlobalChem Protein
    • GlobalChem RNA
    • GlobalChem DNA
    • GlobalChem Bacteria
    • GlobalChem Monoclonal Antibody
  • Quantum Chemistry
    • Z-Matrix Store
    • Psi4Parser & Orbital Visualizer
  • ForceFields
    • GlobalChem Molecule
    • CGenFF Molecule
    • GAFF2 Molecule
    • CGenFF Dissimilarity Score
  • Development Operations
    • Open Source Database Monitor
  • Graphing Templates
    • Plotly
Powered by GitBook
On this page
  • Install the Package(s)
  • GlobalChem: Building the GlobalChem Graph Network
  • GlobalChem Extensions Access Nodes and Perform a PCA and Radial Analysis:

Quick Start

Here you will learn how to install the initial package, build the initial network, print it out, and perform some analysis.

Install the Package(s)

GlobalChem is the graph network that has no dependencies and it's functionality built within the main object.

GlobalChemExtensions has a dependency network that is too cumbersome to deal with but has additional functionality for cheminformaticians (or anyone really) to perform analysis on chemical data including the GlobalChem Graph network.

The best way to interact with our API is to use one our official library distributed on PyPi

# Install via pip
pip install global-chem 

# Install the Extension Package
pip install global-chem[extensions]

Additional Dependency Features

Not everyone wants to install everything into their local environment which can be a very hefty especially something as large as GlobalChemTo combat this we partitioned some of the applications dependencies into different package dependencies that can be installed with the extra function from setuptools. Please refer to the master extension list about which app depends on where.

  • cheminformatics

  • bioinformatics

  • quantum_chemistry

  • development_operations

  • forcefields

  • graphing

  • all - All the extensions

pip install 'global-chem[graphing]'
pip install 'global-chem[forcefields]'
pip install 'global-chem[bioinformatics]'
pip install 'global-chem[cheminformatics]'
pip install 'global-chem[quantum_chemistry]'
pip install 'global-chem[development_operations]'
pip install 'global-chem[all]'

Good to know: global-chem-extensions dependencies are not linked to any specific versions in hopes for flexibility of other development environments.

GlobalChem: Building the GlobalChem Graph Network

To build the GlobalChem Graph network we first import the package, initialize the class, and call the function :

from global_chem import GlobalChem
gc = GlobalChem()
gc.build_global_chem_network(print_output=True)
'global_chem': {
    'children': [
        'environment',
        'miscellaneous',
        'organic_synthesis',
        'medicinal_chemistry',
        'narcotics',
        'interstellar_space',
        'proteins',
        'materials'
    ],
    'name': 'global_chem',
    'node_value': <global_chem.global_chem.Node object at 0x10f60eed0>,
    'parents': []
}, etc.

To make it pretty:

gc.print_globalchem_network()

                                ┌solvents─common_organic_solvents
             ┌organic_synthesis─└protecting_groups─amino_acid_protecting_groups
             │          ┌polymers─common_monomer_repeating_units
             ├materials─└clay─montmorillonite_adsorption
             │                            ┌privileged_kinase_inhibtors
             │                            ├privileged_scaffolds
             ├proteins─kinases─┌scaffolds─├iupac_blue_book_substituents
             │                 │          └common_r_group_replacements
             │                 └braf─inhibitors
             │              ┌vitamins
             │              ├open_smiles
             ├miscellaneous─├amino_acids
             │              └regex_patterns
global_chem──├environment─emerging_perfluoroalkyls
             │          ┌schedule_one
             │          ├schedule_four
             │          ├schedule_five
             ├narcotics─├pihkal
             │          ├schedule_two
             │          └schedule_three
             ├interstellar_space
             │                    ┌cannabinoids
             │                    │         ┌electrophillic_warheads_for_kinases
             │                    ├warheads─└common_warheads_covalent_inhibitors
             └medicinal_chemistry─│      ┌phase_2_hetereocyclic_rings
                                  └rings─├iupac_blue_book_rings
                                         └rings_in_drugs
                                         

GlobalChem Extensions Access Nodes and Perform a PCA and Radial Analysis:

Let's have some fun. Let's access a node and perform some PCA Analysis. We want to test whether an object functional groups share some similarity some arbitrary features and try to determine what those features specifically are. This will help understand features of relevance for small molecules.

We are going to look at the list of the molecules in pihkal because it's a pretty comprehensive list of what's on the drug market currently published on the wikipedia page. This will help us identify

PCA Analysis

from global_chem import GlobalChem
from global_chem_extensions import GlobalChemExtensions

gc = GlobalChem()
gc.build_global_chem_network(print_output=False, debugger=False)
smiles_list = list(gc.get_node_smiles('pihkal').values())

GlobalChemExtensions().node_pca_analysis(smiles_list, save_file=False)

If we look at the PCA analysis in more detail, we can see that the machine has assigned fingerprints of the benzodioxole and the secondary amine with long alkyl chain as primary features for this cluster. If you hover over dots you can fetch other clusters that seem as primary core scaffolds within this dataset.

Radial Analysis

Let's have a look at how a list of emerging perfluoroalkyls to the rest of the nodes in the network using a radial analysis. For more details on the Radial Analysis algorithm please head over to the page.

from global_chem_extensions import GlobalChemExtensions

gc = GlobalChem()
gc.build_global_chem_network(print_output=False, debugger=False)

smiles_list = list(gc.get_node_smiles('pihkal').values())
GlobalChemExtensions().sunburst_chemical_list(smiles_list, save_file=False)

If we have a quick look at the list of Pihkal, we can see that they are very similar to the covalent warheads. More nodes, and more in-ferment can be made about the data and up to the user verify :).

Enjoy

Read more of the documentation or just start playing around with the data. This data takes some time to digest so patience is necessary when building you're own networks as well. Happy cheminformatics.

PreviousWelcome to the GlobalChem Documentation!NextExtensions

Last updated 2 years ago