GlobalChem
First initialize the class must be initialized:
gc = GlobalChem()
Print the Network
gc.print_globalchem_network()
Check the available nodes in GlobalChem:
nodes_list = gc.check_available_nodes()
print (nodes_list)
Retrieve all the Nodes
nodes_list = gc.get_all_nodes()
Get the Tree Depth of GlobalChem
depth = gc.get_depth_of_globalchem()
Get GlobalChem All Nodes SMILES
depth = gc.get_all_smiles()
Get GlobalChem All Nodes SMARTS
depth = gc.get_all_smarts()
Get GlobalChem All Nodes Names
depth = gc.get_all_names()
Get SMILES Definition by IUPAC Name
This function fetches the distance between two words using the Levenshtein distance with a distance tolerance number. It removes both grammar and upper case letters automatically and tries to match the best fitting word against the query and return their dedicated paths. Users have the option to return the exact definition or partial definitions.
definition = gc.get_smiles_by_iupac(
'benzene',
distance_tolerance=7,
return_partial_definitions=True
)
print (definition)
You have the option to do a fuzzy reconstruction of the SMILES from the IUPAC used stripped grammar and functional groups:
definition = gc.get_smiles_by_iupac(
'(4R,4aR,7S,7aR,12bS)-3-methyl-2,4,4a,7,7a,13-hexahydro-1H-4,12-methanobenzofuro[3,2-e]isoquinoline-7,9-diol',
distance_tolerance=2,
return_partial_definitions=False,
reconstruct_smiles=True,
)
print (definition)
Build the GlobalChem Network and Print it Out
gc.build_global_chem_network(
print_output=True, # Print the network out
debugger=False, # For Developers mostly to see all node values
)
The algorithm uses a series of parents/children to connect nodes instead of "edges" as in traditional graph networks. This just makes it easier to code if the graph database lives as a 1-dimensional with lists of parents and children's connected in this fashion.
Fetch a Node
gc = GlobalChem()
gc.build_global_chem_network()
node = gc.get_node('emerging_perfluoroalkyls')
print (node)
Fetch the IUPAC:SMILES/SMARTS Data from the Node
gc = GlobalChem()
gc.build_global_chem_network()
smiles = gc.get_node_smiles('emerging_perfluoroalkyls')
smarts = gc.get_node_smarts('emerging_perfluoroalkyls')
print ("Length of Perfluoroalkyls: %s " % len(smiles))
from global_chem import GlobalChem
gc = GlobalChem(verbose=False)
gc.initiate_network()
gc.add_node('global_chem', 'common_monomer_repeating_units')
gc.add_node('common_monomer_repeating_units','electrophilic_warheads_for_kinases')
values = gc.get_node_smiles('common_monomer_repeating_units')
print (values)
Creating a Deep Layer Chemical Graph Networks (DGN) & Print it Out:
This is for the more advanced users of building networks and how to manage sets of layers.
# Create a Deep Layer Network
gc = GlobalChem()
gc.initiate_deep_layer_network()
gc.add_deep_layer(
[
'emerging_perfluoroalkyls',
'montmorillonite_adsorption',
'common_monomer_repeating_units'
]
)
gc.add_deep_layer(
[
'common_warheads_covalent_inhibitors',
'privileged_scaffolds',
'iupac_blue_book'
]
)
gc.print_deep_network()
Compute a Common Score
Common Score Algorithm:
Datamine the current graph network of GlobalChem
Get the object weights of each mention
Determine the mention weight
Sum the Weight's and that is how common the molecule is.
The higher the value the higher the common score tied with it's IUPAC name.
gc = GlobalChem()
gc.build_global_chem_network(print_output=False, debugger=False)
gc.compute_common_score('benzene', verbose=True)
To TSV
The network returned in all CSV format for interoperability for web application development mostly but can also be used to search.
gc = GlobalChem()
gc.to_tsv('global_chem.tsv')
Last updated