GlobalChem Molecule

We can use the example located in the forcefield demo:

Load the Package

from global_chem_extensions import GlobalChemExtensions
ff = GlobalChemExtensions().forcefields()

Load the Molecule

We can use the perflurobutanoic acid located in the repository as an example where we load the SMILES and the stream file where the rank ordering is preserved.

# Load the Molecule

global_chem_molecule = ff.initialize_globalchem_molecule(
    'FC(F)(C(F)(C(O)=O)F)C(F)(F)F',
    stream_file='global-chem/example_data/forcefield_parameterization/perfluorobutanoic_acid.str',
    # frcmod_file='gaff2.frcmod',
)

Determine the Name

Determine the name of the compound based on the dictionary.

global_chem_molecule.determine_name()
name = global_chem_molecule.name
print (name)

Determine the Attributes of a Molecule

attributes = global_chem_molecule.get_attributes()
for k, v in attributes.items():
  print (f'{k}: {v}')

Convert to Different Interoperable mol or SMILES objects

To allow for any interoperability with different software we can provide a conversion into their respective objects.

pysmiles_mol = global_chem_molecule.get_pysmiles()
rdkit_mol = global_chem_molecule.get_rdkit_molecule()
partial_smiles_mol = global_chem_molecule.get_partial_smiles()
deep_smiles_mol = global_chem_molecule.encode_deep_smiles()
selfies_mol = global_chem_molecule.encode_selfies()
validate_smiles = global_chem_molecule.validate_molvs()

print ("RDKit Mol: %s" % rdkit_mol)
print ("PySMILES: %s" % pysmiles_mol)
print ("Partial SMILES: %s" % partial_smiles_mol)
print ("DeepSMILES: %s" % deep_smiles_mol)
print ("Selfies Mol: %s" % selfies_mol)
print ("Validation: %s" % validate_smiles)

If a failure at the conversion happens, it must be because perhaps the SMILES is invalid. Please check with the validation module.

Draw CGenFF Molecule

global_chem_molecule.draw_cgenff_molecule(
    height=1000, 
    width=1000
 )

Create a Curly SMILES representation of SMILES and Atom Types

CurlySMILES contain their features inside of curly braces and is stable SMILES to pass through RDKit.

curly_smiles = global_chem_molecule.get_curly_smiles()

Get CGenFF CXSMILES

CXSMILES is a atoms in the SMILES string followed by a | and then a set of features, or metadata, to accommodate each atom. This allows the user to embed whatever information they wish into the molecule.

We decided to embed atom-type information since the atom-type ordering is preserved.

print (global_chem_molecule.get_cgenff_cxsmiles())

Get CGenFF CXSMARTS

CXSMARTS is a atoms in the SMARTS string followed by a | and then a set of features, or metadata, to accommodate each atom. This allows the user to embed whatever information they wish into the molecule.

cx_smarts = global_chem_molecule.get_cgenff_cxsmarts()

Last updated