SMILES Validation
To perform validation of the SMILES, we'll be passing it through a series of software to determine what fails and what doesn't.
The full performance of the GlobalChem against these different lists can be found below:
Software
SMILES Passed
RDKit
100%
SELFIES
100%
PySMILES
99.8%
Partial SMILES
85.7%
DeepSMILES
99.25%
MolVS
98.50%
Users can also validate their own SMILES across different software:
Imports
from global_chem import GlobalChem
from global_chem_extensions import GlobalChemExtensions
gc = GlobalChem()
cheminformatics = GlobalChemExtensions().cheminformatics()
Validate a list SMILES across different software
gc.build_global_chem_network()
smiles_list = list(gc.get_node_smiles('emerging_perfluoroalkyls').values())
sucesses, failures = cheminformatics.verify_smiles(
smiles_list,
rdkit=True,
partial_smiles=True,
return_failures=True,
pysmiles=True,
molvs=True
)
total = len(sucesses) + len(failures)
print ("Percantage of Accepted SMILES: %s" % ((len(sucesses) / total) * 100))
Last updated