GlobalChem: Your Chemical Knowledge Graph
  • Welcome to the GlobalChem Documentation!
  • Quick Start
  • Extensions
  • API
    • GlobalChem
    • Graph Algorithm
  • Mother Nature
    • Mother Nature Commands
    • Discord Roles
  • Cheminformatics
    • SMILES Validation
    • Decoding Fingeringprints and SMILES to IUPAC
    • SMILES to PDF And Back
    • Drug Design Filters
    • Deep Layer Scatter
    • Identifier SMARTS
    • Protonating SMILES
    • Sunbursting SMILES
    • Visualizing SMARTS
    • One-Hot Encoding SMILES
    • Principal Component Analysis SMILES
    • GlobalChem Graph to Networkx Graph
    • Amino Acid Sequence to SMILES
    • Scaffold Graph Adapter
  • Bioinformatics
    • GlobalChem Protein
    • GlobalChem RNA
    • GlobalChem DNA
    • GlobalChem Bacteria
    • GlobalChem Monoclonal Antibody
  • Quantum Chemistry
    • Z-Matrix Store
    • Psi4Parser & Orbital Visualizer
  • ForceFields
    • GlobalChem Molecule
    • CGenFF Molecule
    • GAFF2 Molecule
    • CGenFF Dissimilarity Score
  • Development Operations
    • Open Source Database Monitor
  • Graphing Templates
    • Plotly
Powered by GitBook
On this page
  1. Cheminformatics

SMILES to PDF And Back

PreviousDecoding Fingeringprints and SMILES to IUPACNextDrug Design Filters

Last updated 2 years ago

PDF Parsing is going to be handling by a separate package called MolPDF. It has pretty much a fixed template and is used for handling data distribution quickly so we can get a general feel of a molecule list.

The philosophy is very simple. We don't really need a template but a simple PDF document to transfer data quickly especially in the case of a supplementary information. We use the meta data stored in the PDF to store the SMILES in accordance with the images. That way it makes it easier to mine the data quickly into a python object.

Imports

from global_chem import GlobalChem
from global_chem_extensions import GlobalChemExtensions

gc = GlobalChem()
cheminformatics = GlobalChemExtensions().cheminformatics()

SMILES to PDF

smiles_list = list(gc.get_node_smiles('pihkal').values())

cheminformatics.smiles_to_pdf(
    smiles=smiles_list,
    labels = [],
    file_name = 'molecules.pdf',
    include_failed_smiles = True,
    title = 'MY MOLECULES',
)
Method: 'generate' Time: 4.98 seconds

PDF to SMILES

molecules = cheminformatics.pdf_to_smiles(
    'molecules.pdf',
)

print (len(molecules))
GitHub - Sulstice/molpdf: Convert SMILES to PDF 2D Skeletal Diagrams and back again.GitHub
Logo