# GlobalChem Molecule

We can use the example located in the forcefield demo:

{% embed url="<https://colab.research.google.com/drive/1hW0K6V5zPDFdZvFkYarbOr9wRoof2n4s?usp=sharing>" %}

**Load the Package**

```
from global_chem_extensions import GlobalChemExtensions
ff = GlobalChemExtensions().forcefields()
```

**Load the Molecule**

We can use the perflurobutanoic acid located in the repository as an example where we load the SMILES and the stream file where the rank ordering is preserved.&#x20;

```
# Load the Molecule

global_chem_molecule = ff.initialize_globalchem_molecule(
    'FC(F)(C(F)(C(O)=O)F)C(F)(F)F',
    stream_file='global-chem/example_data/forcefield_parameterization/perfluorobutanoic_acid.str',
    # frcmod_file='gaff2.frcmod',
)

```

**Determine the Name**

&#x20;Determine the name of the compound based on the dictionary.&#x20;

{% tabs %}
{% tab title="Code" %}

```
global_chem_molecule.determine_name()
name = global_chem_molecule.name
print (name)
```

{% endtab %}

{% tab title="Output" %}

```
perfluorobutanoic acid
```

{% endtab %}
{% endtabs %}

**Determine the Attributes of a Molecule**

{% tabs %}
{% tab title="Code" %}

```
attributes = global_chem_molecule.get_attributes()
for k, v in attributes.items():
  print (f'{k}: {v}')
```

{% endtab %}

{% tab title="Output" %}

```
Attributes: {
'name': 'cyclopentane', 'smiles': 'C1CCCC1',
'molecular_weight': 70.07825032, 
'logp': 1.9505000000000001,
'h_bond_donor': 0,
'h_bond_acceptors': 0,
'rotatable_bonds': 0, 
'number_of_atoms': 5,
'molar_refractivity': 23.084999999999994,
'topological_surface_area_mapping': 0.0,
'formal_charge': 0,
'heavy_atoms': 5,
'num_of_rings': 1
}

```

{% endtab %}
{% endtabs %}

**Convert to Different Interoperable mol or SMILES objects**

To allow for any interoperability with different software we can provide a conversion into their respective objects.&#x20;

{% tabs %}
{% tab title="Code" %}

```
pysmiles_mol = global_chem_molecule.get_pysmiles()
rdkit_mol = global_chem_molecule.get_rdkit_molecule()
partial_smiles_mol = global_chem_molecule.get_partial_smiles()
deep_smiles_mol = global_chem_molecule.encode_deep_smiles()
selfies_mol = global_chem_molecule.encode_selfies()
validate_smiles = global_chem_molecule.validate_molvs()

print ("RDKit Mol: %s" % rdkit_mol)
print ("PySMILES: %s" % pysmiles_mol)
print ("Partial SMILES: %s" % partial_smiles_mol)
print ("DeepSMILES: %s" % deep_smiles_mol)
print ("Selfies Mol: %s" % selfies_mol)
print ("Validation: %s" % validate_smiles)
```

{% endtab %}

{% tab title="Output" %}

```
RDKit Mol: <rdkit.Chem.rdchem.Mol object at 0x7f7259a7cb20>
PySMILES: Graph with 5 nodes and 5 edges
Partial SMILES: Molecule(atoms=['Atom(elem=6,chg=0,idx=0)', 'Atom(elem=6,chg=0,idx=1)', 'Atom(elem=6,chg=0,idx=2)', 'Atom(elem=6,chg=0,idx=3)', 'Atom(elem=6,chg=0,idx=4)'])
DeepSMILES: CCCCC5
Selfies Mol: [C][C][C][C][C][Ring1][Branch1]
Validation: ['INFO: [FragmentValidation] pentane is present']
```

{% endtab %}
{% endtabs %}

{% hint style="info" %}
If a failure at the conversion happens, it must be because perhaps the SMILES is invalid. Please check with the validation module.&#x20;
{% endhint %}

**Draw CGenFF Molecule**

{% tabs %}
{% tab title="Code" %}

```
global_chem_molecule.draw_cgenff_molecule(
    height=1000, 
    width=1000
 )
```

{% endtab %}

{% tab title="Output" %}
![](/files/vWDjLk0m7OQ7KytXN2RH)
{% endtab %}
{% endtabs %}

**Create a Curly SMILES representation of SMILES and Atom Types**

CurlySMILES contain their features inside of curly braces and is stable SMILES to pass through RDKit.&#x20;

{% tabs %}
{% tab title="Code" %}

```
curly_smiles = global_chem_molecule.get_curly_smiles()
```

{% endtab %}

{% tab title="Output" %}

```
F{FGA2}C{CG312}(F{FGA2})(C{CG312}(F{FGA2})(C{CG2O2}(O{OG311})=O{OG2D1})F{FGA2})C{CG302}(F{FGA3})(F{FGA3})F{FGA3}
```

{% endtab %}
{% endtabs %}

**Get CGenFF CXSMILES**

CXSMILES is a atoms in the SMILES string followed by a `|` and then a set of features, or metadata, to accommodate each atom. This allows the user to embed whatever information they wish into the molecule.&#x20;

We decided to embed atom-type information since the atom-type ordering is preserved.

{% tabs %}
{% tab title="Code" %}

```
print (global_chem_molecule.get_cgenff_cxsmiles())
```

{% endtab %}

{% tab title="Output" %}

```
O=C(O)C(F)(F)C(F)(F)C(F)(F)F |atomProp:0.atom_type.OG2D1:0.atom_idx.7:1.atom_type.CG2O2:1.atom_idx.5:2.atom_type.OG311:2.atom_idx.6:3.atom_type.CG312:3.atom_idx.3:4.atom_type.FGA2:4.atom_idx.4:5.atom_type.FGA2:5.atom_idx.8:6.atom_type.CG312:6.atom_idx.1:7.atom_type.FGA2:7.atom_idx.0:8.atom_type.FGA2:8.atom_idx.2:9.atom_type.CG302:9.atom_idx.9:10.atom_type.FGA3:10.atom_idx.10:11.atom_type.FGA3:11.atom_idx.11:12.atom_type.FGA3:12.atom_idx.12|
```

{% endtab %}
{% endtabs %}

**Get CGenFF CXSMARTS**

CXSMARTS is a atoms in the SMARTS string followed by a `|` and then a set of features, or metadata, to accommodate each atom. This allows the user to embed whatever information they wish into the molecule.&#x20;

{% tabs %}
{% tab title="Code" %}

```
cx_smarts = global_chem_molecule.get_cgenff_cxsmarts()
```

{% endtab %}

{% tab title="Output" %}

```
[#9]-[#6](-[#9])(-[#6](-[#9])(-[#6](-[#8])=[#8])-[#9])-[#6](-[#9])(-[#9])-[#9] |atomProp:0.atom_type.FGA2:0.atom_idx.0:1.atom_type.CG312:1.atom_idx.1:2.atom_type.FGA2:2.atom_idx.2:3.atom_type.CG312:3.atom_idx.3:4.atom_type.FGA2:4.atom_idx.4:5.atom_type.CG2O2:5.atom_idx.5:6.atom_type.OG311:6.atom_idx.6:7.atom_type.OG2D1:7.atom_idx.7:8.atom_type.FGA2:8.atom_idx.8:9.atom_type.CG302:9.atom_idx.9:10.atom_type.FGA3:10.atom_idx.10:11.atom_type.FGA3:11.atom_idx.11:12.atom_type.FGA3:12.atom_idx.12|
```

{% endtab %}
{% endtabs %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://globalchem.gitbook.io/globalchem-your-chemical-graph-network/forcefields/globalchem-molecule.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
