e3fp.conformer.util module

Utilities for handling SMILES strings and RDKit mols and conformers.

Author: Seth Axen E-mail: seth.axen@gmail.com

class MolItemName(mol_name=None, proto_state_num=None, conf_num=None, proto_delim='-', conf_delim='_')[source]

Bases: object

Class for parsing mol item names and converting to various formats.

property conf_name
copy()[source]
classmethod from_str(mol_item_name, mol_item_regex=re.compile('(?P<mol_name>.+?)(?:-(?P<proto_state_num>\\d+))?(?:_(?P<conf_num>\\d+))?$'), mol_item_fields=('mol_name', 'proto_state_num', 'conf_num'), **kwargs)[source]
classmethod from_tuple(fields_tuple)[source]
property mol_item_name
static mol_item_name_to_dict(mol_item_name, mol_item_regex=re.compile('(?P<mol_name>.+?)(?:-(?P<proto_state_num>\\d+))?(?:_(?P<conf_num>\\d+))?$'), mol_item_fields=('mol_name', 'proto_state_num', 'conf_num'))[source]
property mol_name
property proto_name
to_conf_name(conf_num=None, conf_delim='_')[source]
to_mol_name(as_proto=False)[source]
to_proto_name(proto_state_num=None, proto_delim='-')[source]
to_str()[source]
to_tuple()[source]
class MolItemTuple(mol_name, proto_state_num, conf_num)

Bases: tuple

conf_num

Alias for field number 2

mol_name

Alias for field number 0

proto_state_num

Alias for field number 1

add_conformer_energies_to_mol(mol, energies)[source]

Add conformer energies as mol property.

See discussion at https://sourceforge.net/p/rdkit/mailman/message/27547551/

dict_to_smiles(smiles_file, smiles_dict)[source]

Write SMILES dict to file.

get_conformer_energies_from_mol(mol)[source]

Get conformer energies from mol.

iter_to_smiles(smiles_file, smiles_iter)[source]

Write iterator of (mol_name, SMILES) to file.

mol2_generator(*filenames)[source]

Parse name from mol2 filename and return generator.

Parameters

files (iterable object) – List of mol2 files, where filename should be molecule name followed by “.mol2”

Yields

tupletuple of the format (file, name).

mol_from_mol2(mol2_file, name=None, standardise=False)[source]

Read a mol2 file into an RDKit PropertyMol.

Parameters
  • mol2_file (str) – path to a mol2 file

  • name (str, optional) – Name of molecule. If not provided, uses file basename as name

  • standardise (bool) – Clean mol through standardisation

Returns

RDKit PropertyMol

Return type

Molecule.

mol_from_sdf(sdf_file, conf_num=None, standardise=False, mode='rb')[source]

Read SDF file into an RDKit Mol object.

Parameters
  • sdf_file (str) – Path to an SDF file

  • conf_num (int or None, optional) – Maximum number of conformers to read from file. Defaults to all.

  • standardise (bool (default False)) – Clean mol through standardisation

  • mode (str (default ‘rb’)) – Mode with which to open file

Returns

RDKit Mol

Return type

Mol object with each molecule in SDF file as a conformer

mol_from_smiles(smiles, name, standardise=False)[source]

Generate a n RDKit PropertyMol from SMILES string.

Parameters
  • smile (str) – SMILES string

  • name (str) – Name of molecule

  • standardise (bool) – Clean Mol through standardisation

Returns

RDKit PropertyMol

Return type

Molecule.

mol_to_sdf(mol, out_file, conf_num=None)[source]

Write RDKit Mol objects to an SDF file.

Parameters
  • mol (RDKit Mol) – A molecule containing 1 or more conformations to write to file.

  • out_file (str) – Path to save SDF file.

  • conf_num (int or None, optional) – Maximum number of conformers to save to file. Defaults to all.

mol_to_standardised_mol(mol, name=None)[source]

Standardise mol(s).

smiles_generator(*filenames)[source]

Parse SMILES file(s) and yield (name, smile).

Parameters

files (iterable object) – List of files containing smiles. File must contain one smile per line, followed by a space and then the molecule name.

Yields

tupletuple of the format (smile, name).

smiles_to_dict(smiles_file, unique=False, has_header=False)[source]

Read SMILES file to dict.