e3fp.fingerprint.generate module

Generate E3FP fingerprints.

Author: Seth Axen E-mail: seth.axen@gmail.com

fprints_dict_from_mol(mol, bits=4294967296, level=5, radius_multiplier=1.718, first=3, counts=False, stereo=True, include_disconnected=True, rdkit_invariants=False, exclude_floating=True, remove_duplicate_substructs=True, out_dir_base=None, out_ext='.fp.bz2', save=False, all_iters=False, overwrite=False)[source]

Build a E3FP fingerprint from a mol with at least one conformer.

Parameters
  • mol (RDKit Mol) – Input molecule with one or more conformers to be fingerprinted.

  • bits (int) – Set number of bits for final folded fingerprint.

  • level (int, optional) – Level/maximum number of iterations of E3FP. If -1 is provided, it runs until termination, and all_iters is set to False.

  • radius_multiplier (float, optional) – Radius multiplier for spherical shells.

  • first (int, optional) – First N number of conformers from file to fingerprint. If -1, all are fingerprinted.

  • counts (bool, optional) – Instead of bit-based fingerprints. Otherwise, generate count-based fingerprints.

  • stereo (bool, optional) – Incorporate stereochemistry in fingerprint.

  • remove_duplicate_substructs (bool, optional) – If a substructure arises that corresponds to an identifier already in the fingerprint, then the identifier for the duplicate substructure is not added to fingerprint.

  • include_disconnected (bool, optional) – Include disconnected atoms when hashing and for stereo calculations. Turn off purely for testing purposes, to make E3FP more like ECFP.

  • rdkit_invariants (bool, optional) – Use the atom invariants used by RDKit for its Morgan fingerprint.

  • exclude_floating (bool, optional:) – Mask atoms with no bonds (usually floating ions) from the fingerprint. These are often placed arbitrarily and can confound the fingerprint.

  • out_dir_base (str, optional) – Basename of out directory to save fingerprints. Iteration number is appended.

  • out_ext (str, optional) – Extension on fingerprint pickles, used to determine compression level.

  • save (bool, optional) – Save fingerprints to directory.

  • all_iters (bool, optional) – Save fingerprints from all iterations to file(s).

  • overwrite (bool, optional) – Overwrite pre-existing file.

  • Deleted Parameters

  • ——————

  • sdf_file (str) – SDF file path.

fprints_dict_from_sdf(sdf_file, **kwargs)[source]

Build fingerprints dict for conformers encoded in an SDF file.

See fprints_dict_from_mol for description of arguments.

run(sdf_files, bits=4294967296, first=3, level=5, radius_multiplier=1.718, counts=False, stereo=True, include_disconnected=True, rdkit_invariants=False, exclude_floating=True, remove_duplicate_substructs=True, params=None, out_dir_base=None, out_ext='.fp.bz2', db_file=None, overwrite=False, all_iters=False, log=None, num_proc=None, parallel_mode=None, verbose=False)[source]

Generate E3FP fingerprints from SDF files.