Predict

ehreact.predict contains functions and classes for calculating scores of query molecules and a Hasse diagram.

Predict

Classes and functions from ehreact.predict.predict.py.

Entry point for predicting on a calculated Hasse diagram.

ehreact.predict.predict.predict(args)[source]

Scores queries on a precomputed Hasse diagram based on the inputted arguments

Parameters:

args (Namespace) – Namespace of arguments.

Make Prediction

Classes and functions from ehreact.predict.make_prediction.py

Score on Hasse diagram.

ehreact.predict.make_prediction.calculate_score(smi, mol, highest_template, d, verbose, predict_mode, params, smiles_dict, stereochemistry, tags_core=None)[source]

Function to score a query molecule given the diagram, highest template, QM information.

Parameters:
  • smi (str) – SMILES string of the query molecule.

  • mol (rdkit.Chem.Mol) – RDKit molecule.

  • highest_template (str) – key of the highest matching template.

  • d (ehreact.diagram.diagram.Diagram) – Hasse diagram.

  • verbose (bool, default False) – Whether to print additional information.

  • predict_mode (Literal[“single_reactant”,”multi_reactant”,”transition_state”], default “transition_state”) – Prediction mode, either transition states extracted from reaction smiles or single/multi reactants from smiles.

    If single reactants, reaction partners are added automatically and all possible reaction products enumerated, for multi_reactants no reaction partners are added but different reaction outcomes are taken in to account.

    params: dict

    Dictionary of scoring hyperparameters.

  • smiles_dict (dict) – A dictionary of the canonicalized input smiles.

  • stereochemistry (bool) – Whether to use stereochemistry for scoring.

  • tags_core (dict, default None) – Atom mapping numbers of atom in the reaction core. Used to verify a correct match.

Returns:

  • best_score (float) – Score.

  • rawnumber (dict) – Dictionary of raw scores.

ehreact.predict.make_prediction.find_highest_template(curr_node, mol, d, highest_template)[source]

Recursive function to find the highest matching substructure/reaction rule of a molecule/reaction.

Parameters:
  • curr_node (str) – SMILES string of current node.

  • mol (rdkit.Chem.Mol) – RDKit molecule.

  • d (ehreact.diagram.diagram.Diagram) – Hasse diagram.

  • highest_template (str) – Current highest matching substructure/reaction rule.

Returns:

highest_template – highest matching (most specific) template in tree

Return type:

str

ehreact.predict.make_prediction.make_prediction(smiles, d, params=None, verbose=False, quiet=True, compute_aam=False, predict_mode='transition_state', stereochemistry=False)[source]

Computes a Hasse diagram of a list of reaction or molecule smiles.

Parameters:
  • smiles (List[str]) – List of SMILES or reaction SMILES to score.

  • d (ehreact.diagram.diagram.Diagram) – Diagram to calculate scores on.

  • params (dict, default None) – Dictionary of scoring hyperparameters.

  • verbose (bool, default False) – Whether to print additional information.

  • quiet (bool, default True) – Whether to silence all output.

  • compute_aam (bool, default False) – Whether to compute atom-mappings for reactions.

  • predict_mode (Literal[“single_reactant”,”multi_reactant”,”transition_state”], default “transition_state”) – Prediction mode, either transition states extracted from reaction smiles or single/multi reactants from smiles. If single reactants, reaction partners are added automatically and all possible reaction products enumerated, for multi_reactants no reaction partners are added but different reaction outcomes are taken in to account.

  • stereochemistry (bool, default False) – Whether to use stereochemistry for scoring.

Returns:

  • scores (List[float]) – List of scores.

  • combination (List[str]) – List of multiple reactants.

  • current_smiles (List[str]) – List of SMILES strings.

  • belongs_to (List[int]) – List of integer index characterizing to which initial smiles the reaction belongs to.

  • raw_scores (List[dict]) – List of dictionaries of raw scores.

Score

Classes and functions from ehreact.predict.score.py

ehreact.predict.score.default_score_generator(rawnumber, params, verbose)[source]

Score generator.

Parameters:
  • rawnumber (dict) – Dictionary of raw scores.

  • params (dict) – Dictionary of hyperparameters.

  • verbose (bool) – Whether to print additional information.

Returns:

best_score – Overall score.

Return type:

float

ehreact.predict.score.highest_sim(sims_reac, sims_prod, use_prod)[source]

Function to find highest similarity of a query to a set of leaf nodes.

Parameters:
  • sims_reac (List[float]) – List of similarities of the reactants.

  • sims_prod (List[float]) – List of similarities of the products.

  • use_prod (bool) – Whether to use the product similarities in addition to the reactants.

Returns:

best_score – Highest similarity score.

Return type:

float