Contents

Introduction

There are several related but distinct applications available through the ASKCOS website. These applications and their corresponding use cases are listed in the table below.

ApplicationBrief DescriptionUse Case
One-Step Retrosynthesis Single-step retrosynthesis Identifying precursors (buyable or non-buyable) that can be used to synthesize a target compound via a single reaction
Tree-Builder Multi-step retrosynthesis Identifying synthetic pathways that can be used to synthesize a target compound via a series of reactions (allows for creation of pathways from buyable chemicals to a target)
Context Recommendation Reaction condition recommendations Suggesting conditions (reagent, catalyst, solvents, temperature) under which a specified reaction can be run successfully
Forward Prediction Product distribution Predicting the products that are likely to be generated when two reactants are combined under specified conditions (reagent, catalyst, solvents, temperature)
Reaction Evaluation Probability of reaction success Predicting whether or not a target compound of interest will be generated when specified precursors react with one another, and simultaneously identifying conditions that are most likely to promote creation of the specified target compound
SCScore Evaluator Estimate synthetic complexity Predicting the synthetic complexity of a molecule [1-5 scale] using a trained neural network model
Buyable Look-Up Database of commercially-available compounds Determining whether or not a chemical is listed in our Buyables database and, if so, for what price (in USD per gram)
Drawing Structures from SMILES/SMARTS Converting molecule/reaction SMILES strings or template SMARTS strings to skeletal formulas

Generally, the website takes target compound inputs as SMILES strings, although there are some compounds for which the common name (aspirin, ibuprofen, etc.) can be entered in the input window and converted to a SMILES string automatically. For more information about SMILES strings, click here. Compound SMILES strings can be obtained by first drawing the molecule of interest in ChemDraw, selecting the molecule with the selector tool, right-clicking, and choosing 'Molecule', 'Copy As', and finally, 'SMILES'. Alternatively, much of the website also supports drawing compounds of interest directly. This can be done by clicking the Draw links adjacent to input text fields, where applicable.

One-Step Retrosynthesis

One-Step Retro is used to identify precursor molecules that can be combined to generate a target compound via a single reaction.

Usage

  1. Enter target compound SMILES string into "Target compound" field
  2. Select template prioritization option ("apply relevant templates" preferred)
  3. Select precursor scoring option (Relevance+Heuristic preferred)
  4. Define minimum plausibility [0 to 1]
  5. Click Search

Options

  • Reaction templates have been culled from the Reaxys database. There are two options for template prioritization:
    • Apply relevant templates
      • A neural network has been trained to identify the probability that a template is applicable to a compound in question
      • The maximum number of templates (num. templates) and the maximum cumulative probability (max. cum. prob.) can be specified; the search will terminate either when the maximum number of templates or the maximum cumulative probability is reached, whichever comes first
    • Apply all templates
      • All of the templates that have been collected from the database are applied
      • This is relatively slow
  • Precursor recommendations are selected and sorted based on a scoring function. There are four options for precursor scoring:
    • Heuristic: crude synthetic accessibility score
    • Relevance+Heuristic: combination of heuristic and template relevance
    • SCScore: learned synthetic complexity score
    • Natural ordering: as the name implies
  • In addition, a minimum plausibility may be specified. The plausibility metric represents the likelihood that a pair of candidate precursors can be successfully used to generate the target compound under any set of conditions.

Output

The one-step retrosynthesis tool generates a list of recommended precursor pairs. The pairs are rank-ordered based on the scoring function specified by the user. For each pair, the following information is displayed:

  1. Skeletal structures
  2. SMILES strings
  3. Information about buyability
  4. "Score": the value of the score defined by the chosen scoring function
  5. "# Examples": the number of entries in Reaxys that support the template(s) that was (were) applied to generate the precursors
  6. "Max template relevance": the relevance score (or maximum thereof) assigned by the neural network to the template(s) that was (were) applied to generate the precursors

Adjacent to each pair of precursor SMILES strings is a link, "-->?”, that copies the strings over to the context recommender (see below) for easy acquisition of context recommendations.

Example

To demonstrate the use of the one-step retrosynthesis tool, consider the potential target compound Atropine. To obtain a list of precursors that can be used to generate Atropine via a single reaction, click the “One-step Retro” link on the left-hand side of the ASKCOS homepage. The following screen will appear:

Enter Atropine (by name or SMILES string: CN1C2CCC1CC(C2)OC(=O)C(CO)c3ccccc3) into the “Target compound” field. The default algorithm options generally work well, and may be left as-is. Click Search. Below the Search button, the parsed molecule structure will appear, along with the rank-ordered list of precursor candidates. The first three candidates are shown in the image below. If a candidate is buyable, its price per gram will appear next to its SMILES string; if not, the phrase “cannot buy” is listed:

The SMILES strings of the proposed candidates are links that the user can click to run the one-step retrosynthesis on a candidate. This allows users to design full syntheses manually, in a step-by-step fashion. For example, clicking the SMILES string of the non-buyable precursor that is ranked second in the output above automatically runs the one-step expansion on that candidate:

Alternatively, the user can click the link designated by "-->?"" adjacent to the precursor SMILES strings in the one-step output. This automatically runs the context recommender. For example, clicking the context recommendation link for the first pair of precursors in the one-step retrosynthesis we generated for Atropine gives:

You can also click on the image of the reactants themselves to expand a list of the templates that were used to suggest that transformation. Multiple templates can lead to the same precursor, so there may be many with different degrees of specificity and different numbers of precedents.

Clicking through provides more details about the template, including its SMARTS string. It displays a selection of precedents from Reaxys sorted by decreasing yield.

Tree Builder

The Tree Builder is used to identify precursor molecules that can be combined to generate a target compound via a series of reactions (a “tree”). This is a multi-step retrosynthetic planner.

Usage

  1. Specify target compound:
    • Enter SMILES string of target in “Target compound” field
    • OR
    • Click “Draw” link adjacent to “Target compound” field and draw target compound in the window that appears
  2. Confirm that parsed structure matches intended target
  3. Select expansion settings (see Options, below)
  4. Select stopping criteria (see Options, below)
  5. Select evaluation settings (see Options, below)
  6. Click Start

Options

Expansion settings

  • The maximum depth setting allows you to specify the maximum number of sequential synthetic steps you would like to see in the results
    • The tree-builder roughly follows a “depth-first” search, which means that it prioritizes expanding existing chemical nodes at depth k to depth k+1 over collecting additional option-nodes at depth k. Thus, setting the maximum depth too high may preclude the creation of several good disconnections at a particular depth
    • We recommend setting the maximum depth equal to 1 + the length of the shortest anticipated route
  • The minimum retrosynthetic template count setting allows you to specify the minimum number of instances in Reaxys that support the templates that will be applied in your tree expansion. We recommend leaving this setting at the default of zero (effectively: no minimum), unless you have some motivation to exclusively use popular chemistry
  • The tree search relies on truncated branching: after a chemical node is expanded to generate a list of potential precursors, a heuristic scoring function is used to select the most promising precursors from the list. The maximum branching factor specifies the number of precursors to select during this truncation. The heuristic prioritizes buyability and, to a lesser extent, structural simplicity
  • Expansion time (s): this setting allows you to specify the amount of time for which you would like the expansion to run
  • For information about template prioritization, num. templates, max. cum. prob, and precursor scoring: see One-Step Retro Options

Stop Criteria

  • Maximum chemical price ($/g): this setting allows you to specify the maximum price for chemicals that the algorithm deems buyable
  • Chemical property logic (optional): changing this setting from the default ‘None’ to ‘OR’ or ‘AND’ brings up four new text fields that allow you to specify the maximum number of carbon, nitrogen, oxygen, and hydrogen atoms you would like the starting materials to contain. The ‘OR’ option will allow the algorithm to terminate the expansion of a chemical node either if the compound is known to be buyable or if it satisfies the maximum atom criteria specified. The ‘AND’ option will prevent the algorithm from terminating an expansion unless both the buyability and heavy atom criteria are satisfied. Setting the maximum count for a particular atom type to an arbitrarily large number (e.g., 1000, which is the default for each) effectively places no constraint on the algorithm in relation to that atom type.
  • Chemical popularity logic (optional): changing this setting from the default ‘None’ to ‘OR’ will allow the algorithm to terminate the expansion of a chemical node either if the compound is known to be buyable or if it appears with a high frequency as either a reactant or product in the Reaxys database. If a compound appears frequently in Reaxys, it is likely that a synthetic chemist would know how to make it.
    • Specifying this option can be very useful for lengthy syntheses. After an initial expansion (with this option set to OR) from a target compound is complete, those starting materials that are not buyable can be copied and expanded to buyable chemicals on their own.

Evaluation settings

  • Minimum plausibility: this setting allows you to specify the minimum plausibility of each reaction in your expansion. Reactions that are proposed by the retrosynthetic algorithm but do not meet this minimum will be removed from the expansion.
  • Manual forward prediction: this setting defines the forward prediction approach that will be implemented if a user clicks the 'Evaluate' link on a particular proposed synthetic tree in the output. This does not affect the outcome of the retrosynthetic expansion itself.
  • Pay close attention to chirality: we recommend keeping this box checked. This prevents the application of achiral templates to chiral molecules, or chiral templates to achiral molecules, although it is somewhat more computationally expensive to do so.

Output

The tree-builder produces a rank-ordered list of recommended synthetic routes. Compounds that are buyable are outlined in green; those that are not are outlined in orange. Between each target compound and its retrospective precursor(s) is a number that specifies the number of examples of reactions in Reaxys that support the template(s) that was (were) applied to make the proposed disconnection.

You can store the results of an expansion and review them later. To do this after an expansion is complete, scroll to the top of the website and click the "My Results" tab in the black banner. In the list that appears, click the “Save this page” link to save your results. From this list, you can also access results that you have saved previously (to do this, click "See saved results" instead of "Save this page").

Additionally, once a retrosynthetic expansion is performed, the user has the option to blacklist a chemical or reaction that appears in the tree if it is, e.g., known to be patented. Then, the user may re-run the expansion to obtain recommendations that exclude the chemical(s) or reaction(s) that were deemed undesirable. The list of compounds and reactions that you have blacklisted can be accessed by clicking the "My Banlist" tab in the black banner at the top of the website. For more information, see the Example below.

Example

A set of proposed synthetic routes to the compound Fluconazole is easily created using the default tree-builder settings. To get started, click the Tree Builder module option (accessible from the Modules drop-down at the top of the ASKCOS website). This will bring up a field where you can enter the SMILES string of your target (in this case, Fluconazole):

When you enter Fluconazole’s name or SMILES string in the "Target compound" field, the various algorithm settings appear, along with the parsed structure (which you should confirm is correct):

Click Start. Once the expansion is complete, a summary of the results of the expansion appears, followed by a rank-ordered list of potential synthetic routes. The summary of the expansion results shows the number of chemicals (or, at half-depths, reactions) that were found by the algorithm during the expansion.

The first recommendation in the list of synthetic routes is Fluconazole itself, because it can be purchased for less than the maximum chemical price that was specified as part of the stop criteria. Since Fluconazole is commercially available, it is outlined in green:

The second option is a linear synthesis consisting of two reactions. The epoxide intermediate is outlined in orange because it is not buyable:

Hovering over each of the compounds pulls up a pale-yellow window with additional information, including the compound’s SMILES string, frequency of appearance as a reactant or product in Reaxys, and the options to “blacklist” or “hide all” (see Tree Builder - Output). For compounds that are commercially available, the price per gram is listed (non-purchasable compounds show the words “not buyable”). For example, for the non-purchasable epoxide intermediate:

As described in Tree Builder - Output, compounds and reactions may be blacklisted, if desired. To do this, for example, for the epoxide intermediate in Option 2, mouse over the compound, and click the blacklist link in the pale-yellow window that appears. Another pop-up will appear with a text field where you can specify the reason for blacklisting the chemical (for your records only). After you click OK, another window will appear, confirming that the compound with the specified SMILES string was blacklisted. Click OK again.

To review the compounds and reactions that you’ve blacklisted, scroll to the top of the webpage and click the "My Banlist" tab. A list of options will appear, including the option to “View banned chemicals”:

If you click the “View banned chemicals” link, you will be taken to a new webpage that shows a list of the chemicals you’ve blacklisted. For now, only the epoxide intermediate appears. You have the option to temporarily disable compounds on your blacklist by clicking the True link under the Active column on the left; you may also permanently delete compounds on your blacklist by clicking the X under the Delete column on the right:

Now that we’ve blacklisted this chemical, it is possible to return to the Tree Builder tab and rerun the Fluconazole expansion. This time, none of the trees that the algorithm recommends will contain the blacklisted chemical.

Context Recommendation

The context recommender is useful for cases when you know the chemical transformation you want to make, and you’d like recommendations for the reaction conditions. This might be the case, for example, after you’ve run the Tree Builder on a target compound and you have a series of transformations in-hand: each reaction in the series can then be passed through the context recommender to obtain suggested reaction conditions. At this time, the context recommender is capable of recommending a catalyst, reagent, solvent (up to 2), and temperature.

Usage

  1. Specify reactants:
    1. Enter reactant SMILES strings into “Reactants” field. Separate distinct reactants using a period, “.”
    2. OR
    3. Click “Draw” link adjacent to “Reactants” field and draw reactants in window that appears
  2. Specify products:
    1. Enter product SMILES strings into "Products” field
    2. OR
    3. Click “Draw” link adjacent to "Products” field and draw products in window that appears
  3. Select context recommender option (see Options, below; Neural Network preferred)
  4. Click Get Context Recommendations

Options

Two options for the context recommender are available: a neural network and a nearest neighbor search. The nearest neighbor approach is an early version of the context recommender that relies on computationally expensive similarity calculations. It is slower and less accurate than the neural network.

Output

The context recommender produces a list of recommendations for reaction conditions. Each recommendation contains a subset of the following features: reagent, catalyst, solvent (1 or 2), and temperature. We define reagents as compounds that do not contribute heavy atoms to the product. In some cases, the distinction between a reagent and a catalyst in Reaxys is ambiguous, such that the distinction between these categories in the recommendations may be ambiguous as well, although data cleaning has been performed to help mitigate this. Note that the context recommender will not produce an error if the proposed chemical transformation is not achievable via a single reaction.

Example

Consider the FMOC deprotection in the scheme below. A weak base would effectively remove the FMOC group without altering the other functionality in the molecule. This example will show that the context recommender is sensitive to this type of requirement.

To begin, click the Context Recommendation option on the left-hand side of the ASKCOS homepage. This will bring up a screen where you can enter the SMILES strings of the reactant(s) and product of the reaction of interest (or, alternatively, you can click the Draw link adjacent to the text fields).

In the “Reactants” text field, enter the SMILES strings of the reactant in the FMOC deprotection: Reactant SMILES: O=C([C@H](CO[C@@H]1O[C@@H]([C@@H]([C@@H]([C@H]1NC(C)=O)OC(C)=O)OC(C)=O)COC(C)=O)NC(OCC2C3=C(C4=C2C=CC=C4)C=CC=C3)=O)NCCCCCCCCCCCCCC For reactions that include multiple reactants, distinct reactant SMILES strings should be separated with a period. In the “Products” text field, enter the SMILES string of the product of the FMOC deprotection: Product SMILES: O=C([C@H](CO[C@@H]1O[C@@H]([C@@H]([C@@H]([C@H]1NC(C)=O)OC(C)=O)OC(C)=O)COC(C)=O)N)NCCCCCCCCCCCCCC Once you enter the strings, the parsed structures will appear:

For the context recommender, choose the neural network. Click Get Context Recommendations. Below the Get Context Recommendations button, a list of ten recommendations will appear:

The results, which include weak bases and low temperatures, reflect the need to selectively deprotect the Fmoc group.

Forward Prediction

The forward predictor can be used to predict the outcome of a reaction between any number of compounds.

Usage

  1. 1. Specify reactants. Note that compounds that can contribute atoms to the product should be included as reactants:
    1. Enter SMILES strings of reactants, separated by a period, in the "Reactants" field
    2. OR
    3. Click the "Draw" link adjacent to the “Reactants” field and draw reactant structures in window that appears
  2. Select prediction approach (see Options, below)
  3. Specify reagents (any non-solvent and non-catalyst compounds that do not contribute heavy atoms to the product):
    1. Enter SMILES strings of reagents, separated by a period, in the “Reagents” field
    2. Click the “Draw” link adjacent to the “Reagents” field and draw reagent structures in window that appears
  4. Select solvent
  5. Specify the maximum number of products you would like the algorithm to return (see Options, below)
  6. If using template-based prediction, specify temperature and minimum template count (see Options, below)

Options

Prediction approach – two options exist:

  • Template-free
    • Faster and more accurate than template-based approach
    • May allow for prediction of more inventive chemistries than template-based approach
    • Does not yet take reaction conditions (context) into account, although reagents and solvents may be specified
  • Template-based
    • Traditional approach
    • Somewhat context-dependent
    • Allows user to specify the solvent, reagent, and temperature of interest
    • Using the template-based approach requires specification of the minimum template count – this feature determines the minimum frequency of Reaxys instances that support the templates that are applied to generate predictions. The default value of 25 is the minimum that may be specified, as 25 has been found to be an appropriate minimum for getting reasonable predictions; higher values bias the search toward more popular chemistries

Maximum number of products

  • This feature adjusts the number of candidate outcomes that are returned
  • Large values do not affect computational time, but do affect the amount of time it takes to render the images

Output

The forward predictor produces a list of candidate reaction outcomes that are rank-ordered by a Probability score that reflects the model’s confidence that each outcome will appear. The probabilities for a set of candidates should not be interpreted together to comprise an expected product distribution, nor should an individual probability be interpreted as an expected yield; rather, a probability for a given outcome purely reflects the model’s confidence regarding whether or not that compound will appear at all (the model is not trained on specific yield-related data).

Example

Consider the p-toluenesulfonic acid-promoted etherification of benzhydrol by dimethylaminoethanol that can be used to synthesize diphenhydramine. To predict the outcome of a reaction between benzhydrol and DMAE in the presence of p-toluenesulfonic acid, start by entering the SMILES strings of the two reactants, separated by a period, in the "Reactants" field (or, alternatively, draw the structures in the window that appears after clicking the Draw link adjacent to the Reactants field).

SMILES string of DMAE: CN(C)CCO SMILES string of benzhydrol: OC(c1ccccc1)c1ccccc1 The parsed structures will appear so the user can confirm that their input was interpreted correctly.

Next, choose the Prediction approach. Since we have a reagent that we would like to specify, we can use either approach, although the template-based approach does not make any explicit consideration of the fact that p-toluenesulfonic acid is expected to act as a reagent and not a reactant.

Choosing the template-free expansion method gives the predicted outcomes according to the Template-free forward predictor:

Reaction Evaluation

The reaction evaluator allows you to specify a pair of precursor compounds and a potential target compound, and reports the model’s confidence (probability) that the target compound will be generated via a reaction between the specified precursors.

Usage

  1. Specify reactants:
    1. Enter SMILES strings of reactants, separated by a period, in the “Reactants” field
    2. OR
    3. Click the “Draw” link adjacent to the “Reactants” field and draw reactant structures in window that appears
  2. Specify target compound/product of interest:
    1. Enter SMILES string of target compound in the “Product” field
    2. OR
    3. Click the “Draw” link adjacent to the “Product” field and draw product structure in window that appears
  3. Select prediction approach (see Options, below)
  4. Select context recommender (see Options, below; Neural Network preferred)
  5. Decide whether or not reagents should contribute heavy atoms to the product (see Options, below)
  6. Click Evaluate

Options

Prediction approach – two options exist:

  • Template-free
    • Faster and more accurate than template-based approach
    • May allow for better assessment of relatively inventive chemistries than template-based approach
  • Template-based
    • Traditional approach
    • Somewhat context-dependent
  • Fast filter
    • Does not consider context explicitly
    • Tries to evaluate the likelihood that the reaction could succeed under some context
    • Can also be used during one- or multi-step retrosynthesis to automatically filter results

Context recommender – two options exist when applicable (see Context Recommendation - Options for more information):

  • Neural network
  • Nearest neighbor

Output

The primary output of the reaction evaluator is the plausibility score, which indicates the model’s confidence that the compound specified by the user in the “Product” field will be generated when the specified precursors react with one another.

Example

Consider the simplest one-step retrosynthesis for diphenhydramine. We can enter the reactant and product SMILES strings into the Evaluator:

Using the Template-free method, we can virtually screen 10 different reaction conditions to find which one seems to be most promising, i.e., which one leads to the intended product with the highest confidence.

Alternately, we can ask if this reaction is possible under any set of conditions, which is the question that the Fast filter is designed to answer. Choosing this option gives a very confident prediction that this reaction is possible.