Reaction Space Generation
This section describes how AMLRO constructs the reaction search space and
selects the initial training conditions using the get_reaction_scope
entry-point function.
This is the first step in any AMLRO workflow.
Overview
Reaction space generation serves two purposes:
Construct the full combinatorial reaction space based on user-defined continuous and categorical parameters.
Select an initial subset of reactions for training using a chosen sampling strategy.
The reaction scope is generated using:
get_reaction_scope(
config=config,
sampling="sobol",
training_size=10,
write_files=True,
exp_dir=exp_dir,
)
This function generates all reaction combinations and writes them to
full_combo.csv for use as reaction grid for active learning optimization.
Function Arguments
Configuration Dictionary
config defines the reaction parameters and objectives and is described
in detail in configuration.
Only reaction variables and vales are read from config at this stage.
Sampling Strategy
The sampling argument controls how the initial training reactions are
selected from the full reaction space.
Supported options include:
"lhs"– Latin Hypercube Sampling"sobol"– Sobol low-discrepancy sequence"random"– Random sampling
The sampling method affects only the initial training set and does not alter the full reaction space.
Implementation details:
Latin Hypercube Sampling (LHS) is implemented using PyDOE2 with a min–max criterion, 1000 iterations, and a fixed random seed for reproducibility.
Sobol sampling is implemented using
scipy.qmc.Random sampling uses uniform random selection.
Training Size
training_size specifies the number of reaction conditions selected for the
initial training set.
This value should reflect experimental or computational budget constraints. Typical values range from 5 to 50 reactions, depending on problem complexity. Donot go lesser than 5.
Experiment Directory
exp_dir defines the directory where all generated reaction space and
training files are written.
If the directory does not exist, it will be created automatically. This
directory should use with full optimization cycle. Specially if you manually
adding reaction_data.csv includes here.
Example:
exp_dir = "exp_data"
File Generation
When write_files=True, the reaction space generation step produces the
following files inside exp_dir:
full_combo.csvEncoded representation of the complete reaction space. Categorical variables are stored as integer indices.full_combo_decoded.csvHuman-readable version of the full reaction space with original categorical values restored. (This file is generated if only reaction space <= 20000)training_combo.csvInitial subset of reaction conditions selected using the specified sampling strategy. These reactions are intended to be performed experimentally or evaluated via simulation.
These files allow users to inspect, visualize, or modify reaction conditions prior to experimentation.
Inspecting the Reaction Space
Users may inspect the generated reaction combinations using standard data analysis tools.
Example:
import pandas as pd
df = pd.read_csv("exp_data/training_combo.csv")
print(df.head())
This is particularly useful for verifying parameter ranges, sampling behavior, and categorical encoding.
Relationship to the AMLRO Workflow
Reaction space generation is a one-time initialization step. Once completed, users proceed to:
Perform experiments or simulations for conditions listed in
training_combo.csvInvoke active learning to propose new reaction conditions
For details on how experimental feedback is incorporated, see training_data.rst.
For details on batch selection and optimization, see active_learning.rst.