.. _quickstart: Quickstart ========== This guide demonstrates a minimal AMLRO workflow for reaction optimization. It mirrors the three core steps of AMLRO and is intended to introduce the framework with the least possible setup. Each step has a dedicated function that can be called by the user: 1. **Reaction Space Generation** → ``get_reaction_scope()`` 2. **Training Set Generation** → ``generate_training_data()`` 3. **Active Learning Prediction** → ``get_optimized_parameters()`` AMLRO is designed for *iterative* optimization workflows. Between optimization steps, experimental or computational feedback must be provided by the user. Step 0: Import Packages ----------------------- .. code-block:: python from amlro.generate_reaction_conditions import get_reaction_scope from amlro.generate_training_data import generate_training_data from amlro.optimizer import get_optimized_parameters import pandas as pd Step 1: Define the Configuration Dictionary ------------------------------------------- AMLRO is configured using a single dictionary that defines reaction parameters, objectives. The configuration dictionary specifies: - Continuous reaction parameters (bounds, resolution, names) - Categorical reaction parameters (explicit value lists) - Optimization objectives and directions A minimal configuration dictionary **must contain all required keys**, even if some sections (e.g., categorical variables) are empty. The following example shows the **required structure only**. Values are placeholders and should be replaced by user-defined settings. .. code-block:: python config = { "continuous": { "bounds": [[min_value, max_value], ...], "resolutions": [step_size, ...], "feature_names": ["feature_1", "feature_2", ...], }, "categorical": {"feature_names": [], "values": []}, "objectives": ["objective_1", "objective_2", ...], "directions": ["max", "min", ...], } An experiment directory is used to store all intermediate files: .. code-block:: python exp_dir = "exp_data" Step 2: Generate Reaction Space and Initial Training Conditions --------------------------------------------------------------- .. code-block:: python get_reaction_scope( config=config, sampling="lhs", training_size=10, write_files=True, exp_dir=exp_dir ) This step generates the following files: - ``full_combo.csv``: encoded full reaction space - ``full_combo_decoded.csv``: human-readable reaction space - ``training_combo.csv``: initial reaction conditions to evaluate To inspect the initial training conditions: .. code-block:: python df = pd.read_csv(f"{exp_dir}/training_combo.csv") print(df) Step 3: Generate the Training Dataset ------------------------------------ After performing experiments or simulations for the reaction conditions listed in ``training_combo.csv``, objective values must be provided before proceeding. Once objective values are available, generate or update the training dataset: .. code-block:: python generate_training_data( exp_dir=exp_dir, config=config, filename="reactions_data.csv" ) This step creates empty files (if not already present) or updates the following datasets: - ``reactions_data.csv``: encoded dataset used for machine learning model training - ``reactions_data_decoded.csv``: human-readable version of the training dataset This function is designed to be called **iteratively**, where reaction conditions are evaluated round-by-round and experimental feedback is incorporated after each iteration. **Providing Experimental or Computational Feedback: AMLRO supports both manual and programmatic ways of supplying objective values. **Option A: Manual update (recommended for open loop experimental workflows)** If experimental results are obtained outside of AMLRO, users may manually create or update ``reactions_data.csv`` before running the active learning step. To do so: - Open ``reactions_data.csv`` - Copy initial reaction conditions from ``training_combo.csv`` - Add columns corresponding to the objective names defined in ``config["objectives"]`` - Ensure no additional columns are present - For categorical features, use encoded values (``0, 1, 2, ...``) following the order specified in the configuration dictionary. - Also create a ``reactions_data_decoded.csv`` file with actual categorical values. This option is also suitable when users already possess prior experimental data. .. important:: When updating training or reaction data files manually, they must contain **only** the following columns: - Feature columns defined in ``config`` - Objective columns defined in ``config["objectives"]`` **Option B: Programmatic update (simulations or benchmarks)** When objective values can be computed programmatically (e.g., benchmark functions or simulations), users may iteratively update the dataset using a ``for`` loop by calling ``generate_training_data()`` function. In this case, objective values are computed and appended automatically, provided that column names exactly match those defined in the configuration. This approach is demonstrated in the Branin tutorial and Google Colab example. Step 4: Predict Next Optimal Reaction Conditions ------------------------------------------------ .. code-block:: python next_conditions = get_optimized_parameters( exp_dir=exp_dir, config=config, batch_size=5 filename='reactions_data.csv' ) This completes one AMLRO iteration. The workflow can be repeated by adding new experimental feedback and re-running Steps 4 and 5. Next Steps ---------- AMLRO also supports fully automated closed-loop optimization when objective values can be computed programmatically. For an interactive example using the Branin benchmark function, see: - :doc:`tutorials/branin_example` - Google Colab notebook (linked in the tutorial) A web-based interface for interactive use is under development.