DGTLENG 301 · Lesson 3 of 5

Surrogate-Based Optimization at Scale

The Optimization Bottleneck

Engineering optimization requires evaluating many design candidates. Gradient-based methods need function evaluations at each step. Evolutionary algorithms evaluate populations across generations. Bayesian methods evaluate sequentially but still need many evaluations to converge. When each evaluation requires a high-fidelity simulation — finite element analysis, computational fluid dynamics, multi-physics coupled models — the cost per evaluation is measured in hours or days, and optimization becomes impractical.

A design space with 20 parameters, each at 5 levels, contains nearly 100 trillion possible configurations. Even with intelligent sampling, exploring this space with full-fidelity simulation would take centuries of compute time. This is not a theoretical problem — it is the daily reality of engineering design teams who must find good designs within schedule and budget constraints.

Surrogate-based optimization addresses this bottleneck by replacing the expensive simulation with a fast approximation. The surrogate is trained on a set of simulation results and can evaluate new design points in milliseconds rather than hours. The optimization algorithm calls the surrogate instead of the simulator, enabling thousands or millions of evaluations at a fraction of the cost.

But surrogates introduce approximation error. The question is not "is the surrogate perfect?" — it never is. The question is "is the surrogate accurate enough for the decision being made, and how do we know?"

Direct Optimization: The Expensive Baseline

Direct optimization calls the high-fidelity simulation for every evaluation. No approximation, no surrogate, no shortcuts. The simulation model — validated against test data, calibrated to physical behavior — provides the most trustworthy evaluation available.

When it works: Problems with few parameters (under 10), fast simulations (minutes per run), and sufficient compute budget. A thermal sizing study with 5 parameters and a 3-minute simulation run can complete 1,000 evaluations in two days on a single machine. For problems of this scale, direct optimization is both practical and preferable — no surrogate error, no model management, no questions about approximation validity.

When it does not work: Problems with many parameters (20 or more), expensive simulations (hours per run), or the need for many thousands of evaluations (global optimization, uncertainty quantification, multi-objective Pareto mapping). A structural optimization with 30 parameters and a 4-hour simulation cannot practically evaluate even 500 candidates — that would take 2,000 hours of compute, which on typical engineering workstations means months of wall-clock time.

The cost of direct optimization is the motivation for every surrogate method. Understanding this cost is essential for deciding when a surrogate is worth the effort of building and validating it.

Surrogate-Assisted Optimization

Surrogate-assisted optimization replaces the expensive simulation with a trained ML model. The process follows a well-established workflow.

Step 1: Design the training set. Use Design of Experiments (DoE) methods to select a set of input parameter combinations that sample the design space efficiently. Latin Hypercube Sampling ensures even coverage of each parameter's range. Space-filling designs (Sobol sequences, maximin distance) ensure that no large regions of the design space are left unsampled. The training set size depends on the dimensionality of the problem and the complexity of the response surface — typically 10 to 20 points per parameter for smooth responses, more for complex or nonlinear surfaces.

Step 2: Run the simulations. Execute the high-fidelity simulation for each training point. This is the expensive step, but it is a one-time cost. The simulation runs can be parallelized across available compute resources. Each run produces the inputs (design parameters) and outputs (performance metrics) that the surrogate will learn from.

Step 3: Train the surrogate. Fit an ML model to the input-output data. The choice of model depends on the problem:

Gaussian Process Regression (Kriging) provides predictions with uncertainty estimates. It tells you not just what it predicts but how confident it is. Excellent for problems with fewer than 500 training points and moderate dimensionality. The uncertainty estimate is its most valuable feature — it enables active learning (described below) and provides a natural warning when the surrogate is extrapolating.
Neural networks handle larger datasets and higher-dimensional inputs. Faster inference than Gaussian processes for large models. Require more training data and hyperparameter tuning. Do not natively provide calibrated uncertainty estimates, though ensemble methods and dropout-based approximations can add this capability.
Radial basis functions are simple, fast, and effective for smooth response surfaces. A good default when the response is expected to be well-behaved and the training data is modest in size.
Gradient boosting and random forests handle mixed continuous and categorical inputs naturally. Good for problems that combine continuous design parameters with discrete choices (material type, manufacturing process, component selection).

Step 4: Validate the surrogate. Test the surrogate against simulation results it was not trained on. Hold out 10-20% of the data, or use cross-validation to assess accuracy across the full dataset. Track RMSE, R-squared, and maximum error. The validation must confirm that accuracy is sufficient for the intended use — a surrogate used for screening (which designs are worth investigating?) can tolerate more error than one used for final performance prediction.

Step 5: Optimize using the surrogate. Run the optimization algorithm against the surrogate, evaluating thousands of candidates in seconds. The optimizer finds the best design according to the surrogate's predictions. Then validate the best candidates by running the actual high-fidelity simulation to confirm that the surrogate's prediction was accurate.

The critical risk: the optimizer will exploit errors in the surrogate. If the surrogate underpredicts stress in one region of the design space, the optimizer will drive designs into that region because they appear feasible. This is why validation of the optimizer's recommended designs against the full simulation is not optional — it is the step that catches surrogate errors before they become design errors.

Multi-Fidelity Optimization

Multi-fidelity methods combine data from simulations of different accuracy levels. A coarse-mesh finite element model runs in minutes; a fine-mesh model runs in hours. A simplified thermal model runs in seconds; a full conjugate heat transfer model runs in days. Multi-fidelity surrogates learn from all available data, using cheap low-fidelity data to map the general shape of the response surface and expensive high-fidelity data to calibrate the predictions where accuracy matters.

Co-kriging is the classical multi-fidelity surrogate. It models the response at each fidelity level and the correlation between levels. The low-fidelity model provides broad coverage (hundreds or thousands of cheap evaluations). The high-fidelity model provides accuracy anchors (tens of expensive evaluations). The co-kriging model interpolates between the high-fidelity points, guided by the structure learned from the low-fidelity data.

The cost advantage is substantial. A multi-fidelity surrogate might achieve the accuracy of a high-fidelity surrogate trained on 200 expensive runs by using 20 expensive runs plus 500 cheap runs. If the cheap runs are 100 times faster than the expensive ones, the total cost drops by a factor of five or more while maintaining prediction quality.

The risk is fidelity correlation. Multi-fidelity methods assume that the low-fidelity model captures the correct trends even if it gets the magnitudes wrong. If the low-fidelity model has qualitatively different behavior — predicting a peak where the high-fidelity model shows a valley — the multi-fidelity surrogate will be misled. Validating the correlation between fidelity levels before building the surrogate is essential.

Practical multi-fidelity sources in engineering:

Coarse mesh vs. fine mesh FEA (same physics, different resolution)
Simplified boundary conditions vs. full boundary conditions (same geometry, reduced complexity)
Reduced-order models vs. full-order models (mathematical simplification)
Empirical correlations vs. simulation (different analysis methods entirely)

Each combination has a different fidelity gap and a different correlation structure. The engineer's job is to select multi-fidelity combinations where the low-fidelity model is cheap, reasonably correlated with high-fidelity, and available across the full design space.

Active Learning: The Surrogate Decides What to Simulate

The most powerful surrogate-based approach is active learning, where the surrogate itself determines which new simulation to run. Instead of designing the entire training set upfront, the process is iterative:

Train the surrogate on an initial small sample
The surrogate identifies the design point where its prediction is most uncertain or where new information would most improve the optimization
Run the high-fidelity simulation at that point
Update the surrogate with the new data
Repeat until the optimization converges or the simulation budget is exhausted

Expected Improvement (EI) is the most common acquisition function for active learning in optimization. It balances exploitation (sampling where the surrogate predicts good performance) with exploration (sampling where the surrogate is uncertain). A point with high EI is either predicted to be very good, or very uncertain, or both — all of which are valuable places to invest an expensive simulation.

The advantage is efficiency. Active learning concentrates simulations where they matter most — near the optimum and in regions of high uncertainty. It avoids wasting simulations in well-characterized regions of the design space that are far from optimal. In practice, active learning can find near-optimal designs with 30-50% fewer simulations than a fixed training set approach.

The limitation is sequential computation. Each new simulation must be selected based on the results of all previous simulations. This means simulations cannot be fully parallelized — each batch must wait for the previous batch to complete and the surrogate to update. Batch active learning methods (selecting multiple points per iteration) partially address this at the cost of slightly reduced efficiency.

When Surrogates Mislead

Surrogates are not always helpful. Understanding when they fail prevents costly design errors.

High-dimensional problems with limited data. A 50-parameter problem with 100 training points leaves vast regions of the design space unsampled. The surrogate interpolates wildly between sparse data points, and the optimizer exploits these interpolations.

Discontinuous responses. A system that transitions between operating modes (laminar to turbulent flow, elastic to plastic deformation, contact to separation) has a response surface with jumps and kinks. Smooth surrogates — Gaussian processes, neural networks with smooth activation functions — cannot represent these discontinuities and produce misleading predictions near the transition boundaries.

Extrapolation. Surrogates are interpolation machines. When the optimizer pushes into regions of the design space beyond the training data, the surrogate extrapolates with no reliability guarantee. Some surrogates (Gaussian processes) flag this through high uncertainty. Others (neural networks) extrapolate silently, producing confident but wrong predictions.

Overfitting. A surrogate with too many parameters relative to the training data memorizes the noise in the simulation results rather than learning the underlying trend. The surrogate fits the training data perfectly but predicts poorly on new points. Cross-validation and holdout testing detect overfitting — but only if the validation data covers the relevant region of the design space.

The defense against misleading surrogates is always the same: validate the optimizer's recommended designs against the full-fidelity simulation before making decisions. The surrogate is a search tool, not a decision tool. It narrows the field; the high-fidelity simulation confirms the result.

Optimization Approaches: Cost and Accuracy Trade-offs

Cost per evaluationFull simulation cost — hours or days per evaluation depending on model complexity.

Total evaluations practicalHundreds at most. Budget limits exploration to a small fraction of the design space.

Accuracy of evaluationHighest available — uses the validated simulation directly. No approximation error.

Risk profileRisk of missing the global optimum because too few candidates can be evaluated. No surrogate error risk.

Best forLow-dimensional problems (under 10 parameters), fast simulations (minutes per run), or when approximation is unacceptable.

Compare the four approaches across cost, accuracy, and risk dimensions. Notice the progression: each method trades some form of cost for some form of risk. The engineering decision is which trade-off is appropriate for the specific problem and its consequences.

Assessment

Question 1 of 2Score: 0

A team has built a Gaussian Process surrogate for a structural optimization with 15 parameters, trained on 150 simulation runs. The surrogate shows R-squared of 0.95 on cross-validation. The optimizer finds a design that the surrogate predicts is 25% lighter than any training point. Which of the following are appropriate responses? (Select all that apply)

Select all that apply

You are tasked with optimizing a component design with 25 parameters, where each high-fidelity simulation takes 8 hours. Your compute budget allows 200 simulation runs total. Describe your optimization strategy: (1) which approach you would use (direct, surrogate, multi-fidelity, or active learning) and why, (2) how you would allocate your 200-run budget between training and validation, (3) what surrogate model type you would choose and why, and (4) how you would validate the optimizer's final recommended design.