DGTLENG 106 · Lesson 3 of 5

Optimization and Search

Every Design Is a Search

Every engineering design decision is, at its core, a search. You have a space of possible configurations — dimensions, materials, topologies, process parameters — and you are looking for the configuration that best satisfies your objectives while respecting your constraints. Whether the search is conducted with a formal optimizer or with an engineer's intuition and spreadsheets, the underlying structure is the same: objectives, constraints, design variables, and a landscape to navigate.

The difference is scale. An engineer evaluating three candidate materials against two load cases can reason through the trade-offs manually. An engineer searching a 50-dimensional design space with coupled nonlinear constraints and three competing objectives cannot. That is where computational optimization becomes essential — not as a convenience, but as the only viable approach.

Gradient-Based Methods: Fast and Fragile

Gradient-based optimizers compute the derivative of the objective function with respect to each design variable, then take steps in the direction that improves the objective. Steepest descent, conjugate gradient, and quasi-Newton methods all follow this logic. They are the workhorses of optimization when conditions are right.

When they work well. The objective function is smooth, differentiable, and convex (or nearly so). The design space is continuous. Gradients can be computed analytically or via adjoint methods. Under these conditions, gradient methods converge in tens to hundreds of evaluations — orders of magnitude fewer than alternatives.

When they fail. The objective function has discontinuities (a material switches from elastic to plastic behavior), multiple local optima (the optimizer gets trapped in a mediocre solution), integer or categorical variables (you cannot take a gradient with respect to "material A vs. material B"), or noisy evaluations (numerical noise in a simulation produces unreliable gradients). In these cases, a gradient-based optimizer does not just perform poorly — it can converge confidently to a suboptimal solution with no indication that better solutions exist elsewhere.

The practitioner's responsibility: know whether the landscape is smooth enough for gradient methods before choosing them. If you cannot answer that question, you cannot trust the result.

Evolutionary Algorithms: Robust and Expensive

Genetic algorithms, particle swarm optimization, and differential evolution take a fundamentally different approach. They maintain a population of candidate solutions, evaluate each, and use selection, recombination, and mutation to generate new candidates. There are no gradients, no smoothness assumptions, and no requirement for continuous variables.

What they handle. Nonconvex landscapes with multiple local optima. Mixed integer-continuous design spaces. Discontinuous or noisy objective functions. Black-box evaluations where no gradient information is available. Multi-objective problems where the goal is not a single optimum but a Pareto frontier of non-dominated solutions (NSGA-II is the canonical algorithm here).

What they cost. Evolutionary algorithms need thousands to tens of thousands of objective function evaluations to converge. If each evaluation is a 6-hour FEA run, the total cost is measured in years of serial compute time. Even with parallelization, this can be prohibitive for high-fidelity simulation-driven optimization. The exploration is thorough but expensive.

The trade-off is clear: evolutionary algorithms trade computational efficiency for robustness. They will find the global region of interest if given enough evaluations, but "enough" is a large number.

Bayesian Optimization: Smart Sampling

Bayesian optimization occupies a different niche. It is designed for problems where each objective function evaluation is expensive — a physical experiment, a high-fidelity simulation, a prototype test — and the budget for evaluations is small (typically tens to low hundreds, not thousands).

The approach: maintain a probabilistic surrogate model (typically a Gaussian process) of the objective function. The surrogate provides both a prediction and an uncertainty estimate at every point in the design space. An acquisition function (expected improvement, knowledge gradient, or similar) uses both the prediction and the uncertainty to decide where to sample next. The key innovation: the optimizer does not just seek low values of the objective — it preferentially explores regions of high uncertainty because those are the regions where a better solution is most likely to hide.

Where it excels. Expensive-to-evaluate functions with moderate dimensionality (typically up to 20 or 30 variables). Physical experiments where each data point costs thousands of dollars. Situations where you need to find a good solution in 50 evaluations, not 5,000.

Where it struggles. High-dimensional spaces (the Gaussian process surrogate degrades in many dimensions). Large evaluation budgets (if you can afford 10,000 evaluations, evolutionary algorithms may be more straightforward). Highly noisy evaluations (the surrogate needs to distinguish signal from noise). Categorical or hierarchical design spaces (standard Gaussian processes assume continuous inputs).

Multi-Objective Optimization: The Pareto Frontier

Engineering rarely has a single objective. Minimize weight and maximize stiffness. Minimize cost and maximize reliability. Reduce drag and maintain lift. These objectives conflict — improving one degrades another.

Multi-objective optimization does not produce a single "best" design. It produces a Pareto frontier: the set of designs where no objective can be improved without worsening at least one other. Every design on the frontier is optimal in the sense that no other design dominates it on all objectives simultaneously.

The frontier is the engineer's decision map. It shows the quantitative trade-offs: "To reduce weight by 5%, stiffness decreases by 12%." The optimizer generates this map. The engineer — using judgment about which trade-offs are acceptable given project priorities, regulatory requirements, and stakeholder preferences — selects a design from it.

This division of labor is fundamental. The optimizer explores. The engineer decides. Attempts to collapse multi-objective problems into a single weighted-sum objective (0.6 * weight + 0.4 * stiffness) conceal the trade-offs and bias the search toward a single point on the frontier. The full frontier is almost always more informative.

How AI Transforms Optimization

AI has not changed what optimization is. It has changed what is computationally feasible.

Surrogate-based optimization. Replace the expensive objective function (FEA, CFD, physical test) with a trained ML surrogate. The optimizer evaluates the surrogate in milliseconds instead of hours. This is the single most impactful application of ML in computational engineering: it converts optimization problems that would require years of compute into problems that finish overnight. The catch: the surrogate is only accurate within its training domain. Adaptive strategies that alternate between surrogate predictions and full-fidelity validation checks are essential.

Reinforcement learning for sequential decisions. Some engineering design processes are sequential: the choice of material constrains the choice of process, which constrains achievable geometry, which constrains performance. Reinforcement learning agents learn policies for these sequential decisions by exploring the decision tree and receiving rewards for good outcomes. Applications include additive manufacturing process planning, assembly sequence optimization, and control system tuning.

Transfer learning across optimization campaigns. An optimization campaign for version N of a product generates thousands of evaluated designs. When version N+1 arrives with a modified design space, transfer learning leverages the knowledge from the previous campaign to warm-start the search. The optimizer does not start from scratch — it starts from an informed prior, reducing the number of new evaluations needed.

Automated problem formulation. Perhaps the most speculative application: AI assisting in defining the optimization problem itself. Given natural language specifications ("lightweight, corrosion-resistant, manufacturable by casting"), LLMs could propose candidate objectives, constraints, and design variable ranges. This does not replace engineering judgment, but it could accelerate the often-tedious process of translating requirements into mathematical optimization formulations.

Optimization Methods: Navigating the Cost-Robustness Spectrum

Evaluations NeededTens to hundreds — efficient when gradients are available and the landscape is smooth.

RobustnessLow — converges to the nearest local optimum, which may not be the global optimum. No exploration of distant regions.

Variable TypesContinuous only. Cannot handle integer, categorical, or mixed variables.

Best ForSmooth, convex, continuous problems where gradients can be computed analytically or via adjoint methods.

AI AugmentationLimited — the method is already efficient. AI adds value by providing differentiable surrogates for non-differentiable objectives.

Move through the four stages to see how the cost-robustness trade-off evolves. Notice that AI augmentation does not replace the earlier methods — it transforms their economics. Gradient methods remain the right choice for smooth problems. Evolutionary algorithms remain the right choice for multi-objective Pareto exploration. AI changes which problems are tractable, not which methods are correct.

Assessment

Question 1 of 3Score: 0

A team is optimizing the shape of a turbine blade for aerodynamic efficiency. Each CFD evaluation takes 12 hours. The design space has 8 continuous shape parameters. The budget allows for 80 total CFD runs. Which optimization approaches are appropriate for this problem? (Select all that apply)

Select all that apply

Implement a simplified Bayesian optimization loop in Python. The objective function is a noisy 1D function. Your task: (1) define an acquisition function that balances exploration (high uncertainty) and exploitation (low predicted value), (2) select the next sample point, and (3) update the model. Use the comments to guide your implementation. This is pseudocode-level — focus on the logic, not the library calls.

python

import numpy as np

# Simulated objective function (expensive to evaluate)
def objective(x):
  """Noisy 1D function we want to minimize. Each call is 'expensive.'"""
  return (x - 3.5)**2 * np.sin(4 * x) + np.random.normal(0, 0.1)

# Initial samples (pretend these cost us 3 expensive evaluations)
X_observed = np.array([1.0, 5.0, 8.0])
Y_observed = np.array([objective(x) for x in X_observed])

# Candidate points to consider for next evaluation
X_candidates = np.linspace(0, 10, 200)

def surrogate_predict(X_observed, Y_observed, x_new):
  """
  Simplified surrogate: returns (predicted_mean, predicted_std).
  In practice this would be a Gaussian Process. Here, use a simple
  heuristic: predicted mean = weighted average of nearby observations,
  predicted std = higher when far from observed points.
  """
  # YOUR CODE: compute mean prediction and uncertainty estimate
  mean = 0.0   # replace
  std = 1.0    # replace
  return mean, std

def acquisition(mean, std, best_so_far, trade_off=1.0):
  """
  Acquisition function: lower is better (we are minimizing).
  Implement Expected Improvement or a simpler Lower Confidence Bound:
    LCB = mean - trade_off * std
  Points with low predicted mean AND high uncertainty score well.
  """
  # YOUR CODE: return acquisition value
  return 0.0  # replace

# Bayesian optimization loop: select next point
best_so_far = np.min(Y_observed)
acquisition_values = []

for x in X_candidates:
  mean, std = surrogate_predict(X_observed, Y_observed, x)
  acq = acquisition(mean, std, best_so_far)
  acquisition_values.append(acq)

# Select the candidate with the best (lowest) acquisition value
next_x = X_candidates[np.argmin(acquisition_values)]
print(f"Next point to evaluate: x = {next_x:.2f}")

# Evaluate and update
next_y = objective(next_x)
X_observed = np.append(X_observed, next_x)
Y_observed = np.append(Y_observed, next_y)
print(f"Observed value: y = {next_y:.2f}")
print(f"Total evaluations used: {len(X_observed)}")