DGTLENG 106 · Lesson 4 of 5

Uncertainty, Statistics, and Data-Driven Methods

The Answer Without the Confidence Interval Is Not an Answer

Engineering analysis produces numbers. A peak stress of 347 MPa. A drag coefficient of 0.032. A fatigue life of 1.2 million cycles. These numbers are presented in reports, used in trade studies, compared against allowables, and cited in certification documentation. And every single one of them is wrong — in the sense that no model perfectly represents reality, no input parameter is known exactly, and no numerical method produces an exact solution.

The question is not whether the answer is wrong. The question is how wrong it might be. Uncertainty quantification (UQ) is the discipline that answers this question. It is not optional. It is not a nice-to-have that gets added if the schedule permits. It is what separates engineering from guessing. A stress prediction of 347 MPa with no uncertainty characterization is a number. A stress prediction of 347 MPa with a 95% confidence interval of 310 to 395 MPa, given the uncertainty in material properties and load conditions, is engineering information.

Three Sources of Uncertainty

Every engineering prediction carries uncertainty from three distinct sources. Conflating them leads to misdiagnosis and wasted effort.

Model uncertainty (epistemic): Is the physics right? A linear elastic FEA model applied to a material that is actually yielding has model error. A RANS turbulence model applied to massively separated flow has model error. This uncertainty is in the choice of equations, not in their solution. More data does not reduce model uncertainty — only a better model does.

Parameter uncertainty (aleatory and epistemic): Are the inputs right? Material yield strength is not a single number — it is a distribution reflecting batch-to-batch variation. Applied loads vary with operating conditions. Geometry deviates from nominal due to manufacturing tolerances. Some of this uncertainty is irreducible (true physical variability) and some is reducible (we could measure more precisely if we invested in better characterization).

Numerical uncertainty: Is the discretization fine enough? Is the solver converged? Is the time step small enough? This is the uncertainty introduced by the computational method itself. Mesh convergence studies address this source. It is the most controllable of the three — but it is also the one most frequently ignored once the model "runs."

The interaction between these sources matters. A mesh-converged solution on a wrong model is precisely wrong. A correct model with uncertain parameters gives a distribution of answers, all of which are physically plausible. An unconverged solution with accurate parameters might accidentally get the right answer for the wrong reason.

Monte Carlo Simulation: Brute Force That Works

Monte Carlo is the most conceptually simple and universally applicable method for propagating uncertainty. The recipe: define probability distributions for all uncertain inputs. Sample from those distributions. Run the model for each sample. Collect the outputs. The set of outputs is the output distribution.

If you sample 10,000 combinations of material properties, loads, and geometric tolerances, and run your FEA model for each, you get 10,000 stress predictions. The histogram of those predictions is your output distribution. From it you can extract means, standard deviations, percentiles, and failure probabilities.

Why it works. Monte Carlo makes no assumptions about the relationship between inputs and outputs. It handles nonlinearity, discontinuities, interactions, and arbitrary input distributions. It converges to the true output distribution as sample size increases, regardless of the problem's dimensionality. The convergence rate (proportional to 1/sqrt(N), where N is the sample count) is slow but predictable.

Why it is expensive. 10,000 model evaluations means 10,000 solver runs. If each FEA run takes 2 hours, that is 20,000 hours of compute — over two years of serial time. Even with parallelization on a 100-core cluster, that is 200 hours. For CFD runs that take 24 hours each, Monte Carlo at this scale is simply impractical.

This cost barrier is the central problem. Monte Carlo works for any problem but is affordable only for problems where each evaluation is cheap (minutes or less). For high-fidelity simulation, direct Monte Carlo is rarely feasible.

Design of Experiments: Maximum Information, Minimum Runs

Where Monte Carlo samples randomly (or quasi-randomly), Design of Experiments (DOE) samples strategically. The goal: extract the maximum amount of information about the input-output relationship from the minimum number of runs.

Factorial designs sample all combinations of input levels. A full factorial of 5 inputs at 3 levels each requires 3^5 = 243 runs. This grows exponentially with the number of inputs, making it impractical for high-dimensional problems. Fractional factorials reduce the count by sacrificing the ability to estimate higher-order interactions.

Latin Hypercube Sampling (LHS) ensures that each input's marginal distribution is uniformly sampled, even with a small sample size. A 50-sample LHS over 10 inputs guarantees that each input is represented across its full range — something random sampling cannot guarantee with so few samples.

Sobol sequences are quasi-random sequences that fill the input space more uniformly than random sampling. They reduce the variance of Monte Carlo estimates by eliminating the clusters and gaps that random sampling produces.

The practitioner's choice depends on the goal. If you need to identify which inputs drive the output (screening), a fractional factorial or Plackett-Burman design is efficient. If you need to build a response surface across the full input space, LHS or Sobol sequences provide better coverage. If you need to estimate sensitivity indices (what fraction of output variance is attributable to each input), Sobol sequences paired with Sobol sensitivity analysis are the standard approach.

How AI Changes Uncertainty Quantification

AI transforms UQ from a method that is theoretically sound but practically prohibitive into one that is both sound and affordable.

Surrogate-based Monte Carlo. Train an ML surrogate on a small set of full-fidelity simulations (50 to 500, depending on dimensionality). Then run Monte Carlo on the surrogate. Because the surrogate evaluates in milliseconds, running 100,000 Monte Carlo samples becomes trivial. The total cost is the training database (50 to 500 expensive runs) plus negligible surrogate evaluation cost. Compare this to direct Monte Carlo (10,000+ expensive runs). The cost reduction is one to two orders of magnitude.

The risk: the surrogate's approximation error propagates into the uncertainty estimate. If the surrogate is inaccurate in regions that contribute to tail behavior (extreme outputs), the failure probability estimate will be wrong. Validation of the surrogate against held-out full-fidelity results is essential, with particular attention to tail accuracy.

Active learning for DOE. Classical DOE designs the sampling plan before running any experiments. Active learning designs the plan adaptively: run a batch of experiments, build a surrogate, identify where the surrogate is most uncertain or where the decision boundary (pass/fail) is most ambiguous, and place the next experiments there. This focuses computational effort where it has the most impact on the quantity of interest.

In practice, this means that instead of uniformly sampling a 10-dimensional input space (which requires enormous samples for adequate coverage), active learning concentrates samples in the regions that matter — near failure boundaries, in regions of high sensitivity, or where the surrogate disagrees with physical intuition.

Bayesian inference. Bayesian inference provides the mathematical framework for updating beliefs about parameters as new data arrives. Start with a prior distribution (what you believed before seeing data). Observe data. Compute the posterior distribution (what you believe after seeing data). This is the computational instantiation of learning from evidence.

In engineering UQ, Bayesian inference allows models to be calibrated against experimental data, combining simulation predictions with test measurements to produce updated parameter estimates that are consistent with both. When a fleet of turbines generates operational data, Bayesian updating can refine failure probability estimates in real time — something that classical Monte Carlo, which requires a fixed model and fixed inputs, cannot do.

Data-driven models. When physics is too complex, too expensive, or insufficiently understood, data-driven models learn the input-output relationship directly from observations. Gaussian process regression provides not just predictions but prediction intervals. Neural networks can capture complex nonlinearities. Random forests handle mixed variable types and provide feature importance rankings.

The critical limitation: data-driven models are interpolation machines. They perform within the distribution of the training data. Outside that distribution — higher temperatures, different materials, novel operating conditions — they extrapolate, and extrapolation from data-driven models is unreliable. Physics-informed approaches, which embed governing equations as constraints in the learning process, are the emerging bridge between purely data-driven and purely physics-based methods.

Uncertainty Quantification: From Deterministic to Adaptive

What You GetA single output value for a single set of nominal inputs. No information about how the output changes with input variation.

CostOne model evaluation. Cheapest possible analysis.

What You MissEverything about risk. You know the nominal answer but not how sensitive it is to assumptions, how likely failure is, or how much margin you actually have.

When AppropriatePreliminary screening, feasibility checks, model debugging. Never appropriate as the final basis for a design decision involving safety or reliability.

Walk through the four stages from left to right. Each stage adds information about uncertainty while managing cost. The deterministic analysis is the starting point most engineers are comfortable with. The progression to Monte Carlo, surrogate Monte Carlo, and active learning represents increasing sophistication in how uncertainty is quantified — and each stage requires the practitioner to understand what approximations are being made.

Assessment

Question 1 of 3Score: 0

A structural analyst runs a deterministic FEA with nominal material properties and reports a safety factor of 1.8. The program manager concludes that the design has adequate margin. What critical information is missing from this assessment? (Select all that apply)

Select all that apply

Your team has a validated FEA model for a safety-critical structural component. The model takes 6 hours per run. You need to estimate the probability that peak stress exceeds the material allowable, considering uncertainty in applied loads (3 parameters), material properties (2 parameters), and manufacturing tolerances (2 parameters) — 7 uncertain parameters total. You have budget for 200 full-fidelity FEA runs. Describe your UQ strategy: (1) How would you allocate the 200 runs between training a surrogate, validating the surrogate, and any direct Monte Carlo checks? (2) What DOE strategy would you use for the training set? (3) How would you validate that the surrogate is accurate enough in the tail region where failures occur? (4) What would make you abandon the surrogate approach and demand more full-fidelity runs?