Advanced Sensitivity Analysis

The built-in SensitivityAnalyzer covers the most common scenario: Sobol sampling over a parameter space, one output file per objective, compared against observed data with a standard indicator.

For cases that go beyond this — sensitivity across multiple channels or locations, custom scoring, or samplers other than Sobol — pySWATPlus exposes the building blocks you need to compose your own sensitivity workflow directly.

Building blocks

Task	Tool
Run a SWAT+ simulation with parameter changes	`TxtinoutReader.run_swat`
Read a SWAT+ output file	`DataManager.simulated_timeseries_df`
Compute performance indicators	`PerformanceMetrics.compute_from_abbr`
Generate parameter samples	`SALib.sample.sobol.sample` (or any other SALib sampler)
Compute Sobol indices	`SALib.analyze.sobol.analyze`
Parallel execution	`concurrent.futures.ProcessPoolExecutor`

Use the built-in SensitivityAnalyzer first

Before following this guide, check whether the built-in SensitivityAnalyzer covers your needs:

Scenario	Use
One output variable compared to observed data	`SensitivityAnalyzer`
Multiple output variables, one per file	`SensitivityAnalyzer` (multi-output)
Multiple channels or locations from the same file	This guide
Custom scoring or aggregation across locations	This guide
Sampler other than Sobol	This guide

Example: multi-channel Sobol sensitivity

This example computes Sobol sensitivity indices for two parameters against observed daily flow at two channels simultaneously, using mean NSE across both channels as the scalar metric — a case the built-in SensitivityAnalyzer cannot express directly.

Note

Steps 2 – 5 all belong in one Python file. They are shown separately only to explain each part.

Step 1 — Configure the model

Set the simulation period, output print options, and any fixed parameters before running simulations. See Configuration Settings for a detailed walkthrough.

Step 2 — Module-level setup

All imports, paths, and data are defined at the top of the file so they are available to every function defined below.

import json
import logging
import pathlib
import shutil
import concurrent.futures

import numpy
import pandas
import SALib.sample.sobol
import SALib.analyze.sobol

from pySWATPlus import TxtinoutReader, DataManager, PerformanceMetrics

logging.basicConfig(level=logging.INFO)

# --- Paths ---
TXTINOUT_DIR = pathlib.Path(r'C:\Users\Username\custom_folder')
SIM_DIR      = pathlib.Path(r'C:\Users\Username\sensitivity_dir')

# --- Channels to analyse ---
CHANNEL_IDS = [561, 562]

# --- Parameter search space ---
PARAM_CONFIGS = [
    {'name': 'esco', 'change_type': 'absval', 'lower_bound': 0,  'upper_bound': 1},
    {'name': 'cn2',  'change_type': 'pctchg', 'lower_bound': 25, 'upper_bound': 75},
]

# --- Observed discharge (one CSV per channel, must contain 'date' and 'discharge' columns) ---
obs = {
    561: pandas.read_csv(r'C:\Users\Username\obs_561.csv', parse_dates=['date']),
    562: pandas.read_csv(r'C:\Users\Username\obs_562.csv', parse_dates=['date']),
}
for df in obs.values():
    df['date'] = df['date'].dt.date
    df.rename(columns={'discharge': 'obs'}, inplace=True)

Step 3 — Define the simulation function

_run_sim must be a top-level function (not defined inside a class or another function). ProcessPoolExecutor sends it to worker processes by name, and lambdas or nested functions cannot be transferred that way — they will raise a PicklingError.

def _run_sim(idx, param_values):
    '''Run one SWAT+ simulation and return its channel outputs.'''

    parameters = [
        {'name': cfg['name'], 'change_type': cfg['change_type'], 'value': float(val)}
        for cfg, val in zip(PARAM_CONFIGS, param_values)
    ]

    # Each simulation needs its own empty subdirectory
    sim_path = SIM_DIR / f'sim_{idx}'
    sim_path.mkdir()

    try:
        reader   = TxtinoutReader(tio_dir=TXTINOUT_DIR)
        run_path = reader.run_swat(sim_dir=sim_path, parameters=parameters)

        outputs = {}
        for ch_id in CHANNEL_IDS:
            df = DataManager().simulated_timeseries_df(
                sim_file=run_path / 'channel_sd_day.txt',
                has_units=True,
                apply_filter={'gis_id': [ch_id]},
                usecols=['flo_out']
            )
            # Rename 'flo_out' → 'sim' so PerformanceMetrics can find the simulated column
            outputs[ch_id] = df.rename(columns={'flo_out': 'sim'})
    finally:
        shutil.rmtree(sim_path, ignore_errors=True)

    return outputs

Note

If a previous run was interrupted and left stale sim_<n> directories behind, remove them manually before restarting — run_swat requires each simulation directory to be empty.

Step 4 — Generate samples and run simulations

Sobol analysis requires a specific sample structure. SALib.sample.sobol.sample generates 2^(sample_number+1) × (n_params + 2) parameter sets — for 2 parameters and sample_number=3 that is 40 sets. Larger values give more accurate indices but require more simulations.

if __name__ == '__main__':
    # Build the SALib problem definition
    problem = {
        'num_vars': len(PARAM_CONFIGS),
        'names':    [cfg['name'] for cfg in PARAM_CONFIGS],
        'bounds':   [[cfg['lower_bound'], cfg['upper_bound']] for cfg in PARAM_CONFIGS],
    }

    # Generate samples — sample_number=3 → 40 simulations for 2 parameters
    sample_array = SALib.sample.sobol.sample(problem, N=pow(2, 3))

    # Run all simulations in parallel — results come back in the same order as sample_array
    with concurrent.futures.ProcessPoolExecutor() as executor:
        all_outputs = list(executor.map(_run_sim, range(len(sample_array)), sample_array))

Step 5 — Score each sample and compute indices

Each sample needs a single scalar metric. Here, mean NSE across both channels is used. SALib.analyze.sobol.analyze then decomposes the variance of that scalar across parameters.

    # Compute a scalar metric for every sample
    pm = PerformanceMetrics()
    Y  = []

    for outputs in all_outputs:
        nse_vals = []
        for ch_id in CHANNEL_IDS:
            merged = outputs[ch_id].merge(obs[ch_id][['date', 'obs']], on='date', how='inner')
            nse_vals.append(pm.compute_from_abbr(merged, 'sim', 'obs', 'NSE'))
        Y.append(numpy.mean(nse_vals))

    Y = numpy.array(Y)

    # Compute Sobol indices
    indices = SALib.analyze.sobol.analyze(problem, Y)

    # S1: first-order index  — direct effect of each parameter alone
    # ST: total-order index  — direct effect plus all interactions with other parameters
    for name, s1, st in zip(problem['names'], indices['S1'], indices['ST']):
        print(f'{name:10s}  S1={s1:.3f}  ST={st:.3f}')

    # Save results to disk
    with open('sensitivity_indices.json', 'w') as f:
        json.dump({
            'param_names': problem['names'],
            'S1':          indices['S1'].tolist(),
            'ST':          indices['ST'].tolist(),
            'S1_conf':     indices['S1_conf'].tolist(),
            'ST_conf':     indices['ST_conf'].tolist(),
        }, f, indent=4)

sensitivity_indices.json will look like this:

{
    "param_names": ["esco", "cn2"],
    "S1":      [0.42, 0.31],
    "ST":      [0.58, 0.47],
    "S1_conf": [0.05, 0.04],
    "ST_conf": [0.06, 0.05]
}

A high ST with a low S1 for a parameter means it interacts strongly with others. Parameters with both low S1 and low ST have little influence on the output and are candidates for fixing at a default value during calibration.

Tips

How many samples do you need?

Start with sample_number=3 (40 simulations for 2 parameters) to get a quick picture. Increase to sample_number=6 or higher for publication-quality indices — the required number of simulations grows as 2^(sample_number+1) × (n_params + 2).

Using other samplers

The Sobol sampler is the default choice, but SALib also provides Morris (SALib.sample.morris), FAST (SALib.sample.fast_sampler), and others. Replace the sampling and analysis calls; _run_sim and the scoring loop are the same regardless of the sampler used.

Single scalar vs multiple metrics

SALib.analyze.sobol.analyze requires one scalar per sample. If you want sensitivity indices for NSE and KGE separately, run analyze twice with two different Y arrays — one built from NSE values, one from KGE values.

Tracking progress

For per-simulation progress, pySWATPlus emits messages via Python's standard logging module:

import logging
logging.basicConfig(level=logging.INFO)

Troubleshooting parallel errors

If ProcessPoolExecutor raises an unclear error, close the terminal and restart it. Lingering background processes or locked files from a previous run can interfere with new parallel workloads.