ParaEval

Regression Tests

Each synthetic case defines expected outputs. At build time, the decision engine runs live against each case and the results are compared to expectations.

All 3 golden cases pass

Golden Cases

CaseExpectedActualResult
Hong Kong Cold-Chain Warehouse — Typhoon Saola
hong-kong-cold-storage-saola
Trigger Met
70%
Trigger Met
70%
Pass
Manila Coastal Logistics Facility — Typhoon Karding
manila-coastal-logistics
Trigger Met
83%
Trigger Met
83%
Pass
Jakarta Residential Asset — January 2020 Floods
jakarta-residential-asset
Not Met
17%
Not Met
17%
Pass

Illustrative Failure Example

fixtureAlgorithm drift: confidence threshold change

If the met threshold were changed from 0.70 to 0.75, the Hong Kong flagship case would shift from met to borderline with no change to evidence. This illustrates why the algorithm is pinned and golden case expectations are checked at build time.

-status: "met" (expected)
-confidence: 0.700
+status: "borderline" (actual — threshold drift)
+confidence: 0.700