Men 2026Planned synthetic datasetMen et al. (2026), Algorithms

Men Hurricane Hybrid

Ensemble classification plus regression pipeline

A paper-correct track for hurricane payout-class prediction and partial-loss estimation using XGBoost and GBDT ensembles over catastrophe-model event rows.

Paper focus

A Hybrid Machine Learning Framework for Predicting Hurricane Losses in Parametric Insurance with Highly Imbalanced Data

This page reflects the paper’s actual two-track design: payout-class classification plus partial-loss regression, combined through ensemble voting.

Peril handling

It uses hurricane intensity, pressure, radius, distance-to-asset features, descriptive statistics across track observations, and interaction terms derived from catastrophe-model outputs.

Basis-risk role

Its role is to benchmark how far a complex ensemble can reduce payout error under severe class imbalance, not to masquerade as a simple transparent trigger.

Trained run

Loading trained checkpoint metadata.

How To Read This Page

Start with Results to see the held-out performance that matters for this model family.

Use Live case inference to understand what the model would recommend for one insured event.

Read Training data and Local implementation for realism and replication caveats.

Training data

The first executable version will use a Men-shaped synthetic hurricane dataset with payout bounds and severe imbalance.

Synthetic data will preserve the classification and regression task split.
This path is for API and artifact smoke tests before a real catastrophe-style dataset is wired in.
Current outputs are placeholders, not trained ensemble metrics.

Local implementation

A dedicated Men backend trainer now saves a versioned sklearn ensemble artifact through the shared model registry.
The adapter serves real calibration and case-level inference from that saved ensemble manifest.
This first implementation preserves the paper structure with sklearn GBDT-style models because the repo does not currently ship XGBoost or CatBoost.

Model flow

Step 1

Cat-model features

→

Step 2

Classifier track

→

Step 3

Regression track

→

Step 4

Vote ensemble

→

Step 5

Payout output

Mathematical model

Track one classifies events into non-payable, partially payable, and fully payable classes using XGBoost-based classifiers.
Track two predicts continuous losses for payable events using gradient-boosting regressors, then maps those predictions back into payout classes using lower and upper bounds.
Final class and payout decisions are formed through a voting and payout-mapping ensemble, with total absolute error as a key evaluation target.

Equation

final_class = vote(c_two_stage, c_three_class, class(reg(x))), payout = g(final_class, reg(x), lb, ub)

Architecture presentation

The classification and regression tracks are trained separately on the same frozen split.

Regression predictions are also converted into a tertiary class vote using payout bounds.

The final decision combines all three views into one payout-class and payout-amount output.

Pros

Matches the paper’s actual ensemble structure instead of replacing it with an unrelated neural net.
Handles partial-pay events explicitly, which matters for basis-risk analysis.
Maps well to ParaEval’s need for class, amount, and rationale in one response.

Cons

More complex artifact management because multiple submodels must be versioned together.
Depends on richer event engineering than the current six-feature demo schema.
Not intended to be presented as a transparent deployable trigger without simplification.

Results

Lower Payout Bound

—

Minimum modeled payout level treated as a partial-pay event.

Upper Payout Bound

—

Modeled full-pay boundary used by the ensemble vote.

Payout MAE

—

Average absolute payout error on the held-out split.

Basis Risk

—

Mismatch rate between payable vs non-payable outcomes.

Boundary F1

—

Balanced quality of the payable vs non-payable decision.

Performance Snapshot

End-user summary of how reliable the current model looks on held-out examples.

Trigger precision

—

When the model says payable, this estimates how often that call is right.

Trigger recall

—

When a payable event really happens, this estimates how often the model catches it.

Basis-risk exposure

—

Higher is better here. It reflects fewer false triggers and fewer missed payable events.

Payout fit

—

Higher is better here. It compresses payout error into a simple quality indicator for the UI.

Example cases

Live case inference

Hong Kong Cold-Chain Warehouse — Typhoon Saola

Running model inference for the selected case.

Architecture

Cat-model event table with nearest-observation and descriptive-stat features
Two-stage and three-class XGBoost classification tracks
GBDT regression track for partial-pay events
Vote-based ensemble and payout mapping

Inputs

Hurricane event rows derived from stochastic catalog observations
Distance, MWS, MSLP, RMW, descriptive statistics, and interaction features
Lower and upper payout bounds for class mapping

Outputs

Predicted payout class
Predicted payout amount
Ensemble vote trace and regression-derived payout rationale