The Question
I’ve been working on replacing static stop rules with a data-driven, regime-aware stop framework.
The core question is simple:
At what point does a trade become statistically unlikely to recover?
Rather than guessing, I modeled probability of recovery as a function of MAE-to-date and market context, and then asked multiple models the same question.
What surprised me was not the answer — but how consistent it was.
The Setup
Data
- Instrument: ES (1-minute bars)
- Regime: PA-FIRST
- Sample size: 639 trades
- Features:
- MAE-to-date (true adverse excursion using high/low)
- ATR (1m)
- Distance from EMA
- EMA slope
- Minutes in trade
- Regime (one-hot encoded)
Label
A trade is considered recovered if final PnL > 0.
The models predict:
P(recover | current MAE-to-date, context)
Models Compared
I trained and evaluated two nonlinear models:
1. Random Forest
- Strong baseline
- Handles nonlinearity and interactions
- Often criticized for instability
2. Gradient Boosted Trees (Histogram-based)
- Faster convergence
- Strong bias control
- Often outperforms RF on tabular data
Both models were trained identically:
- Grouped by
trade_id(no leakage) - Same features
- Same probability threshold extraction logic
How the Stop Level Is Derived
Instead of using the model output directly, I apply a policy extraction step:
- Predict
P(recover)at each 1-minute snapshot - Bin snapshots by MAE-to-date
- Find the first MAE level where:
mean P(recover) < 0.20
- Use that MAE as the model-derived max stop level
This turns a probabilistic model into a deterministic, auditable risk rule.
The Result
Both models independently produced the same stop level:
| Regime | Threshold | Max Stop (pts) | Observations |
|---|---|---|---|
| PA-FIRST | P(recover) < 0.20 | ** 9.9** | 639 |
This is remarkably close to the heuristic I had previously derived by hand:
- Caution zone ≈ 9.5 pts
- Hard failures accelerate ≈ 10–11 pts
- Kill switch ≈ 12 pts
Why This Matters
When different model families agree, it usually means:
- The signal is structural, not model-specific
- MAE-to-date is the correct axis
- The decision boundary is stable
- The result is unlikely to be a coincidence
In other words:
This stop level is being discovered, not fit.
Design Implications
In live trading, this becomes:
- Model exit: MAE-to-date ≈ 9.9 pts
- Hard kill switch: 12.0 pts (safety backstop)
- Execution floor: small buffer (e.g. 0.5 pts) to avoid noise
The model doesn’t replace discipline — it quantifies it.
A Subtle but Important Insight
This approach does not require loading models in production.
The models are used offline to learn regime-conditioned stop policies, which are then written to a database and consumed by the live execution engine.
That keeps live systems:
- simpler
- safer
- easier to reason about
What’s Next
This was just PA-FIRST.
The real test (and likely divergence) comes with:
- ATM-FIRST trades
- higher volatility regimes
- time-conditioned policies (early vs late trade)
- asymmetric logic (tighten vs exit)
But the takeaway stands:
If Random Forests and Gradient Boosting agree on the same stop level, the market is telling you something worth listening to.
This post is part of an ongoing effort to replace intuition-driven trading rules with observable, testable system behavior.