Trade Observations
Trade Observations
Stop Guessing and Start Observing

Adding Random Forest to Machine A: Dual-Model GTO Signal Comparison

April 24, 2026
#Machine A#Random Forest#Keras#Futures Trading#Systematic Trading#Trade Observations

Adding Random Forest to Machine A: Dual-Model GTO Signal Comparison

One of the long-term goals of Trade Observations is not just to automate trade decisions, but to continuously improve how those decisions are made.

Machine A is responsible for generating the primary 5-minute GTO signal that drives execution decisions for ES/MES futures. Until now, that signal was produced by a TensorFlow/Keras model that returns a probability representing:

Probability the next close is up

That model has been stable and useful, but it raised an important question:

Is Keras actually the best model for this job?

To answer that properly, I added a second model to Machine A:

Random Forest

Instead of replacing the Keras model immediately, the better engineering decision was to run both models side by side and compare them under identical market conditions.

This creates a much stronger decision framework:

  • same instrument
  • same 5-minute bars
  • same feature set
  • same labels
  • same thresholds
  • same production environment

Only the model changes.

That gives a true apples-to-apples comparison.


Existing Machine A Architecture

Machine A currently follows this flow:

NinjaTrader → MSMQ / RTD → Python → Model Inference → Excel (PyXLL) → Signal Push → Database

The Keras model already produces:

  • probability of next close being up
  • directional action (Long, Short, Flat)
  • Kelly sizing
  • expectancy

That information is written to Excel, logged to SQLite, and pushed into the downstream execution workflow.

The system supports both:

  • PA-FIRST
  • ATM-FIRST

execution frameworks.


Why Random Forest

Random Forest offers a very different modeling approach compared to neural networks.

Keras strengths

  • excellent for nonlinear relationships
  • flexible architecture
  • supports sequence-style modeling
  • strong probability output

Random Forest strengths

  • fast to train
  • highly interpretable
  • robust against noisy features
  • strong feature importance analysis
  • easier overfitting inspection

Most importantly:

Random Forest gives visibility into why predictions happen.

That matters when the model is driving real capital decisions.


Training Setup

The goal was not to invent a new problem.

The goal was to compare models on the exact same problem.

Shared Features

Both models use the same core 5-minute RTH feature set:

  • Open
  • High
  • Low
  • Close
  • ChopIndex
  • ITD168
  • TRG168
  • ITD-TRG
  • Chopi1BarChg
  • TRG1BarChg
  • EMA
  • PeriodHH
  • PeriodLL
  • ITD-EMA

Shared Label Logic

Labels are generated using ITD and TRG directional agreement:

Label = 1

When:

  • current ITD > previous ITD
  • current TRG > previous TRG

Label = -1

When:

  • current ITD < previous ITD
  • current TRG < previous TRG

Label = 0

Everything else.

For the first comparison, Random Forest was trained as a binary model:

label == 1 → 1
all others → 0

This matches the Keras bullish-probability framework.


Threshold Logic

Both models use the same live thresholds:

prob_up >= 0.62 → Long
prob_up <= 0.38 → Short
otherwise → Flat

This is critical.

Without identical thresholds, comparing models becomes misleading.


Production Integration

The Random Forest model was added to Machine A without disrupting the existing Keras workflow.

What Changed

A second model loader was added to PyXLL:

XL_LOAD_LATEST_RF_FROM_S3()

This loads:

  • RF .joblib model
  • RF metadata JSON

from S3 into Excel process memory.

Live Inference

A new PyXLL macro was added:

XL_RF_PREDICT_FILL_AND_LOG()

This:

  1. reads the latest feature window from the worksheet
  2. reconstructs the exact feature order used during training
  3. performs RF inference
  4. converts probability to GTO action
  5. writes output to Excel
  6. logs predictions to SQLite

The existing Keras macro remains unchanged.

Both now run side by side.


Comparison Database

A shared SQLite database was created:

model_compare.sqlite3

with two tables:

  • keras_predictions
  • rf_predictions

This allows direct comparison of:

  • probabilities
  • directional actions
  • agreement vs disagreement
  • trade frequency
  • non-flat accuracy
  • performance by regime

This is far better than comparing screenshots or spreadsheet cells manually.

It creates a real research workflow.


First RF Results

Initial RF training produced:

Binary Accuracy

77.4%

Signal vs Raw Label Accuracy

62.6%

Non-Flat Signal Accuracy

71.9%

That last number matters most.

It means:

When Random Forest commits to a Long or Short signal, it is correct nearly 72% of the time against the raw label.

That is a strong early result.

Even more important:

The probability distribution is healthy and not collapsing around 0.50.

That means the model is actually making directional decisions—not just hiding in uncertainty.


What Will Be Measured Next

The real question is not:

Which model has better accuracy?

The real question is:

Which model produces better trading decisions?

Those are not always the same thing.

The comparison now focuses on:

Signal Frequency

Does one model overtrade?

Does one stay flat too often?

Agreement Cases

When both models agree:

  • are outcomes stronger?
  • is confidence higher?

Disagreement Cases

When models disagree:

  • which model wins more often?
  • is disagreement itself a useful signal?

Regime Performance

Which model performs better in:

  • trends
  • trading ranges
  • breakout environments
  • PA-FIRST vs ATM-FIRST

Risk Quality

Does one model produce better stop behavior and better downstream trade management?

That matters more than raw probability scores.


Final Thought

This is not about proving Random Forest is better than Keras.

It is about building a better decision engine.

Good trading systems are not built by defending old assumptions.

They are built by measuring alternatives honestly.

Machine A is now capable of doing exactly that.

And that is a much bigger upgrade than simply adding another model.

It turns Machine A into a research platform.

That is where real edge comes from.