Sticks V6 — Intelligence Operating System

The Core Principle

Intelligence must be earned. Every behavior change, threshold adjustment, archetype trust level, and timing rule must be backed by structured evidence collected over a meaningful observation period. The system does not self-modify based on small samples, short-term wins, or outcome-driven logic. It earns the right to adapt through disciplined evidence collection, structured review, and stage-gated progression.

PHASE 1 — TRAINING PERIOD ARCHITECTURE

Stage-Gate Model

The system progresses through five formal stages. Each stage has entry requirements, allowed behaviors, forbidden behaviors, and graduation criteria. No stage can be skipped.

Stage 1: OBSERVATION (Bets 0–100)

Purpose: Collect raw decision data. Label everything. Change nothing.

Allowed: - Place paper bets using current model (Poisson + 8% edge threshold + Kelly) - Record all decisions (enter, wait, monitor, reject) with full context - Label archetypes manually after each bet - Tag regime conditions for each slate - Record CLV for every placed bet - Record timing of entry relative to kickoff - Record contradiction count (model vs qualitative signals) - Record confidence score at time of decision - Score process quality 1-5 after settlement

Forbidden: - Changing edge thresholds - Changing Kelly fraction - Changing market inclusion/exclusion - Trusting any archetype pattern - Adjusting timing rules - Promoting mastery levels beyond "Observation" - Using archetype performance to influence decisions

Data volume required: 100 decisions minimum (includes enters, waits, monitors, rejects)

Graduation criteria: - 100+ labeled decisions - 50+ settled bets with CLV recorded - 30+ rejection decisions with reasoning documented - 15+ post-bet reviews completed with process scores - All archetype tags assigned to settled bets - All regime tags assigned to slates

Stage 2: STRUCTURED REVIEW (Bets 100–250)

Purpose: Begin pattern recognition. Score process quality systematically. Identify which archetypes have enough data to analyze.

Allowed: - Continue all Stage 1 collection - Run archetype performance reports (win rate, CLV, by archetype) - Run timing analysis (CLV vs entry-time-before-kickoff) - Run rejection quality analysis (were rejected bets correctly rejected?) - Classify bets as good-process or bad-process independent of outcome - Identify the top 3 and bottom 3 archetypes by CLV - Score review quality for each review

Forbidden: - Changing any decision thresholds - Trusting archetypes as validated - Adjusting timing behavior - Unlocking mastery beyond "Review" - Weighting decisions by archetype performance

Data volume required: 250 total decisions, 150 settled bets

Graduation criteria: - All Stage 1 criteria maintained - 150+ settled bets with CLV - 100+ post-bet reviews completed - 50+ rejection reviews completed - Archetype performance report run on minimum 5 archetypes with 10+ samples each - CLV distribution analyzed (positive CLV % identified) - Timing analysis completed (optimal entry window identified with data)

Stage 3: CALIBRATION (Bets 250–500)

Purpose: Begin quantifying what the evidence actually says. Calculate confidence intervals. Identify what deserves trust.

Allowed: - Calculate per-archetype CLV with confidence intervals - Calculate per-market CLV - Calculate per-regime performance - Calculate model calibration (predicted probability vs actual frequency) - Identify statistically significant patterns (minimum 30 samples per bucket) - Flag archetypes for "emerging trust" status (positive CLV, 30+ samples, CLV > 0 at 70% confidence) - Begin scoring timing mastery - Begin scoring rejection mastery

Forbidden: - Changing edge thresholds based on calibration alone - Treating emerging patterns as validated - Lowering minimum edge for "promising" archetypes - Adjusting staking for specific archetypes - Any aggressive self-modification

Data volume required: 500 decisions, 300 settled bets

Graduation criteria: - 300+ settled bets - Model calibration report completed - Per-archetype CLV calculated for all archetypes with 20+ samples - Per-market CLV calculated - Timing analysis shows identifiable optimal windows - Rejection analysis shows measurable discipline quality - Overall CLV positive across the full sample (even if barely) - No evidence of systematic model failure in any major market

Stage 4: CONTROLLED TRUST (Bets 500–1000)

Purpose: Begin making controlled, evidence-backed adjustments. Small changes with monitoring.

Allowed: - Promote archetypes with 50+ samples and consistent positive CLV to "validated" status - Adjust minimum edge threshold by ±1% for validated archetypes only - Adjust timing rules based on 200+ data points showing clear optimal windows - Flag "decaying" archetypes (were positive, trending negative over last 50 bets) - Begin unlocking mastery tracks beyond "Calibration" - Allow archetype-specific confidence adjustments (±5% only)

Forbidden: - Wholesale threshold changes - Trusting any archetype with fewer than 50 samples - Ignoring CLV in favor of win rate - Adjusting staking beyond ±0.5% of Kelly fraction - Treating any single month's performance as evidence

Data volume required: 1000 decisions, 600 settled bets

Graduation criteria: - 600+ settled bets - Positive CLV over rolling 200-bet window - Model calibration within 3% on all major probability buckets - At least 3 validated archetypes - Timing mastery score above threshold - Rejection mastery score above threshold - No systemic failure detected in any 100-bet window

Stage 5: MATURE INTELLIGENCE (1000+ Bets)

Purpose: The system has earned the right to adapt. Changes are still evidence-gated but the range of allowed adjustments expands.

Allowed: - Full archetype trust system (validated, emerging, unvalidated, decaying, invalidated) - Adaptive edge thresholds per archetype (range: 5%–12%) - Timing-based entry rules (wait vs enter now) - Regime-aware confidence adjustments - Mastery-gated unlocks - Active rejection as a scored skill - Full review-driven doctrine updates - Controlled experimental bets (5% of bankroll max, tagged as experimental)

Still forbidden: - Ignoring CLV - Outcome-only optimization - Removing safety thresholds entirely - Trusting any pattern with fewer than 30 samples - Self-modifying without review evidence

PHASE 2 — GAMIFIED KNOWLEDGE BASE ARCHITECTURE

Structure

1. Core Doctrine

Stores: The unchangeable rules. Edge threshold minimums, Kelly cap, risk limits, market exclusions, no-accas rule. Updated: Only by human operator decision, never by the system. Used by: RISK, TRADER. Training vs Mature: Identical in both modes.

2. Decision Doctrine

Stores: How decisions are made. Verdict taxonomy (enter/wait/monitor/reject), evidence requirements, confidence scoring rubric, contradiction handling rules. Updated: After Stage 3 calibration, via review evidence. Used by: ANALYST, CLAUDE. Training: Read-only. Mature: Updatable with review evidence.

3. Timing Doctrine

Stores: When to enter. Optimal entry windows by market type, CLV-vs-timing curves, wait-vs-enter rules, stale-edge definitions. Updated: After Stage 3 with 200+ timing data points. Used by: TRADER. Training: Collect data only. Mature: Active timing rules.

4. Rejection Doctrine

Stores: When and why to say no. Rejection classes, pass/monitor/reject criteria, false-edge signatures, contradiction thresholds. Updated: After Stage 2 with rejection review data. Used by: ANALYST, RISK. Training: Collect rejections and reasons. Mature: Active rejection scoring.

5. Risk Doctrine

Stores: Exposure rules, stop-loss logic, defensive mode triggers, max stake rules, correlation limits. Updated: Only by human operator. Used by: RISK. Training vs Mature: Identical.

6. Review Doctrine

Stores: How reviews are conducted. Process scoring rubric, outcome-bias prevention rules, good-loss/bad-win classification criteria. Updated: Continuously refined based on review quality scores. Used by: SETTLER. Training: Active from Stage 1. Mature: Full review engine.

7. Archetype Library

Stores: Recurring bet pattern definitions. Signature conditions, trust level, evidence count, CLV, invalidation signs. Updated: Via settled bet classification and review. Used by: ANALYST. Training: Labels assigned, no trust granted. Mature: Trust levels active.

8. Failure Pattern Library

Stores: Known failure modes. Steam-chasing, false-positive signatures, model-weak spots, timing traps. Updated: Via post-bet review when process score is low. Used by: ANALYST, RISK. Training: Collect examples. Mature: Active warnings.

9. Regime Library

Stores: Slate-level conditions. High-volatility, quiet, public-bias, lineup-chaos, efficient-market. Updated: Via slate-level tagging and performance analysis. Used by: ANALYST, RISK. Training: Tag regimes, no behavior changes. Mature: Regime-aware adjustments.

10. League Intelligence

Stores: Per-league model accuracy, CLV patterns, market efficiency. Updated: Via league-bucketed performance analysis. Used by: ANALYST. Training: Collect. Mature: League-specific confidence adjustments.

11. Market Behavior Knowledge

Stores: Per-market-type performance. Which markets have positive CLV? Which are efficient? Updated: Via market-bucketed analysis. Used by: ANALYST, TRADER. Training: Collect. Mature: Market-specific edge thresholds.

12. Calibration Layer

Stores: Model accuracy by probability bucket. Predicted 60% → actual 58%? Predicted 40% → actual 35%? Updated: After Stage 3 with 300+ bets. Used by: ANALYST. Training: Calculate but don't act. Mature: Adjust model confidence.

PHASE 3 — GAMIFICATION MODEL

XP Categories

XP is earned for process quality, not outcomes. Each bet decision generates XP across multiple categories.

Category	Earns XP	Loses XP
Edge Quality	Positive CLV	Negative CLV on "strong" confidence bets
Timing	Entry CLV > 0 (beat the closing line)	Entered after value eroded (CLV < -0.10)
Rejection	Correctly rejected bet that would have lost	False rejection of bet that would have won with good process
Discipline	Followed all doctrine rules	Violated stop-loss, exposure limits, or override rules
Review	Completed thorough review with process score ≥ 4	Shallow review (score ≤ 2) or skipped review
Archetype	Correctly identified archetype that matches outcome pattern	Misidentified archetype
Regime	Correctly tagged regime and adjusted behavior	Ignored regime signals
Calibration	Confidence prediction within 5% of actual	Overconfidence (predicted 70%, actual < 50%)

Anti-Results-Bias Rules

A bad bet that wins (poor process, negative expected CLV, ignored contradictions) earns ZERO XP for the win and receives a process penalty.
A good bet that loses (strong process, positive CLV, correct archetype, good timing) earns FULL process XP.
A great pass (correctly rejected a losing bet) earns rejection XP equal to a winning bet's edge XP.
XP from outcomes (win/loss) is weighted at 20%. XP from process metrics is weighted at 80%.

Training Period Restrictions

During Stages 1-2: XP is tracked but no mastery levels unlock.
During Stage 3: First mastery levels can unlock (Observation → Calibration rank).
During Stages 4-5: Full mastery progression active.

PHASE 4 — MASTERY SYSTEM

Tracks

1. Edge Detection

Measures: Accuracy of edge identification. Are high-confidence bets actually higher-probability? Levels: Observer → Detector → Evaluator → Expert → Master Unlocks at Expert: Archetype-specific edge adjustment (±1%) Locked until: 200 settled bets with confidence scores

2. Timing

Measures: CLV quality. Are entries beating the closing line? Levels: Observer → Timer → Strategist → Expert → Master Unlocks at Expert: Wait-vs-enter recommendations active Locked until: 150 bets with CLV data

3. Rejection Discipline

Measures: Quality of "no" decisions. Rejection accuracy rate. Levels: Observer → Filter → Guardian → Expert → Master Unlocks at Expert: Auto-rejection of invalidated archetypes Locked until: 100 rejection decisions reviewed

4. Risk Discipline

Measures: Compliance with risk rules. Stop-loss respect, exposure management. Levels: Observer → Compliant → Disciplined → Expert → Master Unlocks at Expert: Adaptive exposure limits based on regime Locked until: 200 decisions with full risk compliance tracking

5. Review Quality

Measures: Depth and accuracy of post-bet reviews. Levels: Observer → Reviewer → Analyst → Expert → Master Unlocks at Expert: Review-driven doctrine updates Locked until: 100 reviews completed with quality scores

6. Archetype Recognition

Measures: Accuracy of archetype tagging. Do tagged archetypes perform as expected? Levels: Observer → Tagger → Pattern Reader → Expert → Master Unlocks at Expert: Archetype trust system active Locked until: 50 bets per archetype for top 5 archetypes

7. Regime Awareness

Measures: Accuracy of regime detection. Does identified regime correlate with performance patterns? Levels: Observer → Detector → Strategist → Expert → Master Unlocks at Expert: Regime-aware confidence adjustment Locked until: 30 slates per regime type

8. Confidence Calibration

Measures: Alignment between predicted probability and actual outcome frequency. Levels: Observer → Estimator → Calibrator → Expert → Master Unlocks at Expert: Model confidence bias correction Locked until: 300 bets with calibration analysis

PHASE 5 — KNOWLEDGE OBJECTS

1. Decision Record

{
  id: string,
  timestamp: ISO,
  fixture_id: number,
  match: string,
  verdict: "enter" | "wait" | "monitor" | "reject",
  market: string,
  model_prob: float,
  market_odds: float,
  edge_pct: float,
  confidence: 1-10,
  timing_score: 1-5,
  archetype_tag: string,
  regime_tag: string,
  contradiction_count: int,
  invalidation_triggers: string[],
  supporting_evidence: string[],
  contradictory_evidence: string[],
  claude_verdict: "approve" | "reject" | null,
  claude_confidence: 1-10 | null,
  claude_reasoning: string | null,
  risk_status: string,
  execution_readiness: 1-5,
  evidence_sufficiency: 1-5,
  trust_status: "unvalidated" | "emerging" | "validated" | "decaying",
  training_stage: 1-5,
  created_by: "ANALYST"
}

Written: Every time ANALYST evaluates a match. Active: All stages.

2. Review Record

{
  id: string,
  decision_id: string,
  settled_at: ISO,
  result: "WON" | "LOST",
  pnl: float,
  clv: float,
  process_score: 1-5,
  thesis_quality: 1-5,
  evidence_quality: 1-5,
  contradiction_handling: 1-5,
  timing_quality: 1-5,
  discipline_quality: 1-5,
  archetype_fit: 1-5,
  stake_appropriateness: 1-5,
  classification: "good-win" | "good-loss" | "bad-win" | "bad-loss" | "great-pass" | "correct-wait" | "correct-reject" | "false-reject" | "missed-entry",
  lessons: string[],
  xp_awarded: { edge: int, timing: int, rejection: int, discipline: int, review: int, archetype: int },
  mastery_impact: { track: string, delta: int }[],
  reviewed_by: "SETTLER",
  reviewed_at: ISO
}

Written: After every settled bet and every reviewed rejection. Active: All stages (Stage 1: simplified scores only).

3. Archetype Record

{
  id: string,
  name: string,
  description: string,
  signature_conditions: string[],
  trust_status: "unvalidated" | "emerging" | "validated" | "decaying" | "invalidated",
  sample_count: int,
  win_rate: float,
  avg_clv: float,
  avg_edge: float,
  positive_clv_pct: float,
  ideal_timing: string,
  invalidation_signs: string[],
  risk_flags: string[],
  last_updated: ISO,
  last_performance_window: { wins: int, losses: int, clv: float, window_size: int },
  trust_earned_at_sample: int | null,
  trust_decay_trigger: string | null
}

Written: After classification during review. Updated on every new sample. Active: Labeling from Stage 1. Trust from Stage 4.

4. Regime Record

{
  id: string,
  name: string,
  description: string,
  detection_indicators: string[],
  sample_slates: int,
  avg_clv_during: float,
  avg_win_rate_during: float,
  confidence_adjustment: float,  // e.g., -0.05 = reduce confidence by 5%
  timing_adjustment: string,
  rejection_threshold_adjustment: float,
  trust_status: "unvalidated" | "emerging" | "validated",
  last_updated: ISO
}

Active: Tagging from Stage 1. Adjustments from Stage 4.

5. Rejection Record

{
  id: string,
  decision_id: string,
  reason_class: "fake-edge" | "stale-edge" | "model-weak" | "contradiction" | "bad-timing" | "low-liquidity" | "low-trust-archetype" | "weak-evidence" | "value-gone" | "regime-caution",
  edge_at_rejection: float,
  would_have_result: "WON" | "LOST" | "UNKNOWN",
  quality_score: 1-5,
  xp_awarded: int,
  reviewed: boolean
}

Active: All stages.

6. Timing Record

{
  id: string,
  decision_id: string,
  entry_time_before_ko: int,  // minutes
  odds_at_entry: float,
  closing_odds: float,
  clv: float,
  timing_grade: "excellent" | "good" | "acceptable" | "late" | "stale",
  market_type: string,
  lesson: string | null
}

Active: All stages.

PHASE 6 — DECISION INTELLIGENCE FRAMEWORK

Every candidate match produces a formal Decision Object:

DECISION:
  verdict: enter | wait | monitor | reject

  EDGE ASSESSMENT:
    model_probability: 0.62
    market_implied: 0.53
    edge: 9.0%
    confidence: 7/10
    evidence_sufficiency: 4/5

  TIMING ASSESSMENT:
    timing_score: 4/5
    entry_window: "optimal (2h pre-KO)"
    line_movement: "stable"
    stale_risk: "low"

  CONTEXT:
    archetype: "strong-home-value"
    archetype_trust: "emerging (38 samples, CLV +0.04)"
    regime: "normal"
    contradiction_count: 1
    contradictions: ["Away team unbeaten in last 5 away"]
    invalidation_triggers: ["Key home striker injured", "Heavy rain forecast"]

  RISK:
    exposure_impact: "+3.00 (total: 18/800)"
    risk_status: "ACTIVE"
    stake: 3.00 (quarter Kelly)

  REVIEW:
    review_priority: "standard"
    learning_value: "medium"
    mastery_tracks: ["edge_detection", "archetype_recognition"]

  TRUST:
    trust_level: "emerging"
    training_stage: 3
    evidence_count: 38
    behavior_unlocked: false

PHASE 7 — BET ARCHETYPES

Archetype Definitions

strong-home-value

Signature: Home team with xG > 1.5, home odds > model fair odds by 8%+, strong home form (4W+ in last 6 home) Why it exists: Markets sometimes undervalue consistent home performers in mid-table matchups Trustworthy when: Home team not in congested fixture schedule, no key absences, normal conditions Dangerous when: Cup hangover, nothing-to-play-for end-of-season, key defender missing Ideal timing: 1-3 hours before kickoff (lines stable) Invalidation: Late team news changing expected lineup significantly Training start: Sample count 0. Trust: unvalidated. Trust threshold: 40 samples, CLV > +0.03, positive CLV in 55%+ of bets

high-xG-totals-edge

Signature: Combined xG > 3.0, Over 2.5 odds > 1.60, both teams averaging 1.3+ goals Why it exists: Poisson model occasionally finds genuine over-pricing in total goals markets Trustworthy when: Both teams in attacking form, no weather concerns, open-play styles Dangerous when: One team is defensive-specialist or playing for a draw Ideal timing: Early (6h+ before KO) when lines haven't corrected Invalidation: One team announces ultra-defensive setup or key attacker out

inflated-away-price

Signature: Away team odds > model fair price by 10%+, away team in strong form Why it exists: Markets historically overvalue home advantage for some team profiles Trustworthy when: Away team is genuinely strong away from home (check away-specific form) Dangerous when: Small sample of away games, hostile venue, derby NOTE: Model currently weak on away predictions. Extra caution. Higher trust threshold. Trust threshold: 60 samples (higher than standard due to known model weakness)

draw-trap

Signature: Model says edge on draw, both teams evenly matched, low xG Why it exists: Draws are hardest to predict. Bookmakers price draws conservatively. Trustworthy when: Almost never with current model quality Dangerous when: Always — the model is weakest here Training behavior: Collect data but flag as high-risk. Do not trust until Stage 5 minimum.

stale-edge-false-positive

Signature: Edge existed 6h+ ago but odds have moved toward model's predicted probability Why it exists: The market corrected. The edge is gone. This is a FAILURE pattern, not a bet type. Action:** Reject. Award rejection XP.

qualitative-veto-override

Signature: Model shows edge but Claude (or human) rejects on qualitative grounds Why it exists: Injuries, motivation, tactical mismatches the model can't see Trust threshold: Track veto accuracy. If vetoes are correct 60%+ of the time, increase Claude's weight.

PHASE 8 — TIMING INTELLIGENCE

Timing States

State	Definition	Action
Early Value	Edge identified 6h+ before KO, line stable	Enter if edge > 10%. Wait if 8-10%.
Optimal Window	1-3h before KO, line stable, edge confirmed	Primary entry zone.
Late Confirmation	<1h before KO, line moved toward our position	Enter — market agrees. CLV likely positive.
Stale Edge	Edge existed earlier but odds moved against us	Do NOT enter. Value is gone. Reject.
Market Ahead	Line moved past our model's price	Pass. Market is smarter on this one.
In-Play Early	15-20 min in, stats confirming thesis	Monitor. Enter only if live data strongly supports.
In-Play Late	60+ min, reduced time value	Generally avoid. Exception: extreme mispricing.

Timing Grades

Grade	CLV Range	Meaning
A	CLV > +0.10	Excellent timing — significantly beat closing line
B	CLV +0.03 to +0.10	Good timing — beat the line
C	CLV -0.03 to +0.03	Neutral — entered at about fair value
D	CLV -0.10 to -0.03	Late — entered after partial value erosion
F	CLV < -0.10	Stale entry — should have been rejected

Timing XP

Grade A: +15 XP
Grade B: +10 XP
Grade C: +3 XP
Grade D: -5 XP
Grade F: -15 XP
Correct stale-edge rejection: +12 XP

PHASE 9 — REJECTION INTELLIGENCE

Rejection Classes

Class	Trigger	Priority
Fake Edge	Model error, data anomaly, odds display error	Critical — log for model review
Stale Edge	Value eroded, odds moved	Standard — timing discipline
Model Weak	Away win spots, draw spots, known model limitations	Standard — discipline
Contradiction	3+ contradicting factors vs model	Elevated — review carefully
Bad Timing	Too late, too early for confidence, in-play decay	Standard
Low Liquidity	Thin Betfair book, wide spreads	Standard
Low Trust Archetype	Archetype is unvalidated or decaying	Elevated — wait for evidence
Weak Evidence	Edge exists but supporting data is thin	Standard
Value Gone	Line moved past break-even	Critical — never enter
Regime Caution	Current regime is low-confidence period	Elevated

Great Pass Recognition

A "great pass" is a rejection where: 1. The rejected bet would have lost AND 2. The process for rejecting was documented with a clear reason AND 3. The rejection class was appropriate

Great passes earn edge-level XP (15-25 XP) and advance Rejection Discipline mastery.

PHASE 10 — REVIEW LOOP

Classification Matrix

Process	Outcome	Classification	XP Impact
Strong process, documented	Won	Good Win	Full XP: edge + timing + process
Strong process, documented	Lost	Good Loss	Process XP preserved, edge XP neutral
Weak process, undocumented	Won	Bad Win	Zero XP, process penalty
Weak process, undocumented	Lost	Bad Loss	Process penalty, learning requirement
Strong rejection reasoning	Would have lost	Great Pass	Rejection XP = edge win XP
Correct wait decision	Better entry found later	Correct Wait	Timing XP
Rejected, would have won	But process was correct	Correct Reject	Neutral (variance)
Rejected, would have won	Process was wrong	False Reject	Small penalty, review required

Process Score Rubric (1-5)

Score	Meaning
5	Perfect: clear thesis, evidence documented, contradictions addressed, timing optimal, stake appropriate, review thorough
4	Strong: minor gap in one area but overall disciplined
3	Adequate: followed basic doctrine but missed contextual factors
2	Weak: skipped contradiction check, poor timing, inadequate evidence
1	Failure: chased, ignored signals, violated doctrine

What Review Updates

Decision Record → marked as reviewed
Archetype Record → sample count updated, CLV recalculated
Timing Record → timing grade assigned
Mastery Tracks → XP awarded per track
Failure Library → new entry if process score ≤ 2
Rejection quality → if this was a rejection that was reviewed

During Training (Stages 1-3): Review updates records but does NOT change any decision behavior. During Mature (Stages 4-5): Review can trigger archetype trust changes, timing rule updates, and doctrine amendments.

PHASE 11 — REGIME AWARENESS

Regime Types

Normal

Indicators: Standard slate, 5-8 matches, no unusual external factors Adjustments: None — baseline behavior

High Volatility

Indicators: Cup weeks, international breaks ending, early season Adjustments: Reduce confidence by 5%, increase min edge to 10%, reduce stake by 20%

Public Bias

Indicators: Marquee matchups (El Clasico, derby days), heavy public betting Adjustments: Increased value on contrarian positions, away/under markets

Lineup Chaos

Indicators: Multiple teams with 3+ changes, cup rotation suspected Adjustments: Reduce confidence, increase rejection rate, wait for confirmed lineups

Efficient Market

Indicators: CLV consistently near zero across last 20 bets, no mispricing detected Adjustments: Reduce activity, raise min edge to 12%, focus on rejection quality

Strong Model Period

Indicators: CLV positive in 60%+ of last 30 bets, model calibration tight Adjustments: Moderate confidence boost (max +5%), slightly larger stakes allowed

PHASE 12 — HUD / MOBILE SURFACING

Match Card (Live Screen)

Barcelona v Espanyol                    2-1  67'
La Liga · CONTROL

Edge 9.3% · Confidence 7 · Timing: OPTIMAL
Archetype: strong-home-value (emerging, 38 samples)
Regime: normal
Contradictions: 1 · Evidence: 4/5
Trust: EMERGING · Stage: 3

Portfolio Position Card

B  Barcelona v Espanyol · Over 2.5 @1.95
   Edge 9.3% · CLV +0.08 · Timing A
   Process: 4/5 · Archetype: high-xG-totals
   Trust: emerging · Review: pending

Profile — Mastery Dashboard

MASTERY TRACKS
  Edge Detection     ████████░░  Expert (Level 4)
  Timing             ██████░░░░  Strategist (Level 3)
  Rejection          █████████░  Expert (Level 4)
  Risk Discipline    ██████████  Master (Level 5)
  Review Quality     ███████░░░  Analyst (Level 3)
  Archetypes         ████░░░░░░  Pattern Reader (Level 3)

TRAINING STATUS: Stage 3 — Calibration
  Bets: 287/500 to Stage 4
  CLV+: 54% (target: 55%+)
  Reviews: 142 completed

PHASE 13 — IMPLEMENTATION ROADMAP

Step 1: Decision Records (Week 1)

Add decision_records table to sportai.db. Record every ANALYST evaluation with verdict, confidence, archetype tag, regime tag, contradiction count. This is the foundation. Everything else builds on having structured decisions.

Step 2: Review Records (Week 2)

Add review_records table. After each settled bet, create a review with process score, classification (good-win / bad-win / good-loss / bad-loss), CLV, timing grade. Start with simplified scoring (1-5 overall).

Step 3: Rejection Records (Week 2)

Add rejection_records table. When ANALYST or RISK rejects a bet, log the reason class and the edge that was rejected. Track the would-have-result after settlement for rejection quality scoring.

Step 4: Timing Records (Week 3)

Add timing_records table. For every placed bet, record time-before-kickoff, entry odds, closing odds, CLV. Calculate timing grades.

Step 5: Archetype Tags (Week 3)

Define the initial 10 archetypes (from Phase 7). Add archetype tagging to the auto-bet pipeline. Every bet gets an archetype label based on signature conditions.

Step 6: Process-Weighted XP (Week 4)

Replace the current XP system (win = 15-25, loss = 2) with the process-weighted system from Phase 3. XP = 80% process + 20% outcome.

Step 7: Stage Tracking (Week 4)

Add training_stage to bankroll.json. Start at Stage 1. Track progress toward Stage 2 graduation criteria.

Step 8: Mastery Tracks (Week 5)

Add mastery track scores to bankroll.json. Start all tracks at "Observer" level. Track XP per track. Display in Profile screen.

Step 9: Regime Tagging (Week 6)

Add regime detection to the analysis pipeline. Tag each slate. Start collecting regime performance data.

Step 10: Calibration Reports (Week 8+)

After 200+ bets, run first calibration analysis. Model probability vs actual frequency. Per-archetype CLV. Per-market CLV. Timing analysis.

What Remains Locked

Feature	Unlocks At
Archetype trust adjustments	Stage 4 (500+ bets)
Edge threshold changes	Stage 4 (500+ bets)
Timing-based entry rules	Stage 3 (250+ bets with timing data)
Regime-aware adjustments	Stage 4 (500+ bets)
Experimental bets	Stage 5 (1000+ bets)
Adaptive Kelly	Stage 5 (1000+ bets)
Auto-rejection of invalidated archetypes	Stage 4 (Rejection mastery: Expert)

How to Validate

The intelligence system is working if: 1. CLV improves over rolling 100-bet windows 2. Rejection accuracy improves (rejected bets lose at higher rate than entered bets) 3. Timing grades improve (more A/B, fewer D/F) 4. Process scores improve (average trending toward 4+) 5. Great-pass count increases 6. Bad-win count decreases (fewer undisciplined wins) 7. Archetype tagging becomes predictive (tagged archetypes perform as expected) 8. Model calibration improves (predicted probabilities match actual frequencies more closely)

If these metrics are NOT improving after 500 bets, the intelligence system needs human review and potential model redesign — not more automation.

This document is the brain, memory, progression ladder, and calibration engine for Sticks Ultimate V6. Intelligence must be earned.