Sticks V6 — Intelligence Operating System
The Core Principle
Intelligence must be earned. Every behavior change, threshold adjustment, archetype trust level, and timing rule must be backed by structured evidence collected over a meaningful observation period. The system does not self-modify based on small samples, short-term wins, or outcome-driven logic. It earns the right to adapt through disciplined evidence collection, structured review, and stage-gated progression.
PHASE 1 — TRAINING PERIOD ARCHITECTURE
Stage-Gate Model
The system progresses through five formal stages. Each stage has entry requirements, allowed behaviors, forbidden behaviors, and graduation criteria. No stage can be skipped.
Stage 1: OBSERVATION (Bets 0–100)
Purpose: Collect raw decision data. Label everything. Change nothing.
Allowed: - Place paper bets using current model (Poisson + 8% edge threshold + Kelly) - Record all decisions (enter, wait, monitor, reject) with full context - Label archetypes manually after each bet - Tag regime conditions for each slate - Record CLV for every placed bet - Record timing of entry relative to kickoff - Record contradiction count (model vs qualitative signals) - Record confidence score at time of decision - Score process quality 1-5 after settlement
Forbidden: - Changing edge thresholds - Changing Kelly fraction - Changing market inclusion/exclusion - Trusting any archetype pattern - Adjusting timing rules - Promoting mastery levels beyond "Observation" - Using archetype performance to influence decisions
Data volume required: 100 decisions minimum (includes enters, waits, monitors, rejects)
Graduation criteria: - 100+ labeled decisions - 50+ settled bets with CLV recorded - 30+ rejection decisions with reasoning documented - 15+ post-bet reviews completed with process scores - All archetype tags assigned to settled bets - All regime tags assigned to slates
Stage 2: STRUCTURED REVIEW (Bets 100–250)
Purpose: Begin pattern recognition. Score process quality systematically. Identify which archetypes have enough data to analyze.
Allowed: - Continue all Stage 1 collection - Run archetype performance reports (win rate, CLV, by archetype) - Run timing analysis (CLV vs entry-time-before-kickoff) - Run rejection quality analysis (were rejected bets correctly rejected?) - Classify bets as good-process or bad-process independent of outcome - Identify the top 3 and bottom 3 archetypes by CLV - Score review quality for each review
Forbidden: - Changing any decision thresholds - Trusting archetypes as validated - Adjusting timing behavior - Unlocking mastery beyond "Review" - Weighting decisions by archetype performance
Data volume required: 250 total decisions, 150 settled bets
Graduation criteria: - All Stage 1 criteria maintained - 150+ settled bets with CLV - 100+ post-bet reviews completed - 50+ rejection reviews completed - Archetype performance report run on minimum 5 archetypes with 10+ samples each - CLV distribution analyzed (positive CLV % identified) - Timing analysis completed (optimal entry window identified with data)
Stage 3: CALIBRATION (Bets 250–500)
Purpose: Begin quantifying what the evidence actually says. Calculate confidence intervals. Identify what deserves trust.
Allowed: - Calculate per-archetype CLV with confidence intervals - Calculate per-market CLV - Calculate per-regime performance - Calculate model calibration (predicted probability vs actual frequency) - Identify statistically significant patterns (minimum 30 samples per bucket) - Flag archetypes for "emerging trust" status (positive CLV, 30+ samples, CLV > 0 at 70% confidence) - Begin scoring timing mastery - Begin scoring rejection mastery
Forbidden: - Changing edge thresholds based on calibration alone - Treating emerging patterns as validated - Lowering minimum edge for "promising" archetypes - Adjusting staking for specific archetypes - Any aggressive self-modification
Data volume required: 500 decisions, 300 settled bets
Graduation criteria: - 300+ settled bets - Model calibration report completed - Per-archetype CLV calculated for all archetypes with 20+ samples - Per-market CLV calculated - Timing analysis shows identifiable optimal windows - Rejection analysis shows measurable discipline quality - Overall CLV positive across the full sample (even if barely) - No evidence of systematic model failure in any major market
Stage 4: CONTROLLED TRUST (Bets 500–1000)
Purpose: Begin making controlled, evidence-backed adjustments. Small changes with monitoring.
Allowed: - Promote archetypes with 50+ samples and consistent positive CLV to "validated" status - Adjust minimum edge threshold by ±1% for validated archetypes only - Adjust timing rules based on 200+ data points showing clear optimal windows - Flag "decaying" archetypes (were positive, trending negative over last 50 bets) - Begin unlocking mastery tracks beyond "Calibration" - Allow archetype-specific confidence adjustments (±5% only)
Forbidden: - Wholesale threshold changes - Trusting any archetype with fewer than 50 samples - Ignoring CLV in favor of win rate - Adjusting staking beyond ±0.5% of Kelly fraction - Treating any single month's performance as evidence
Data volume required: 1000 decisions, 600 settled bets
Graduation criteria: - 600+ settled bets - Positive CLV over rolling 200-bet window - Model calibration within 3% on all major probability buckets - At least 3 validated archetypes - Timing mastery score above threshold - Rejection mastery score above threshold - No systemic failure detected in any 100-bet window
Stage 5: MATURE INTELLIGENCE (1000+ Bets)
Purpose: The system has earned the right to adapt. Changes are still evidence-gated but the range of allowed adjustments expands.
Allowed: - Full archetype trust system (validated, emerging, unvalidated, decaying, invalidated) - Adaptive edge thresholds per archetype (range: 5%–12%) - Timing-based entry rules (wait vs enter now) - Regime-aware confidence adjustments - Mastery-gated unlocks - Active rejection as a scored skill - Full review-driven doctrine updates - Controlled experimental bets (5% of bankroll max, tagged as experimental)
Still forbidden: - Ignoring CLV - Outcome-only optimization - Removing safety thresholds entirely - Trusting any pattern with fewer than 30 samples - Self-modifying without review evidence
PHASE 2 — GAMIFIED KNOWLEDGE BASE ARCHITECTURE
Structure
1. Core Doctrine
Stores: The unchangeable rules. Edge threshold minimums, Kelly cap, risk limits, market exclusions, no-accas rule. Updated: Only by human operator decision, never by the system. Used by: RISK, TRADER. Training vs Mature: Identical in both modes.
2. Decision Doctrine
Stores: How decisions are made. Verdict taxonomy (enter/wait/monitor/reject), evidence requirements, confidence scoring rubric, contradiction handling rules. Updated: After Stage 3 calibration, via review evidence. Used by: ANALYST, CLAUDE. Training: Read-only. Mature: Updatable with review evidence.
3. Timing Doctrine
Stores: When to enter. Optimal entry windows by market type, CLV-vs-timing curves, wait-vs-enter rules, stale-edge definitions. Updated: After Stage 3 with 200+ timing data points. Used by: TRADER. Training: Collect data only. Mature: Active timing rules.
4. Rejection Doctrine
Stores: When and why to say no. Rejection classes, pass/monitor/reject criteria, false-edge signatures, contradiction thresholds. Updated: After Stage 2 with rejection review data. Used by: ANALYST, RISK. Training: Collect rejections and reasons. Mature: Active rejection scoring.
5. Risk Doctrine
Stores: Exposure rules, stop-loss logic, defensive mode triggers, max stake rules, correlation limits. Updated: Only by human operator. Used by: RISK. Training vs Mature: Identical.
6. Review Doctrine
Stores: How reviews are conducted. Process scoring rubric, outcome-bias prevention rules, good-loss/bad-win classification criteria. Updated: Continuously refined based on review quality scores. Used by: SETTLER. Training: Active from Stage 1. Mature: Full review engine.
7. Archetype Library
Stores: Recurring bet pattern definitions. Signature conditions, trust level, evidence count, CLV, invalidation signs. Updated: Via settled bet classification and review. Used by: ANALYST. Training: Labels assigned, no trust granted. Mature: Trust levels active.
8. Failure Pattern Library
Stores: Known failure modes. Steam-chasing, false-positive signatures, model-weak spots, timing traps. Updated: Via post-bet review when process score is low. Used by: ANALYST, RISK. Training: Collect examples. Mature: Active warnings.
9. Regime Library
Stores: Slate-level conditions. High-volatility, quiet, public-bias, lineup-chaos, efficient-market. Updated: Via slate-level tagging and performance analysis. Used by: ANALYST, RISK. Training: Tag regimes, no behavior changes. Mature: Regime-aware adjustments.
10. League Intelligence
Stores: Per-league model accuracy, CLV patterns, market efficiency. Updated: Via league-bucketed performance analysis. Used by: ANALYST. Training: Collect. Mature: League-specific confidence adjustments.
11. Market Behavior Knowledge
Stores: Per-market-type performance. Which markets have positive CLV? Which are efficient? Updated: Via market-bucketed analysis. Used by: ANALYST, TRADER. Training: Collect. Mature: Market-specific edge thresholds.
12. Calibration Layer
Stores: Model accuracy by probability bucket. Predicted 60% → actual 58%? Predicted 40% → actual 35%? Updated: After Stage 3 with 300+ bets. Used by: ANALYST. Training: Calculate but don't act. Mature: Adjust model confidence.
PHASE 3 — GAMIFICATION MODEL
XP Categories
XP is earned for process quality, not outcomes. Each bet decision generates XP across multiple categories.
| Category | Earns XP | Loses XP |
|---|---|---|
| Edge Quality | Positive CLV | Negative CLV on "strong" confidence bets |
| Timing | Entry CLV > 0 (beat the closing line) | Entered after value eroded (CLV < -0.10) |
| Rejection | Correctly rejected bet that would have lost | False rejection of bet that would have won with good process |
| Discipline | Followed all doctrine rules | Violated stop-loss, exposure limits, or override rules |
| Review | Completed thorough review with process score ≥ 4 | Shallow review (score ≤ 2) or skipped review |
| Archetype | Correctly identified archetype that matches outcome pattern | Misidentified archetype |
| Regime | Correctly tagged regime and adjusted behavior | Ignored regime signals |
| Calibration | Confidence prediction within 5% of actual | Overconfidence (predicted 70%, actual < 50%) |
Anti-Results-Bias Rules
- A bad bet that wins (poor process, negative expected CLV, ignored contradictions) earns ZERO XP for the win and receives a process penalty.
- A good bet that loses (strong process, positive CLV, correct archetype, good timing) earns FULL process XP.
- A great pass (correctly rejected a losing bet) earns rejection XP equal to a winning bet's edge XP.
- XP from outcomes (win/loss) is weighted at 20%. XP from process metrics is weighted at 80%.
Training Period Restrictions
- During Stages 1-2: XP is tracked but no mastery levels unlock.
- During Stage 3: First mastery levels can unlock (Observation → Calibration rank).
- During Stages 4-5: Full mastery progression active.
PHASE 4 — MASTERY SYSTEM
Tracks
1. Edge Detection
Measures: Accuracy of edge identification. Are high-confidence bets actually higher-probability? Levels: Observer → Detector → Evaluator → Expert → Master Unlocks at Expert: Archetype-specific edge adjustment (±1%) Locked until: 200 settled bets with confidence scores
2. Timing
Measures: CLV quality. Are entries beating the closing line? Levels: Observer → Timer → Strategist → Expert → Master Unlocks at Expert: Wait-vs-enter recommendations active Locked until: 150 bets with CLV data
3. Rejection Discipline
Measures: Quality of "no" decisions. Rejection accuracy rate. Levels: Observer → Filter → Guardian → Expert → Master Unlocks at Expert: Auto-rejection of invalidated archetypes Locked until: 100 rejection decisions reviewed
4. Risk Discipline
Measures: Compliance with risk rules. Stop-loss respect, exposure management. Levels: Observer → Compliant → Disciplined → Expert → Master Unlocks at Expert: Adaptive exposure limits based on regime Locked until: 200 decisions with full risk compliance tracking
5. Review Quality
Measures: Depth and accuracy of post-bet reviews. Levels: Observer → Reviewer → Analyst → Expert → Master Unlocks at Expert: Review-driven doctrine updates Locked until: 100 reviews completed with quality scores
6. Archetype Recognition
Measures: Accuracy of archetype tagging. Do tagged archetypes perform as expected? Levels: Observer → Tagger → Pattern Reader → Expert → Master Unlocks at Expert: Archetype trust system active Locked until: 50 bets per archetype for top 5 archetypes
7. Regime Awareness
Measures: Accuracy of regime detection. Does identified regime correlate with performance patterns? Levels: Observer → Detector → Strategist → Expert → Master Unlocks at Expert: Regime-aware confidence adjustment Locked until: 30 slates per regime type
8. Confidence Calibration
Measures: Alignment between predicted probability and actual outcome frequency. Levels: Observer → Estimator → Calibrator → Expert → Master Unlocks at Expert: Model confidence bias correction Locked until: 300 bets with calibration analysis
PHASE 5 — KNOWLEDGE OBJECTS
1. Decision Record
{
id: string,
timestamp: ISO,
fixture_id: number,
match: string,
verdict: "enter" | "wait" | "monitor" | "reject",
market: string,
model_prob: float,
market_odds: float,
edge_pct: float,
confidence: 1-10,
timing_score: 1-5,
archetype_tag: string,
regime_tag: string,
contradiction_count: int,
invalidation_triggers: string[],
supporting_evidence: string[],
contradictory_evidence: string[],
claude_verdict: "approve" | "reject" | null,
claude_confidence: 1-10 | null,
claude_reasoning: string | null,
risk_status: string,
execution_readiness: 1-5,
evidence_sufficiency: 1-5,
trust_status: "unvalidated" | "emerging" | "validated" | "decaying",
training_stage: 1-5,
created_by: "ANALYST"
}
Written: Every time ANALYST evaluates a match. Active: All stages.
2. Review Record
{
id: string,
decision_id: string,
settled_at: ISO,
result: "WON" | "LOST",
pnl: float,
clv: float,
process_score: 1-5,
thesis_quality: 1-5,
evidence_quality: 1-5,
contradiction_handling: 1-5,
timing_quality: 1-5,
discipline_quality: 1-5,
archetype_fit: 1-5,
stake_appropriateness: 1-5,
classification: "good-win" | "good-loss" | "bad-win" | "bad-loss" | "great-pass" | "correct-wait" | "correct-reject" | "false-reject" | "missed-entry",
lessons: string[],
xp_awarded: { edge: int, timing: int, rejection: int, discipline: int, review: int, archetype: int },
mastery_impact: { track: string, delta: int }[],
reviewed_by: "SETTLER",
reviewed_at: ISO
}
Written: After every settled bet and every reviewed rejection. Active: All stages (Stage 1: simplified scores only).
3. Archetype Record
{
id: string,
name: string,
description: string,
signature_conditions: string[],
trust_status: "unvalidated" | "emerging" | "validated" | "decaying" | "invalidated",
sample_count: int,
win_rate: float,
avg_clv: float,
avg_edge: float,
positive_clv_pct: float,
ideal_timing: string,
invalidation_signs: string[],
risk_flags: string[],
last_updated: ISO,
last_performance_window: { wins: int, losses: int, clv: float, window_size: int },
trust_earned_at_sample: int | null,
trust_decay_trigger: string | null
}
Written: After classification during review. Updated on every new sample. Active: Labeling from Stage 1. Trust from Stage 4.
4. Regime Record
{
id: string,
name: string,
description: string,
detection_indicators: string[],
sample_slates: int,
avg_clv_during: float,
avg_win_rate_during: float,
confidence_adjustment: float, // e.g., -0.05 = reduce confidence by 5%
timing_adjustment: string,
rejection_threshold_adjustment: float,
trust_status: "unvalidated" | "emerging" | "validated",
last_updated: ISO
}
Active: Tagging from Stage 1. Adjustments from Stage 4.
5. Rejection Record
{
id: string,
decision_id: string,
reason_class: "fake-edge" | "stale-edge" | "model-weak" | "contradiction" | "bad-timing" | "low-liquidity" | "low-trust-archetype" | "weak-evidence" | "value-gone" | "regime-caution",
edge_at_rejection: float,
would_have_result: "WON" | "LOST" | "UNKNOWN",
quality_score: 1-5,
xp_awarded: int,
reviewed: boolean
}
Active: All stages.
6. Timing Record
{
id: string,
decision_id: string,
entry_time_before_ko: int, // minutes
odds_at_entry: float,
closing_odds: float,
clv: float,
timing_grade: "excellent" | "good" | "acceptable" | "late" | "stale",
market_type: string,
lesson: string | null
}
Active: All stages.
PHASE 6 — DECISION INTELLIGENCE FRAMEWORK
Every candidate match produces a formal Decision Object:
DECISION:
verdict: enter | wait | monitor | reject
EDGE ASSESSMENT:
model_probability: 0.62
market_implied: 0.53
edge: 9.0%
confidence: 7/10
evidence_sufficiency: 4/5
TIMING ASSESSMENT:
timing_score: 4/5
entry_window: "optimal (2h pre-KO)"
line_movement: "stable"
stale_risk: "low"
CONTEXT:
archetype: "strong-home-value"
archetype_trust: "emerging (38 samples, CLV +0.04)"
regime: "normal"
contradiction_count: 1
contradictions: ["Away team unbeaten in last 5 away"]
invalidation_triggers: ["Key home striker injured", "Heavy rain forecast"]
RISK:
exposure_impact: "+3.00 (total: 18/800)"
risk_status: "ACTIVE"
stake: 3.00 (quarter Kelly)
REVIEW:
review_priority: "standard"
learning_value: "medium"
mastery_tracks: ["edge_detection", "archetype_recognition"]
TRUST:
trust_level: "emerging"
training_stage: 3
evidence_count: 38
behavior_unlocked: false
PHASE 7 — BET ARCHETYPES
Archetype Definitions
strong-home-value
Signature: Home team with xG > 1.5, home odds > model fair odds by 8%+, strong home form (4W+ in last 6 home) Why it exists: Markets sometimes undervalue consistent home performers in mid-table matchups Trustworthy when: Home team not in congested fixture schedule, no key absences, normal conditions Dangerous when: Cup hangover, nothing-to-play-for end-of-season, key defender missing Ideal timing: 1-3 hours before kickoff (lines stable) Invalidation: Late team news changing expected lineup significantly Training start: Sample count 0. Trust: unvalidated. Trust threshold: 40 samples, CLV > +0.03, positive CLV in 55%+ of bets
high-xG-totals-edge
Signature: Combined xG > 3.0, Over 2.5 odds > 1.60, both teams averaging 1.3+ goals Why it exists: Poisson model occasionally finds genuine over-pricing in total goals markets Trustworthy when: Both teams in attacking form, no weather concerns, open-play styles Dangerous when: One team is defensive-specialist or playing for a draw Ideal timing: Early (6h+ before KO) when lines haven't corrected Invalidation: One team announces ultra-defensive setup or key attacker out
inflated-away-price
Signature: Away team odds > model fair price by 10%+, away team in strong form Why it exists: Markets historically overvalue home advantage for some team profiles Trustworthy when: Away team is genuinely strong away from home (check away-specific form) Dangerous when: Small sample of away games, hostile venue, derby NOTE: Model currently weak on away predictions. Extra caution. Higher trust threshold. Trust threshold: 60 samples (higher than standard due to known model weakness)
draw-trap
Signature: Model says edge on draw, both teams evenly matched, low xG Why it exists: Draws are hardest to predict. Bookmakers price draws conservatively. Trustworthy when: Almost never with current model quality Dangerous when: Always — the model is weakest here Training behavior: Collect data but flag as high-risk. Do not trust until Stage 5 minimum.
stale-edge-false-positive
Signature: Edge existed 6h+ ago but odds have moved toward model's predicted probability Why it exists: The market corrected. The edge is gone. This is a FAILURE pattern, not a bet type. Action:** Reject. Award rejection XP.
qualitative-veto-override
Signature: Model shows edge but Claude (or human) rejects on qualitative grounds Why it exists: Injuries, motivation, tactical mismatches the model can't see Trust threshold: Track veto accuracy. If vetoes are correct 60%+ of the time, increase Claude's weight.
PHASE 8 — TIMING INTELLIGENCE
Timing States
| State | Definition | Action |
|---|---|---|
| Early Value | Edge identified 6h+ before KO, line stable | Enter if edge > 10%. Wait if 8-10%. |
| Optimal Window | 1-3h before KO, line stable, edge confirmed | Primary entry zone. |
| Late Confirmation | <1h before KO, line moved toward our position | Enter — market agrees. CLV likely positive. |
| Stale Edge | Edge existed earlier but odds moved against us | Do NOT enter. Value is gone. Reject. |
| Market Ahead | Line moved past our model's price | Pass. Market is smarter on this one. |
| In-Play Early | 15-20 min in, stats confirming thesis | Monitor. Enter only if live data strongly supports. |
| In-Play Late | 60+ min, reduced time value | Generally avoid. Exception: extreme mispricing. |
Timing Grades
| Grade | CLV Range | Meaning |
|---|---|---|
| A | CLV > +0.10 | Excellent timing — significantly beat closing line |
| B | CLV +0.03 to +0.10 | Good timing — beat the line |
| C | CLV -0.03 to +0.03 | Neutral — entered at about fair value |
| D | CLV -0.10 to -0.03 | Late — entered after partial value erosion |
| F | CLV < -0.10 | Stale entry — should have been rejected |
Timing XP
- Grade A: +15 XP
- Grade B: +10 XP
- Grade C: +3 XP
- Grade D: -5 XP
- Grade F: -15 XP
- Correct stale-edge rejection: +12 XP
PHASE 9 — REJECTION INTELLIGENCE
Rejection Classes
| Class | Trigger | Priority |
|---|---|---|
| Fake Edge | Model error, data anomaly, odds display error | Critical — log for model review |
| Stale Edge | Value eroded, odds moved | Standard — timing discipline |
| Model Weak | Away win spots, draw spots, known model limitations | Standard — discipline |
| Contradiction | 3+ contradicting factors vs model | Elevated — review carefully |
| Bad Timing | Too late, too early for confidence, in-play decay | Standard |
| Low Liquidity | Thin Betfair book, wide spreads | Standard |
| Low Trust Archetype | Archetype is unvalidated or decaying | Elevated — wait for evidence |
| Weak Evidence | Edge exists but supporting data is thin | Standard |
| Value Gone | Line moved past break-even | Critical — never enter |
| Regime Caution | Current regime is low-confidence period | Elevated |
Great Pass Recognition
A "great pass" is a rejection where: 1. The rejected bet would have lost AND 2. The process for rejecting was documented with a clear reason AND 3. The rejection class was appropriate
Great passes earn edge-level XP (15-25 XP) and advance Rejection Discipline mastery.
PHASE 10 — REVIEW LOOP
Classification Matrix
| Process | Outcome | Classification | XP Impact |
|---|---|---|---|
| Strong process, documented | Won | Good Win | Full XP: edge + timing + process |
| Strong process, documented | Lost | Good Loss | Process XP preserved, edge XP neutral |
| Weak process, undocumented | Won | Bad Win | Zero XP, process penalty |
| Weak process, undocumented | Lost | Bad Loss | Process penalty, learning requirement |
| Strong rejection reasoning | Would have lost | Great Pass | Rejection XP = edge win XP |
| Correct wait decision | Better entry found later | Correct Wait | Timing XP |
| Rejected, would have won | But process was correct | Correct Reject | Neutral (variance) |
| Rejected, would have won | Process was wrong | False Reject | Small penalty, review required |
Process Score Rubric (1-5)
| Score | Meaning |
|---|---|
| 5 | Perfect: clear thesis, evidence documented, contradictions addressed, timing optimal, stake appropriate, review thorough |
| 4 | Strong: minor gap in one area but overall disciplined |
| 3 | Adequate: followed basic doctrine but missed contextual factors |
| 2 | Weak: skipped contradiction check, poor timing, inadequate evidence |
| 1 | Failure: chased, ignored signals, violated doctrine |
What Review Updates
- Decision Record → marked as reviewed
- Archetype Record → sample count updated, CLV recalculated
- Timing Record → timing grade assigned
- Mastery Tracks → XP awarded per track
- Failure Library → new entry if process score ≤ 2
- Rejection quality → if this was a rejection that was reviewed
During Training (Stages 1-3): Review updates records but does NOT change any decision behavior. During Mature (Stages 4-5): Review can trigger archetype trust changes, timing rule updates, and doctrine amendments.
PHASE 11 — REGIME AWARENESS
Regime Types
Normal
Indicators: Standard slate, 5-8 matches, no unusual external factors Adjustments: None — baseline behavior
High Volatility
Indicators: Cup weeks, international breaks ending, early season Adjustments: Reduce confidence by 5%, increase min edge to 10%, reduce stake by 20%
Public Bias
Indicators: Marquee matchups (El Clasico, derby days), heavy public betting Adjustments: Increased value on contrarian positions, away/under markets
Lineup Chaos
Indicators: Multiple teams with 3+ changes, cup rotation suspected Adjustments: Reduce confidence, increase rejection rate, wait for confirmed lineups
Efficient Market
Indicators: CLV consistently near zero across last 20 bets, no mispricing detected Adjustments: Reduce activity, raise min edge to 12%, focus on rejection quality
Strong Model Period
Indicators: CLV positive in 60%+ of last 30 bets, model calibration tight Adjustments: Moderate confidence boost (max +5%), slightly larger stakes allowed
PHASE 12 — HUD / MOBILE SURFACING
Match Card (Live Screen)
Barcelona v Espanyol 2-1 67'
La Liga · CONTROL
Edge 9.3% · Confidence 7 · Timing: OPTIMAL
Archetype: strong-home-value (emerging, 38 samples)
Regime: normal
Contradictions: 1 · Evidence: 4/5
Trust: EMERGING · Stage: 3
Portfolio Position Card
B Barcelona v Espanyol · Over 2.5 @1.95
Edge 9.3% · CLV +0.08 · Timing A
Process: 4/5 · Archetype: high-xG-totals
Trust: emerging · Review: pending
Profile — Mastery Dashboard
MASTERY TRACKS
Edge Detection ████████░░ Expert (Level 4)
Timing ██████░░░░ Strategist (Level 3)
Rejection █████████░ Expert (Level 4)
Risk Discipline ██████████ Master (Level 5)
Review Quality ███████░░░ Analyst (Level 3)
Archetypes ████░░░░░░ Pattern Reader (Level 3)
TRAINING STATUS: Stage 3 — Calibration
Bets: 287/500 to Stage 4
CLV+: 54% (target: 55%+)
Reviews: 142 completed
PHASE 13 — IMPLEMENTATION ROADMAP
Step 1: Decision Records (Week 1)
Add decision_records table to sportai.db. Record every ANALYST evaluation with verdict, confidence, archetype tag, regime tag, contradiction count. This is the foundation. Everything else builds on having structured decisions.
Step 2: Review Records (Week 2)
Add review_records table. After each settled bet, create a review with process score, classification (good-win / bad-win / good-loss / bad-loss), CLV, timing grade. Start with simplified scoring (1-5 overall).
Step 3: Rejection Records (Week 2)
Add rejection_records table. When ANALYST or RISK rejects a bet, log the reason class and the edge that was rejected. Track the would-have-result after settlement for rejection quality scoring.
Step 4: Timing Records (Week 3)
Add timing_records table. For every placed bet, record time-before-kickoff, entry odds, closing odds, CLV. Calculate timing grades.
Step 5: Archetype Tags (Week 3)
Define the initial 10 archetypes (from Phase 7). Add archetype tagging to the auto-bet pipeline. Every bet gets an archetype label based on signature conditions.
Step 6: Process-Weighted XP (Week 4)
Replace the current XP system (win = 15-25, loss = 2) with the process-weighted system from Phase 3. XP = 80% process + 20% outcome.
Step 7: Stage Tracking (Week 4)
Add training_stage to bankroll.json. Start at Stage 1. Track progress toward Stage 2 graduation criteria.
Step 8: Mastery Tracks (Week 5)
Add mastery track scores to bankroll.json. Start all tracks at "Observer" level. Track XP per track. Display in Profile screen.
Step 9: Regime Tagging (Week 6)
Add regime detection to the analysis pipeline. Tag each slate. Start collecting regime performance data.
Step 10: Calibration Reports (Week 8+)
After 200+ bets, run first calibration analysis. Model probability vs actual frequency. Per-archetype CLV. Per-market CLV. Timing analysis.
What Remains Locked
| Feature | Unlocks At |
|---|---|
| Archetype trust adjustments | Stage 4 (500+ bets) |
| Edge threshold changes | Stage 4 (500+ bets) |
| Timing-based entry rules | Stage 3 (250+ bets with timing data) |
| Regime-aware adjustments | Stage 4 (500+ bets) |
| Experimental bets | Stage 5 (1000+ bets) |
| Adaptive Kelly | Stage 5 (1000+ bets) |
| Auto-rejection of invalidated archetypes | Stage 4 (Rejection mastery: Expert) |
How to Validate
The intelligence system is working if: 1. CLV improves over rolling 100-bet windows 2. Rejection accuracy improves (rejected bets lose at higher rate than entered bets) 3. Timing grades improve (more A/B, fewer D/F) 4. Process scores improve (average trending toward 4+) 5. Great-pass count increases 6. Bad-win count decreases (fewer undisciplined wins) 7. Archetype tagging becomes predictive (tagged archetypes perform as expected) 8. Model calibration improves (predicted probabilities match actual frequencies more closely)
If these metrics are NOT improving after 500 bets, the intelligence system needs human review and potential model redesign — not more automation.
This document is the brain, memory, progression ladder, and calibration engine for Sticks Ultimate V6. Intelligence must be earned.