Methodology — orbyd

The pipeline, in one paragraph

Each trading morning, a five-stage pipeline ingests 30 days of news and four quarters of earnings transcripts for held positions, watchlist names, and the top-100 momentum universe. A frontier language model synthesises a strategic grade per candidate, then reasons across all candidates side-by-side under sizing and risk constraints. The output is a dossier per ticker, a regime-tagged journal entry, a macro view, sector rotation, and a curated watchlist. The reasoning is what gets published.

The five stages

Liquidity screen. ~400 US-listed names filtered by spread, ADV, market cap, and tradable shape. Deterministic; cheap; eliminates ~75% of the universe before the LLM touches it.
Momentum + narrative scoring. Survivors ranked by a composite of price structure × volume × news density × theme-cluster strength. Themes are treated as primary, not afterthoughts — the system reads narrative basket behaviour, not isolated ticker action.
The model's quality read. Claude Opus reads news, earnings transcripts, and filings for each candidate and produces a quality grade (an internal grade from strongest to weakest), along with the dossier you see on this site: thesis, invalidation, bull case, bear case, setup, catalysts, what-would-change-our-mind, correlations.
Portfolio composition. Opus reasons across all candidates side-by-side using the 1M-token context — every dossier in one pass — then proposes a portfolio under sizing caps, archetype rules, regime tiers, and concentration constraints.
How the model learns from closed trades. Closed trades trigger a review. Recurring patterns promote to a playbook. Weightings update from 90-day outcomes weekly. Those notes inform the methodology.

The daily schedule

Trading windows are computed off the US market calendar in New-York-relative minutes — DST-correct year-round, including the ~4-week US/EU DST gap windows where naive Berlin-time schedules silently fail. Daily windows:

Premarket — universe scan + overnight-gap check on held names.
Decision window — 20 min pre-close, the main entry/exit pass.
Midday + opportunistic — 11:30 + 14:00 ET, fire only on material new info.
Post-close — reconcile the day's record, EOD journal, regime confirmation.

What's in a dossier

Each dossier follows the same skeleton — designed so a reader can audit the reasoning, not just the conclusion:

Current thesis — one paragraph, what the bet is.
Invalidation trigger — the explicit kill criterion.
Bull case — sourced bullets, dated, no hand-waving.
Bear case — same construction, equal weight.
Setup & price structure — MAs, RSI, levels, basing pattern.
Catalyst calendar — next 30 days, dated.
What would change our mind — explicit conditions for higher / lower conviction.
Correlation notes — how the name moves with its basket.

Archetype taxonomy

Every name is tagged with an archetype that drives sizing discipline. Archetypes are not labels — they're behaviour profiles that the sizing rules read. Full definitions on the glossary.

Compounder. Quality balance sheet, secular tailwind, multi-year hold candidate.
Cyclical recovery. Mean-reverting earnings, regime-sensitive.
Theme leader. Highest-conviction name within an active narrative.
Special situation. M&A, spin-off, restructuring, regulatory event.
Earnings inflection. Pre/post-print setup with explicit binary.
Retail squeeze. High-beta, short-interest-driven, hard sizing cap.
Defensive. Cash-flow durability, low-beta, regime hedge.
Macro hedge. Cross-asset proxy for thematic risk (XLE / GLD / TLT / …).

Regime classification

Each journal entry records the system's regime call. Regimes are not picks — they're a filter that gates how aggressively the system reads candidate setups. The macro view is the long-form version of the same read.

Published regime labels: RISK-ON, CHOPPY, RISK-OFF, a binary-event stagflation scare, an escalating stagflation scare, a healthy-but-unconfirmed recovery, and variants. Each is a defined rule that maps to buy-threshold, size-multiplier, max-exposure, and cash-floor settings.

Conviction levels

Conviction is the model's calibrated confidence that the setup will play, not a price target or return forecast. Four levels: SUPREME, HIGH, MEDIUM, LOW. Each calibrates sizing and stop discipline — and each carries an explicit invalidation trigger that strips the conviction if breached.

How outcomes are scored

Open methodology applies to the scoring too — here is the exact, reproducible method behind the track record. Every thesis ships a falsifiable invalidation trigger. When it resolves, the pipeline marks it "played out" or "invalidated", dated, with a flag for whether the published trigger fired first. Strictly non-monetary — outcomes are binary: the claim held, or it was falsified.

Each conviction tier is treated as a probabilistic claim, published in advance, that the thesis plays out: SUPREME = 0.90, HIGH = 0.75, MEDIUM = 0.60, LOW = 0.50. Names the model held no conviction on are listed in the resolved ledger for transparency but are not scored — you can only be graded on a call you actually made. Against the binary outcome (played-out = 1, invalidated = 0) we compute:

Brier score — the mean squared error between the stated probability and the outcome. 0 is perfect, 0.25 is a coin flip, 1 is maximally wrong. (Reference: expert Superforecasters ≈ 0.08, the best LLMs ≈ 0.10.)
Murphy decomposition — Brier = calibration − resolution + uncertainty. Calibration (reliability) is how far each tier's observed play-out rate sits from its stated probability; resolution is how much the tiers separate from the base rate; uncertainty is the irreducible base-rate variance.
Brier skill score — skill versus a naive always-the-base-rate forecast (1 − Brier ∕ uncertainty).
Reliability by tier, archetype, and regime — the observed play-out rate broken out by conviction tier, by archetype, and by the macro regime in force when each thesis resolved. This is the falsifiable test of whether SUPREME actually beats LOW.

The board is hidden until at least one thesis has resolved (no faked scorecard), and the full record is machine-readable at /track-record.json for agents and independent verification.

Where the model is wrong

Three classes of failure are recurring and worth naming:

Stale facts. A model snapshot of a balance sheet can lag the latest 10-Q. Flagged in dossier notes when caught — not always caught.
Confident-but-wrong setup reads. A "clean higher-low" can become a failed reclaim within hours. Dossiers age fast; recency-of-write is on every page.
Theme misclassification. A name gets bucketed in a basket whose actual price driver is different; the correlation logic then over-fits.

This is why nothing here is a recommendation. The dossier is the reasoning behind every call.

Open by default

Every thesis, the names we hold and the ones we're watching, and every regime, macro and sector call — published the day it's made, dated, and scored against the trigger that would prove it wrong. The methodology is open, and so is the reasoning behind every call on the book.

Common questions

What is orbyd?

orbyd is a continuous market-intelligence layer built on frontier language models. A multi-stage pipeline reads US equity news, earnings transcripts, and price structure every trading day, then publishes per-ticker dossiers, regime-tagged journal entries, a weekly macro view, sector rotation, and a curated watchlist.

Which language models power the pipeline?

Anthropic's Claude Opus and Claude Sonnet. The portfolio composition stage uses Opus's 1M-token context window to compare hundreds of candidates side-by-side in a single pass.

What is an archetype?

Archetypes are behaviour profiles assigned to each name (Compounder, Cyclical recovery, Theme leader, Special situation, Earnings inflection, Retail squeeze, Defensive, Macro hedge). They drive sizing discipline and stop logic. See the glossary for full definitions.

What is an invalidation trigger?

The explicit kill criterion published with every dossier. If the trigger fires, the conviction is stripped and the thesis is treated as broken. It's a published commitment, not a soft warning.

How often is the site updated?

Daily for the journal and dossiers; weekly for the macro view and sector rotation. Each surface carries its own dateModified meta and an updated-at line. Subscribe via JSON Feed or RSS.

Is this investment advice?

No. We publish educational content under the BaFin and EU regulatory framework. No personalised advice is given and no orders are accepted.