← Zheyuan YuProject notes

2026 · Python · private repository

quantnews

Turning news and filings into ranked, risk-gated equity trade ideas.

  • 8 yrsof SEC Form-4 history (2018–2026)
  • ~5,000insider events tracked
  • ~14.5kevents, walk-forward calibrated
  • 5signal sources, S&P 100

The problem

Most market-moving information arrives as unstructured text: news headlines, SEC filings, earnings calls. I wanted a system that reads that stream, turns it into ranked trade ideas with explicit risk controls, and then tracks which ideas were taken and how they resolved, so the loop from signal to review is closed rather than anecdotal.

How it works

A daily pipeline collects RSS, SEC EDGAR 8-K items and Form 4 insider transactions, earnings-call transcripts, and analyst revisions into one store. Each event is scored by an LLM, and the raw conviction is calibrated against empirical hit rates with Bayesian shrinkage before anything is ranked.

Ranking is more than a sort on conviction. Sizing is gated by market regime (SPY and QQQ breadth) and by risk-off rules: an earnings window or a cluster of insider selling stands a position down rather than trusting a stale signal. Ideas that take a trade become tracked theses, each with entry, stop, target, and a written review on close.

From headline to sized position
  • RSS
  • SEC 8-K
  • Form 4
  • Transcripts
  • Analyst revisions
  1. LLM scoring

    one signal per event

  2. Calibration

    conviction → empirical hit rate

  3. Conviction stack

    regime gate · lockouts · nudges

  4. Ranked, risk-sized ideas

    entry / stop / target

  5. Tracked theses

    recorded, then reviewed on close

    ↺ reviews feed back into calibration, closing the loop.

Worked example. A news event scores 0.78 raw conviction. Rather than trust that number, the system looks up the empirical hit rate for that event type and conviction bucket from history, sizes the position on the calibrated figure instead, and lets the regime gate scale it with market breadth.

Conviction you can size on

The point of the calibration layer is that a raw model score is not a probability. quantnews maps each (event type, direction, conviction bucket) to an empirical hit rate, shrunk toward the base rate so thin samples can’t shout. Position size follows the calibrated number, not the headline, and the regime gate scales the whole book up or down with market breadth. Conviction is earned from data, then sized by risk.

Validated before it ships

A signal earns its place through event-study backtests and walk-forward calibration, measured on a broad S&P 100 sample and across pre-COVID, COVID, and post-COVID regimes, rather than on a handful of cherry-picked names. The same machinery runs as a live monitor, so a signal that decays gets caught and down-weighted automatically.

The code is private (it’s a personal research system). Happy to give a deeper walkthrough on request.

Get in touch →