Merge pull request 'feat: detect a hard-crashed diagnostic + analyze the crash boot — 0.15.0' (#10 ) from feat/m6-steam-detection into main

Reviewed-on: #10
feat: detect a hard-crashed diagnostic + analyze the crash boot — 0.15.0
2026-05-22 06:53:13 +00:00 · 2026-05-22 08:52:59 +02:00 · 2026-05-22 06:51:06 +00:00 · 2026-05-22 08:45:20 +02:00 · 2026-05-22 06:43:57 +00:00 · 2026-05-22 08:40:50 +02:00
16 changed files with 912 additions and 29 deletions
@@ -5,6 +5,58 @@ All notable changes to RigDoctor are recorded here. Format follows
 (`MAJOR.MINOR.PATCH`, pre-1.0). `__version__` and `pyproject.toml` must match the git
 release tag (so the auto-updater, D18, can compare versions).

+## [0.15.0] - 2026-05-22
+### Added
+- **Hard-crash detection & recovery for the guided diagnostic.** If a focused capture ends
+  without a clean stop (the recorder never wrote `session-stop` and isn't running), RigDoctor
+  treats it as a likely hard freeze. On launch the **Games** page shows a warning banner —
+  *"Your last diagnostic for <game> ended unexpectedly…"* — with **Analyze crash** / **Dismiss**.
+- **Deeper crash analysis.** *Analyze crash* combines the captured window (final readings before
+  the freeze + any GPU-lost event) with a focused scan of the **previous (crashed) boot's kernel
+  log** (`journalctl -k -b -1`: Xid/panic/OOM/MCE/AER/thermal) plus SMART/driver/persistence/
+  live-temp checks — the full "what happened" picture. `core/diagnostic.py` gains
+  `pending_crash()` / `analyze_crash()`; `health.check_previous_boot()` +
+  `run_health_checks(include_journal=False)` back it.
+
+## [0.14.0] - 2026-05-22
+### Changed
+- **Dashboard headline tiles are now history trend graphs** instead of single-value gauges —
+  GPU temp, GPU load, CPU temp, and memory each plot their recent history (with the current
+  value, window min/max, and a dashed warning-threshold line), so you can see changes over time
+  rather than only the instantaneous reading. New `HistoryGraph` widget (QPainter, no new deps).
+
+## [0.13.0] - 2026-05-22
+### Added
+- **Run Diagnostic now explains itself and can launch the game.** Clicking Run Diagnostic shows
+  what to do — *play the game, reproduce the crash, then Finish & analyze* (and that data
+  survives a hard freeze + reboot) — and offers **Launch game & start** (asks Steam to run it by
+  appid) or **Start without launching**. The recording banner now spells out the next step
+  instead of just showing a sample count.
+### Fixed
+- Button labels containing "&" (e.g. "Finish & analyze") rendered as "Finish _analyze" because
+  Qt treated the "&" as a keyboard mnemonic — now escaped so the ampersand shows literally.
+
+## [0.12.0] - 2026-05-22
+### Added
+- **Guided diagnostic in the GUI.** Each game on the **Games** page now has a **Run Diagnostic**
+  button → a focused, game-tagged capture starts and a recording banner appears (live sample
+  count, GPU-lost indicator) with **Finish & analyze** / **Discard**. Finishing opens a results
+  dialog: the window-scoped capture summary (peak temps/power, events, last samples) plus the
+  health findings as cards. The banner persists/restores if you navigate away and back while a
+  capture is running. Shares `core/diagnostic.py` with the CLI (one flow, three front-ends).
+
+## [0.11.0] - 2026-05-22
+### Added
+- **Guided diagnostic session (CLI) — the seed use case, end to end.** `rigdoctor diagnose
+  start --game "<name>"` runs a **focused crash-capture tagged with that game** (its own
+  diagnostic log, so the report is scoped to just that session), `diagnose status` shows
+  progress, and `diagnose finish` stops it and prints a combined report: the **capture
+  summary** (peak temps/power, GPU-lost events, last samples — M3) plus the **health findings**
+  (Xid/SMART/driver/etc. — M4). The game can be given by `--game` or `--appid` (resolved from
+  the Steam scan), and is recorded as a log event so it survives a crash + reboot.
+- Shared orchestration lives in `core/diagnostic.py` (one callable for CLI/GUI/tray, per
+  ARCHITECTURE §7.1); the recorder/`record run` gained an optional `--game` tag.
+
 ## [0.10.2] - 2026-05-22
 ### Changed
 - When an Environment **Apply**/**Install** fails, the status now shows the **real reason**
@@ -40,8 +40,13 @@ Ubuntu + NVIDIA first; `.deb` distribution (see `DECISIONS.md`).
 - [ ] M10 desktop GUI (PySide6: dashboard, log browser, report viewer, logger controls)
 - [ ] M11 tray / menu-bar applet (QSystemTrayIcon: live M1 readouts + Run Diagnostic +
      supporting actions — D13)
- [ ] Guided diagnostic session (pick game → focused M3 capture → M4 scan → findings),
-      shared by tray/GUI/CLI
+- [~] Guided diagnostic session (pick game → focused M3 capture → M4 scan → findings),
+      shared by tray/GUI/CLI — *core + CLI + GUI done* (`core/diagnostic.py`, `rigdoctor
+      diagnose start/status/finish`, and a **Run Diagnostic** button per game on the GUI Games
+      page → recording banner → results dialog with the capture summary + findings). Tags a
+      focused capture with the chosen game (own diagnostic log, window-scoped report) and
+      combines the capture summary with the M4 findings. *Pending:* the tray (M11) entry point,
+      and auto start/stop via the D12 wrapper/watcher.
 - [ ] Logger trigger modes: always-on + game-launch (D12 — wrapper first:
      `rigdoctor wrap %command%` + global Steam compat-tool; zero-config watcher
      (Steam RunningAppID + /proc) and GameMode hook follow)
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

 [project]
 name = "rigdoctor"
-version = "0.10.2"
+version = "0.15.0"
 description = "Modular hardware monitoring & crash diagnostics for Linux gamers."
 readme = "README.md"
 requires-python = ">=3.11"
@@ -1,3 +1,3 @@
 """RigDoctor — modular hardware monitoring & crash diagnostics for Linux gamers."""

-__version__ = "0.10.2"
+__version__ = "0.15.0"
@@ -86,6 +86,7 @@ def cmd_record_run(args) -> int:
        max_bytes=cfg["log_max_bytes"],
        backups=cfg["log_backups"],
        status_path=config.STATUS_FILE,
+        game=getattr(args, "game", None),
    )

    def _handle(_sig, _frame):
@@ -345,6 +346,77 @@ def cmd_report(args) -> int:
    return 0


+def _resolve_game(args) -> str | None:
+    """Game name from --game, or looked up from --appid via the Steam scan."""
+    if getattr(args, "game", None):
+        return args.game
+    if getattr(args, "appid", None):
+        from .core import steam
+
+        for g in steam.scan_games(steam.selected_library_paths()):
+            if g.appid == str(args.appid):
+                return g.name
+        return None
+    return None
+
+
+def cmd_diagnose(args) -> int:
+    from .core import diagnostic, reccontrol, steam
+
+    sub = args.diagnose_cmd or "status"
+
+    if sub == "start":
+        if reccontrol.running_pid():
+            print("A capture is already running — finish it with: rigdoctor diagnose finish")
+            return 1
+        game = _resolve_game(args)
+        if game is None and (args.game or args.appid):
+            print("Couldn't match that game in your selected Steam libraries.")
+            return 1
+        if game is None:
+            games = steam.cached_games() or steam.scan_games(steam.selected_library_paths())
+            if games:
+                print("Pick a game to focus on, then re-run with --game:")
+                for g in games:
+                    print(f"  --game {g.name!r}")
+            else:
+                print("No games detected. Select a library: rigdoctor games libraries --all")
+            return 1
+        pid = diagnostic.start(game=game, interval=args.interval)
+        time.sleep(1.0)
+        if pid and reccontrol.pid_alive(pid):
+            print(f"Diagnostic capture started for {game!r} (pid {pid}).")
+            print("  Play your game. When you're done (or after a crash + reboot):")
+            print("    rigdoctor diagnose finish")
+            return 0
+        print(f"Capture failed to start; see {config.SPAWN_LOG}")
+        return 1
+
+    if sub == "status":
+        status = diagnostic.active()
+        if not status:
+            print("No diagnostic capture is running.")
+            return 0
+        game = status.get("game") or "—"
+        print(f"Capturing for {game!r}: {status.get('samples', 0)} samples"
+              + (" · GPU-lost seen" if status.get("gpu_lost") else ""))
+        return 0
+
+    # finish
+    if not reccontrol.running_pid() and not config.DIAG_LOG.exists():
+        print("No diagnostic to analyze. Start one with: rigdoctor diagnose start --game <name>")
+        return 1
+    print("Stopping capture and analyzing…\n")
+    result = diagnostic.finish(last_n=args.last)
+    from .render import render_health, render_summary
+
+    if result.game:
+        print(f"Diagnostic — {result.game}\n")
+    print(render_summary(result.summary, log_path=config.DIAG_LOG))
+    print("\n" + render_health(result.findings, title="Findings"))
+    return 0
+
+
 def cmd_gameenv(args) -> int:
    from dataclasses import asdict

@@ -470,6 +542,7 @@ def build_parser() -> argparse.ArgumentParser:
    run_p = rec_sub.add_parser("run", help="run the capture loop in the foreground (systemd-friendly)")
    run_p.add_argument("-n", "--interval", type=float, default=None, help="sampling interval (s)")
    run_p.add_argument("-o", "--out", default=None, help="log file path")
+    run_p.add_argument("--game", default=None, help="tag the capture with a game name (M6/diagnose)")
    run_p.set_defaults(func=cmd_record_run)

    start_p = rec_sub.add_parser("start", help="start recording in the background")
@@ -519,6 +592,19 @@ def build_parser() -> argparse.ArgumentParser:
    env_p = sub.add_parser("gameenv", help="gaming environment checks (M6): flag stability/perf settings")
    env_p.add_argument("--json", action="store_true", help="output JSON instead of text")
    env_p.set_defaults(func=cmd_gameenv)
+
+    diag_p = sub.add_parser("diagnose", help="guided diagnostic: capture while gaming, then analyze")
+    diag_sub = diag_p.add_subparsers(dest="diagnose_cmd")
+    diag_start = diag_sub.add_parser("start", help="start a focused capture for a game")
+    diag_start.add_argument("--game", default=None, help="game name to focus on")
+    diag_start.add_argument("--appid", default=None, help="Steam appid to focus on (resolved to a name)")
+    diag_start.add_argument("-n", "--interval", type=float, default=None, help="sampling interval (s)")
+    diag_start.set_defaults(func=cmd_diagnose)
+    diag_sub.add_parser("status", help="show the in-progress diagnostic").set_defaults(func=cmd_diagnose)
+    diag_finish = diag_sub.add_parser("finish", help="stop the capture and analyze it")
+    diag_finish.add_argument("--last", type=int, default=10, help="recent samples to show")
+    diag_finish.set_defaults(func=cmd_diagnose)
+    diag_p.set_defaults(func=cmd_diagnose, diagnose_cmd=None, last=10)
    return p


@@ -23,6 +23,9 @@ CONFIG_FILE = CONFIG_DIR / "config.toml"

 # Crash-capture logger (M3)
 LOG_FILE = LOG_DIR / "capture.jsonl"
+# Guided diagnostic (M6/D12): a focused capture writes here, separate from the always-on
+# crash log, so its report covers only that session's window.
+DIAG_LOG = LOG_DIR / "diagnostic.jsonl"
 STATUS_FILE = STATE_DIR / "recorder.json"
 PID_FILE = STATE_DIR / "recorder.pid"
 SPAWN_LOG = STATE_DIR / "recorder.out"
@@ -0,0 +1,162 @@
+"""Guided diagnostic session (SPEC §4 / ARCHITECTURE §7.1): orchestrate M3 + M4.
+
+The seed use case, one flow: **pick a game** → **focused crash-capture** scoped to that
+session (M3, tagged with the game) → on **finish**, **scan & analyze** (M4 health report)
+over the captured window + system logs → return a prioritized result. This is not a new
+module — it's a single shared callable so the CLI, GUI, and tray run the identical flow.
+
+The capture is **manually bracketed** (start/finish) for now; auto start/stop on game launch
+(the D12 wrapper/watcher) plugs in here later without changing the result shape.
+"""
+
+from __future__ import annotations
+
+import json
+import time
+from dataclasses import dataclass
+
+from .. import config
+from . import reccontrol
+from .crashlog import Summary, summarize
+from .health import CRITICAL, OK, WARNING, Finding
+
+_SEV_ORDER = {CRITICAL: 0, WARNING: 1, "info": 2, OK: 3}
+
+
+@dataclass
+class DiagnosticResult:
+    game: str | None
+    summary: Summary           # capture window: peak temps/power, events, last samples (M3)
+    findings: list[Finding]    # health findings: Xid/SMART/driver/etc. (M4)
+
+
+@dataclass
+class CrashInfo:
+    game: str | None
+    samples: int
+    when: float | None         # ts of the last captured sample (≈ when the freeze hit)
+    gpu_lost: bool
+
+
+def _clear_diag_log() -> None:
+    """Each diagnostic is a fresh focused capture — drop any previous session + segments."""
+    base = config.DIAG_LOG
+    for p in [base, *base.parent.glob(base.name + ".*")]:
+        try:
+            p.unlink()
+        except OSError:
+            pass
+
+
+def start(game: str | None = None, interval: float | None = None) -> int | None:
+    """Begin a focused capture, tagged with the game, into the dedicated diagnostic log.
+    Returns the pid, or None if a capture is already running."""
+    if reccontrol.running_pid():
+        return None
+    _clear_diag_log()
+    return reccontrol.start_background(interval=interval, out=str(config.DIAG_LOG), game=game)
+
+
+def is_running() -> bool:
+    return reccontrol.running_pid() is not None
+
+
+def active() -> dict | None:
+    """Status of the in-progress session (running flag, game, samples), or None if idle."""
+    if not is_running():
+        return None
+    return reccontrol.read_status()
+
+
+def _await_stopped(timeout: float = 6.0) -> None:
+    deadline = time.monotonic() + timeout
+    while reccontrol.running_pid() and time.monotonic() < deadline:
+        time.sleep(0.1)
+
+
+def _game_from_summary(summary: Summary) -> str | None:
+    """Recover the focused game from the log's 'game' event (survives a crash + reboot)."""
+    for _ts, kind, detail in reversed(summary.events):
+        if kind == "game" and detail:
+            return detail
+    return None
+
+
+def finish(last_n: int = 10, log_path=None) -> DiagnosticResult:
+    """Stop the capture (if running), summarize the window, and run the health report."""
+    from .health import run_health_checks
+
+    reccontrol.stop_background()
+    _await_stopped()
+    path = log_path or config.DIAG_LOG
+    summary = summarize(path, last_n=last_n)
+    game = _game_from_summary(summary) or (reccontrol.read_status() or {}).get("game")
+    findings = run_health_checks()
+    return DiagnosticResult(game=game, summary=summary, findings=findings)
+
+
+# --- hard-crash detection & post-crash analysis -----------------------------------
+
+def pending_crash() -> CrashInfo | None:
+    """Detect a diagnostic that ended abnormally (no clean stop, no live recorder).
+
+    A focused capture writes `session-start` (+ `game`) and, on a clean stop, `session-stop`.
+    After a hard freeze that block never runs, so the log has a start with no stop and no
+    live recorder — that's our hard-crash signal. Returns None if a capture is running, none
+    is recorded, it stopped cleanly, or the user already acknowledged it.
+    """
+    if is_running() or not config.DIAG_LOG.exists():
+        return None
+    summary = summarize(config.DIAG_LOG)
+    kinds = {kind for _ts, kind, _detail in summary.events}
+    if "session-start" not in kinds:
+        return None
+    if "session-stop" in kinds or "diagnostic-acknowledged" in kinds:
+        return None
+    return CrashInfo(
+        game=_game_from_summary(summary),
+        samples=summary.samples,
+        when=summary.end,
+        gpu_lost="gpu-lost" in kinds,
+    )
+
+
+def acknowledge_crash() -> None:
+    """Mark the recorded crash as seen so it stops prompting (appends a marker event)."""
+    try:
+        config.DIAG_LOG.parent.mkdir(parents=True, exist_ok=True)
+        with open(config.DIAG_LOG, "a", encoding="utf-8") as fh:
+            fh.write(json.dumps({"ts": time.time(), "event": "diagnostic-acknowledged", "detail": ""}) + "\n")
+    except OSError:
+        pass
+
+
+def _crash_headline(summary: Summary) -> Finding:
+    gpu_lost = any(kind == "gpu-lost" for _ts, kind, _detail in summary.events)
+    when = time.strftime("%H:%M:%S", time.localtime(summary.end)) if summary.end else "?"
+    detail = (
+        f"The capture stopped abruptly at {when} after {summary.samples} samples, with no clean "
+        "shutdown recorded — consistent with a hard freeze or power loss."
+    )
+    if gpu_lost:
+        detail += " A GPU-lost event was captured during the session."
+    return Finding(
+        CRITICAL if gpu_lost else WARNING,
+        "Diagnostic",
+        "Session ended without a clean stop (likely a hard crash)",
+        detail,
+        "Review the last readings (Capture, above) and the crash-boot findings below.",
+    )
+
+
+def analyze_crash(last_n: int = 15) -> DiagnosticResult:
+    """Analyze a recorded hard crash: the captured window + the previous boot's kernel log
+    + the rest of the health report (SMART/driver/persistence/temps)."""
+    from .health import check_previous_boot, run_health_checks
+
+    summary = summarize(config.DIAG_LOG, last_n=last_n)
+    findings: list[Finding] = [_crash_headline(summary)]
+    findings += check_previous_boot()                       # the crashed boot's kernel log
+    findings += run_health_checks(include_journal=False)    # SMART/driver/persistence/temps
+    findings.sort(key=lambda f: _SEV_ORDER.get(f.severity, 9))
+    return DiagnosticResult(game=_game_from_summary(summary), summary=summary, findings=findings)
@@ -146,6 +146,22 @@ def check_journal() -> list[Finding]:
    return findings


+def check_previous_boot() -> list[Finding]:
+    """Scan the previous boot's kernel log — the boot that crashed — for fault signatures.
+
+    Needs persistent journald (else the crashed boot's logs were lost on reboot, which the
+    persistence check flags separately). Findings are framed as coming from that boot.
+    """
+    out = _journalctl(["-k", "-b", "-1", "--no-pager", "-o", "cat"])
+    if not out or not out.strip():
+        return []
+    tagged = []
+    for f in scan_journal_text(out):
+        detail = ("Logged during the previous (crashed) boot. " + (f.detail or "")).strip()
+        tagged.append(Finding(f.severity, f.category, f.title, detail, f.suggestion))
+    return tagged
+
+
 def check_journal_persistence() -> list[Finding]:
    if Path("/var/log/journal").is_dir():
        return []
@@ -235,16 +251,20 @@ def check_live_temps() -> list[Finding]:
    )]


-def run_health_checks() -> list[Finding]:
+def run_health_checks(include_journal: bool = True) -> list[Finding]:
    """Run all checks and return findings sorted by severity (worst first).

    SMART needs root; if the session collected it via launch elevation, use that
    instead of re-running smartctl (which would just report "needs root").
+
+    `include_journal=False` skips the 7-day kernel-journal scan — used by the crash
+    analysis, which scans the previous (crashed) boot specifically instead.
    """
    from . import elevation

    findings: list[Finding] = []
    findings += check_nvidia_driver()
+    if include_journal:
        findings += check_journal()
    findings += check_journal_persistence()
    priv = elevation.privileged()
@@ -38,7 +38,9 @@ def read_status() -> dict | None:
        return None


-def start_background(interval: float | None = None, out: str | None = None) -> int | None:
+def start_background(
+    interval: float | None = None, out: str | None = None, game: str | None = None
+) -> int | None:
    """Spawn a detached `record run`. Returns the child pid, or None if already running."""
    if running_pid():
        return None
@@ -48,6 +50,8 @@ def start_background(interval: float | None = None, out: str | None = None) -> i
        cmd += ["--interval", str(interval)]
    if out:
        cmd += ["--out", out]
+    if game:
+        cmd += ["--game", game]
    out_fh = open(config.SPAWN_LOG, "a")
    proc = subprocess.Popen(
        cmd,
@@ -27,12 +27,14 @@ class Recorder:
        backups: int = 10,
        status_path=None,
        sampler: Sampler | None = None,
+        game: str | None = None,
    ) -> None:
        self.interval = interval
        self.sampler = sampler or Sampler(available_sources())
        self.writer = CrashLogWriter(log_path, max_bytes, backups)
        self.log_path = Path(log_path)
        self.status_path = Path(status_path) if status_path else None
+        self.game = game or None
        self.samples = 0
        self._stop = threading.Event()
        self._gpu_lost = False
@@ -43,6 +45,8 @@ class Recorder:

    def run(self) -> None:
        self.writer.write_event("session-start", f"interval={self.interval:g}s")
+        if self.game:
+            self.writer.write_event("game", self.game)  # tag the focused-diagnostic target
        self._write_status(running=True)
        try:
            while not self._stop.is_set():
@@ -81,6 +85,7 @@ class Recorder:
            "samples": self.samples,
            "updated": time.time(),
            "gpu_lost": self._gpu_lost,
+            "game": self.game,
        }
        if sample is not None:
            data["latest"] = headline(sample)
@@ -15,6 +15,8 @@ from __future__ import annotations

 import json
 import os
+import shutil
+import subprocess
 import time
 from dataclasses import asdict, dataclass
 from pathlib import Path
@@ -351,6 +353,24 @@ def acknowledge_new() -> None:

 # --- formatting -----------------------------------------------------------------------

+def launch_game(appid: str) -> bool:
+    """Best-effort: ask Steam to launch a game by appid (steam:// URL). Non-blocking."""
+    if not appid:
+        return False
+    url = f"steam://rungameid/{appid}"
+    for cmd in (["steam", url], ["xdg-open", url]):
+        if shutil.which(cmd[0]):
+            try:
+                subprocess.Popen(
+                    cmd, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL,
+                    stdin=subprocess.DEVNULL, start_new_session=True,
+                )
+                return True
+            except (OSError, subprocess.SubprocessError):
+                continue
+    return False
+
+
 def human_size(num_bytes: int) -> str:
    if num_bytes <= 0:
        return "—"
@@ -17,19 +17,19 @@ from PySide6.QtWidgets import (

 from ..core.sample import Sample
 from ..render import metric_label
-from .widgets import Card, MetricBar, MetricRow, StatGauge
+from .widgets import Card, HistoryGraph, MetricBar, MetricRow

 _GROUP_ORDER = ["gpu", "cpu", "memory", "storage"]
 _GROUP_TITLES = {"gpu": "GPU", "cpu": "CPU", "memory": "Memory", "storage": "Storage"}
 _BAR_METRICS = {"util", "mem_util", "fan", "used_pct"}


-def _gauge_card(gauge: StatGauge) -> QFrame:
+def _tile_card(widget: QWidget) -> QFrame:
    card = QFrame()
    card.setObjectName("Card")
    layout = QVBoxLayout(card)
-    layout.setContentsMargins(6, 14, 6, 8)
-    layout.addWidget(gauge)
+    layout.setContentsMargins(6, 10, 6, 8)
+    layout.addWidget(widget)
    return card


@@ -54,16 +54,16 @@ class Dashboard(QWidget):
        header.addWidget(self._updated)
        root.addLayout(header)

-        # Headline gauges
-        self._g_gpu_temp = StatGauge("GPU Temp", "°C", 100, "temp")
-        self._g_gpu_load = StatGauge("GPU Load", "%", 100, "accent")
-        self._g_cpu_temp = StatGauge("CPU Temp", "°C", 100, "temp")
-        self._g_mem = StatGauge("Memory", "%", 100, "usage")
-        gauges = QHBoxLayout()
-        gauges.setSpacing(14)
+        # Headline trend graphs (history over the session, not just the live value)
+        self._g_gpu_temp = HistoryGraph("GPU Temp", "°C", 30, 100, "temp")
+        self._g_gpu_load = HistoryGraph("GPU Load", "%", 0, 100, "accent")
+        self._g_cpu_temp = HistoryGraph("CPU Temp", "°C", 30, 100, "temp")
+        self._g_mem = HistoryGraph("Memory", "%", 0, 100, "usage")
+        graphs = QHBoxLayout()
+        graphs.setSpacing(14)
        for g in (self._g_gpu_temp, self._g_gpu_load, self._g_cpu_temp, self._g_mem):
-            gauges.addWidget(_gauge_card(g))
-        root.addLayout(gauges)
+            graphs.addWidget(_tile_card(g))
+        root.addLayout(graphs)

        # Per-subsystem cards (scrollable, 2-column grid)
        scroll = QScrollArea()
@@ -81,10 +81,10 @@ class Dashboard(QWidget):
        root.addWidget(scroll, 1)

    def update_sample(self, sample: Sample) -> None:
-        self._g_gpu_temp.set_value(self._val(sample, "gpu", "temp", ""))
-        self._g_gpu_load.set_value(self._val(sample, "gpu", "util"))
-        self._g_cpu_temp.set_value(self._cpu_temp(sample))
-        self._g_mem.set_value(self._val(sample, "memory", "used_pct"))
+        self._g_gpu_temp.add_value(self._val(sample, "gpu", "temp", ""))
+        self._g_gpu_load.add_value(self._val(sample, "gpu", "util"))
+        self._g_cpu_temp.add_value(self._cpu_temp(sample))
+        self._g_mem.add_value(self._val(sample, "memory", "used_pct"))

        keys = [r.key for r in sample.readings]
        if keys != self._built_keys:  # sources appeared/disappeared
@@ -0,0 +1,81 @@
+"""Results view for a guided diagnostic session (M6/D12): capture summary + findings."""
+
+from __future__ import annotations
+
+from PySide6.QtCore import Qt
+from PySide6.QtGui import QFont
+from PySide6.QtWidgets import (
+    QDialog,
+    QFrame,
+    QHBoxLayout,
+    QLabel,
+    QPushButton,
+    QScrollArea,
+    QVBoxLayout,
+    QWidget,
+)
+
+from ..render import render_summary
+from .widgets import finding_card
+
+
+class DiagnosticDialog(QDialog):
+    def __init__(self, result, parent=None) -> None:
+        super().__init__(parent)
+        self.setWindowTitle(f"Diagnostic — {result.game}" if result.game else "Diagnostic")
+        self.resize(660, 680)
+
+        root = QVBoxLayout(self)
+        root.setContentsMargins(20, 18, 20, 16)
+        root.setSpacing(14)
+
+        title = QLabel(f"Diagnostic — {result.game}" if result.game else "Diagnostic")
+        title.setObjectName("PageTitle")
+        root.addWidget(title)
+
+        scroll = QScrollArea()
+        scroll.setWidgetResizable(True)
+        scroll.setFrameShape(QFrame.Shape.NoFrame)
+        scroll.setStyleSheet("background: transparent;")
+        body = QWidget()
+        col = QVBoxLayout(body)
+        col.setContentsMargins(0, 0, 0, 0)
+        col.setSpacing(10)
+        col.setAlignment(Qt.AlignmentFlag.AlignTop)
+
+        # Capture window summary (peaks / events / last samples) — monospace for the columns.
+        cap_head = QLabel("Capture")
+        cap_head.setStyleSheet("font-weight: 700; background: transparent;")
+        col.addWidget(cap_head)
+        summary = QLabel(render_summary(result.summary))
+        summary.setObjectName("Report")
+        summary.setFont(QFont("monospace"))
+        summary.setTextInteractionFlags(Qt.TextInteractionFlag.TextSelectableByMouse)
+        summary.setWordWrap(False)
+        summary.setStyleSheet(
+            "background: #0d0f13; color: #cfd3da; border: 1px solid #2a2f39; "
+            "border-radius: 8px; padding: 10px;"
+        )
+        col.addWidget(summary)
+
+        find_head = QLabel(f"Findings ({len(result.findings)})")
+        find_head.setStyleSheet("font-weight: 700; background: transparent;")
+        col.addWidget(find_head)
+        if result.findings:
+            for finding in result.findings:
+                col.addWidget(finding_card(finding))
+        else:
+            none = QLabel("No findings.")
+            none.setObjectName("Muted")
+            col.addWidget(none)
+
+        scroll.setWidget(body)
+        root.addWidget(scroll, 1)
+
+        buttons = QHBoxLayout()
+        buttons.addStretch(1)
+        close = QPushButton("Close")
+        close.setObjectName("PrimaryButton")
+        close.clicked.connect(self.accept)
+        buttons.addWidget(close)
+        root.addLayout(buttons)
@@ -17,6 +17,7 @@ from PySide6.QtWidgets import (
    QFrame,
    QHBoxLayout,
    QLabel,
+    QMessageBox,
    QPushButton,
    QScrollArea,
    QVBoxLayout,
@@ -24,10 +25,11 @@ from PySide6.QtWidgets import (
 )

 from ..config import load_config, update_config
-from .theme import ACCENT, GOOD, MUTED
+from .diagnostic_dialog import DiagnosticDialog
+from .theme import ACCENT, GOOD, MUTED, WARN


-def _game_row(name: str, sublabel: str, size: str, is_new: bool) -> QFrame:
+def _game_row(name: str, sublabel: str, size: str, is_new: bool, appid: str = "", on_diagnose=None) -> QFrame:
    card = QFrame()
    card.setObjectName("Card")
    h = QHBoxLayout(card)
@@ -59,6 +61,13 @@ def _game_row(name: str, sublabel: str, size: str, is_new: bool) -> QFrame:
    size_label.setMinimumWidth(80)
    size_label.setAlignment(Qt.AlignmentFlag.AlignRight | Qt.AlignmentFlag.AlignVCenter)
    h.addWidget(size_label, 0)
+
+    if on_diagnose is not None:
+        diag_btn = QPushButton("Run Diagnostic")
+        diag_btn.setObjectName("ActionButton")
+        diag_btn.setCursor(Qt.CursorShape.PointingHandCursor)
+        diag_btn.clicked.connect(lambda: on_diagnose(name, appid))
+        h.addWidget(diag_btn, 0)
    return card


@@ -66,14 +75,17 @@ class GamesPage(QWidget):
    _libraries_ready = Signal(object)  # list[dict(path, label, count, selected)]
    _scanned = Signal(object)          # steam.ScanResult
    new_count_changed = Signal(int)    # newly-installed game count (for the nav badge)
+    _diag_done = Signal(object)        # DiagnosticResult — focused capture analyzed

    def __init__(self) -> None:
        super().__init__()
        self.setObjectName("Page")
        self._libraries_ready.connect(self._render_libraries)
        self._scanned.connect(self._render_games)
+        self._diag_done.connect(self._on_diag_done)
        self._busy = False
        self._new_appids: set[str] = set()
+        self._diag_game: str | None = None

        root = QVBoxLayout(self)
        root.setContentsMargins(20, 18, 20, 18)
@@ -93,6 +105,52 @@ class GamesPage(QWidget):
        header.addWidget(self._rescan_btn)
        root.addLayout(header)

+        # In-progress diagnostic banner (hidden until a focused capture is running).
+        self._banner = QFrame()
+        self._banner.setObjectName("Card")
+        self._banner.setStyleSheet(f"#Card {{ border: 1px solid {ACCENT}; }}")
+        banner_h = QHBoxLayout(self._banner)
+        banner_h.setContentsMargins(16, 10, 16, 10)
+        banner_h.setSpacing(10)
+        self._banner_label = QLabel("")
+        self._banner_label.setWordWrap(True)
+        self._banner_label.setStyleSheet(f"color: {ACCENT}; font-weight: 700; background: transparent;")
+        banner_h.addWidget(self._banner_label, 1)
+        self._finish_btn = QPushButton("Finish && analyze")  # && → literal & (not a mnemonic)
+        self._finish_btn.setObjectName("ActionButton")
+        self._finish_btn.clicked.connect(self._finish_diagnostic)
+        banner_h.addWidget(self._finish_btn)
+        self._discard_btn = QPushButton("Discard")
+        self._discard_btn.clicked.connect(self._discard_diagnostic)
+        banner_h.addWidget(self._discard_btn)
+        self._banner.hide()
+        root.addWidget(self._banner)
+
+        # Hard-crash banner: a previous diagnostic ended without a clean stop.
+        self._crash_banner = QFrame()
+        self._crash_banner.setObjectName("Card")
+        self._crash_banner.setStyleSheet(f"#Card {{ border: 1px solid {WARN}; }}")
+        crash_h = QHBoxLayout(self._crash_banner)
+        crash_h.setContentsMargins(16, 10, 16, 10)
+        crash_h.setSpacing(10)
+        self._crash_label = QLabel("")
+        self._crash_label.setWordWrap(True)
+        self._crash_label.setStyleSheet(f"color: {WARN}; font-weight: 700; background: transparent;")
+        crash_h.addWidget(self._crash_label, 1)
+        self._analyze_btn = QPushButton("Analyze crash")
+        self._analyze_btn.setObjectName("ActionButton")
+        self._analyze_btn.clicked.connect(self._analyze_crash)
+        crash_h.addWidget(self._analyze_btn)
+        self._dismiss_btn = QPushButton("Dismiss")
+        self._dismiss_btn.clicked.connect(self._dismiss_crash)
+        crash_h.addWidget(self._dismiss_btn)
+        self._crash_banner.hide()
+        root.addWidget(self._crash_banner)
+
+        self._diag_timer = QTimer(self)
+        self._diag_timer.setInterval(1000)
+        self._diag_timer.timeout.connect(self._poll_diag)
+
        # Libraries (opt-in checkboxes)
        lib_card = QFrame()
        lib_card.setObjectName("Card")
@@ -126,6 +184,7 @@ class GamesPage(QWidget):

        self._load_cached()                       # instant display from the last scan
        QTimer.singleShot(400, self.refresh)      # then rescan in the background on launch
+        self._check_crash()                       # surface an interrupted (crashed) diagnostic

    # --- loading ----------------------------------------------------------------------

@@ -233,9 +292,151 @@ class GamesPage(QWidget):
                os.path.basename(g.library.rstrip("/")) or g.library,
                steam.human_size(g.size_bytes),
                g.appid in new_appids,
+                appid=g.appid,
+                on_diagnose=self._start_diagnostic,
            ))
        self._list.addStretch(1)

+    # --- guided diagnostic (M6/D12) ---------------------------------------------------
+
+    def _start_diagnostic(self, name: str, appid: str = "") -> None:
+        from ..core import diagnostic, steam
+
+        if diagnostic.is_running():
+            QMessageBox.information(
+                self, "RigDoctor",
+                "A capture is already running — finish or discard it first.")
+            return
+
+        # Tell the user what the flow actually is, and offer to launch the game for them.
+        box = QMessageBox(self)
+        box.setIcon(QMessageBox.Icon.Information)
+        box.setWindowTitle(f"Run Diagnostic — {name}")
+        box.setText(f"Record a focused diagnostic while you play {name}?")
+        box.setInformativeText(
+            "RigDoctor will capture sensors in the background. Then:\n\n"
+            "1.  Play the game and try to reproduce the freeze / black screen / crash.\n"
+            "2.  When you're done — or after a hard freeze and reboot — come back here and "
+            "click “Finish & analyze”.\n\n"
+            "Your readings are saved continuously, so even a hard lock won't lose them."
+        )
+        launch_btn = box.addButton("Launch game && start", QMessageBox.ButtonRole.AcceptRole)
+        start_btn = box.addButton("Start without launching", QMessageBox.ButtonRole.ActionRole)
+        box.addButton("Cancel", QMessageBox.ButtonRole.RejectRole)
+        if not appid:
+            launch_btn.setEnabled(False)  # no appid → can't ask Steam to launch it
+        box.exec()
+        clicked = box.clickedButton()
+        if clicked not in (launch_btn, start_btn):
+            return
+
+        if diagnostic.start(game=name) is None:
+            QMessageBox.warning(self, "RigDoctor", "Couldn't start the capture.")
+            return
+        launched = steam.launch_game(appid) if clicked is launch_btn else False
+        self._diag_game = name
+        self._finish_btn.setEnabled(True)
+        self._discard_btn.setEnabled(True)
+        self._banner.show()
+        self._diag_timer.start()
+        self._poll_diag()
+        if clicked is launch_btn and not launched:
+            QMessageBox.information(
+                self, "RigDoctor",
+                "Recording started, but couldn't launch the game automatically — "
+                "launch it yourself, then click “Finish & analyze” when you're done.")
+
+    def _poll_diag(self) -> None:
+        from ..core import diagnostic
+
+        status = diagnostic.active()
+        if not status:
+            self._diag_timer.stop()  # recorder exited on its own
+            return
+        samples = status.get("samples", 0)
+        lost = "  ·  ⚠ GPU-lost detected" if status.get("gpu_lost") else ""
+        game = status.get("game") or self._diag_game or "your game"
+        self._banner_label.setText(
+            f"● Recording {game} — play it and reproduce the problem, then click "
+            f"“Finish & analyze”.   ({samples} samples{lost})"
+        )
+
+    def _finish_diagnostic(self) -> None:
+        self._diag_timer.stop()
+        self._finish_btn.setEnabled(False)
+        self._discard_btn.setEnabled(False)
+        self._banner_label.setText("Analyzing… (running the health report)")
+        threading.Thread(target=self._work_finish, daemon=True).start()
+
+    def _work_finish(self) -> None:
+        from ..core import diagnostic
+
+        try:
+            result = diagnostic.finish()
+        except Exception:
+            result = None
+        self._diag_done.emit(result)
+
+    def _on_diag_done(self, result) -> None:
+        self._banner.hide()
+        self._crash_banner.hide()
+        self._finish_btn.setEnabled(True)
+        self._discard_btn.setEnabled(True)
+        self._analyze_btn.setEnabled(True)
+        if result is None:
+            QMessageBox.warning(self, "RigDoctor", "The diagnostic couldn't be analyzed.")
+            return
+        DiagnosticDialog(result, self).exec()
+
+    def _discard_diagnostic(self) -> None:
+        from ..core import reccontrol
+
+        self._diag_timer.stop()
+        reccontrol.stop_background()
+        self._banner.hide()
+
+    # --- hard-crash recovery ----------------------------------------------------------
+
+    def _check_crash(self) -> None:
+        from ..core import diagnostic
+
+        info = diagnostic.pending_crash()
+        if info is None:
+            self._crash_banner.hide()
+            return
+        game = info.game or "your last game"
+        extra = "  ·  ⚠ GPU-lost was captured" if info.gpu_lost else ""
+        self._crash_label.setText(
+            f"⚠ Your last diagnostic for {game} ended unexpectedly — likely a hard crash "
+            f"({info.samples} samples{extra}). Analyze it to see the final readings and the "
+            f"likely cause from the system logs."
+        )
+        self._analyze_btn.setEnabled(True)
+        self._crash_banner.show()
+
+    def _analyze_crash(self) -> None:
+        from ..core import diagnostic
+
+        diagnostic.acknowledge_crash()  # don't prompt again for this one
+        self._analyze_btn.setEnabled(False)
+        self._crash_label.setText("Analyzing the crash (final readings + system logs)…")
+        threading.Thread(target=self._work_analyze_crash, daemon=True).start()
+
+    def _work_analyze_crash(self) -> None:
+        from ..core import diagnostic
+
+        try:
+            result = diagnostic.analyze_crash()
+        except Exception:
+            result = None
+        self._diag_done.emit(result)
+
+    def _dismiss_crash(self) -> None:
+        from ..core import diagnostic
+
+        diagnostic.acknowledge_crash()
+        self._crash_banner.hide()
+
    # --- nav badge integration --------------------------------------------------------

    def showEvent(self, event) -> None:  # noqa: N802 (Qt override)
@@ -247,3 +448,15 @@ class GamesPage(QWidget):

            threading.Thread(target=steam.acknowledge_new, daemon=True).start()
            self.new_count_changed.emit(0)
+
+        # Reflect a capture that's still running (e.g. started earlier, navigated back).
+        from ..core import diagnostic
+
+        if diagnostic.is_running():
+            status = diagnostic.active() or {}
+            self._diag_game = status.get("game") or self._diag_game
+            self._banner.show()
+            if not self._diag_timer.isActive():
+                self._diag_timer.start()
+        else:
+            self._check_crash()  # re-surface an interrupted diagnostic if one is pending
@@ -2,8 +2,10 @@

 from __future__ import annotations

-from PySide6.QtCore import QRectF, Qt
-from PySide6.QtGui import QColor, QFont, QPainter, QPen
+from collections import deque
+
+from PySide6.QtCore import QPointF, QRectF, Qt
+from PySide6.QtGui import QColor, QFont, QPainter, QPainterPath, QPen
 from PySide6.QtWidgets import (
    QComboBox,
    QFrame,
@@ -17,7 +19,19 @@ from PySide6.QtWidgets import (

 from ..core.sample import Reading
 from ..render import format_value
-from .theme import ACCENT, CRIT, GOOD, MUTED, TEXT, TRACK, WARN, gauge_color, temp_color
+from .theme import (
+    ACCENT,
+    CRIT,
+    GOOD,
+    MUTED,
+    TEMP_WARN,
+    TEXT,
+    TRACK,
+    USAGE_WARN,
+    WARN,
+    gauge_color,
+    temp_color,
+)

 _SEV = {
    "critical": ("CRITICAL", CRIT),
@@ -248,6 +262,117 @@ class StatGauge(QWidget):
        p.end()


+class HistoryGraph(QWidget):
+    """A headline metric as a trend: current value + window min/max + a history line.
+
+    Replaces the at-a-glance gauge with changes-over-time. `kind` drives the color
+    (temp band / usage / accent), matching StatGauge so the dashboard stays consistent.
+    """
+
+    def __init__(self, title: str, unit: str = "", vmin: float = 0.0, vmax: float = 100.0,
+                 kind: str = "accent", history: int = 180) -> None:
+        super().__init__()
+        self._title = title
+        self._unit = unit
+        self._min = vmin
+        self._max = vmax
+        self._kind = kind  # "temp" | "usage" | "accent"
+        self._values: deque[float | None] = deque(maxlen=history)
+        self.setMinimumSize(160, 132)
+
+    def add_value(self, value: float | None) -> None:
+        self._values.append(value)
+        self.update()
+
+    def _fmt(self, value: float | None) -> str:
+        if value is None:
+            return "—"
+        if self._unit == "°C":
+            return f"{value:.0f}°"
+        if self._unit == "%":
+            return f"{value:.0f}%"
+        return f"{value:.0f}{self._unit}"
+
+    def paintEvent(self, event) -> None:  # noqa: N802 (Qt override)
+        p = QPainter(self)
+        p.setRenderHint(QPainter.RenderHint.Antialiasing)
+        w, h = self.width(), self.height()
+        pad = 10.0
+        present = [v for v in self._values if v is not None]
+        current = next((v for v in reversed(self._values) if v is not None), None)
+        color = QColor(gauge_color(self._kind, current))
+
+        ftitle = QFont()
+        ftitle.setPointSizeF(10.0)
+        ftitle.setBold(True)
+        p.setFont(ftitle)
+        p.setPen(QColor(MUTED))
+        p.drawText(QRectF(pad, 6, w - 2 * pad, 18),
+                   Qt.AlignmentFlag.AlignLeft | Qt.AlignmentFlag.AlignVCenter, self._title)
+
+        fval = QFont()
+        fval.setPointSizeF(21.0)
+        fval.setBold(True)
+        p.setFont(fval)
+        p.setPen(color if current is not None else QColor(MUTED))
+        p.drawText(QRectF(pad, 2, w - 2 * pad, 28),
+                   Qt.AlignmentFlag.AlignRight | Qt.AlignmentFlag.AlignTop, self._fmt(current))
+
+        if present:
+            fsm = QFont()
+            fsm.setPointSizeF(8.5)
+            p.setFont(fsm)
+            p.setPen(QColor(MUTED))
+            p.drawText(QRectF(pad, 27, w - 2 * pad, 14), Qt.AlignmentFlag.AlignLeft,
+                       f"min {self._fmt(min(present))}   max {self._fmt(max(present))}")
+
+        g_top, g_bot = 48.0, h - pad
+        g_left, g_right = pad, w - pad
+        span = self._max - self._min
+        if g_bot - g_top < 12 or g_right - g_left < 12 or span <= 0:
+            p.end()
+            return
+
+        def y_of(v: float) -> float:
+            frac = (max(self._min, min(self._max, v)) - self._min) / span
+            return g_bot - frac * (g_bot - g_top)
+
+        warn = TEMP_WARN if self._kind == "temp" else (USAGE_WARN if self._kind == "usage" else None)
+        if warn is not None and self._min <= warn <= self._max:
+            pen = QPen(QColor(TRACK))
+            pen.setWidthF(1.0)
+            pen.setStyle(Qt.PenStyle.DashLine)
+            p.setPen(pen)
+            yw = y_of(warn)
+            p.drawLine(QPointF(g_left, yw), QPointF(g_right, yw))
+
+        maxlen = self._values.maxlen or 1
+        step = (g_right - g_left) / max(1, maxlen - 1)
+        n = len(self._values)
+        # Build the line newest-at-right; break it where readings are missing.
+        path = QPainterPath()
+        drawing = False
+        for i, v in enumerate(self._values):
+            if v is None:
+                drawing = False
+                continue
+            x = g_right - (n - 1 - i) * step
+            y = y_of(v)
+            if drawing:
+                path.lineTo(x, y)
+            else:
+                path.moveTo(x, y)
+                drawing = True
+        if not path.isEmpty():
+            pen = QPen(color)
+            pen.setWidthF(2.0)
+            pen.setCapStyle(Qt.PenCapStyle.RoundCap)
+            pen.setJoinStyle(Qt.PenJoinStyle.RoundJoin)
+            p.setPen(pen)
+            p.drawPath(path)
+        p.end()
+
+
 class MetricBar(QWidget):
    """A label + value with a thin progress bar (for 0–100% metrics)."""

@@ -0,0 +1,107 @@
+"""Tests for the guided diagnostic orchestration (M3+M4 glue)."""
+
+import tempfile
+import time
+import unittest
+from pathlib import Path
+from unittest import mock
+
+from rigdoctor.core import diagnostic
+from rigdoctor.core.crashlog import CrashLogWriter, summarize
+from rigdoctor.core.health import Finding
+from rigdoctor.core.sample import Reading, Sample
+
+
+def _write_log(path: str, game: str) -> None:
+    w = CrashLogWriter(path)
+    w.write_event("session-start", "interval=1s")
+    w.write_event("game", game)
+    for temp in (60.0, 72.0, 81.0):
+        w.write_sample(Sample(ts=time.time(), readings=[Reading("gpu", "temp", temp, "°C", "")]))
+    w.write_event("gpu-lost", "nvidia-smi query timed out")
+    w.close()
+
+
+class GameRecoveryTests(unittest.TestCase):
+    def test_game_recovered_from_log_event(self):
+        with tempfile.TemporaryDirectory() as d:
+            log = str(Path(d) / "capture.jsonl")
+            _write_log(log, "Path of Exile 2")
+            summary = summarize(log)
+            self.assertEqual(diagnostic._game_from_summary(summary), "Path of Exile 2")
+
+    def test_no_game_event_returns_none(self):
+        with tempfile.TemporaryDirectory() as d:
+            log = str(Path(d) / "capture.jsonl")
+            w = CrashLogWriter(log)
+            w.write_event("session-start")
+            w.close()
+            self.assertIsNone(diagnostic._game_from_summary(summarize(log)))
+
+
+class FinishTests(unittest.TestCase):
+    def test_finish_combines_summary_and_findings(self):
+        with tempfile.TemporaryDirectory() as d:
+            log = Path(d) / "capture.jsonl"
+            _write_log(str(log), "Satisfactory")
+            fake = [Finding("warning", "GPU", "NVIDIA Xid 79 ×1", "fell off the bus")]
+            with mock.patch("rigdoctor.core.health.run_health_checks", return_value=fake), \
+                 mock.patch.object(diagnostic.reccontrol, "stop_background", return_value=False), \
+                 mock.patch.object(diagnostic.reccontrol, "running_pid", return_value=None):
+                result = diagnostic.finish(log_path=log)
+            self.assertEqual(result.game, "Satisfactory")
+            self.assertEqual(result.summary.samples, 3)
+            self.assertEqual(result.findings, fake)
+            # peak GPU temp captured in the window, GPU-lost event recorded
+            self.assertEqual(result.summary.maxima["gpu.temp"][0], 81.0)
+            self.assertTrue(any(kind == "gpu-lost" for _ts, kind, _d in result.summary.events))
+
+
+class CrashDetectionTests(unittest.TestCase):
+    def _diag_log(self, d) -> Path:
+        return Path(d) / "diagnostic.jsonl"
+
+    def test_unterminated_session_is_a_pending_crash(self):
+        with tempfile.TemporaryDirectory() as d:
+            log = self._diag_log(d)
+            _write_log(str(log), "Tarkov")  # has session-start + game, no session-stop
+            with mock.patch.object(diagnostic.config, "DIAG_LOG", log), \
+                 mock.patch.object(diagnostic.reccontrol, "running_pid", return_value=None):
+                info = diagnostic.pending_crash()
+            self.assertIsNotNone(info)
+            self.assertEqual(info.game, "Tarkov")
+            self.assertTrue(info.gpu_lost)  # _write_log writes a gpu-lost event
+
+    def test_clean_stop_is_not_a_crash(self):
+        with tempfile.TemporaryDirectory() as d:
+            log = self._diag_log(d)
+            w = CrashLogWriter(str(log))
+            w.write_event("session-start"); w.write_event("game", "X")
+            w.write_sample(Sample(time.time(), [Reading("gpu", "temp", 60.0, "°C", "")]))
+            w.write_event("session-stop", "samples=1")
+            w.close()
+            with mock.patch.object(diagnostic.config, "DIAG_LOG", log), \
+                 mock.patch.object(diagnostic.reccontrol, "running_pid", return_value=None):
+                self.assertIsNone(diagnostic.pending_crash())
+
+    def test_acknowledge_clears_pending_crash(self):
+        with tempfile.TemporaryDirectory() as d:
+            log = self._diag_log(d)
+            _write_log(str(log), "Tarkov")
+            with mock.patch.object(diagnostic.config, "DIAG_LOG", log), \
+                 mock.patch.object(diagnostic.reccontrol, "running_pid", return_value=None):
+                self.assertIsNotNone(diagnostic.pending_crash())
+                diagnostic.acknowledge_crash()
+                self.assertIsNone(diagnostic.pending_crash())
+
+    def test_running_capture_is_not_a_crash(self):
+        with tempfile.TemporaryDirectory() as d:
+            log = self._diag_log(d)
+            _write_log(str(log), "Tarkov")
+            with mock.patch.object(diagnostic.config, "DIAG_LOG", log), \
+                 mock.patch.object(diagnostic.reccontrol, "running_pid", return_value=4321):
+                self.assertIsNone(diagnostic.pending_crash())  # it's in-progress, not crashed
+
+
+if __name__ == "__main__":
+    unittest.main()
Author	SHA1	Message	Date
jessey	ab89dda0b4	Merge pull request 'feat: detect a hard-crashed diagnostic + analyze the crash boot — 0.15.0' (#10 ) from feat/m6-steam-detection into main release / release (push) Successful in 13s Details Reviewed-on: #10	2026-05-22 06:53:13 +00:00
jessey	305c88ba09	feat: detect a hard-crashed diagnostic + analyze the crash boot — 0.15.0 A focused capture that ends without a clean stop (no session-stop, no live recorder) is treated as a likely hard freeze. - core/diagnostic.py: pending_crash() detects the unterminated session; acknowledge_crash() dismisses it; analyze_crash() combines the captured window (final readings + GPU-lost) with a focused scan of the PREVIOUS (crashed) boot + SMART/driver/persistence/temps. - health.check_previous_boot() scans `journalctl -k -b -1`; run_health_checks gained include_journal to avoid double-scanning for the crash path. - GUI: Games page shows a warning banner on launch for an interrupted diagnostic with Analyze crash / Dismiss → results dialog. - Tests for crash detection / clean-stop / acknowledge / in-progress. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 08:52:59 +02:00
jessey	82f3ea49de	Merge pull request 'feat(gui): dashboard history graphs for headline metrics — 0.14.0' (#9 ) from feat/m6-steam-detection into main release / release (push) Successful in 14s Details Reviewed-on: #9	2026-05-22 06:51:06 +00:00
jessey	8d695227bc	feat(gui): dashboard history graphs for headline metrics — 0.14.0 Replace the four headline gauges (GPU temp, GPU load, CPU temp, memory) with HistoryGraph trend tiles: each plots its session history with the current value, window min/max, a dashed warn-threshold line, and a kind-colored line (temp band / usage / accent). QPainter-drawn, no new dependency. Seeing changes over time is more useful than the live-only snapshot. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 08:45:20 +02:00
jessey	82bef0a08c	Merge pull request 'feat(gui): explain Run Diagnostic + offer to launch the game — 0.13.0' (#8 ) from feat/m6-steam-detection into main release / release (push) Successful in 14s Details Reviewed-on: #8	2026-05-22 06:43:57 +00:00
jessey	73f347449e	feat(gui): explain Run Diagnostic + offer to launch the game — 0.13.0 The recording banner gave no guidance, so it wasn't clear what to do after clicking Run Diagnostic. - Start dialog now spells out the flow: play the game, reproduce the crash, then Finish & analyze (data survives a hard freeze + reboot), with "Launch game & start" (steam.launch_game via steam:// appid URL) or "Start without launching". - Recording banner now states the next step, not just a sample count. - steam.launch_game(appid): best-effort Steam launch (steam / xdg-open). - Fix: escape "&" in button labels (Qt mnemonic) so "Finish & analyze" shows correctly instead of "Finish _analyze". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 08:40:50 +02:00
jessey	5cd51beadf	Merge pull request 'feat(gui): Run Diagnostic flow on the Games page — 0.12.0' (#7 ) from feat/m6-steam-detection into main release / release (push) Successful in 14s Details Reviewed-on: #7	2026-05-22 06:32:30 +00:00
jessey	934b489fec	feat(gui): Run Diagnostic flow on the Games page — 0.12.0 Brings the guided diagnostic (0.11.0 core/CLI) into the GUI: - Each game row gets a "Run Diagnostic" button → starts a focused, game-tagged capture and shows a recording banner (live sample count + GPU-lost indicator) with Finish & analyze / Discard. - Finishing runs core.diagnostic.finish() off the UI thread and opens a results dialog (gui/diagnostic_dialog.py): window-scoped capture summary + findings cards (reusing render_summary + finding_card). - Banner restores on showEvent if a capture is still running (navigate away/back). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 08:32:04 +02:00
jessey	7a283dc338	Merge pull request 'feat: guided diagnostic session (CLI) — pick a game, capture, analyze — 0.11.0' (#6 ) from feat/m6-steam-detection into main release / release (push) Successful in 15s Details Reviewed-on: #6	2026-05-22 06:28:21 +00:00
jessey	5682878f22	feat: guided diagnostic session (CLI) — pick a game, capture, analyze — 0.11.0 The seed use case end to end, orchestrating M3 + M4 (ARCHITECTURE §7.1). - core/diagnostic.py: start(game) runs a focused, game-tagged capture into a dedicated diagnostic log (window-scoped report, separate from the always-on crash log); finish() stops it and combines the capture summary (M3) with the health findings (M4). Game recorded as a log event so it survives crash+reboot. - CLI: rigdoctor diagnose start --game/--appid \| status \| finish. - recorder/record run gained an optional --game tag; reccontrol passes it through. - Tests for game recovery + the finish() combination. GUI/tray "Run Diagnostic" button and auto start/stop (D12 wrapper) come next. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 08:27:53 +02:00