fix(cli): correct the missing-PySide6 hint to the real apt packages — 0.36.1

rigdoctor gui suggested 'apt install python3-pyside6' (no such package on Debian/Ubuntu). Point to the split modules instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
docs: document the proper (GPG-verified, deb822) apt setup
2026-05-22 15:48:20 +02:00 · 2026-05-22 15:44:41 +02:00 · 2026-05-22 13:39:01 +00:00 · 2026-05-22 15:38:12 +02:00 · 2026-05-22 13:32:41 +00:00 · 2026-05-22 15:31:36 +02:00
32 changed files with 1755 additions and 149 deletions
@@ -11,7 +11,20 @@ on:
    branches: [main]
 jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - name: Install (core only)
        run: python -m pip install -e .
      - name: Run tests
        run: python -m unittest discover -s tests -v
  release:
    needs: test          # don't publish a release if the tests fail
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
@@ -30,6 +43,9 @@ jobs:
      - name: Build self-extracting installer (.run)
        run: python packaging/make_run.py
      - name: Build .deb
        run: python packaging/make_deb.py
      - name: Read version
        id: ver
        run: |
@@ -90,3 +106,26 @@ jobs:
              "${API}/releases/${rid}/assets?name=$(basename "$f")" >/dev/null
          done
          echo "Published ${TAG}."
      - name: Publish .deb to the Gitea apt registry (optional — needs REGISTRY_TOKEN)
        env:
          PKG_TOKEN: ${{ secrets.REGISTRY_TOKEN }}
        run: |
          set -euo pipefail
          if [ -z "${PKG_TOKEN:-}" ]; then
            echo "REGISTRY_TOKEN not set — skipping apt publish (the .deb is still a release asset)."
            exit 0
          fi
          OWNER="${{ github.repository_owner }}"
          URL="${{ github.server_url }}/api/packages/${OWNER}/debian/pool/stable/main/upload"
          for f in dist/*.deb; do
            echo "Uploading $(basename "$f") to the apt registry…"
            code=$(curl -sS -o /tmp/apt_upload.txt -w '%{http_code}' \
              --user "${OWNER}:${PKG_TOKEN}" --upload-file "$f" "$URL" || true)
            case "$code" in
              2*)  echo "  uploaded ($code)";;
              409) echo "  already published ($code) — skipping (registry versions are immutable)";;
              *)   echo "  upload failed ($code):"; cat /tmp/apt_upload.txt || true; exit 1;;
            esac
          done
          echo "apt source: deb ${{ github.server_url }}/api/packages/${OWNER}/debian stable main"
@@ -0,0 +1,44 @@
 name: tests
 run-name: Run test suite
 # Runs the unittest suite on pull requests (once per PR). Pushes to main are covered by the
 # `test` job in release.yml, so we don't trigger on push here — that would double every run.
 # Two jobs:
 #   core      — stdlib-only install; the GUI tests skip (@skipUnless HAVE_QT). Bulletproof.
 #   gui-smoke — installs the GUI extra + offscreen Qt libs and runs the same suite headless,
 #               exercising the MainWindow/SetupWizard/DiagnosticDialog construction tests.
 # Make `tests / core (pull_request)` a required status check on `main` so a PR can't merge red.
 on:
  pull_request:
 jobs:
  core:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - name: Install (core only — no PySide6)
        run: python -m pip install -e .
      - name: Run tests (GUI tests skip without PySide6)
        run: python -m unittest discover -s tests -v
  gui-smoke:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - name: System libraries for offscreen Qt
        run: |
          sudo apt-get update
          sudo apt-get install -y libegl1 libgl1 libxkbcommon0 libdbus-1-3 libglib2.0-0
      - name: Install (with GUI extra)
        run: python -m pip install -e ".[gui]"
      - name: Run tests (headless)
        env:
          QT_QPA_PLATFORM: offscreen
        run: python -m unittest discover -s tests -v
@@ -5,6 +5,116 @@ All notable changes to RigDoctor are recorded here. Format follows
 (`MAJOR.MINOR.PATCH`, pre-1.0). `__version__` and `pyproject.toml` must match the git
 release tag (so the auto-updater, D18, can compare versions).
 ## [0.36.1] - 2026-05-22
 ### Fixed
 - `rigdoctor gui` printed the wrong fix when PySide6 is missing — it suggested the non-existent
  `python3-pyside6` package. Now it names the real split modules
  (`python3-pyside6.qt{widgets,gui,websockets,svg}` + `python3-pyte`).
 ## [0.36.0] - 2026-05-22
 ### Fixed
 - **`.deb` now installs all dependencies automatically — no manual tool install.** The previous
  `Recommends: python3-pyside6` named a package that doesn't exist on Debian/Ubuntu (PySide6 is
  split per module), so apt silently skipped it and the GUI wouldn't start. Now it Recommends the
  actual modules the GUI imports — `python3-pyside6.qt{widgets,gui,websockets,svg}` + `python3-pyte`.
 ### Changed
 - **`apt install rigdoctor` sets up the whole toolset.** The `.deb` also Recommends the optional
  diagnostic/gaming tools (smartmontools, lm-sensors, dmidecode, pciutils, libnotify-bin,
  libsecret-tools, gamemode, mangohud) so they install by default — users never hand-install
  tools. `cpupower` is a Suggests (kernel-tied); `--no-install-recommends` still gives CLI-only.
 ## [0.35.0] - 2026-05-22
 ### Added
 - **`.deb` package (M9 / D8)** — `packaging/make_deb.py` builds a `rigdoctor_<version>_all.deb`
  (pure-Python, `Architecture: all`) via `dpkg-deb`: `Depends: python3`, with the GUI deps
  (`python3-pyside6`, `python3-pyte`) as **Recommends** so `sudo apt install ./rigdoctor_*.deb`
  gives the full app and `--no-install-recommends` gives CLI-only. Installs the package, both
  launchers, the desktop entry, and the icon. CI (`release.yml`) builds it as a **release asset**
  every release, and optionally publishes it to the Gitea **apt registry** (set a `REGISTRY_TOKEN`
  secret) for `sudo apt install rigdoctor`. **M9 is now complete.**
 ## [0.34.0] - 2026-05-22
 ### Added
 - **Event-based alerts (M8).** Beyond temperature + GPU-lost, RigDoctor now notifies on
  **critical kernel events** — Xid (GPU error), out-of-memory kills, CPU machine-checks, PCIe
  AER errors, and disk I/O errors — scanned from the kernel log every ~30s while monitoring and
  fired one-shot (cooldown-gated, so no spam). A proactive warning the moment something goes
  wrong, not just on a temperature threshold. Included whenever desktop notifications are on.
 ## [0.33.0] - 2026-05-22
 ### Added
 - **AI explanations stream live.** "Explain with AI" now fills token-by-token as the model
  generates (Ollama NDJSON + Claude SSE, both via stdlib `urllib`) instead of a multi-second
  freeze, then re-renders the finished answer as Markdown. `core/ai.explain_stream()`.
 ## [0.32.0] - 2026-05-22
 ### Added
 - **More for diagnostics & reports:**
  - **`nvidia-smi -q` snapshot** — driver, throttle/clock-event reasons, clocks, power, temps,
    PCIe link, ECC + retired pages (point-in-time at diagnostic time).
  - **Display-server log** — auto-detected: `Xorg.0.log` on X11, or the compositor's user-journal
    slice (gnome-shell/kwin/sway/gamescope) on Wayland.
  - **Full system inventory** (M5 hardware/OS) is now included in each stored diagnostic and the
    **Report** bundle — invaluable for larger/shared debugging.
  These join the kernel log + coredump records in `syslogs.txt`/`inventory.*`, are saved per
  diagnostic, included in the Report zip, and (logs) fed to the AI on "Explain".
 ## [0.31.0] - 2026-05-22
 ### Added
 - **Diagnostics now collect session-scoped system logs** (`core/syslogs.py`): a kernel-log
  slice (`journalctl -k` — Xid, OOM-killer, MCE, PCIe AER, thermal, hung tasks) and
  **crashed-process records** (`coredumpctl` — which executable, signal, and when). They're saved
  to the diagnostic directory (`syslogs.txt`), included in the **Report** bundle, and fed to the
  AI on "Explain" alongside the game logs. Best-effort — degrades quietly if the tools are
  missing or access is denied; scoped to the session window so it doesn't drag in old noise.
 ## [0.30.0] - 2026-05-22
 ### Added
 - **Logging & report bundles (M15, D25)** — opt-in via one **Settings → Logging** toggle
  (default off). When on: the app logs to a rotating `app.log`, and **each diagnostic is stored
  in its own folder** (`~/.local/share/rigdoctor/diagnostics/<id>/`) with the capture log, a
  structured `result.json`, a readable `report.txt`, a session-scoped game-log snapshot, and an
  `ai/` record of every AI interaction — **the exact data sent, which model, and its reply**.
 - **Report** — a button on the diagnostic dialog (and `rigdoctor bundle`) zips a diagnostic's
  folder plus `app.log` into `~/.local/share/rigdoctor/reports/<id>.zip` for sharing. Everything
  stays local; the zip only leaves your machine if you share it. Available only when logging is on.
 ## [0.29.0] - 2026-05-22
 ### Added
 - **AI now resolves Steam app IDs from your library instead of guessing.** When app IDs appear
  in the logs/findings, RigDoctor looks them up in your scanned games (`steam.appid_names()`) and
  injects an "App IDs (resolved from your installed games)" glossary into the prompt — so the
  model names games correctly (e.g. `2694490 = Path of Exile 2`) rather than hallucinating. Only
  IDs it can resolve locally are listed; no network, no model "training" needed.
 ## [0.28.1] - 2026-05-22
 ### Fixed
 - **AI explanations were misreading stale/benign logs.** Three fixes so the model analyses the
  *actual* session: (1) the prompt now states the **real game name, capture duration, and
  outcome** (clean vs. crash) so the model stops guessing the game from log paths; (2) game logs
  are **scoped to the session window** (Steam-console lines filtered by timestamp; a stale
  per-app Proton log from an earlier game is skipped); (3) the reference KB flags common
  **benign** Steam/Proton lines (`libnvidia-ml.so.1` assertion, routine minidump uploads, "fork
  without exec") so they aren't reported as the cause. The system prompt also forbids
  Windows-only advice (no "run as administrator") and tells the model not to invent a problem
  when the run was clean.
 ## [0.28.0] - 2026-05-22
 ### Added
 - **AI explanations now include recent game logs.** When you press "Explain with AI" on a
  diagnostic, RigDoctor also gathers recent **Proton** (`~/steam-<appid>.log`) and **Steam**
  console logs (`core/gamelogs.py`, tail-read + size-bounded) and passes them to the model, so
  it can correlate log errors with the sensor findings and pinpoint *when* something went wrong.
 ### Fixed
 - The AI explanation popup now **renders Markdown** (headings, bold, lists) instead of showing
  raw `###`/`**` — `QTextEdit.setMarkdown`, and the model is told to answer in Markdown.
 ## [0.27.1] - 2026-05-22
 ### Changed
 - AI assistant: selecting **Ollama** now pre-fills the model field with **`qwen2.5:7b`** (a
  strong 7B that fits an 8 GB GPU; our grounding makes a 7B sufficient). It won't overwrite a
  model you've already entered, and you can change it freely.
 ## [0.27.0] - 2026-05-22
 ### Added
 - **AI assistant (M14, D24)** — optional, **strictly opt-in, never automatic**. Explains your
@@ -1,132 +1,146 @@
 # RigDoctor
-A **modular diagnostics, monitoring, and health-check toolkit for Linux gamers.**
+**Hardware monitoring & crash diagnostics for Linux gamers.** Live sensors, crash-safe
 logging, plain-language health reports, per-game diagnostics, and optional AI explanations —
 in a desktop app, a tray applet, or the terminal. Ubuntu/Debian + NVIDIA first.
-> **Status:** 🟢 Phase 1 (MVP) complete. The **sensor core (M1)**, **crash-capture logger
+Linux gaming faults are hard to pin down — GPUs falling off the PCIe bus, black screens
-> (M3)**, and **health report (M4)** all work — live `snapshot`/`monitor`, crash-safe `record`
+mid-game, silent thermal/VRAM throttling, driver/Proton mismatches. The useful data is
-> with a post-crash report, and `report` to scan logs/SMART/driver for likely causes. A
+scattered across `nvidia-smi`, `/sys`, `journalctl`, and SMART, and the readings right before a
-> desktop GUI (M10) ties them together (dashboard, recording, health). See `docs/ROADMAP.md`.
+freeze are usually lost. RigDoctor pulls it together and keeps the evidence.
-## Why this exists
+## Features
-Linux gaming hardware faults are hard to diagnose: GPUs falling off the PCIe bus, the screen
+- **Live monitoring** — a dark desktop **dashboard** (history graphs + per-subsystem cards), a
-suddenly going black mid-game, silent thermal/VRAM throttling, power transients,
+  **tray applet** with at-a-glance status, and a terminal view (`rigdoctor monitor`).
-driver/library mismatches, Proton quirks, and CPU governor / power-profile misconfiguration.
+- **Crash-safe recording** — background logger that `fsync`s every sample, so the state right
-The data needed to diagnose them is scattered across `nvidia-smi`, `/sys/class/hwmon`,
+  before a hard freeze survives. Manual, always-on, or auto-start when a game launches.
-`journalctl`, SMART, and more — and the most useful readings (the ones right before a hard
+- **Health report** — scans `journalctl`/SMART/driver for likely causes (Xid, OOM, disk
-freeze) are usually lost because nothing flushed them to disk.
+  errors, throttling…) and explains them with suggested fixes.
 - **Per-game diagnostics** — pick a game, capture while you play, get a focused report; hard
  crashes are detected and analysed on next launch.
 - **Gaming tune-ups** — flags risky settings (CPU governor, PCIe ASPM, persistence mode…) with
  **one-click, reversible fixes**.
 - **Proactive alerts** — desktop notifications on overheating and critical kernel events
  (GPU-lost, Xid, out-of-memory, disk I/O).
 - **AI explanations** *(optional, opt-in)* — explain a diagnostic in plain language with a
  **local model (Ollama)** or **Claude**. Never automatic; only when you press the button.
 - **Shareable reports** — zip a diagnostic (logs, inventory, AI transcript) to hand to someone,
  or share a live **terminal session** for remote help.
 - **Self-updating** — `apt upgrade`, or the in-app updater.
-RigDoctor pulls all of that into one modular tool: live monitoring, crash-safe logging, a
+## Install
 one-shot health report, and an interactive installer that only sets up the modules a given
 user actually needs for their hardware.
-**Seed use cases:** an RTX 3070 that intermittently "falls off the bus" under heavy GPU load
+### Debian / Ubuntu — `.deb`
 (Path of Exile on Linux, Escape from Tarkov on Windows), and a monitor going black mid-game.
 See `docs/SPEC.md` §1.
-## How you run it
+The simplest path: grab the latest **`rigdoctor_<version>_all.deb`** from the
-
+[releases page](https://git.jesseyvanofferen.com/jessey/rigdoctor/releases) and install it —
-RigDoctor is **GUI-first** — the desktop app is the primary way in — but every feature is
+apt pulls the GUI dependencies (PySide6, pyte) automatically:
 also available headless:
 - **Desktop GUI** — graphical dashboard, recording controls, log browser, reports. The
  default interface for most users.
 - **Tray applet** — a small top-menu-bar applet with quick actions and at-a-glance status.
 - **CLI** — full functionality from the terminal; works over SSH and in scripts.
 The GUI/tray are optional modules; a headless (CLI-only) install loses no capability.
 ## Key decisions (settled)
 | Topic | Decision |
 |-------|----------|
 | Name | **RigDoctor** |
 | Language / stack | **Python 3 + Qt (PySide6)** — core/CLI/daemon stdlib-only; Qt only for GUI/tray |
 | Primary distro | **Ubuntu** (Debian via apt); others best-effort later |
 | Primary GPU | **NVIDIA** first; AMD, then Intel later |
 | MVP | **Sensor core + crash logger + health report** (NVIDIA-only, CLI-first) |
 | Distribution | **User-local install** (self-updating from the public repo, no root); **`.deb`** optional |
 | Scope of action | **Read-only + suggestions** (no auto-apply yet) |
 | Stress tests | **Out of scope** |
 Full rationale and the still-open questions are in `docs/DECISIONS.md`.
 ## Repo layout
 | Path | Purpose |
 |------|---------|
 | `docs/SPEC.md` | Product specification — vision, requirements, modules (the main planning doc) |
 | `docs/ARCHITECTURE.md` | Technical design — core engine, front-ends, daemon, installer |
 | `docs/MODULES.md` | Catalog of modules with scope, dependencies, status |
 | `docs/ROADMAP.md` | Phased milestones |
 | `docs/DECISIONS.md` | Decision log + remaining open questions |
 | `src/rigdoctor/` | Source code — `core/` engine + sources, `cli.py`, `render.py` |
 | `installer/` | Installer / `.deb` packaging (empty until Phase 4) |
 | `tests/` | Tests (stdlib `unittest`) |
 ## Install (user-local, no root)
 RigDoctor installs into a private venv under `~/.local` — no root, self-updating:
 ```bash
-./install.sh                 # from a source checkout or the self-extracting .run
+sudo apt install ./rigdoctor_*_all.deb        # CLI only: add --no-install-recommends
 ./install.sh --ref v0.0.6    # install a specific released tag (needs a token)
 ./install.sh --uninstall     # remove it
 ```
-This adds `rigdoctor` / `rigdoctor-gui` to `~/.local/bin` and a desktop entry. Each release
+**Or add the apt repository** for `apt install` + automatic updates. The registry is private and
-also ships a one-file **`.run`** installer (download, `chmod +x`, run). Updates are gated to
+GPG-signed, so you need a Gitea token with **`read:package`**, the signing key, and the deb822
-accounts on the Git server (a Personal Access Token); save one via the GUI **Setup → Update
+source (`read -rsp` keeps the token out of your shell history):
 access** panel or `rigdoctor login`, then `rigdoctor update` (or the sidebar button).
 ## Run it (dev)
 Stdlib-only, no install needed (target is Python ≥ 3.11; tested on 3.14):
 ```bash
-PYTHONPATH=src python3 -m rigdoctor snapshot     # one-shot sensor read
+read -rsp 'Gitea read:package token: ' TOKEN; echo
-PYTHONPATH=src python3 -m rigdoctor snapshot --json
+
-PYTHONPATH=src python3 -m rigdoctor monitor -n 1 # live view (Ctrl-C to quit)
+# signing key → dearmored into the keyring (the key endpoint requires the token too)
-PYTHONPATH=src python3 -m rigdoctor sources       # list detected sensor sources
+sudo install -d -m 0755 /etc/apt/keyrings
-PYTHONPATH=src python3 -m unittest discover -s tests
+curl -fsSL --user <user>:"$TOKEN" \
  https://git.jesseyvanofferen.com/api/packages/jessey/debian/repository.key \
  | sudo gpg --dearmor -o /etc/apt/keyrings/gitea-jessey.gpg
 # download credentials, kept out of the sources file
 printf 'machine git.jesseyvanofferen.com login <user> password %s\n' "$TOKEN" \
  | sudo tee /etc/apt/auth.conf.d/rigdoctor.conf >/dev/null
 sudo chmod 0600 /etc/apt/auth.conf.d/rigdoctor.conf
 # the source (modern deb822 format, GPG-verified, all-arch)
 sudo tee /etc/apt/sources.list.d/rigdoctor.sources >/dev/null <<'EOF'
 Types: deb
 URIs: https://git.jesseyvanofferen.com/api/packages/jessey/debian
 Suites: stable
 Components: main
 Architectures: all
 Signed-By: /etc/apt/keyrings/gitea-jessey.gpg
 EOF
 sudo apt update && sudo apt install rigdoctor
 ```
-### Crash-capture logger (M3)
+Then `sudo apt upgrade` keeps it current. *(Quick-and-dirty alternative if the registry isn't
 signed: skip the key and use a one-line `deb [arch=all trusted=yes] …/debian stable main` source.)*
-A crash-safe background logger (JSONL, `fsync` per sample, bounded by rotation) for catching
+### Any distro — self-extracting `.run` (no root)
-the state right before a freeze:
+
 Download **`rigdoctor-<version>-installer.run`** from the releases page and run it. It installs
 into a private virtualenv under `~/.local` (no root), adds the launchers + desktop entry, and
 opens the first-run setup wizard:
 ```bash
-rigdoctor record start          # start logging in the background
+sh rigdoctor-*-installer.run
 rigdoctor record status         # is it running? latest readings, sample count
 rigdoctor record stop           # stop it
 rigdoctor record report         # post-crash summary: peaks, events, last samples
 rigdoctor record run            # run in the foreground (the systemd-ready entrypoint)
 ```
-Logs live in `~/.local/share/rigdoctor/logs/`. It detects GPU "lost"/hang (nvidia-smi query
+### Updating & removing
 timeout) and writes an event marker. Trigger modes (always-on / game-launch) and the
 `systemd --user` service arrive in Phase 4.
-### Desktop GUI (M10)
+- **`.deb`:** `sudo apt upgrade` (or reinstall a newer `.deb`).
 - **`.run` / user-local:** the in-app **Update** button, or `rigdoctor update`.
 - **Remove:** `sudo apt remove rigdoctor`, or `rigdoctor uninstall` for the user-local install.
-The GUI uses PySide6 (Qt) — the only part of RigDoctor that needs a non-stdlib dep:
+## Using it
 Launch **RigDoctor** from your app menu, or:
 ```bash
-pip install -e '.[gui]'   # core + PySide6, gives `rigdoctor` and `rigdoctor-gui`
+rigdoctor-gui          # desktop app (+ tray)
-rigdoctor gui             # or: rigdoctor-gui
+rigdoctor --help       # everything from the terminal (works over SSH)
 ```
-It opens a dark-themed window with sidebar navigation and a **live dashboard** over the
+Handy CLI commands:
 same sensor core — circular gauges for the headline metrics plus collapsible per-subsystem
 cards (GPU/CPU/memory/storage) with temperature-colored values (icey-blue → green → red).
 The **Logs** and **Health** sections are full pages (recording controls + post-crash report;
 and the kernel-log / SMART / driver scan). **Inventory** is a placeholder until M5 lands.
-Without the GUI extra, `pip install -e .` gives just the stdlib-only CLI.
+```bash
 rigdoctor snapshot              # one-shot reading of every sensor
 rigdoctor monitor              # live terminal dashboard
 rigdoctor report               # health report (logs / SMART / driver)
 rigdoctor diagnose start|finish # capture while gaming, then analyse
 rigdoctor gameenv              # flag risky gaming settings + fixes
 rigdoctor inventory            # hardware/OS inventory
 rigdoctor ai explain           # AI explanation of the current findings (opt-in)
 rigdoctor bundle               # zip the latest diagnostic into a shareable report
 ```
-## Start here
+## Requirements
-1. Read `docs/SPEC.md` for what we're building.
+- **Linux** — Ubuntu/Debian first-class (the `.deb`); the `.run` works on any distro with
-2. Read `docs/ROADMAP.md` for the build order (Phase 1 = the MVP).
+  Python ≥ 3.11.
-3. Read `docs/DECISIONS.md` for the settled decisions (D1–D15).
+- **GPU** — NVIDIA fully supported (via `nvidia-smi`); AMD/Intel sensors are best-effort.
-</content>
+- **CLI/daemon** need only Python 3 (stdlib). The **GUI/tray** add **PySide6** (`python3-pyside6`).
 - Optional tools unlock more: `smartmontools`, `lm-sensors`, `gamemode`, `mangohud`. The setup
  wizard offers to install them.
 ## Privacy
 Everything stays on your machine — no telemetry, no phone-home. The AI assistant is **off by
 default** and runs only when you explicitly trigger it; with Ollama nothing leaves the machine,
 and the Claude option asks before sending. Reports are local files; they leave only if you share
 the zip.
 ## Development
 RigDoctor's core is stdlib-only Python; the GUI/tray use PySide6.
 ```bash
 git clone https://git.jesseyvanofferen.com/jessey/rigdoctor && cd rigdoctor
 pip install -e ".[gui]"                    # core + GUI; omit [gui] for CLI-only
 python -m unittest discover -s tests       # run the test suite
 PYTHONPATH=src python3 -m rigdoctor snapshot   # run without installing
 ```
 Design docs live in `docs/` — `SPEC.md` (vision/requirements), `ARCHITECTURE.md`,
 `MODULES.md` (module catalog), `ROADMAP.md`, and `DECISIONS.md` (the decision log).
 Contributions: branch off `main`, keep tests green (CI runs them on PRs), and bump the version
 + `CHANGELOG.md` for shipped changes.
@@ -0,0 +1,17 @@
 <svg xmlns="http://www.w3.org/2000/svg" width="512" height="512" viewBox="0 0 512 512">
  <defs>
    <radialGradient id="bg" cx="50%" cy="42%" r="78%">
      <stop offset="0%" stop-color="#1b2230"/>
      <stop offset="100%" stop-color="#0d0f13"/>
    </radialGradient>
  </defs>
  <rect width="512" height="512" fill="url(#bg)"/>
  <!-- gauge ring -->
  <circle cx="256" cy="256" r="168" fill="none" stroke="#2a2f39" stroke-width="28"/>
  <!-- accent sweep -->
  <path d="M256 88 a168 168 0 1 1 -118.8 49.2" fill="none" stroke="#38bdf8"
        stroke-width="28" stroke-linecap="round"/>
  <!-- heartbeat / monitoring trace -->
  <path d="M120 264 H200 L232 192 L280 336 L312 264 H392" fill="none" stroke="#e6e8eb"
        stroke-width="28" stroke-linecap="round" stroke-linejoin="round"/>
 </svg>
@@ -264,9 +264,21 @@ root cause + suggested next steps). Adds M14 to the D14 set.
  as suggestions (consistent with D9 — it explains/recommends, applying fixes stays
  consent-gated). No new runtime dependency (HTTP via stdlib).
 ### D25 — Logging & report bundles (M15) — *DECIDED 2026-05-22*
 Opt-in logging + shareable diagnostic reports.
 - **One combined `logging_enabled` toggle** (default off) controls both application logging
  (rotating `app.log`) and per-diagnostic storage. Kept as a single switch for simplicity.
 - **Each diagnostic is stored in its own directory** (`DATA_DIR/diagnostics/<id>/`): capture
  log, structured `result.json`, human-readable `report.txt`, a scoped game-log snapshot, and an
  `ai/` folder recording each AI interaction (**exact data sent, provider+model, and the reply**).
 - **"Report"** zips one diagnostic directory (plus `app.log`) into `DATA_DIR/reports/` —
  auto-saved there (no save dialog), shown with its path. Available only when logging is on
  (nothing is stored otherwise). CLI: `rigdoctor bundle`.
 - Everything stays local; the report only leaves the machine if the user shares the zip.
 ## Open
-None currently — all tracked decisions (D1–D24) are resolved. New questions will be added
+None currently — all tracked decisions (D1–D25) are resolved. New questions will be added
 here as they arise. Remaining detail to flesh out during build: the tray's supporting-action
 set (D13), per-module apt package names, M12's tunnel/token specifics, and M13's
 update mechanism (APT repo vs. self-installed `.deb`).
@@ -2,7 +2,8 @@
 Status: ⬜ not started · 🟦 designing · 🟨 in progress · ✅ done
-> Module set per D14, plus **M12 (session sharing, D16)** and **M13 (auto-update, D18)**.
+> Module set per D14, plus **M12 (session sharing, D16)**, **M13 (auto-update, D18)**,
 > **M14 (AI assistant, D24)**, and **M15 (logging & reports, D25)**.
 > **M7 (stress/repro) was dropped (D7).** M10/M11 are the GUI and tray modules (D10/D11).
 > GPU scope reads "all (NVIDIA first)" — NVIDIA first, others via the vendor abstraction (D4).
@@ -17,10 +18,11 @@ Status: ⬜ not started · 🟦 designing · 🟨 in progress · ✅ done
 | M6 | Gaming env checks | Diagnostics | none | all | P2 | 🟨 |
 | M10 | Desktop GUI | Desktop UI | **python3-pyside6** | all | P2 | ✅ |
 | M11 | Tray / menu-bar applet | Desktop UI | **python3-pyside6** (+ AppIndicator on GNOME) | all | P2 | ✅ |
-| M9 | Installer | (meta) | none | all | P1 | 🟨 |
+| M9 | Installer (+ `.deb`) | (meta) | none | all | P1 | ✅ |
 | M12 | Session sharing (shared terminal) | Sharing | none (relay) | all | P3 | ✅ |
 | M13 | Auto-update | (core) | none (stdlib; user-local file swap) | all | P3 | ✅ |
 | M14 | AI assistant (explain diagnostics) | (optional) | none (stdlib urllib; Ollama or Claude) | all | P3 | ✅ |
 | M15 | Logging & report bundles | (core) | none (stdlib logging + zip) | all | P3 | ✅ |
 | ~~M7~~ | ~~Stress / repro~~ | — | — | — | — | ❌ dropped (D7) |
 ## Notes per module
@@ -128,6 +130,16 @@ Status: ⬜ not started · 🟦 designing · 🟨 in progress · ✅ done
  which lifts a small local model and sharpens Claude. Stdlib `urllib` (no pip deps); output is
  advisory (D9). Configure in **Settings → AI assistant**.
 - **M15 Logging & report bundles** (D25) — opt-in via one `logging_enabled` toggle (default off):
  application logging to a rotating `app.log` (`core/applog.py`) and **per-diagnostic storage**
  (`core/diagstore.py`) — each diagnostic gets its own `DATA_DIR/diagnostics/<id>/`: capture,
  `result.json`, `report.txt`, the full **inventory** (M5: hardware/OS), scoped **game logs**
  (`core/gamelogs.py`), scoped **system logs** (`core/syslogs.py` — `journalctl -k`,
  `coredumpctl`, an `nvidia-smi -q` snapshot, and the X11/Wayland display-server log), and an
  `ai/` record of every AI interaction (exact data sent, model, reply). **"Report"** zips one
  into `DATA_DIR/reports/` (GUI button on the diagnostic dialog; CLI `rigdoctor bundle`). Logs
  are session-scoped and fed to the AI on "Explain". Stays local; shareable on demand.
 ## Bundles (final — D14)
 - **Essential:** M1 + M3 + M4  *(the MVP, NVIDIA-only — D5)*
 - **Monitoring:** M2 + M8
@@ -67,9 +67,12 @@ Ubuntu + NVIDIA first; `.deb` distribution (see `DECISIONS.md`).
      Settings "Recording trigger") incl. the zero-config **game-launch watcher**
      (`core/watcher.py`, `rigdoctor watch`); and a **graphical first-run setup wizard**
      (`gui/setup_wizard.py`): environment → dependency-bundle selection → install → recording
-      trigger → readiness, auto-launched by install.sh and re-runnable from Settings.
+      trigger → readiness, auto-launched by install.sh and re-runnable from Settings; and a
-      *Pending:* `.deb` packaging (next bullet).
+      **`.deb`** (`packaging/make_deb.py`, `Architecture: all`, `Depends: python3`,
- [ ] `.deb` packaging (D8) declaring per-bundle deps incl. python3-pyside6 for Desktop UI
+      `Recommends: python3-pyside6/pyte`) built + published in CI (release asset + optional
      Gitea apt registry). **M9 complete.**
 - [x] `.deb` packaging (D8) — built via `dpkg-deb` (no debhelper); GUI deps as Recommends so
      `apt install rigdoctor` includes the Desktop UI, `--no-install-recommends` = CLI only.
 ## Phase 5 — Breadth (later)
 - [ ] AMD GPU support in M1 (Steam Deck / Radeon)
@@ -97,6 +100,13 @@ Ubuntu + NVIDIA first; `.deb` distribution (see `DECISIONS.md`).
 - [ ] *Possible follow-ups:* interactive chat grounded in the data; more reference-KB entries;
      an "Explain" button on the System Health page.
 ## Phase 8 — Logging & report bundles  (M15, D25)
 - [x] **Opt-in logging** (one `logging_enabled` toggle): rotating `app.log` (`core/applog.py`)
      + **per-diagnostic storage** in its own directory (`core/diagstore.py`) — capture,
      result, report, scoped game logs, and AI-interaction records.
 - [x] **Report** bundle — zip a diagnostic (incl. exactly what was sent to the AI, the model,
      and its reply) into the reports folder. GUI button + `rigdoctor bundle`.
 > **Out of scope:** stress/repro module (D7); multi-distro support and packaging beyond
 > Ubuntu/apt + `.deb` (D15) — a thin seam is kept but not built out.
@@ -162,6 +162,18 @@ the actual findings plus matched reference facts from a curated, exact-match kno
 ("RAG-lite" — no embeddings/vector store, stdlib only); no fine-tuning. HTTP via stdlib `urllib`
 (no new core dependency); output is advisory (consistent with D9).
 ### M15 — Logging & report bundles (D25)
 Opt-in (one `logging_enabled` toggle, default off). When on: the application logs to a rotating
 `app.log`, and **each diagnostic is stored in its own directory** (capture log, structured
 result, human-readable report, the full **inventory** (M5 hardware/OS), session-scoped **game
 logs** (Proton/Steam) and **system logs** (`journalctl -k`, `coredumpctl`, an `nvidia-smi -q`
 snapshot, and the X11/Wayland display-server log), and a record of every AI interaction — the
 exact data sent, the model, and its reply). The collected logs are also fed to the AI on
 "Explain". Collection is best-effort (degrades if tools are missing/denied). A **Report** action zips one diagnostic's directory
 (plus the app log) into a shareable bundle saved under the reports folder (GUI button; CLI
 `rigdoctor bundle`). Everything stays local — a report only leaves the machine if the user
 shares the zip. Stdlib only (`logging` + `zipfile`).
 ## 5. Non-functional requirements
 - **Zero hard deps for the core/CLI/daemon** — Python stdlib + tools already present. **Qt
  (PySide6) is required only by the GUI (M10) and tray (M11) modules**, declared in the
@@ -0,0 +1,121 @@
 """Build a `.deb` for RigDoctor (M9 / D8) — dependency-light, no debhelper.
 Pure-Python app, so it's `Architecture: all`: we stage the package into dist-packages, drop the
 two launchers in /usr/bin, install the desktop entry + icon, write a DEBIAN/control, and call
 `dpkg-deb`. The core is stdlib (`Depends: python3`); everything else is **Recommends** so a
 plain `apt install rigdoctor` sets up the whole toolset automatically (users never hand-install
 deps) — the GUI modules (Debian/Ubuntu split PySide6 per module, so we name
 `python3-pyside6.qt{widgets,gui,websockets,svg}`) + `python3-pyte`, plus the diagnostic/gaming
 tools (smartmontools, lm-sensors, dmidecode, pciutils, libnotify-bin, libsecret-tools, gamemode,
 mangohud). `--no-install-recommends` still yields a CLI-only install; `cpupower` is a Suggests
 (kernel-tied/heavy).
 Run: `python packaging/make_deb.py` → `dist/rigdoctor_<version>_all.deb`.
 """
 from __future__ import annotations
 import shutil
 import subprocess
 import sys
 from pathlib import Path
 ROOT = Path(__file__).resolve().parents[1]
 DIST = ROOT / "dist"
 MAINTAINER = "Jessey van Offeren <jjvanofferen@gmail.com>"
 HOMEPAGE = "https://git.jesseyvanofferen.com/jessey/rigdoctor"
 def _version() -> str:
    text = (ROOT / "src" / "rigdoctor" / "__init__.py").read_text(encoding="utf-8")
    for line in text.splitlines():
        if line.startswith("__version__"):
            return line.split('"')[1]
    raise SystemExit("could not read __version__")
 _LAUNCHER = """\
 #!/usr/bin/python3
 import sys
 from {module} import main
 sys.exit(main())
 """
 _DESKTOP = """\
 [Desktop Entry]
 Type=Application
 Name=RigDoctor
 Comment=Hardware monitoring & crash diagnostics for Linux gamers
 Exec=rigdoctor-gui
 Icon=rigdoctor
 Terminal=false
 Categories=System;Monitor;Utility;
 StartupWMClass=rigdoctor
 """
 _CONTROL = """\
 Package: rigdoctor
 Version: {version}
 Architecture: all
 Maintainer: {maintainer}
 Section: utils
 Priority: optional
 Depends: python3 (>= 3.11)
 Recommends: python3-pyside6.qtwidgets, python3-pyside6.qtgui, python3-pyside6.qtwebsockets, python3-pyside6.qtsvg, python3-pyte, smartmontools, lm-sensors, dmidecode, pciutils, libnotify-bin, libsecret-tools, gamemode, mangohud
 Suggests: linux-tools-generic
 Homepage: {homepage}
 Description: Hardware monitoring & crash diagnostics for Linux gamers
 RigDoctor monitors GPU/CPU temperatures, load, and sensors, captures crash
 diagnostics while gaming, scans logs (Xid/SMART/kernel) for problems, and can
 explain them in plain language. The CLI and background daemon are pure Python
 (stdlib only); the optional desktop GUI and system-tray applet use PySide6,
 pulled in via Recommends. Install with --no-install-recommends for CLI only.
 """
 def _write(path: Path, text: str, mode: int = 0o644) -> None:
    path.parent.mkdir(parents=True, exist_ok=True)
    path.write_text(text, encoding="utf-8")
    path.chmod(mode)
 def build() -> Path:
    version = _version()
    DIST.mkdir(exist_ok=True)
    stage = DIST / f"rigdoctor_{version}_all"
    if stage.exists():
        shutil.rmtree(stage)
    # Python package → dist-packages (importable system-wide), minus bytecode.
    pkg_dst = stage / "usr/lib/python3/dist-packages/rigdoctor"
    shutil.copytree(ROOT / "src" / "rigdoctor", pkg_dst,
                    ignore=shutil.ignore_patterns("__pycache__", "*.pyc"))
    # Launchers.
    _write(stage / "usr/bin/rigdoctor", _LAUNCHER.format(module="rigdoctor.cli"), 0o755)
    _write(stage / "usr/bin/rigdoctor-gui", _LAUNCHER.format(module="rigdoctor.gui.app"), 0o755)
    # Desktop entry + icon.
    _write(stage / "usr/share/applications/rigdoctor.desktop", _DESKTOP)
    icon = ROOT / "src" / "rigdoctor" / "gui" / "assets" / "rigdoctor.svg"
    _write(stage / "usr/share/icons/hicolor/scalable/apps/rigdoctor.svg",
           icon.read_text(encoding="utf-8"))
    # Refresh the desktop database on install/remove (best-effort).
    _write(stage / "DEBIAN/postinst",
           "#!/bin/sh\nset -e\nupdate-desktop-database -q 2>/dev/null || true\n", 0o755)
    _write(stage / "DEBIAN/postrm",
           "#!/bin/sh\nset -e\nupdate-desktop-database -q 2>/dev/null || true\n", 0o755)
    _write(stage / "DEBIAN/control",
           _CONTROL.format(version=version, maintainer=MAINTAINER, homepage=HOMEPAGE))
    out = DIST / f"rigdoctor_{version}_all.deb"
    subprocess.run(["dpkg-deb", "--root-owner-group", "--build", str(stage), str(out)], check=True)
    shutil.rmtree(stage)
    return out
 if __name__ == "__main__":
    path = build()
    print(f"built {path}")
    sys.exit(0)
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "rigdoctor"
-version = "0.27.0"
+version = "0.36.1"
 description = "Modular hardware monitoring & crash diagnostics for Linux gamers."
 readme = "README.md"
 requires-python = ">=3.11"
@@ -1,3 +1,3 @@
 """RigDoctor — modular hardware monitoring & crash diagnostics for Linux gamers."""
-__version__ = "0.27.0"
+__version__ = "0.36.1"
@@ -55,8 +55,9 @@ def cmd_gui(args) -> int:
        from .gui.app import main as gui_main
    except ImportError as exc:
        print("The GUI needs PySide6, which isn't installed.")
-        print("  Install it with:  pip install 'rigdoctor[gui]'")
+        print("  Ubuntu/Debian:  sudo apt install python3-pyside6.qtwidgets "
-        print("  or on Ubuntu:     sudo apt install python3-pyside6")
+              "python3-pyside6.qtgui python3-pyside6.qtwebsockets python3-pyside6.qtsvg python3-pyte")
        print("  pip:            pip install 'rigdoctor[gui]'")
        print(f"  ({exc})")
        return 2
    return gui_main([sys.argv[0]])
@@ -472,6 +473,23 @@ def cmd_ai(args) -> int:
    return 0 if ok else 1
 def cmd_bundle(args) -> int:
    """Zip the latest stored diagnostic into a report bundle (M15) — needs logging enabled."""
    from .core import diagstore
    if not diagstore.enabled():
        print("Logging is off. Enable it (Settings → Logging, or set logging_enabled) so "
              "diagnostics are stored and can be reported.")
        return 1
    directory = diagstore.latest_dir()
    if directory is None:
        print("No stored diagnostics yet — run a diagnostic first.")
        return 1
    out = diagstore.make_report(directory)
    print(f"Report written: {out}")
    return 0
 def cmd_gameenv(args) -> int:
    from dataclasses import asdict
@@ -686,10 +704,16 @@ def build_parser() -> argparse.ArgumentParser:
    ai_sub.add_parser("test", help="send a tiny probe to verify connectivity").set_defaults(func=cmd_ai)
    ai_sub.add_parser("explain", help="explain the current health findings with AI").set_defaults(func=cmd_ai)
    ai_p.set_defaults(func=cmd_ai, ai_cmd=None)
    bundle_p = sub.add_parser("bundle", help="zip the latest stored diagnostic into a report bundle (M15)")
    bundle_p.set_defaults(func=cmd_bundle)
    return p
 def main(argv: list[str] | None = None) -> int:
    from .core import applog
    applog.setup()  # opt-in app logging (M15); no-op unless logging_enabled
    args = build_parser().parse_args(argv)
    return args.func(args)
@@ -37,6 +37,12 @@ SPAWN_LOG = STATE_DIR / "recorder.out"
 # not config: refreshed by the background scan on every launch).
 GAMES_FILE = STATE_DIR / "games.json"
 # Logging & reports (opt-in via `logging_enabled`). App log: rotating file of app events.
 # Each diagnostic is stored under DIAGNOSTICS_DIR/<id>/; "Report" zips one into REPORTS_DIR.
 APP_LOG = STATE_DIR / "app.log"
 DIAGNOSTICS_DIR = DATA_DIR / "diagnostics"
 REPORTS_DIR = DATA_DIR / "reports"
 # Update access token (M13) — gates updates to Gitea account holders (D18).
 # Stored in the OS keyring (Secret Service / GNOME Keyring) via `secret-tool` when
 # available — encrypted at rest, unlocked with the login session — else a 0600 file.
@@ -190,6 +196,7 @@ DEFAULTS: dict = {
    "ai_provider": "",            # AI assistant (M14, D24): "" (unset) | "ollama" | "claude"
    "ai_model": "",               # model name (e.g. "llama3.1" for Ollama; blank = Claude default)
    "ai_endpoint": "http://localhost:11434",  # Ollama server base URL (Claude uses a fixed endpoint)
    "logging_enabled": False,     # opt-in: app logging + per-diagnostic storage + Report (M15)
 }
@@ -16,27 +16,40 @@ Answers are *grounded*: we pass the actual findings plus matched reference facts
 from __future__ import annotations
 import json
 import re
 import urllib.error
 import urllib.request
 from .. import config
 from . import ai_knowledge
 _APPID_RE = re.compile(r"\b\d{5,7}\b")  # Steam app IDs are 5–7 digits
 PROVIDERS = ("ollama", "claude")
 OLLAMA_DEFAULT_ENDPOINT = "http://localhost:11434"
 # Suggested Ollama model — strong instruction-following that fits an 8 GB GPU at Q4. Because we
 # ground the prompt with reference facts, a 7B model is sufficient here.
 OLLAMA_SUGGESTED_MODEL = "qwen2.5:7b"
 CLAUDE_ENDPOINT = "https://api.anthropic.com/v1/messages"
 CLAUDE_DEFAULT_MODEL = "claude-opus-4-7"
 CLAUDE_MAX_TOKENS = 2000
 ANTHROPIC_VERSION = "2023-06-01"
 SYSTEM_PROMPT = (
-    "You are RigDoctor's hardware-diagnostics assistant for Linux gamers. You are given the "
+    "You are RigDoctor's hardware-diagnostics assistant for Linux gamers (Ubuntu + NVIDIA, games "
-    "structured findings RigDoctor collected from this machine, and a set of reference facts. "
+    "via Steam/Proton). You are given session context, the structured findings RigDoctor "
-    "Explain in plain language what the findings mean, identify the most likely root cause of "
+    "collected — which may include recent game/Proton/system log excerpts scoped to this session "
-    "any problem, and give concrete, ordered next steps (exact commands where useful). Base "
+    "— plus reference facts. Use the GAME NAME from the session context; never guess the game "
-    "your reasoning ONLY on the findings and reference facts provided — do not invent readings, "
+    "from log paths or app IDs. Correlate log errors with the findings to pinpoint WHEN and WHY "
-    "hardware, or log lines. Be concise and practical. Present fixes as suggestions, and clearly "
+    "things went wrong, identify the most likely root cause, and give concrete, ordered next "
-    "warn before any step that could cause data loss or instability."
+    "steps with exact Linux commands where useful.\n"
    "Rules: Base your reasoning ONLY on the data and reference facts provided — never invent "
    "readings, hardware, or log lines. This is LINUX: never suggest Windows-only steps (e.g. "
    "'run as administrator', registry edits, toggling antivirus). Treat log lines flagged BENIGN "
    "in the reference facts as non-causal. If no crash was recorded and there are no warning or "
    "critical findings, say plainly that the session looks healthy and do NOT manufacture a "
    "problem. Be concise. Present fixes as suggestions and warn before anything that risks data "
    "loss or instability. Format your answer in Markdown."
 )
@@ -79,10 +92,35 @@ def provider_label() -> str:
    return "not configured"
 def appid_glossary(text: str) -> str:
    """Resolve Steam app IDs that appear in `text` against the user's scanned library.
    We don't teach the model app IDs — we look them up locally and hand it the mapping, so it
    names games correctly instead of guessing. Only IDs we can resolve are listed.
    """
    candidates = set(_APPID_RE.findall(text))
    if not candidates:
        return ""
    try:
        from . import steam
        names = steam.appid_names()
    except Exception:  # never let a glossary lookup break an explanation
        return ""
    known = sorted((i, names[i]) for i in candidates if i in names)
    if not known:
        return ""
    return "App IDs (resolved from your installed games):\n" + "\n".join(
        f"- {appid} = {name}" for appid, name in known)
 def build_prompt(findings_text: str) -> str:
-    """The user-message content: matched reference facts + the collected findings."""
+    """The user-message content: app-ID glossary + matched reference facts + the findings."""
    facts = ai_knowledge.relevant(findings_text)
    parts = []
    glossary = appid_glossary(findings_text)
    if glossary:
        parts.append(glossary)
        parts.append("")
    facts = ai_knowledge.relevant(findings_text)
    if facts:
        parts.append("Reference facts (use these to interpret the findings):")
        parts += [f"- {f}" for f in facts]
@@ -112,6 +150,24 @@ def explain(findings_text: str, timeout: float = 120.0) -> tuple[bool, str]:
        return False, f"Unexpected response from the AI provider: {exc}"
 def explain_stream(findings_text: str, on_chunk, timeout: float = 180.0) -> tuple[bool, str]:
    """Like :func:`explain`, but calls ``on_chunk(text_delta)`` as tokens arrive and returns
    ``(ok, full_text)`` at the end. Caller MUST be a direct user action (D24)."""
    content = build_prompt(findings_text)
    try:
        if provider() == "claude":
            return _claude_stream(content, on_chunk, timeout)
        if provider() == "ollama":
            return _ollama_stream(content, on_chunk, timeout)
        return False, "No AI provider is configured (Settings → AI assistant)."
    except urllib.error.HTTPError as exc:
        return False, _http_error(exc)
    except (urllib.error.URLError, OSError, TimeoutError) as exc:
        return False, f"Couldn't reach the AI provider: {exc}"
    except (ValueError, KeyError, IndexError) as exc:
        return False, f"Unexpected response from the AI provider: {exc}"
 def _post(url: str, payload: dict, headers: dict, timeout: float) -> dict:
    req = urllib.request.Request(
        url, data=json.dumps(payload).encode("utf-8"),
@@ -147,6 +203,65 @@ def _claude(content: str, timeout: float) -> tuple[bool, str]:
    return True, text.strip() or "(the model returned no text)"
 def _stream_request(url: str, payload: dict, headers: dict, timeout: float):
    req = urllib.request.Request(
        url, data=json.dumps(payload).encode("utf-8"),
        headers={"Content-Type": "application/json", **headers})
    return urllib.request.urlopen(req, timeout=timeout)
 def _ollama_stream(content: str, on_chunk, timeout: float) -> tuple[bool, str]:
    if not model():
        return False, "No Ollama model is set (Settings → AI assistant)."
    payload = {"model": model(), "system": SYSTEM_PROMPT, "prompt": content, "stream": True}
    parts: list[str] = []
    with _stream_request(endpoint().rstrip("/") + "/api/generate", payload, {}, timeout) as resp:
        for raw in resp:  # newline-delimited JSON objects
            line = raw.decode("utf-8", "replace").strip()
            if not line:
                continue
            obj = json.loads(line)
            chunk = obj.get("response", "")
            if chunk:
                parts.append(chunk)
                on_chunk(chunk)
            if obj.get("done"):
                break
    return True, "".join(parts).strip() or "(the model returned an empty response)"
 def _claude_stream(content: str, on_chunk, timeout: float) -> tuple[bool, str]:
    key = config.load_ai_key()
    if not key:
        return False, "No Claude API key is set (Settings → AI assistant)."
    payload = {
        "model": model(), "max_tokens": CLAUDE_MAX_TOKENS, "system": SYSTEM_PROMPT,
        "messages": [{"role": "user", "content": content}], "stream": True,
    }
    headers = {"x-api-key": key, "anthropic-version": ANTHROPIC_VERSION}
    parts: list[str] = []
    with _stream_request(CLAUDE_ENDPOINT, payload, headers, timeout) as resp:
        for raw in resp:  # SSE: parse `data:` lines, accumulate text deltas
            line = raw.decode("utf-8", "replace").strip()
            if not line.startswith("data:"):
                continue
            try:
                event = json.loads(line[5:].strip())
            except ValueError:
                continue
            etype = event.get("type")
            if etype == "content_block_delta" and event.get("delta", {}).get("type") == "text_delta":
                chunk = event["delta"].get("text", "")
                if chunk:
                    parts.append(chunk)
                    on_chunk(chunk)
            elif etype == "error":
                return False, event.get("error", {}).get("message", "stream error")
            elif etype == "message_stop":
                break
    return True, "".join(parts).strip() or "(the model returned no text)"
 def _http_error(exc: urllib.error.HTTPError) -> str:
    detail = ""
    try:
@@ -64,6 +64,18 @@ ENTRIES: list[tuple[tuple[str, ...], str]] = [
    (("nvidia persistence", "persistence mode"),
     "NVIDIA persistence mode keeps the driver loaded when no app is using the GPU, avoiding "
     "re-init stalls — harmless to enable."),
    (("libnvidia-ml.so", "interface.h", "failed to load \"libnvidia-ml"),
     "BENIGN: a Steam log assertion 'Failed to load libnvidia-ml.so.1' (from interface.h) is "
     "logged on many normal launches — the Steam runtime sandbox can't see the host NVML library. "
     "It is NOT by itself a crash cause. Only investigate the driver if the GPU is genuinely "
     "undetected (nvidia-smi fails)."),
    (("minidump", ".dmp", "uploading minidump"),
     "BENIGN-by-default: a minidump upload line means a crash handler ran AND that the game/engine "
     "routinely uploads dumps; it is not proof that THIS session crashed unless a hard freeze or "
     "non-zero exit was also recorded. Don't treat a routine minidump line as the root cause."),
    (("fork without exec", "skipping destruction"),
     "BENIGN: 'pid X != Y, skipping destruction (fork without exec?)' is routine Steam/Proton "
     "process bookkeeping, not an error."),
 ]
@@ -1,8 +1,9 @@
-"""Desktop alerts (M8): notify on overheat / GPU-lost / new version via notify-send.
+"""Desktop alerts (M8): notify on overheat / GPU-lost / critical kernel events / new version.
-Edge-triggered: an alert fires when a condition becomes true (not every sample), and
+Edge-triggered: a sustained condition (hot GPU, GPU-lost) fires once when it becomes true and
-can fire again only after it has cleared and a cooldown has passed — so a hot GPU or a
+can re-fire only after it clears + a cooldown; momentary **kernel events** (Xid, OOM-kill, MCE,
-1-Hz sample loop doesn't spam notifications. Degrades to a no-op if notify-send is absent.
+PCIe AER, disk I/O errors) are scanned from the kernel log every `event_interval` seconds and
 fire one-shot (cooldown-gated). So a 1-Hz sample loop never spams. No-op if notify-send absent.
 """
 from __future__ import annotations
@@ -57,13 +58,16 @@ def notify(title: str, message: str, urgency: str = "normal") -> bool:
 class AlertMonitor:
    """Evaluate samples and raise edge-triggered desktop alerts."""
-    def __init__(self, gpu_temp: float = 90.0, cpu_temp: float = 95.0, cooldown: float = 300.0):
+    def __init__(self, gpu_temp: float = 90.0, cpu_temp: float = 95.0, cooldown: float = 300.0,
                 event_interval: float = 30.0):
        self.gpu_temp = gpu_temp
        self.cpu_temp = cpu_temp
        self.cooldown = cooldown
        self.event_interval = event_interval     # how often to scan the kernel log
        self.enabled = True
        self._active: dict[str, bool] = {}
        self._last: dict[str, float] = {}
        self._last_kernel_scan = time.time()     # only alert on events after the monitor starts
    def _fire(self, key: str, title: str, message: str, urgency: str = "critical") -> None:
        if self._active.get(key):
@@ -75,9 +79,39 @@ class AlertMonitor:
        self._last[key] = now
        notify(title, message, urgency)
    def _notify_once(self, key: str, title: str, message: str, urgency: str = "critical") -> None:
        """One-shot alert for a momentary event (cooldown-gated, no active latch)."""
        now = time.time()
        if now - self._last.get(key, 0.0) < self.cooldown:
            return
        self._last[key] = now
        notify(title, message, urgency)
    def _clear(self, key: str) -> None:
        self._active[key] = False
    def _scan_kernel_events(self) -> None:
        """Periodically scan the kernel log for new critical events (Xid/OOM/MCE/PCIe/disk)."""
        now = time.time()
        if now - self._last_kernel_scan < self.event_interval:
            return
        since = self._last_kernel_scan
        self._last_kernel_scan = now
        try:
            from . import syslogs
            text = syslogs.kernel_log(since=since)
        except Exception:  # alerting must never crash the sample loop
            return
        if not text:
            return
        seen: set[str] = set()
        for label, line in syslogs.scan_critical(text):
            if label in seen:  # one alert per category per scan
                continue
            seen.add(label)
            self._notify_once(f"kernel:{label}", label, line[:180])
    def check(self, sample: Sample) -> None:
        if not self.enabled:
            return
@@ -107,3 +141,5 @@ class AlertMonitor:
            self._fire("gpu_lost", "GPU not responding", "nvidia-smi query timed out — the GPU may have dropped")
        else:
            self._clear("gpu_lost")
        self._scan_kernel_events()  # Xid / OOM / MCE / PCIe / disk I/O from the kernel log
@@ -0,0 +1,63 @@
 """Application logging (M15) — opt-in via the `logging_enabled` setting.
 When enabled, app events/errors are written to a rotating file (`config.APP_LOG`); when
 disabled, nothing is written (no file is created). All RigDoctor code logs through
 ``applog.get_logger(__name__)``; the handler is attached once at startup by :func:`setup`.
 Stdlib ``logging`` only.
 """
 from __future__ import annotations
 import logging
 from logging.handlers import RotatingFileHandler
 from .. import config
 _ROOT = "rigdoctor"
 _configured = False
 def setup(force: bool = False) -> bool:
    """Attach the file handler if logging is enabled. Idempotent. Returns whether it's on."""
    global _configured
    logger = logging.getLogger(_ROOT)
    enabled = bool(config.load_config().get("logging_enabled", False))
    if not enabled:
        if force:  # toggled off at runtime — detach so we stop writing
            for h in list(logger.handlers):
                logger.removeHandler(h)
                h.close()
            _configured = False
        return False
    if _configured and not force:
        return True
    for h in list(logger.handlers):  # avoid duplicate handlers on re-setup
        logger.removeHandler(h)
        h.close()
    try:
        config.STATE_DIR.mkdir(parents=True, exist_ok=True)
        handler = RotatingFileHandler(config.APP_LOG, maxBytes=2_000_000, backupCount=3,
                                      encoding="utf-8")
        handler.setFormatter(logging.Formatter(
            "%(asctime)s %(levelname)-7s %(name)s: %(message)s"))
        logger.addHandler(handler)
        logger.setLevel(logging.INFO)
        logger.propagate = False
        _configured = True
        logger.info("logging started (rigdoctor %s)", _version())
    except OSError:
        return False
    return True
 def get_logger(name: str) -> logging.Logger:
    """A child logger. Safe to call before setup — it just won't write until enabled."""
    short = name.split(".")[-1]
    return logging.getLogger(f"{_ROOT}.{short}")
 def _version() -> str:
    from .. import __version__
    return __version__
@@ -28,6 +28,7 @@ class DiagnosticResult:
    game: str | None
    summary: Summary           # capture window: peak temps/power, events, last samples (M3)
    findings: list[Finding]    # health findings: Xid/SMART/driver/etc. (M4)
    dir: str | None = None     # storage directory when logging is on (M15); else None
@dataclass
@@ -97,7 +98,22 @@ def finish(last_n: int = 10, log_path=None) -> DiagnosticResult:
    summary = summarize(path, last_n=last_n)
    game = _game_from_summary(summary) or (reccontrol.read_status() or {}).get("game")
    findings = run_health_checks()
-    return DiagnosticResult(game=game, summary=summary, findings=findings)
+    result = DiagnosticResult(game=game, summary=summary, findings=findings)
    _store(result, path, summary)
    return result
 def _store(result: DiagnosticResult, capture_path, summary: Summary) -> None:
    """Persist the diagnostic to its own directory when logging is enabled (M15)."""
    try:
        from . import diagstore
        since = (summary.start - 60) if summary.start else None
        directory = diagstore.store(result, capture_path, since=since)
        if directory:
            result.dir = str(directory)
    except Exception:  # storage must never break a diagnostic
        pass
 # --- hard-crash detection & post-crash analysis -----------------------------------
@@ -184,4 +200,6 @@ def analyze_crash(last_n: int = 15) -> DiagnosticResult:
    findings += check_previous_boot()                       # the crashed boot's kernel log
    findings += run_health_checks(include_journal=False)    # SMART/driver/persistence/temps
    findings.sort(key=lambda f: _SEV_ORDER.get(f.severity, 9))
-    return DiagnosticResult(game=_game_from_summary(summary), summary=summary, findings=findings)
+    result = DiagnosticResult(game=_game_from_summary(summary), summary=summary, findings=findings)
    _store(result, _crash_path(), summary)
    return result
@@ -0,0 +1,152 @@
 """Per-diagnostic storage + Report bundles (M15) — opt-in via `logging_enabled`.
 When logging is on, each finished diagnostic is persisted to its own directory under
 ``config.DIAGNOSTICS_DIR/<id>/`` (capture log, structured result, human-readable report, a
 game-log snapshot, and any AI interactions). "Report" zips one directory — including exactly
 **what was sent to the AI, which model, and its reply** — into ``config.REPORTS_DIR``.
 """
 from __future__ import annotations
 import json
 import shutil
 import time
 import zipfile
 from dataclasses import asdict, is_dataclass
 from pathlib import Path
 from .. import config
 def enabled() -> bool:
    return bool(config.load_config().get("logging_enabled", False))
 def _slug(name: str | None) -> str:
    s = "".join(c if c.isalnum() else "-" for c in (name or "session").lower())
    return s.strip("-")[:40] or "session"
 def _new_dir(game: str | None) -> Path:
    base = config.DIAGNOSTICS_DIR
    stamp = time.strftime("%Y%m%d-%H%M%S")
    name = f"{stamp}-{_slug(game)}"
    target = base / name
    n = 1
    while target.exists():
        target = base / f"{name}-{n}"
        n += 1
    target.mkdir(parents=True, exist_ok=True)
    return target
 def _as_dict(obj):
    if is_dataclass(obj):
        return asdict(obj)
    return getattr(obj, "__dict__", {}) or str(obj)
 def store(result, capture_path=None, since: float | None = None) -> Path | None:
    """Persist a finished diagnostic to its own directory. Returns the dir, or None if off."""
    if not enabled():
        return None
    from ..render import render_summary
    from . import ai, gamelogs, syslogs
    target = _new_dir(getattr(result, "game", None))
    if capture_path and Path(capture_path).exists():
        try:
            shutil.copyfile(capture_path, target / "capture.jsonl")
        except OSError:
            pass
    payload = {
        "game": getattr(result, "game", None),
        "stored_at": time.time(),
        "summary": _as_dict(result.summary),
        "findings": [_as_dict(f) for f in result.findings],
    }
    _write(target / "result.json", json.dumps(payload, indent=2, default=str))
    report = [f"Game: {getattr(result, 'game', None) or 'unknown'}", "",
              render_summary(result.summary), "",
              ai.format_findings(result.findings, header="Findings:")]
    _write(target / "report.txt", "\n".join(report))
    try:
        logs = gamelogs.collect(since=since)
        if logs:
            _write(target / "gamelogs.txt", logs)
    except OSError:
        pass
    try:
        sys_logs = syslogs.collect(since=since)
        if sys_logs:
            _write(target / "syslogs.txt", sys_logs)
    except OSError:
        pass
    try:  # full hardware/OS inventory (M5) — invaluable for larger debugging in a shared report
        from . import inventory
        sections = inventory.collect()
        _write(target / "inventory.txt", inventory.render_text(sections))
        _write(target / "inventory.json", inventory.render_json(sections))
    except Exception:  # inventory probes vary by machine; never let it break storage
        pass
    return target
 def record_ai(diag_dir, *, provider: str, model: str, system: str, prompt: str, response: str) -> None:
    """Save one AI interaction (exact data sent, model, reply) into the diagnostic's `ai/` dir."""
    if not diag_dir:
        return
    out = Path(diag_dir) / "ai"
    try:
        out.mkdir(parents=True, exist_ok=True)
    except OSError:
        return
    stamp = time.strftime("%Y%m%d-%H%M%S")
    record = {
        "timestamp": time.time(), "provider": provider, "model": model,
        "system_prompt": system, "data_sent_to_model": prompt, "model_reply": response,
    }
    _write(out / f"explain-{stamp}.json", json.dumps(record, indent=2, default=str))
    readable = (
        f"Provider: {provider}\nModel:    {model}\n\n"
        f"=== System prompt ===\n{system}\n\n"
        f"=== Data sent to the model ===\n{prompt}\n\n"
        f"=== Model reply ===\n{response}\n"
    )
    _write(out / f"explain-{stamp}.txt", readable)
 def make_report(diag_dir) -> Path:
    """Zip a diagnostic directory (plus the app log) into REPORTS_DIR; return the zip path."""
    diag_dir = Path(diag_dir)
    config.REPORTS_DIR.mkdir(parents=True, exist_ok=True)
    out = config.REPORTS_DIR / f"report-{diag_dir.name}.zip"
    with zipfile.ZipFile(out, "w", zipfile.ZIP_DEFLATED) as zf:
        for path in sorted(diag_dir.rglob("*")):
            if path.is_file():
                zf.write(path, arcname=str(Path(diag_dir.name) / path.relative_to(diag_dir)))
        if config.APP_LOG.exists():  # the application log, for context around the session
            zf.write(config.APP_LOG, arcname=str(Path(diag_dir.name) / "app.log"))
    return out
 def latest_dir() -> Path | None:
    try:
        dirs = [d for d in config.DIAGNOSTICS_DIR.iterdir() if d.is_dir()]
    except OSError:
        return None
    return max(dirs, key=lambda d: d.stat().st_mtime) if dirs else None
 def _write(path: Path, text: str) -> None:
    try:
        path.write_text(text, encoding="utf-8")
    except OSError:
        pass
@@ -0,0 +1,116 @@
 """Collect recent game / Proton / Steam logs to enrich an AI diagnostic (M14).
 Reads logs that already exist on disk — no change to how the game is launched. Two reliable
 sources: Proton's per-app log (``~/steam-<appid>.log``, written when ``PROTON_LOG=1``) and
 Steam's own console log. Each is tail-read and size-bounded so the AI prompt stays small. The
 text is fed to the AI alongside the findings so it can see *when* something went wrong (a
 vkd3d/DXVK error, a crash line, the exit code) rather than only the sensor summary.
 """
 from __future__ import annotations
 import os
 import re
 import time
 from pathlib import Path
 # Steam keeps logs under its install root; ~/.steam/steam usually symlinks to the real one.
 _STEAM_LOG_DIRS = ("~/.steam/steam/logs", "~/.local/share/Steam/logs", "~/.steam/root/logs")
 _STEAM_LOG_FILES = ("console-linux.txt", "console_log.txt", "stderr.txt")
 _TS = re.compile(r"^\[(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})\]")
 def _line_epoch(line: str) -> float | None:
    m = _TS.match(line)
    if not m:
        return None
    try:
        return time.mktime(time.strptime(m.group(1), "%Y-%m-%d %H:%M:%S"))
    except ValueError:
        return None
 def _since_filter(text: str, since: float) -> str:
    """Keep lines from the first timestamp >= `since` onward (logs are chronological).
    Untimestamped lines before the window are dropped; once inside the window every line is
    kept (so multi-line entries survive). This scopes a long-lived Steam log to one session.
    """
    out: list[str] = []
    including = False
    for line in text.splitlines():
        epoch = _line_epoch(line)
        if epoch is not None and epoch >= since:
            including = True
        if including:
            out.append(line)
    return "\n".join(out)
 def _tail(path: Path, max_bytes: int) -> str:
    """Last ``max_bytes`` of a file, decoded leniently (empty string on error)."""
    try:
        size = path.stat().st_size
        with path.open("rb") as fh:
            if size > max_bytes:
                fh.seek(size - max_bytes)
            return fh.read().decode("utf-8", "replace")
    except OSError:
        return ""
 def _proton_logs() -> list[Path]:
    try:
        logs = list(Path.home().glob("steam-*.log"))
    except OSError:
        return []
    return sorted(logs, key=lambda p: p.stat().st_mtime, reverse=True)
 def _steam_console() -> Path | None:
    for directory in _STEAM_LOG_DIRS:
        base = Path(os.path.expanduser(directory))
        for name in _STEAM_LOG_FILES:
            candidate = base / name
            if candidate.exists():
                return candidate
    return None
 def available() -> bool:
    return bool(_proton_logs() or _steam_console())
 def collect(since: float | None = None, max_bytes: int = 8000) -> str:
    """Recent Proton + Steam log tails as one labelled text block ('' if none).
    With ``since`` (epoch), scope to that session: skip a Proton log not written during/after
    the session (a stale per-app log from an earlier game), and keep only Steam-console lines
    timestamped at/after ``since`` — so we don't feed the model an unrelated past session.
    """
    sections: list[str] = []
    protons = _proton_logs()
    if protons:
        log = protons[0]
        fresh = since is None or _mtime(log) >= since
        tail = _tail(log, max_bytes).strip() if fresh else ""
        if tail:
            sections.append(f"--- Proton log ({log.name}) ---\n{tail}")
    console = _steam_console()
    if console:
        raw = _tail(console, 40000 if since else max_bytes)
        if since is not None:
            raw = _since_filter(raw, since)
        raw = raw.strip()[-max_bytes:].strip()
        if raw:
            sections.append(f"--- Steam log ({console.name}) ---\n{raw}")
    return "\n\n".join(sections)
 def _mtime(path: Path) -> float:
    try:
        return path.stat().st_mtime
    except OSError:
        return 0.0
@@ -318,6 +318,11 @@ def cached_games() -> list[Game]:
    return [Game(**{k: g[k] for k in Game.__dataclass_fields__ if k in g}) for g in cache.get("games", [])]
 def appid_names() -> dict[str, str]:
    """{appid: name} for the user's scanned games — lets us resolve IDs seen in logs (M14)."""
    return {g.appid: g.name for g in cached_games() if g.appid and g.name}
 def rescan(cfg: dict | None = None) -> ScanResult:
    """Scan the selected libraries, diff against the cache, and persist the result.
@@ -0,0 +1,165 @@
 """Session-scoped system logs for diagnostics (M15): kernel, coredumps, NVIDIA, display.
 Covers what the *system* logged when something went wrong, so the report bundle and the AI both
 see it:
  * kernel ring-buffer slice (`journalctl -k`) — Xid, OOM-killer, MCE, PCIe AER, thermal, hung tasks
  * systemd-coredump records (`coredumpctl`) — did the game/wine dump core (SIGSEGV/ABRT), when
  * an `nvidia-smi -q` snapshot — driver, throttle/clock-event reasons, clocks, power, temps, PCIe,
    ECC + retired pages (point-in-time at diagnostic time)
  * the display-server log — `Xorg.0.log` on X11, or the compositor's user-journal slice on Wayland
 Best-effort and size-bounded: degrades silently if a tool is missing or access is denied. Stdlib only.
 """
 from __future__ import annotations
 import os
 import re
 import shutil
 import subprocess
 import time
 from pathlib import Path
 _MAX = 8000      # cap each log section so the prompt/report stays small
 _NV_MAX = 10000  # nvidia-smi -q is structured + valuable; allow a bit more (head-truncated)
 # Compositors whose user-journal entries are the "Wayland log" (OR-matched by journalctl).
 _COMPOSITORS = ("gnome-shell", "mutter", "kwin_wayland", "Xwayland", "sway", "gamescope")
 _XORG_LOGS = ("~/.local/share/xorg/Xorg.0.log", "/var/log/Xorg.0.log")
 def _since_arg(since: float | None) -> str | None:
    return time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(since)) if since else None
 def _run(cmd: list[str], timeout: float = 15.0) -> str:
    try:
        proc = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout)
    except (OSError, subprocess.SubprocessError):
        return ""
    return (proc.stdout or "").strip()
 def kernel_log(since: float | None = None, max_bytes: int = _MAX) -> str:
    if not shutil.which("journalctl"):
        return ""
    cmd = ["journalctl", "-k", "--no-pager"]
    since_arg = _since_arg(since)
    if since_arg:
        cmd += ["--since", since_arg]
    out = _run(cmd)
    if not out or out.strip().lower() == "-- no entries --":  # journalctl's empty marker
        return ""
    return out[-max_bytes:]
 def coredumps(since: float | None = None, max_bytes: int = _MAX) -> str:
    if not shutil.which("coredumpctl"):
        return ""
    cmd = ["coredumpctl", "list", "--no-pager"]
    since_arg = _since_arg(since)
    if since_arg:
        cmd += ["--since", since_arg]
    out = _run(cmd)
    if not out or "no coredumps" in out.lower():
        return ""
    return out[-max_bytes:]
 def nvidia_snapshot(max_bytes: int = _NV_MAX) -> str:
    """Point-in-time `nvidia-smi -q` (head-truncated — driver/temps/clocks/ECC sit near the top)."""
    if not shutil.which("nvidia-smi"):
        return ""
    out = _run(["nvidia-smi", "-q"])
    return out[:max_bytes] if out else ""
 def _xorg_log() -> Path | None:
    for cand in _XORG_LOGS:
        path = Path(os.path.expanduser(cand))
        if path.exists():
            return path
    return None
 def _session_type() -> str:
    declared = os.environ.get("XDG_SESSION_TYPE", "").lower()
    if declared in ("x11", "wayland"):
        return declared
    if os.environ.get("WAYLAND_DISPLAY"):
        return "wayland"
    return "x11" if _xorg_log() else "unknown"
 def _tail_file(path: Path, max_bytes: int) -> str:
    try:
        size = path.stat().st_size
        with path.open("rb") as fh:
            if size > max_bytes:
                fh.seek(size - max_bytes)
            return fh.read().decode("utf-8", "replace")
    except OSError:
        return ""
 def display_log(since: float | None = None, max_bytes: int = _MAX) -> str:
    """Xorg.0.log on X11, or the compositor's user-journal slice on Wayland ('' if none)."""
    if _session_type() == "wayland":
        if not shutil.which("journalctl"):
            return ""
        cmd = ["journalctl", "--user", "--no-pager"]
        since_arg = _since_arg(since)
        if since_arg:
            cmd += ["--since", since_arg]
        cmd += [f"_COMM={comp}" for comp in _COMPOSITORS]  # OR-matched
        out = _run(cmd)
        if not out or out.strip().lower() == "-- no entries --":
            return ""
        return out[-max_bytes:]
    log = _xorg_log()  # X11: Xorg log isn't wall-clock-timestamped, so tail rather than scope
    return _tail_file(log, max_bytes) if log else ""
 # Kernel-log patterns worth alerting on in real time (M8 event alerts). (label, regex).
 _CRITICAL = [
    ("GPU error (Xid)", re.compile(r"NVRM:\s*Xid", re.I)),
    ("Out of memory", re.compile(r"out of memory|oom-kill|killed process \d+", re.I)),
    ("CPU machine-check", re.compile(r"\bmce:|machine check", re.I)),
    ("PCIe error", re.compile(r"\bAER:|pcie bus error", re.I)),
    ("Disk I/O error", re.compile(
        r"buffer i/o error|\bi/o error\b|critical medium error|ext4-fs error|"
        r"blk_update_request:.*error|ata\d+.*(?:failed|error)", re.I)),
 ]
 def scan_critical(text: str) -> list[tuple[str, str]]:
    """(label, line) for kernel lines matching a critical pattern (first match per line)."""
    events: list[tuple[str, str]] = []
    for line in text.splitlines():
        for label, pat in _CRITICAL:
            if pat.search(line):
                events.append((label, line.strip()))
                break
    return events
 def available() -> bool:
    return bool(shutil.which("journalctl") or shutil.which("coredumpctl")
                or shutil.which("nvidia-smi") or _xorg_log())
 def collect(since: float | None = None) -> str:
    """Kernel + coredumps + NVIDIA snapshot + display log as one labelled block ('' if none)."""
    sections: list[str] = []
    kern = kernel_log(since)
    if kern:
        sections.append(f"--- Kernel log (journalctl -k) ---\n{kern}")
    cores = coredumps(since)
    if cores:
        sections.append(f"--- Crashed processes (coredumpctl) ---\n{cores}")
    nvidia = nvidia_snapshot()
    if nvidia:
        sections.append(f"--- NVIDIA snapshot (nvidia-smi -q) ---\n{nvidia}")
    display = display_log(since)
    if display:
        sections.append(f"--- Display server log ({_session_type()}) ---\n{display}")
    return "\n\n".join(sections)
@@ -17,6 +17,10 @@ ICON = Path(__file__).parent / "assets" / "rigdoctor.svg"
 def main(argv: list[str] | None = None) -> int:
    from ..core import applog
    applog.setup()  # opt-in app logging (M15); no-op unless logging_enabled
    applog.get_logger(__name__).info("GUI starting")
    desktop.ensure()  # self-register icon + .desktop so updates show it without re-installing
    app = QApplication(argv if argv is not None else sys.argv)
    app.setApplicationName("RigDoctor")
@@ -5,7 +5,7 @@ from __future__ import annotations
 import threading
 from PySide6.QtCore import Qt, Signal
-from PySide6.QtGui import QFont
+from PySide6.QtGui import QFont, QTextCursor
 from PySide6.QtWidgets import (
    QDialog,
    QFrame,
@@ -24,11 +24,15 @@ from .widgets import finding_card
 class DiagnosticDialog(QDialog):
-    _explained = Signal(object)  # (ok, text) from a user-triggered AI explanation
+    _chunk = Signal(str)         # streamed token delta (worker thread -> GUI)
    _explained = Signal(object)  # (ok, full_text) when the AI stream finishes
    def __init__(self, result, parent=None) -> None:
        super().__init__(parent)
        self._result = result
        self._stream_view = None
        self._stream_status = None
        self._chunk.connect(self._on_chunk)
        self._explained.connect(self._on_explained)
        self.setWindowTitle(f"Diagnostic — {result.game}" if result.game else "Diagnostic")
        self.resize(660, 680)
@@ -86,6 +90,10 @@ class DiagnosticDialog(QDialog):
        from ..core import ai
        self._explain_btn.setVisible(ai.is_configured())  # opt-in only; hidden if not set up
        buttons.addWidget(self._explain_btn)
        self._report_btn = QPushButton("Report")  # zip this diagnostic's logs (M15)
        self._report_btn.clicked.connect(self._make_report)
        self._report_btn.setVisible(bool(result.dir))  # only when logging stored the session
        buttons.addWidget(self._report_btn)
        buttons.addStretch(1)
        close = QPushButton("Close")
        close.setObjectName("PrimaryButton")
@@ -93,7 +101,7 @@ class DiagnosticDialog(QDialog):
        buttons.addWidget(close)
        root.addLayout(buttons)
-    # --- AI explanation (M14, D24) — runs only on this button press ----------------
+    # --- AI explanation (M14, D24) — streamed; runs only on this button press ----------
    def _explain_with_ai(self) -> None:
        from ..core import ai
@@ -107,23 +115,97 @@ class DiagnosticDialog(QDialog):
            if confirm != QMessageBox.StandardButton.Yes:
                return
        self._explain_btn.setEnabled(False)
-        self._explain_btn.setText("Asking the AI…")
+        dialog = self._open_stream_dialog()
        threading.Thread(target=self._work_explain, daemon=True).start()
        dialog.exec()  # streaming fills the view live via signals during this nested loop
        self._stream_view = self._stream_status = None
        self._explain_btn.setEnabled(True)
    def _work_explain(self) -> None:
-        from ..core import ai
+        from ..core import ai, gamelogs, syslogs
-        text = ai.format_findings(self._result.findings, header="Diagnostic findings:")
+        result = self._result
-        text += "\n\nCapture summary:\n" + render_summary(self._result.summary)
+        summary = result.summary
-        self._explained.emit(ai.explain(text))
+        events = {kind for _ts, kind, _detail in summary.events}
        clean = "session-stop" in events
        gpu_lost = "gpu-lost" in events
        lines = [f"Game: {result.game or 'unknown'}"]
        if summary.start and summary.end:
            lines.append(f"Capture duration: ~{int(summary.end - summary.start)}s")
        outcome = "ended cleanly (no crash detected)" if clean else \
            "ended without a clean stop (possible crash/freeze)"
        if gpu_lost:
            outcome += "; a GPU-lost event was recorded"
        lines.append(f"Outcome: {outcome}")
        lines.append("")
        lines.append(ai.format_findings(result.findings, header="Findings:"))
        lines.append("\nCapture summary:\n" + render_summary(summary))
        since = (summary.start - 60) if summary.start else None
        logs = gamelogs.collect(since=since)  # scoped to this session
        if logs:
            lines.append("\nGame/Proton/Steam logs for this session:\n" + logs)
        sys_logs = syslogs.collect(since=since)  # kernel log + crashed-process records
        if sys_logs:
            lines.append("\nSystem logs for this session (kernel + crashed processes):\n" + sys_logs)
        text = "\n".join(lines)
        ok, reply = ai.explain_stream(text, on_chunk=lambda d: self._chunk.emit(d))
        if result.dir:  # record exactly what was sent, the model, and the reply (M15)
            from ..core import diagstore
            diagstore.record_ai(
                result.dir, provider=ai.provider(), model=ai.model(),
                system=ai.SYSTEM_PROMPT, prompt=ai.build_prompt(text),
                response=reply if ok else f"[error] {reply}")
        self._explained.emit((ok, reply))
    def _on_chunk(self, delta: str) -> None:
        if self._stream_view is None:
            return
        self._stream_view.moveCursor(QTextCursor.MoveOperation.End)
        self._stream_view.insertPlainText(delta)  # live plain text as tokens arrive
        self._stream_view.ensureCursorVisible()
    def _on_explained(self, result) -> None:
        ok, text = result
-        self._explain_btn.setEnabled(True)
+        if self._stream_view is not None:
-        self._explain_btn.setText("Explain with AI")
+            if ok:
-        self._show_explanation(text if ok else f"AI explanation failed:\n\n{text}")
+                self._stream_view.setMarkdown(text)  # re-render the finished answer as Markdown
            else:
                self._stream_view.setPlainText(f"AI explanation failed:\n\n{text}")
        if self._stream_status is not None:
            self._stream_status.setText(
                "AI-generated suggestions — verify before acting, especially anything that changes "
                "settings or data." if ok else "The request failed.")
-    def _show_explanation(self, text: str) -> None:
+    # --- Report bundle (M15) ------------------------------------------------------
    def _make_report(self) -> None:
        from PySide6.QtCore import QUrl
        from PySide6.QtGui import QDesktopServices
        from ..core import diagstore
        self._report_btn.setEnabled(False)
        try:
            out = diagstore.make_report(self._result.dir)
        except OSError as exc:
            self._report_btn.setEnabled(True)
            QMessageBox.warning(self, "Report failed", str(exc))
            return
        self._report_btn.setEnabled(True)
        box = QMessageBox(self)
        box.setWindowTitle("Report created")
        box.setText(f"Saved report:\n{out}\n\nIt contains this diagnostic's logs and any AI "
                    "interaction (data sent, model, and reply).")
        open_btn = box.addButton("Open folder", QMessageBox.ButtonRole.ActionRole)
        box.addButton("OK", QMessageBox.ButtonRole.AcceptRole)
        box.exec()
        if box.clickedButton() is open_btn:
            QDesktopServices.openUrl(QUrl.fromLocalFile(str(out.parent)))
    def _open_stream_dialog(self) -> QDialog:
        """A live dialog the AI streams into; finalized to rendered Markdown when done."""
        from ..core import ai
        dlg = QDialog(self)
@@ -133,14 +215,15 @@ class DiagnosticDialog(QDialog):
        view = QTextEdit()
        view.setObjectName("Report")
        view.setReadOnly(True)
        view.setPlainText(text)
        lay.addWidget(view)
-        note = QLabel("AI-generated suggestions — verify before acting, especially anything that changes settings or data.")
+        status = QLabel("Streaming from the model…")
-        note.setObjectName("Muted")
+        status.setObjectName("Muted")
-        note.setWordWrap(True)
+        status.setWordWrap(True)
-        lay.addWidget(note)
+        lay.addWidget(status)
        close = QPushButton("Close")
        close.setObjectName("PrimaryButton")
        close.clicked.connect(dlg.accept)
        lay.addWidget(close, alignment=Qt.AlignmentFlag.AlignRight)
-        dlg.exec()
+        self._stream_view = view
        self._stream_status = status
        return dlg
@@ -27,7 +27,7 @@ from PySide6.QtWidgets import (
 )
 from .. import config
-from ..core import alerts, installer, service, sysenv, uninstall, updates
+from ..core import ai, alerts, installer, service, sysenv, uninstall, updates
 from .theme import GOOD, MUTED, WARN
@@ -114,7 +114,8 @@ class SetupPage(QWidget):
        grid.addWidget(QLabel("CPU temperature alert"), 1, 0)
        grid.addWidget(self._cpu_alert, 1, 1)
        alerts_layout.addLayout(grid)
-        alerts_note = QLabel("GPU-lost and new-version alerts are included whenever notifications are enabled.")
+        alerts_note = QLabel("GPU-lost, critical kernel events (Xid, out-of-memory, disk I/O, PCIe), "
                             "and new-version alerts are included whenever notifications are enabled.")
        alerts_note.setObjectName("Muted")
        alerts_note.setWordWrap(True)
        alerts_layout.addWidget(alerts_note)
@@ -188,7 +189,8 @@ class SetupPage(QWidget):
        ai_layout.addLayout(prov_row)
        self._ai_model = QLineEdit()
-        self._ai_model.setPlaceholderText("Model (e.g. llama3.1 for Ollama; blank = Claude default)")
+        self._ai_model.setPlaceholderText(
            f"Model (e.g. {ai.OLLAMA_SUGGESTED_MODEL} for Ollama; blank = Claude default)")
        ai_layout.addWidget(self._ai_model)
        self._ai_endpoint = QLineEdit()
        self._ai_endpoint.setPlaceholderText("Ollama server URL (default http://localhost:11434)")
@@ -214,6 +216,23 @@ class SetupPage(QWidget):
        ai_layout.addWidget(self._ai_status)
        root.addWidget(ai_card)
        # Logging (M15): opt-in app logging + per-diagnostic storage (enables the Report bundle).
        log_card, log_layout = _panel("Logging")
        log_desc = QLabel(
            "Save application logs and store each diagnostic in its own folder so you can review "
            "or <b>Report</b> it. Off by default; everything stays on your machine.\n"
            f"•  Diagnostics: {config.DIAGNOSTICS_DIR}\n"
            f"•  Reports: {config.REPORTS_DIR}"
        )
        log_desc.setObjectName("Muted")
        log_desc.setWordWrap(True)
        log_layout.addWidget(log_desc)
        self._logging = QCheckBox("Enable logging (application + diagnostics)")
        self._logging.setChecked(config.load_config().get("logging_enabled", False))
        self._logging.toggled.connect(self._toggle_logging)
        log_layout.addWidget(self._logging)
        root.addWidget(log_card)
        # Account access (M13/M12): one Gitea token gates updates and session sharing.
        upd_card, upd_layout = _panel("Account access")
        hint = QLabel("A Gitea access token unlocks updates and session sharing. "
@@ -286,6 +305,8 @@ class SetupPage(QWidget):
        self._ai_endpoint.setVisible(prov == "ollama")
        self._ai_key.setVisible(prov == "claude")
        self._ai_test_btn.setEnabled(prov != "")
        if prov == "ollama" and not self._ai_model.text().strip():
            self._ai_model.setText(ai.OLLAMA_SUGGESTED_MODEL)  # suggested default; user can change
    def _save_ai(self) -> None:
        prov = self._ai_provider()
@@ -317,6 +338,12 @@ class SetupPage(QWidget):
        self._ai_test_btn.setEnabled(True)
        self._ai_status.setText(("✓ " if ok else "✗ ") + (msg[:200] if msg else ""))
    def _toggle_logging(self, on: bool) -> None:
        from ..core import applog
        config.update_config(logging_enabled=on)
        applog.setup(force=True)  # attach/detach the file handler immediately
    def _run_wizard(self) -> None:
        from .setup_wizard import SetupWizard
@@ -62,6 +62,23 @@ class PromptTests(unittest.TestCase):
        text = ai.format_findings([F()])
        self.assertIn("[WARN] GPU: Hot — 92C", text)
    def test_appid_glossary_resolves_known_ids(self):
        from rigdoctor.core import steam
        with mock.patch.object(steam, "appid_names", return_value={"2694490": "Path of Exile 2"}):
            glossary = ai.appid_glossary("Steam log: removed AppID 2694490 ... pid 130544")
        self.assertIn("2694490 = Path of Exile 2", glossary)
    def test_appid_glossary_ignores_unknown_ids(self):
        from rigdoctor.core import steam
        with mock.patch.object(steam, "appid_names", return_value={"570": "Dota 2"}):
            self.assertEqual(ai.appid_glossary("pid 130544 used 8192 MiB"), "")  # not in library
    def test_build_prompt_includes_glossary(self):
        from rigdoctor.core import steam
        with mock.patch.object(steam, "appid_names", return_value={"2694490": "Path of Exile 2"}):
            prompt = ai.build_prompt("AppID 2694490 launched")
        self.assertIn("Path of Exile 2", prompt)
 class ExplainTests(unittest.TestCase):
    def _cfg(self, **over):
@@ -97,5 +114,51 @@ class ExplainTests(unittest.TestCase):
        self.assertEqual(headers["x-api-key"], "sk-ant-x")
 class _FakeResp:
    """A context-managed iterable of byte lines, like urlopen() returns."""
    def __init__(self, lines):
        self._lines = [l.encode("utf-8") for l in lines]
    def __enter__(self):
        return iter(self._lines)
    def __exit__(self, *a):
        return False
 class StreamTests(unittest.TestCase):
    def _cfg(self, **over):
        base = {"ai_provider": "", "ai_model": "", "ai_endpoint": "http://localhost:11434"}
        base.update(over)
        return base
    def test_ollama_stream_accumulates_and_callbacks(self):
        lines = ['{"response": "It is ", "done": false}',
                 '{"response": "the PSU.", "done": false}',
                 '{"response": "", "done": true}']
        chunks = []
        with mock.patch.object(ai.config, "load_config",
                               return_value=self._cfg(ai_provider="ollama", ai_model="qwen2.5:7b")), \
             mock.patch.object(ai, "_stream_request", return_value=_FakeResp(lines)):
            ok, full = ai.explain_stream("Xid 79", on_chunk=chunks.append)
        self.assertTrue(ok)
        self.assertEqual(full, "It is the PSU.")
        self.assertEqual(chunks, ["It is ", "the PSU."])
    def test_claude_stream_parses_sse(self):
        lines = [
            'event: content_block_delta',
            'data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"Failing "}}',
            'data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"disk."}}',
            'data: {"type":"message_stop"}',
        ]
        chunks = []
        with mock.patch.object(ai.config, "load_config", return_value=self._cfg(ai_provider="claude")), \
             mock.patch.object(ai.config, "load_ai_key", return_value="sk-ant-x"), \
             mock.patch.object(ai, "_stream_request", return_value=_FakeResp(lines)):
            ok, full = ai.explain_stream("SMART 197", on_chunk=chunks.append)
        self.assertTrue(ok)
        self.assertEqual(full, "Failing disk.")
        self.assertEqual(chunks, ["Failing ", "disk."])
 if __name__ == "__main__":
    unittest.main()
@@ -34,5 +34,35 @@ class AlertTests(unittest.TestCase):
        m.assert_called_once()
 class KernelEventAlertTests(unittest.TestCase):
    @mock.patch.object(alerts, "notify")
    def test_kernel_event_fires_once_within_cooldown(self, m):
        mon = alerts.AlertMonitor(cooldown=300.0, event_interval=0.0)
        mon._last_kernel_scan = 0.0  # force a scan
        with mock.patch("rigdoctor.core.syslogs.kernel_log",
                        return_value="NVRM: Xid (PCI:0000:01:00): 79, GPU has fallen off the bus"):
            mon._scan_kernel_events()
            mon._last_kernel_scan = 0.0  # force another scan — cooldown must suppress it
            mon._scan_kernel_events()
        self.assertEqual(m.call_count, 1)
        self.assertIn("Xid", m.call_args[0][0])
    @mock.patch.object(alerts, "notify")
    def test_no_alert_when_kernel_log_empty(self, m):
        mon = alerts.AlertMonitor(event_interval=0.0)
        mon._last_kernel_scan = 0.0
        with mock.patch("rigdoctor.core.syslogs.kernel_log", return_value=""):
            mon._scan_kernel_events()
        m.assert_not_called()
    @mock.patch.object(alerts, "notify")
    def test_scan_gated_by_interval(self, m):
        mon = alerts.AlertMonitor(event_interval=9999.0)  # just constructed → not due yet
        with mock.patch("rigdoctor.core.syslogs.kernel_log", return_value="NVRM: Xid 79") as kl:
            mon._scan_kernel_events()
        kl.assert_not_called()
        m.assert_not_called()
 if __name__ == "__main__":
    unittest.main()
@@ -0,0 +1,104 @@
 """Tests for M15 per-diagnostic storage + Report bundles + app logging."""
 import json
 import tempfile
 import unittest
 import zipfile
 from dataclasses import dataclass, field
 from pathlib import Path
 from unittest import mock
 from rigdoctor.core import applog, diagstore
@dataclass
 class FakeSummary:
    start: float = 1.0
    end: float = 2.0
    samples: int = 3
    events: list = field(default_factory=list)
@dataclass
 class FakeFinding:
    severity: str = "ok"
    category: str = "GPU"
    title: str = "Looks fine"
    detail: str = "no issues"
@dataclass
 class FakeResult:
    game: str = "Path of Exile 2"
    summary: FakeSummary = field(default_factory=FakeSummary)
    findings: list = field(default_factory=lambda: [FakeFinding()])
    dir: str | None = None
 class StoreTests(unittest.TestCase):
    def setUp(self):
        self.tmp = Path(tempfile.mkdtemp())
    def test_disabled_returns_none(self):
        with mock.patch.object(diagstore, "enabled", return_value=False):
            self.assertIsNone(diagstore.store(FakeResult()))
    def test_store_writes_artifacts(self):
        with mock.patch.object(diagstore, "enabled", return_value=True), \
             mock.patch("rigdoctor.render.render_summary", return_value="SUMMARY-TEXT"), \
             mock.patch("rigdoctor.core.gamelogs.collect", return_value="LOG-TEXT"), \
             mock.patch("rigdoctor.core.syslogs.collect", return_value="SYS-LOG"), \
             mock.patch("rigdoctor.core.inventory.collect", return_value=[]), \
             mock.patch.object(diagstore.config, "DIAGNOSTICS_DIR", self.tmp / "diagnostics"):
            directory = diagstore.store(FakeResult())
        self.assertTrue((directory / "result.json").exists())
        self.assertTrue((directory / "report.txt").exists())
        self.assertEqual((directory / "gamelogs.txt").read_text(), "LOG-TEXT")
        self.assertEqual((directory / "syslogs.txt").read_text(), "SYS-LOG")
        self.assertTrue((directory / "inventory.txt").exists())  # inventory included for debugging
        data = json.loads((directory / "result.json").read_text())
        self.assertEqual(data["game"], "Path of Exile 2")
        self.assertEqual(len(data["findings"]), 1)
    def test_record_ai_then_report_includes_ai_and_applog(self):
        diag = self.tmp / "20260522-poe2"
        diag.mkdir()
        diagstore.record_ai(diag, provider="claude", model="claude-opus-4-7",
                            system="SYS", prompt="EXACT DATA SENT", response="THE REPLY")
        ai_files = list((diag / "ai").glob("explain-*.json"))
        self.assertTrue(ai_files)
        record = json.loads(ai_files[0].read_text())
        self.assertEqual(record["model"], "claude-opus-4-7")
        self.assertEqual(record["data_sent_to_model"], "EXACT DATA SENT")
        self.assertEqual(record["model_reply"], "THE REPLY")
        app_log = self.tmp / "app.log"
        app_log.write_text("app log line")
        with mock.patch.object(diagstore.config, "REPORTS_DIR", self.tmp / "reports"), \
             mock.patch.object(diagstore.config, "APP_LOG", app_log):
            out = diagstore.make_report(diag)
        self.assertTrue(out.exists())
        with zipfile.ZipFile(out) as zf:
            names = zf.namelist()
        self.assertTrue(any(n.endswith("app.log") for n in names))
        self.assertTrue(any("/ai/explain-" in n for n in names))
 class AppLogTests(unittest.TestCase):
    def test_disabled_is_noop(self):
        with mock.patch.object(applog.config, "load_config", return_value={"logging_enabled": False}):
            self.assertFalse(applog.setup(force=True))
    def test_enabled_writes_file(self):
        tmp = Path(tempfile.mkdtemp())
        with mock.patch.object(applog.config, "load_config", return_value={"logging_enabled": True}), \
             mock.patch.object(applog.config, "STATE_DIR", tmp), \
             mock.patch.object(applog.config, "APP_LOG", tmp / "app.log"):
            self.assertTrue(applog.setup(force=True))
            applog.get_logger("test").info("hello world")
            applog.setup(force=True)  # cleanup path: re-run detaches/reattaches cleanly
        self.assertTrue((tmp / "app.log").exists())
 if __name__ == "__main__":
    unittest.main()
@@ -0,0 +1,77 @@
 """Tests for M14 game/Proton/Steam log collection."""
 import os
 import tempfile
 import time
 import unittest
 from pathlib import Path
 from unittest import mock
 from rigdoctor.core import gamelogs
 class TailTests(unittest.TestCase):
    def test_tail_returns_last_bytes(self):
        path = Path(tempfile.mkdtemp()) / "x.log"
        path.write_text("A" * 100 + "TAIL")
        out = gamelogs._tail(path, 4)
        self.assertEqual(out, "TAIL")
    def test_tail_short_file(self):
        path = Path(tempfile.mkdtemp()) / "x.log"
        path.write_text("short")
        self.assertEqual(gamelogs._tail(path, 9999), "short")
    def test_tail_missing(self):
        self.assertEqual(gamelogs._tail(Path("/nope/x.log"), 10), "")
 class CollectTests(unittest.TestCase):
    def test_collect_includes_proton_and_steam(self):
        tmp = Path(tempfile.mkdtemp())
        proton = tmp / "steam-570.log"
        proton.write_text("err: vkd3d device lost")
        console = tmp / "console-linux.txt"
        console.write_text("Game removed AppID 570 ... exit")
        with mock.patch.object(gamelogs, "_proton_logs", return_value=[proton]), \
             mock.patch.object(gamelogs, "_steam_console", return_value=console):
            out = gamelogs.collect()
        self.assertIn("Proton log", out)
        self.assertIn("vkd3d", out)
        self.assertIn("Steam log", out)
        self.assertIn("exit", out)
    def test_collect_empty_when_none(self):
        with mock.patch.object(gamelogs, "_proton_logs", return_value=[]), \
             mock.patch.object(gamelogs, "_steam_console", return_value=None):
            self.assertEqual(gamelogs.collect(), "")
 class SinceScopingTests(unittest.TestCase):
    def test_since_filter_keeps_window_only(self):
        text = (
            "[2026-05-22 13:00:00] old session line\n"
            "[2026-05-22 13:00:01] another old line\n"
            "[2026-05-22 14:30:00] new session launch\n"
            "[2026-05-22 14:30:05] new session error\n"
        )
        since = time.mktime(time.strptime("2026-05-22 14:00:00", "%Y-%m-%d %H:%M:%S"))
        out = gamelogs._since_filter(text, since)
        self.assertIn("new session launch", out)
        self.assertIn("new session error", out)
        self.assertNotIn("old session", out)
    def test_collect_skips_stale_proton_log(self):
        tmp = Path(tempfile.mkdtemp())
        proton = tmp / "steam-9999.log"
        proton.write_text("stale proton output from an earlier game")
        old_mtime = time.time() - 3600
        os.utime(proton, (old_mtime, old_mtime))
        since = time.time() - 60  # session started a minute ago
        with mock.patch.object(gamelogs, "_proton_logs", return_value=[proton]), \
             mock.patch.object(gamelogs, "_steam_console", return_value=None):
            self.assertEqual(gamelogs.collect(since=since), "")  # stale log excluded
 if __name__ == "__main__":
    unittest.main()
@@ -0,0 +1,114 @@
 """Tests for M15 session-scoped system-log collection (kernel + coredumps)."""
 import unittest
 from unittest import mock
 from rigdoctor.core import syslogs
 class KernelLogTests(unittest.TestCase):
    def test_passes_since_and_tails(self):
        with mock.patch("shutil.which", return_value="/usr/bin/journalctl"), \
             mock.patch.object(syslogs, "_run", return_value="X" * 100 + "TAILLINE") as run:
            out = syslogs.kernel_log(since=1_000_000_000, max_bytes=8)
        self.assertEqual(out, "TAILLINE")
        cmd = run.call_args[0][0]
        self.assertIn("-k", cmd)
        self.assertIn("--since", cmd)
    def test_missing_tool_returns_empty(self):
        with mock.patch("shutil.which", return_value=None):
            self.assertEqual(syslogs.kernel_log(), "")
 class CoredumpTests(unittest.TestCase):
    def test_empty_when_no_coredumps(self):
        with mock.patch("shutil.which", return_value="/usr/bin/coredumpctl"), \
             mock.patch.object(syslogs, "_run", return_value="No coredumps found."):
            self.assertEqual(syslogs.coredumps(), "")
    def test_returns_list(self):
        with mock.patch("shutil.which", return_value="/usr/bin/coredumpctl"), \
             mock.patch.object(syslogs, "_run", return_value="TIME PID SIG EXE\n... SEGV PathOfExile"):
            out = syslogs.coredumps()
        self.assertIn("PathOfExile", out)
 class NvidiaTests(unittest.TestCase):
    def test_missing_tool(self):
        with mock.patch("shutil.which", return_value=None):
            self.assertEqual(syslogs.nvidia_snapshot(), "")
    def test_snapshot_head_truncated(self):
        with mock.patch("shutil.which", return_value="/usr/bin/nvidia-smi"), \
             mock.patch.object(syslogs, "_run", return_value="DRIVER\n" + "x" * 99999):
            out = syslogs.nvidia_snapshot(max_bytes=10)
        self.assertEqual(out, "DRIVER\nxxx")  # head, not tail
 class DisplayTests(unittest.TestCase):
    def test_session_type_env(self):
        with mock.patch.dict("os.environ", {"XDG_SESSION_TYPE": "wayland"}):
            self.assertEqual(syslogs._session_type(), "wayland")
    def test_x11_tails_xorg_log(self):
        import tempfile
        from pathlib import Path
        log = Path(tempfile.mkdtemp()) / "Xorg.0.log"
        log.write_text("(EE) NVIDIA(GPU-0): something failed")
        with mock.patch.object(syslogs, "_session_type", return_value="x11"), \
             mock.patch.object(syslogs, "_xorg_log", return_value=log):
            out = syslogs.display_log()
        self.assertIn("(EE) NVIDIA", out)
    def test_wayland_uses_user_journal(self):
        with mock.patch.object(syslogs, "_session_type", return_value="wayland"), \
             mock.patch("shutil.which", return_value="/usr/bin/journalctl"), \
             mock.patch.object(syslogs, "_run", return_value="gnome-shell: GPU error") as run:
            out = syslogs.display_log(since=1_000_000_000)
        self.assertIn("GPU error", out)
        cmd = run.call_args[0][0]
        self.assertIn("--user", cmd)
        self.assertTrue(any(a.startswith("_COMM=") for a in cmd))
 class ScanCriticalTests(unittest.TestCase):
    def test_matches_each_category(self):
        text = "\n".join([
            "NVRM: Xid (PCI:0000:01:00): 79, GPU has fallen off the bus",
            "Out of memory: Killed process 1234 (PathOfExile)",
            "mce: [Hardware Error]: CPU 0",
            "pcieport 0000:00:01.0: AER: Corrected error received",
            "blk_update_request: I/O error, dev sda, sector 99",
            "this is a perfectly normal line",
        ])
        labels = {label for label, _ in syslogs.scan_critical(text)}
        self.assertEqual(labels, {
            "GPU error (Xid)", "Out of memory", "CPU machine-check",
            "PCIe error", "Disk I/O error"})
    def test_clean_log_no_events(self):
        self.assertEqual(syslogs.scan_critical("usb 1-2: new high-speed device\nsystemd: started"), [])
 class CollectTests(unittest.TestCase):
    def test_collect_combines_sections(self):
        with mock.patch.object(syslogs, "kernel_log", return_value="NVRM: Xid 79"), \
             mock.patch.object(syslogs, "coredumps", return_value="game SIGSEGV"), \
             mock.patch.object(syslogs, "nvidia_snapshot", return_value="Driver Version 595"), \
             mock.patch.object(syslogs, "display_log", return_value="(EE) NVIDIA"):
            out = syslogs.collect()
        for needle in ("Kernel log", "Xid 79", "Crashed processes", "SIGSEGV",
                       "NVIDIA snapshot", "595", "Display server log"):
            self.assertIn(needle, out)
    def test_collect_empty_when_nothing(self):
        with mock.patch.object(syslogs, "kernel_log", return_value=""), \
             mock.patch.object(syslogs, "coredumps", return_value=""), \
             mock.patch.object(syslogs, "nvidia_snapshot", return_value=""), \
             mock.patch.object(syslogs, "display_log", return_value=""):
            self.assertEqual(syslogs.collect(), "")
 if __name__ == "__main__":
    unittest.main()
Author	SHA1	Message	Date
jessey	87fa678ccb	fix(cli): correct the missing-PySide6 hint to the real apt packages — 0.36.1 tests / core (pull_request) Successful in 13s Details tests / gui-smoke (pull_request) Successful in 26s Details rigdoctor gui suggested 'apt install python3-pyside6' (no such package on Debian/Ubuntu). Point to the split modules instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 15:48:20 +02:00
jessey	21cc6a4813	docs: document the proper (GPG-verified, deb822) apt setup tests / core (pull_request) Successful in 13s Details tests / gui-smoke (pull_request) Successful in 27s Details Replace the trusted=yes apt instructions with the proper method: read:package token, registry signing key dearmored into /etc/apt/keyrings, credentials in auth.conf.d, and a modern deb822 .sources file with Signed-By + Architectures: all. Keeps the trusted=yes one-liner as a noted fallback for unsigned registries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 15:44:41 +02:00
jessey	ee73049248	Merge pull request 'fix(deb): auto-install all deps — correct PySide6 names + bundle tools — 0.36.0' (#33 ) from fix/deb-pyside6-deps into main release / test (push) Successful in 12s Details release / release (push) Successful in 16s Details Reviewed-on: #33	2026-05-22 13:39:01 +00:00
jessey	3a8ad5bd5d	fix(deb): auto-install all deps — correct PySide6 names + bundle tools — 0.36.0 tests / core (pull_request) Successful in 12s Details tests / gui-smoke (pull_request) Successful in 29s Details The old Recommends named python3-pyside6 (no such package on Debian/Ubuntu — PySide6 is split per module), so apt skipped it and the GUI couldn't start. Now Recommends the real modules (python3-pyside6.qt{widgets,gui,websockets,svg} + python3-pyte) AND the optional diagnostic/gaming tools (smartmontools, lm-sensors, dmidecode, pciutils, libnotify-bin, libsecret-tools, gamemode, mangohud), so 'apt install rigdoctor' sets up the whole toolset automatically — no manual installs. cpupower -> Suggests. Verified all candidates resolve in apt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 15:38:12 +02:00
jessey	e8b84bf046	Merge pull request 'docs: rewrite README to be user-first (install + use)' (#32 ) from docs/readme-users into main release / test (push) Successful in 12s Details release / release (push) Successful in 16s Details Reviewed-on: #32	2026-05-22 13:32:41 +00:00
jessey	2342dd83aa	docs: rewrite README to be user-first (install + use) tests / core (pull_request) Successful in 12s Details tests / gui-smoke (pull_request) Successful in 29s Details Lead with what RigDoctor does, then install (.deb/apt incl. the private-registry auth.conf.d + trusted=yes notes, and the .run), then usage (GUI/tray/CLI), requirements, and privacy. Move the dev content (from-source, tests, docs links) into a short Development section at the end. Drops the stale status/decisions/ repo-layout planning sections from the top. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 15:31:36 +02:00
jessey	a028fe6d38	Merge pull request 'ci: make apt registry upload idempotent (tolerate 409)' (#31 ) from fix/apt-409 into main release / test (push) Successful in 12s Details release / release (push) Successful in 16s Details Reviewed-on: #31	2026-05-22 13:26:47 +00:00
jessey	a6453335e9	ci: make apt registry upload idempotent (tolerate 409) tests / core (pull_request) Successful in 12s Details tests / gui-smoke (pull_request) Successful in 28s Details Gitea's Debian registry is immutable, so re-uploading an existing version returns 409. With --fail that aborted the release on any re-run / repeat push at the same version. Now we capture the HTTP code: 2xx = uploaded, 409 = already published (skip), anything else = fail with the body. Also fixed the stale skip message (REGISTRY_TOKEN, not PACKAGES_TOKEN). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 15:21:27 +02:00
jessey	baec47dd4e	Merge pull request 'assets: project avatar (gauge + heartbeat) for Gitea' (#30 ) from chore/avatar into main release / test (push) Successful in 12s Details release / release (push) Failing after 15s Details Reviewed-on: #30	2026-05-22 13:18:59 +00:00
jessey	47ecb702e7	Merge branch 'main' into chore/avatar tests / core (pull_request) Successful in 12s Details tests / gui-smoke (pull_request) Successful in 28s Details	2026-05-22 13:17:28 +00:00
jessey	944945ce72	Merge pull request 'feat(m9): .deb package + CI build/publish — 0.35.0' (#29 ) from feat/deb-packaging into main release / test (push) Successful in 13s Details release / release (push) Successful in 19s Details Reviewed-on: #29	2026-05-22 13:17:19 +00:00
jessey	dc719f6a89	assets: project avatar (gauge + heartbeat) for Gitea tests / core (pull_request) Successful in 13s Details tests / gui-smoke (pull_request) Successful in 27s Details 512x512 PNG (assets/avatar.png) rendered from assets/avatar.svg, matching the app icon's gauge-ring + heartbeat motif on a dark gradient. Upload as the repo avatar. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 15:16:58 +02:00
jessey	78cd417d0b	feat(m9): .deb package + CI build/publish — 0.35.0 tests / core (pull_request) Successful in 13s Details tests / gui-smoke (pull_request) Successful in 28s Details packaging/make_deb.py builds rigdoctor_<ver>_all.deb (Architecture: all) via dpkg-deb, no debhelper: Depends python3; Recommends python3-pyside6/pyte (GUI by default, --no-install-recommends = CLI only). Installs the package, both launchers, desktop entry + icon; postinst refreshes the desktop database. release.yml builds it as a release asset and optionally pushes to the Gitea apt registry (REGISTRY_TOKEN). Verified locally: valid .deb, packaged launcher runs 'rigdoctor --version'. Docs/README/ROADMAP/MODULES updated; M9 complete. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 15:15:33 +02:00
jessey	856a3305ad	Merge pull request 'feat(m8): event-based alerts — Xid/OOM/MCE/PCIe/disk from the kernel log — 0.34.0' (#28 ) from feat/event-alerts into main release / test (push) Successful in 13s Details release / release (push) Successful in 15s Details Reviewed-on: #28	2026-05-22 12:48:41 +00:00
jessey	3b1a2e7393	Merge branch 'feat/event-alerts' of ssh://jesseyvanofferen.com:2222/jessey/rigdoctor into feat/event-alerts tests / core (pull_request) Successful in 11s Details tests / gui-smoke (pull_request) Successful in 26s Details	2026-05-22 14:42:53 +02:00
jessey	2989e8e23e	ci: run tests.yml on pull_request only (no push) to avoid double runs A branch with an open PR triggered both the push and pull_request events, running every job twice. Trigger on pull_request only; pushes to main are already tested by release.yml's `test` job. No version bump (CI config only). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 14:42:41 +02:00
jessey	670df23e06	Merge branch 'main' into feat/event-alerts tests / core (push) Successful in 12s Details tests / gui-smoke (push) Successful in 26s Details tests / core (pull_request) Successful in 12s Details tests / gui-smoke (pull_request) Successful in 26s Details	2026-05-22 12:41:34 +00:00
jessey	2ee7763d00	feat(m8): event-based alerts — Xid/OOM/MCE/PCIe/disk from the kernel log — 0.34.0 tests / core (push) Successful in 12s Details tests / gui-smoke (push) Successful in 27s Details tests / core (pull_request) Successful in 12s Details tests / gui-smoke (pull_request) Successful in 26s Details AlertMonitor now scans the kernel log (journalctl -k) every ~30s and fires one-shot, cooldown-gated desktop alerts on critical events: NVIDIA Xid, OOM kills, CPU machine-checks, PCIe AER, and disk I/O errors — so users are warned the moment something goes wrong, not only on a temperature threshold. Disk I/O errors come from the kernel log (no root needed, unlike smartctl). Edge/spam protection reuses the existing cooldown model. syslogs.scan_critical() does the matching; init seeds last-scan to "now" so old boot logs don't alert on launch. Tests for the matcher + monitor gating/cooldown; Settings note updated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 14:41:13 +02:00
jessey	bd6cad5a42	Merge pull request 'feat(ai): stream explanations live (Ollama NDJSON + Claude SSE) — 0.33.0' (#27 ) from feat/syslogs into main release / test (push) Successful in 12s Details tests / core (push) Successful in 12s Details tests / gui-smoke (push) Successful in 25s Details release / release (push) Successful in 15s Details Reviewed-on: #27	2026-05-22 12:35:11 +00:00
jessey	7fa9b63661	Merge branch 'main' into feat/syslogs tests / core (push) Successful in 12s Details tests / gui-smoke (push) Successful in 25s Details tests / core (pull_request) Successful in 11s Details tests / gui-smoke (pull_request) Successful in 28s Details	2026-05-22 12:28:59 +00:00
jessey	c443a8b9f8	ci: add tests workflow + gate releases on tests passing tests / core (push) Successful in 12s Details tests / gui-smoke (push) Successful in 38s Details tests / core (pull_request) Successful in 13s Details tests / gui-smoke (pull_request) Successful in 27s Details - .gitea/workflows/tests.yml: run `unittest discover` on push + pull_request. `core` job (stdlib install, GUI tests skip) is bulletproof; `gui-smoke` job installs the GUI extra + offscreen Qt libs and runs the suite headless. - release.yml: add a `test` job and `release: needs: test` so a push to main can't publish if the tests fail. No version bump — CI config only; nothing in the shipped app changed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 14:26:47 +02:00
jessey	bbc22fa288	feat(ai): stream explanations live (Ollama NDJSON + Claude SSE) — 0.33.0 ai.explain_stream(findings_text, on_chunk) streams token deltas and returns (ok, full_text). Ollama: stream=True NDJSON; Claude: stream=True SSE (parse content_block_delta text deltas). The diagnostic dialog opens an explanation window immediately and fills it token-by-token via a _chunk signal, then re-renders the finished answer as Markdown — no more multi-second freeze on a local model. Non-streaming explain() kept for the CLI. Tests for both parsers; verified live against qwen2.5:7b. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 14:23:15 +02:00
jessey	5502251789	Merge pull request 'feat(m15): collect session-scoped system logs (kernel + coredumps) — 0.31.0' (#26 ) from feat/syslogs into main release / release (push) Successful in 15s Details Reviewed-on: #26	2026-05-22 12:16:52 +00:00
jessey	4bd51a40c3	feat(m15): nvidia-smi snapshot + display logs + inventory in reports — 0.32.0 Expand diagnostic/report collection (all stored per-diagnostic, in the Report zip; logs also fed to the AI on "Explain"): - syslogs: nvidia-smi -q snapshot (driver/throttle/clocks/power/temps/PCIe/ECC/ retired pages) + display-server log auto-detected — Xorg.0.log on X11, or the compositor user-journal slice (gnome-shell/kwin/sway/gamescope) on Wayland. - diagstore: include the full M5 inventory (inventory.txt + .json) — invaluable for larger/shared debugging. inventory.collect() degrades gracefully (no root prompt). Best-effort throughout. - Tests for nvidia/display + inventory in store; docs (M15/SPEC). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 14:16:23 +02:00
jessey	984292c368	feat(m15): collect session-scoped system logs (kernel + coredumps) — 0.31.0 core/syslogs.py gathers, scoped to the diagnostic window: - kernel-log slice (journalctl -k): Xid, OOM, MCE, PCIe AER, thermal, hung tasks - crashed-process records (coredumpctl): exe, signal, when Stored as syslogs.txt in the diagnostic dir, included in the Report bundle, and fed to the AI on "Explain" alongside the game logs. Best-effort (degrades if the tools are missing/denied); treats journalctl's "-- No entries --" as empty. Tests + docs (M15/SPEC). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 14:10:30 +02:00
jessey	bffaf73ad4	Merge pull request 'fix(ai): analyse the actual session, not stale/benign logs — 0.28.1' (#25 ) from feat/m14-ai into main release / release (push) Successful in 15s Details Reviewed-on: #25	2026-05-22 11:57:03 +00:00
jessey	7f0ab9a635	feat(m15): opt-in logging + per-diagnostic storage + Report bundles — 0.30.0 One `logging_enabled` toggle (default off) gates everything (D25): - core/applog.py: rotating app.log (no-op unless enabled); setup() at GUI/CLI start. - core/diagstore.py: each diagnostic stored in DATA_DIR/diagnostics/<id>/ (capture, result.json, report.txt, scoped gamelogs, ai/ records of exactly what was sent to the model + which model + the reply). make_report() zips a diagnostic (+ app.log) into DATA_DIR/reports/. - diagnostic.finish()/analyze_crash() store when enabled; DiagnosticResult.dir. - GUI: Settings → Logging toggle; "Report" button on the diagnostic dialog; AI interactions recorded into the diagnostic dir on "Explain with AI". - CLI: `rigdoctor bundle` (report is taken by the M4 health report). - Tests for store/record_ai/make_report + applog gating; docs (D25, M15, Phase 8). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 13:56:31 +02:00
jessey	12339c3282	feat(ai): resolve Steam app IDs from the library, don't make the model guess — 0.29.0 The model guessed "Rainbow Six Siege" for appID 2694490 (Path of Exile 2). We already know the names locally, so ground it: steam.appid_names() maps appid→name from the scanned library, and ai.build_prompt scans the text for app IDs and injects a resolved glossary. Only locally-known IDs are listed; no network, no fine-tuning. Tests + verified live (2694490 = Path of Exile 2). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 13:40:34 +02:00
jessey	c7e50ba4cb	fix(ai): analyse the actual session, not stale/benign logs — 0.28.1 The user ran a game ~20s with no crash but the AI dredged up old log lines, guessed the wrong game, and gave Windows advice. Fixes: - Prompt now includes the real game name + capture duration + outcome (clean vs crash), so the model uses the known game instead of guessing from log paths. - gamelogs.collect(since=…): scope Steam-console lines by timestamp and skip a stale per-app Proton log (mtime before the session) — no unrelated past run. - ai_knowledge: flag benign Steam/Proton lines (libnvidia-ml.so.1 assertion, routine minidumps, "fork without exec") as non-causal. - System prompt: Linux-only steps (no "run as administrator"); don't manufacture a problem on a clean run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 13:38:19 +02:00
jessey	a3caabc0d5	Merge pull request 'feat(ai): pre-fill qwen2.5:7b when Ollama is selected — 0.27.1' (#24 ) from feat/m14-ai into main release / release (push) Successful in 14s Details Reviewed-on: #24	2026-05-22 11:32:59 +00:00
jessey	b59f202891	feat(ai): render Markdown + feed game/Proton/Steam logs to the AI — 0.28.0 1) The explanation popup rendered raw Markdown (### / *). Switched to QTextEdit.setMarkdown and told the model to answer in Markdown. 2) On "Explain with AI", also collect recent Proton (~/steam-.log) and Steam console logs (core/gamelogs.py — tail-read, size-bounded) and include them in the prompt so the model can correlate log errors with findings and pinpoint when things went wrong. Reference-fact matching runs over the logs too. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 13:32:51 +02:00
jessey	e6d94fbd59	feat(ai): pre-fill qwen2.5:7b when Ollama is selected — 0.27.1 Selecting the Ollama provider pre-fills the model field with the suggested qwen2.5:7b (fits an 8 GB GPU at Q4; grounding makes a 7B sufficient). Won't overwrite a model the user already typed. Constant ai.OLLAMA_SUGGESTED_MODEL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 13:25:04 +02:00
`@@ -1,3 +1,3 @@`
	`"""RigDoctor — modular hardware monitoring & crash diagnostics for Linux gamers."""`	`"""RigDoctor — modular hardware monitoring & crash diagnostics for Linux gamers."""`

	`__version__ = "0.27.0"`	`__version__ = "0.36.1"`