Files
rigdoctor/docs/ROADMAP.md
T
jessey 934b489fec feat(gui): Run Diagnostic flow on the Games page — 0.12.0
Brings the guided diagnostic (0.11.0 core/CLI) into the GUI:
- Each game row gets a "Run Diagnostic" button → starts a focused, game-tagged
  capture and shows a recording banner (live sample count + GPU-lost indicator)
  with Finish & analyze / Discard.
- Finishing runs core.diagnostic.finish() off the UI thread and opens a results
  dialog (gui/diagnostic_dialog.py): window-scoped capture summary + findings
  cards (reusing render_summary + finding_card).
- Banner restores on showEvent if a capture is still running (navigate away/back).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 08:32:04 +02:00

84 lines
5.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# RigDoctor — Roadmap (DRAFT v0.2)
Phased so the seed use case (capturing the RTX 3070 crash / black-screen events) is solved
early, before the broader "tool for all Linux gamers" work. Stack: Python 3 + Qt/PySide6;
Ubuntu + NVIDIA first; `.deb` distribution (see `DECISIONS.md`).
## Phase 0 — Workspace & spec *(done)*
- [x] Create repo + docs scaffold
- [x] Settle the foundational decisions D1D11 (name, language, platform/GPU priority, MVP
scope, trigger model, packaging, scope-of-action, GUI/tray)
- [x] Lock the MVP scope (M1 + M3 + M4, NVIDIA-only)
## Phase 1 — MVP: capture *this* crash (Essential bundle, NVIDIA-only, CLI)
- [x] M1 sensor core (NVIDIA via nvidia-smi + hwmon for CPU/RAM/NVMe), stdlib-only
- [x] M3 crash-capture logger (JSONL, fsync per sample, GPU-lost detection, size rotation)
- [x] Manual trigger mode (`rigdoctor record run/start/stop/status`); `systemd --user`
service + other trigger modes in Phase 4 (`run` is already the service entrypoint)
- [x] M4 health report (Xid/panic/OOM/MCE/AER/thermal scan + SMART + driver-mismatch +
journald-persistence + live temps, suggested fixes only — D9; GPU-firmware verify deferred)
- [x] `record report` post-crash summary (peak temps/power per subsystem, events, last N samples)
- **Exit criteria:** user can run it during gaming and, after a freeze/black-screen, see the
last readings + a plausible cause.
## Phase 2 — Live monitor (terminal)
- [ ] M2 TUI dashboard (current/min/max, grouped, throttle highlighting)
- [ ] M8 basic alerting (overheat/throttle/GPU-lost notifications)
## Phase 3 — Diagnostics breadth
- [ ] M5 system inventory + exportable report
- [~] M6 gaming environment checks (suggest-only) — *Steam game/library detection done*
(multi-library `libraryfolders.vdf` discovery + `appmanifest` scan, opt-in libraries,
launch-time background rescan with new-game badge; CLI `rigdoctor games`, GUI Games page).
This is also the D12 "pick a game" foundation. *Env-check engine done* (`rigdoctor gameenv`
+ GUI Environment page): PCIe ASPM, NVIDIA persistence, CPU governor, GameMode, MangoHud,
swappiness, shader cache, THP, mitigations, Proton versions — read-only with fix commands.
*Pending:* non-Steam launchers (Lutris/Heroic) + GPU power-profile (PowerMizer) checks.
- [ ] SMART integration (smartmontools if present)
## Phase 4 — Desktop UI & installer
- [ ] M10 desktop GUI (PySide6: dashboard, log browser, report viewer, logger controls)
- [ ] M11 tray / menu-bar applet (QSystemTrayIcon: live M1 readouts + Run Diagnostic +
supporting actions — D13)
- [~] Guided diagnostic session (pick game → focused M3 capture → M4 scan → findings),
shared by tray/GUI/CLI — *core + CLI + GUI done* (`core/diagnostic.py`, `rigdoctor
diagnose start/status/finish`, and a **Run Diagnostic** button per game on the GUI Games
page → recording banner → results dialog with the capture summary + findings). Tags a
focused capture with the chosen game (own diagnostic log, window-scoped report) and
combines the capture summary with the M4 findings. *Pending:* the tray (M11) entry point,
and auto start/stop via the D12 wrapper/watcher.
- [ ] Logger trigger modes: always-on + game-launch (D12 — wrapper first:
`rigdoctor wrap %command%` + global Steam compat-tool; zero-config watcher
(Steam RunningAppID + /proc) and GameMode hook follow)
- [~] M9 interactive installer — *done:* distro/GPU detection + optional-dependency install
(`rigdoctor install`, GUI Setup tab); **user-local `install.sh` + self-extracting `.run`**
(no-root venv install, handles python3-venv prereq, CI-built). *Pending:* module-selection
config + `systemd --user` service enable + trigger-mode pick.
- [ ] `.deb` packaging (D8) declaring per-bundle deps incl. python3-pyside6 for Desktop UI
## Phase 5 — Breadth (later)
- [ ] AMD GPU support in M1 (Steam Deck / Radeon)
- [ ] Intel GPU best-effort
- [x] M13 auto-update (D18) — launch-time version check (GUI sidebar) + no-root self-update
apply (`rigdoctor update` / sidebar button → authenticated pip upgrade), token-gated.
Restart-after-update is manual for now.
- [~] Optional auto-apply of suggested fixes behind explicit consent (D9 milestone) — *first
cut shipped for M6 (D22):* one-click apply of runtime-reversible tunables (CPU governor,
NVIDIA persistence, PCIe ASPM, swappiness, THP) via a single pkexec prompt, no reboot.
GRUB-based fixes + CPU mitigations remain suggestion-only.
## Phase 6 — Session sharing / remote assist (M12, D16)
Escalating ladder, built in order:
- [ ] Tier 1: `share export` — diagnostic bundle (inventory + recent log + report); B opens
it in RigDoctor. One-way, safest.
- [x] Tier 2: live read-only view — `rigdoctor share serve` (stdlib HTTP, token-gated:
sensors + health + inventory). Remote = user-chosen tunnel; GUI controls still to add.
- [x] Tier 3: host-consented interactive terminal — a real PTY shell shared over the relay
(own `pty`, pyte-rendered guest), off by default; host reads along + can type (sudo).
> **Out of scope:** stress/repro module (D7); multi-distro support and packaging beyond
> Ubuntu/apt + `.deb` (D15) — a thin seam is kept but not built out.
> **Dropped:** stress / repro module (D7) — not on the roadmap.
</content>