305b6c4497
Planning docs (SPEC, ARCHITECTURE, MODULES, ROADMAP, DECISIONS) with decisions D1-D15 settled: RigDoctor name, Python 3 + Qt/PySide6 stack (core/CLI/daemon stdlib-only), Ubuntu + NVIDIA first, .deb packaging, read-only + suggestions, GUI + tray modules, stress module dropped. First code: the M1 sensor core (stdlib-only) and a CLI. - core engine: Reading/Sample model, Sampler, hwmon reader - self-probing sources (NVIDIA first): nvidia-smi GPU, coretemp/k10temp CPU, /proc/meminfo + DDR5 SPD memory, NVMe storage - CLI: snapshot (text/JSON), monitor, sources; record/report stubbed - stdlib unittest smoke tests Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
57 lines
2.9 KiB
Markdown
57 lines
2.9 KiB
Markdown
# RigDoctor — Roadmap (DRAFT v0.2)
|
||
|
||
Phased so the seed use case (capturing the RTX 3070 crash / black-screen events) is solved
|
||
early, before the broader "tool for all Linux gamers" work. Stack: Python 3 + Qt/PySide6;
|
||
Ubuntu + NVIDIA first; `.deb` distribution (see `DECISIONS.md`).
|
||
|
||
## Phase 0 — Workspace & spec *(done)*
|
||
- [x] Create repo + docs scaffold
|
||
- [x] Settle the foundational decisions D1–D11 (name, language, platform/GPU priority, MVP
|
||
scope, trigger model, packaging, scope-of-action, GUI/tray)
|
||
- [x] Lock the MVP scope (M1 + M3 + M4, NVIDIA-only)
|
||
|
||
## Phase 1 — MVP: capture *this* crash (Essential bundle, NVIDIA-only, CLI)
|
||
- [ ] M1 sensor core (NVIDIA via nvidia-smi + hwmon for CPU/RAM/NVMe), stdlib-only
|
||
- [ ] M3 crash-capture logger (CSV, fsync per sample, GPU-lost detection, rotation,
|
||
`systemd --user` service)
|
||
- [ ] Manual trigger mode first (`rigdoctor record start/stop`); other modes in Phase 4
|
||
- [ ] M4 health report (Xid/panic/OOM/MCE/AER/thermal scan + driver-mismatch + snapshot,
|
||
suggested fixes only — D9)
|
||
- [ ] `--report` post-crash summary (max temps/power, throttle events, last N samples)
|
||
- **Exit criteria:** user can run it during gaming and, after a freeze/black-screen, see the
|
||
last readings + a plausible cause.
|
||
|
||
## Phase 2 — Live monitor (terminal)
|
||
- [ ] M2 TUI dashboard (current/min/max, grouped, throttle highlighting)
|
||
- [ ] M8 basic alerting (overheat/throttle/GPU-lost notifications)
|
||
|
||
## Phase 3 — Diagnostics breadth
|
||
- [ ] M5 system inventory + exportable report
|
||
- [ ] M6 gaming environment checks (suggest-only)
|
||
- [ ] SMART integration (smartmontools if present)
|
||
|
||
## Phase 4 — Desktop UI & installer
|
||
- [ ] M10 desktop GUI (PySide6: dashboard, log browser, report viewer, logger controls)
|
||
- [ ] M11 tray / menu-bar applet (QSystemTrayIcon: live M1 readouts + Run Diagnostic +
|
||
supporting actions — D13)
|
||
- [ ] Guided diagnostic session (pick game → focused M3 capture → M4 scan → findings),
|
||
shared by tray/GUI/CLI
|
||
- [ ] Logger trigger modes: always-on + game-launch (D12 — wrapper first:
|
||
`rigdoctor wrap %command%` + global Steam compat-tool; zero-config watcher
|
||
(Steam RunningAppID + /proc) and GameMode hook follow)
|
||
- [ ] M9 interactive installer (GPU detection, module menu, apt dependency resolution,
|
||
service enable + trigger-mode pick)
|
||
- [ ] `.deb` packaging (D8) declaring per-bundle deps incl. python3-pyside6 for Desktop UI
|
||
|
||
## Phase 5 — Breadth (later)
|
||
- [ ] AMD GPU support in M1 (Steam Deck / Radeon)
|
||
- [ ] Intel GPU best-effort
|
||
- [ ] (Later, separate milestone) Optional auto-apply of suggested fixes behind explicit
|
||
consent — currently out of scope (D9)
|
||
|
||
> **Out of scope:** stress/repro module (D7); multi-distro support and packaging beyond
|
||
> Ubuntu/apt + `.deb` (D15) — a thin seam is kept but not built out.
|
||
|
||
> **Dropped:** stress / repro module (D7) — not on the roadmap.
|
||
</content>
|