Files
rigdoctor/docs/MODULES.md
T
jessey ce5f830393
release / release (push) Successful in 2m13s
Release 0.0.2: M3 logger (CLI + GUI), GUI-first, CI release workflow
Crash-capture logger (M3):
- crash-safe JSONL (fsync per sample), size-based rotation, GPU-lost/recovered
  markers, atomic status file
- CLI: record run/start/stop/status/report (run = systemd-ready entrypoint)
- shared core.reccontrol so CLI + GUI drive the same recorder
- crashlog tests (writer, rotation, reader, summary, recorder)

GUI:
- Recording/Logs page: start/stop/interval controls, live status, post-crash report
- shared render helpers (format_raw/headline, render_summary)

Docs/decisions:
- GUI-first (D17); CLI keeps full parity
- D8 revised: user-local self-updating install primary, .deb optional
- planned: M12 session sharing (D16), M13 no-root auto-update from public repo (D18)
- versioning + CHANGELOG convention (D19)

Infra:
- .gitea/workflows/release.yml: build wheel+sdist and publish a Gitea release
  v<version> on push to main
- align version to the 0.0.x release line; bump to 0.0.2

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 17:16:41 +02:00

80 lines
5.5 KiB
Markdown

# RigDoctor — Module Catalog (DRAFT v0.2)
Status: ⬜ not started · 🟦 designing · 🟨 in progress · ✅ done
> Module set per D14, plus **M12 (session sharing, D16)** and **M13 (auto-update, D18)**.
> **M7 (stress/repro) was dropped (D7).** M10/M11 are the GUI and tray modules (D10/D11).
> GPU scope reads "all (NVIDIA first)" — NVIDIA first, others via the vendor abstraction (D4).
| ID | Module | Bundle | Key deps | GPU scope | Priority | Status |
|----|--------|--------|----------|-----------|----------|--------|
| M1 | Sensor core | Essential | none (nvidia-smi, sysfs) | all (NVIDIA first) | P0 | ⬜ |
| M3 | Crash-capture logger | Essential | none (opt: smartmontools) | all (NVIDIA first) | P0 | 🟨 |
| M4 | Health report (log scan) | Essential | none (opt: smartmontools) | all (NVIDIA first) | P0 | ⬜ |
| M2 | Live monitor (TUI) | Monitoring | none (stdlib curses) | all | P1 | ⬜ |
| M8 | Alerting | Monitoring | libnotify (opt) | all | P2 | ⬜ |
| M5 | System inventory | Diagnostics | none (opt: lm-sensors, dmidecode) | all | P1 | ⬜ |
| M6 | Gaming env checks | Diagnostics | none | all | P2 | ⬜ |
| M10 | Desktop GUI | Desktop UI | **python3-pyside6** | all | P2 | 🟨 |
| M11 | Tray / menu-bar applet | Desktop UI | **python3-pyside6** (+ AppIndicator on GNOME) | all | P2 | ⬜ |
| M9 | Installer | (meta) | none | all | P1 | ⬜ |
| M12 | Session sharing / remote assist | Sharing | none (Tier 3: tmate/sshx) | all | P3 | ⬜ |
| M13 | Auto-update | (core) | none (stdlib; user-local file swap) | all | P3 | ⬜ |
| ~~M7~~ | ~~Stress / repro~~ | — | — | — | — | ❌ dropped (D7) |
## Notes per module
- **M1 Sensor core** — the foundation everything else samples from. Stdlib-only. Abstracts
NVIDIA/AMD/Intel + hwmon behind one interface; **ship the NVIDIA + hwmon path first**.
- **M3 Crash-capture logger** — the highest-value piece for the seed use case. `fsync` per
sample; GPU-lost detection via query timeout; bounded rotation; `systemd --user` service
with a **user-selectable trigger mode** (always-on / game-launch / manual — D6).
*Implemented (manual trigger):* JSONL log with fsync-per-sample, size-based rotation
(`log_max_bytes`/`log_backups`), GPU-lost/recovered event markers, atomic status file, and
`rigdoctor record run|start|stop|status|report`. The foreground `run` is the systemd-ready
entrypoint; the service unit + always-on/game-launch triggers (D6/D12) land in Phase 4.
Also fully driven from the GUI's Recording/Logs page (M10) via shared `core.reccontrol`.
- **M4 Health report** — turns scattered logs into a prioritized, plain-language findings
list with **suggested** fixes (read-only, D9). Reuses M1 for a live snapshot. Also powers
the **guided diagnostic session** (with M3): pick a game → focused capture → scan →
findings (see SPEC §4).
- **M2 Live monitor** — depends on M1; the terminal "HWMonitor for Linux" face. Stdlib-only.
- **M5 / M6 Diagnostics** — inventory export + gaming-env checks; M6 flags risky settings and
suggests the fix command but does not apply it (D9).
- **M8 Alerting** — threshold/event notifications; integrates with the tray applet (M11).
- **M10 Desktop GUI** — PySide6 graphical front-end over the core engine (dashboard, log
browser, report viewer, logger controls). Optional; adds the Qt dependency. *Bootstrapped
early (ahead of its Phase 4 slot) at the user's request:* dark-themed window with sidebar
nav, a live dashboard (circular gauges + collapsible per-subsystem cards, temperature-
colored values), and a **Recording/Logs page** with full M3 controls (start/stop/status +
post-crash report). Health/Inventory remain placeholders until M4/M5. GUI-first per D17.
- **M11 Tray applet** — `QSystemTrayIcon` menu-bar applet. Dropdown shows live M1 readouts
(CPU temp, GPU temp, memory used/total, status dot) and is led by a **Run Diagnostic**
action (the guided diagnostic session), plus Open dashboard / Start-Stop recording /
Snapshot / Quit (D13). Optional; shares the Qt dependency with M10.
- **M9 Installer** — interactive wizard layered on the `.deb` (D8); apt-first dependency
resolution; enables the logger service and trigger mode.
- **M12 Session sharing / remote assist** (D16) — let a helper inspect a user's machine, in
an escalating ladder: (1) **diagnostic bundle export** (inventory + recent log + report,
one-way), (2) **live read-only view** over a user-chosen tunnel (Tailscale/cloudflared/SSH,
no hosted relay), (3) **gated interactive terminal** wrapping tmate/sshx (read-only by
default; read-write only on explicit consent — a deliberate exception to D9). Per-session
consent, ephemeral revocable tokens, audit log.
- **M13 Auto-update** (D18) — *planned.* On launch, check the public Gitea releases API and
**self-update a user-local install with no root** (download → verify checksum/signature →
atomic symlink swap → restart, incl. the daemon). HTTPS-only, version-check-only (no
telemetry), opt-out-able. Surfaced in the GUI; `rigdoctor update` in the CLI. (`.deb` users
update via apt instead.)
## Bundles (final — D14)
- **Essential:** M1 + M3 + M4 *(the MVP, NVIDIA-only — D5)*
- **Monitoring:** M2 + M8
- **Diagnostics:** M5 + M6
- **Desktop UI:** M10 + M11 *(adds PySide6)*
- **Sharing:** M12 *(session sharing / remote assist — D16)*
## MVP candidate — *confirmed (D5)*
**M1 + M3 + M4 (Essential), NVIDIA-only, CLI-first.** Gives a working tool that captures the
GPU crash and explains the logs — deliverable before the installer, GUI/tray, or multi-vendor
work.
</content>