ce5f830393
release / release (push) Successful in 2m13s
Crash-capture logger (M3): - crash-safe JSONL (fsync per sample), size-based rotation, GPU-lost/recovered markers, atomic status file - CLI: record run/start/stop/status/report (run = systemd-ready entrypoint) - shared core.reccontrol so CLI + GUI drive the same recorder - crashlog tests (writer, rotation, reader, summary, recorder) GUI: - Recording/Logs page: start/stop/interval controls, live status, post-crash report - shared render helpers (format_raw/headline, render_summary) Docs/decisions: - GUI-first (D17); CLI keeps full parity - D8 revised: user-local self-updating install primary, .deb optional - planned: M12 session sharing (D16), M13 no-root auto-update from public repo (D18) - versioning + CHANGELOG convention (D19) Infra: - .gitea/workflows/release.yml: build wheel+sdist and publish a Gitea release v<version> on push to main - align version to the 0.0.x release line; bump to 0.0.2 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
80 lines
5.5 KiB
Markdown
80 lines
5.5 KiB
Markdown
# RigDoctor — Module Catalog (DRAFT v0.2)
|
|
|
|
Status: ⬜ not started · 🟦 designing · 🟨 in progress · ✅ done
|
|
|
|
> Module set per D14, plus **M12 (session sharing, D16)** and **M13 (auto-update, D18)**.
|
|
> **M7 (stress/repro) was dropped (D7).** M10/M11 are the GUI and tray modules (D10/D11).
|
|
> GPU scope reads "all (NVIDIA first)" — NVIDIA first, others via the vendor abstraction (D4).
|
|
|
|
| ID | Module | Bundle | Key deps | GPU scope | Priority | Status |
|
|
|----|--------|--------|----------|-----------|----------|--------|
|
|
| M1 | Sensor core | Essential | none (nvidia-smi, sysfs) | all (NVIDIA first) | P0 | ⬜ |
|
|
| M3 | Crash-capture logger | Essential | none (opt: smartmontools) | all (NVIDIA first) | P0 | 🟨 |
|
|
| M4 | Health report (log scan) | Essential | none (opt: smartmontools) | all (NVIDIA first) | P0 | ⬜ |
|
|
| M2 | Live monitor (TUI) | Monitoring | none (stdlib curses) | all | P1 | ⬜ |
|
|
| M8 | Alerting | Monitoring | libnotify (opt) | all | P2 | ⬜ |
|
|
| M5 | System inventory | Diagnostics | none (opt: lm-sensors, dmidecode) | all | P1 | ⬜ |
|
|
| M6 | Gaming env checks | Diagnostics | none | all | P2 | ⬜ |
|
|
| M10 | Desktop GUI | Desktop UI | **python3-pyside6** | all | P2 | 🟨 |
|
|
| M11 | Tray / menu-bar applet | Desktop UI | **python3-pyside6** (+ AppIndicator on GNOME) | all | P2 | ⬜ |
|
|
| M9 | Installer | (meta) | none | all | P1 | ⬜ |
|
|
| M12 | Session sharing / remote assist | Sharing | none (Tier 3: tmate/sshx) | all | P3 | ⬜ |
|
|
| M13 | Auto-update | (core) | none (stdlib; user-local file swap) | all | P3 | ⬜ |
|
|
| ~~M7~~ | ~~Stress / repro~~ | — | — | — | — | ❌ dropped (D7) |
|
|
|
|
## Notes per module
|
|
- **M1 Sensor core** — the foundation everything else samples from. Stdlib-only. Abstracts
|
|
NVIDIA/AMD/Intel + hwmon behind one interface; **ship the NVIDIA + hwmon path first**.
|
|
- **M3 Crash-capture logger** — the highest-value piece for the seed use case. `fsync` per
|
|
sample; GPU-lost detection via query timeout; bounded rotation; `systemd --user` service
|
|
with a **user-selectable trigger mode** (always-on / game-launch / manual — D6).
|
|
*Implemented (manual trigger):* JSONL log with fsync-per-sample, size-based rotation
|
|
(`log_max_bytes`/`log_backups`), GPU-lost/recovered event markers, atomic status file, and
|
|
`rigdoctor record run|start|stop|status|report`. The foreground `run` is the systemd-ready
|
|
entrypoint; the service unit + always-on/game-launch triggers (D6/D12) land in Phase 4.
|
|
Also fully driven from the GUI's Recording/Logs page (M10) via shared `core.reccontrol`.
|
|
- **M4 Health report** — turns scattered logs into a prioritized, plain-language findings
|
|
list with **suggested** fixes (read-only, D9). Reuses M1 for a live snapshot. Also powers
|
|
the **guided diagnostic session** (with M3): pick a game → focused capture → scan →
|
|
findings (see SPEC §4).
|
|
- **M2 Live monitor** — depends on M1; the terminal "HWMonitor for Linux" face. Stdlib-only.
|
|
- **M5 / M6 Diagnostics** — inventory export + gaming-env checks; M6 flags risky settings and
|
|
suggests the fix command but does not apply it (D9).
|
|
- **M8 Alerting** — threshold/event notifications; integrates with the tray applet (M11).
|
|
- **M10 Desktop GUI** — PySide6 graphical front-end over the core engine (dashboard, log
|
|
browser, report viewer, logger controls). Optional; adds the Qt dependency. *Bootstrapped
|
|
early (ahead of its Phase 4 slot) at the user's request:* dark-themed window with sidebar
|
|
nav, a live dashboard (circular gauges + collapsible per-subsystem cards, temperature-
|
|
colored values), and a **Recording/Logs page** with full M3 controls (start/stop/status +
|
|
post-crash report). Health/Inventory remain placeholders until M4/M5. GUI-first per D17.
|
|
- **M11 Tray applet** — `QSystemTrayIcon` menu-bar applet. Dropdown shows live M1 readouts
|
|
(CPU temp, GPU temp, memory used/total, status dot) and is led by a **Run Diagnostic**
|
|
action (the guided diagnostic session), plus Open dashboard / Start-Stop recording /
|
|
Snapshot / Quit (D13). Optional; shares the Qt dependency with M10.
|
|
- **M9 Installer** — interactive wizard layered on the `.deb` (D8); apt-first dependency
|
|
resolution; enables the logger service and trigger mode.
|
|
- **M12 Session sharing / remote assist** (D16) — let a helper inspect a user's machine, in
|
|
an escalating ladder: (1) **diagnostic bundle export** (inventory + recent log + report,
|
|
one-way), (2) **live read-only view** over a user-chosen tunnel (Tailscale/cloudflared/SSH,
|
|
no hosted relay), (3) **gated interactive terminal** wrapping tmate/sshx (read-only by
|
|
default; read-write only on explicit consent — a deliberate exception to D9). Per-session
|
|
consent, ephemeral revocable tokens, audit log.
|
|
- **M13 Auto-update** (D18) — *planned.* On launch, check the public Gitea releases API and
|
|
**self-update a user-local install with no root** (download → verify checksum/signature →
|
|
atomic symlink swap → restart, incl. the daemon). HTTPS-only, version-check-only (no
|
|
telemetry), opt-out-able. Surfaced in the GUI; `rigdoctor update` in the CLI. (`.deb` users
|
|
update via apt instead.)
|
|
|
|
## Bundles (final — D14)
|
|
- **Essential:** M1 + M3 + M4 *(the MVP, NVIDIA-only — D5)*
|
|
- **Monitoring:** M2 + M8
|
|
- **Diagnostics:** M5 + M6
|
|
- **Desktop UI:** M10 + M11 *(adds PySide6)*
|
|
- **Sharing:** M12 *(session sharing / remote assist — D16)*
|
|
|
|
## MVP candidate — *confirmed (D5)*
|
|
**M1 + M3 + M4 (Essential), NVIDIA-only, CLI-first.** Gives a working tool that captures the
|
|
GPU crash and explains the logs — deliverable before the installer, GUI/tray, or multi-vendor
|
|
work.
|
|
</content>
|