Files
rigdoctor/docs/MODULES.md
T
jessey 46ba53631a
release / release (push) Successful in 22s
Release 0.0.7: user-local installer + self-update apply
- install.sh: no-root user-local install (private venv + ~/.local/bin launchers
  + desktop entry); --ref <tag> to install a specific release, --uninstall to
  remove; auto-installs the python3-venv prerequisite with consent
- packaging/make-run.sh: build a self-extracting .run installer (makeself)
  bundling the wheel + install.sh; release workflow builds and attaches it
- M13 self-update apply: `rigdoctor update` runs an authenticated pip upgrade
  (rigdoctor[gui] @ git+https://oauth2:<token>@...@<tag>), token scrubbed; GUI
  sidebar "Update to v…" button applies it and prompts to restart
- version 0.0.7, CHANGELOG, docs (M9/M13, ROADMAP, README install section)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 18:12:53 +02:00

7.0 KiB

RigDoctor — Module Catalog (DRAFT v0.2)

Status: not started · 🟦 designing · 🟨 in progress · done

Module set per D14, plus M12 (session sharing, D16) and M13 (auto-update, D18). M7 (stress/repro) was dropped (D7). M10/M11 are the GUI and tray modules (D10/D11). GPU scope reads "all (NVIDIA first)" — NVIDIA first, others via the vendor abstraction (D4).

ID Module Bundle Key deps GPU scope Priority Status
M1 Sensor core Essential none (nvidia-smi, sysfs) all (NVIDIA first) P0
M3 Crash-capture logger Essential none (opt: smartmontools) all (NVIDIA first) P0 🟨
M4 Health report (log scan) Essential none (opt: smartmontools) all (NVIDIA first) P0 🟨
M2 Live monitor (TUI) Monitoring none (stdlib curses) all P1
M8 Alerting Monitoring libnotify (opt) all P2
M5 System inventory Diagnostics none (opt: lm-sensors, dmidecode) all P1
M6 Gaming env checks Diagnostics none all P2
M10 Desktop GUI Desktop UI python3-pyside6 all P2 🟨
M11 Tray / menu-bar applet Desktop UI python3-pyside6 (+ AppIndicator on GNOME) all P2
M9 Installer (meta) none all P1 🟨
M12 Session sharing / remote assist Sharing none (Tier 3: tmate/sshx) all P3
M13 Auto-update (core) none (stdlib; user-local file swap) all P3 🟨
M7 Stress / repro dropped (D7)

Notes per module

  • M1 Sensor core — the foundation everything else samples from. Stdlib-only. Abstracts NVIDIA/AMD/Intel + hwmon behind one interface; ship the NVIDIA + hwmon path first.
  • M3 Crash-capture logger — the highest-value piece for the seed use case. fsync per sample; GPU-lost detection via query timeout; bounded rotation; systemd --user service with a user-selectable trigger mode (always-on / game-launch / manual — D6). Implemented (manual trigger): JSONL log with fsync-per-sample, size-based rotation (log_max_bytes/log_backups), GPU-lost/recovered event markers, atomic status file, and rigdoctor record run|start|stop|status|report. The foreground run is the systemd-ready entrypoint; the service unit + always-on/game-launch triggers (D6/D12) land in Phase 4. Also fully driven from the GUI's Recording/Logs page (M10) via shared core.reccontrol.
  • M4 Health report — turns scattered logs into a prioritized, plain-language findings list with suggested fixes (read-only, D9). Reuses M1 for a live snapshot. Also powers the guided diagnostic session (with M3): pick a game → focused capture → scan → findings (see SPEC §4). Implemented: journalctl scan (Xid/panic/OOM/MCE/AER/thermal/amdgpu), SMART, NVIDIA driver-mismatch, journald-persistence + live-temp checks; rigdoctor report (text/JSON) + GUI Health tab. GPU-firmware verification deferred.
  • M2 Live monitor — depends on M1; the terminal "HWMonitor for Linux" face. Stdlib-only.
  • M5 / M6 Diagnostics — inventory export + gaming-env checks; M6 flags risky settings and suggests the fix command but does not apply it (D9).
  • M8 Alerting — threshold/event notifications; integrates with the tray applet (M11).
  • M10 Desktop GUI — PySide6 graphical front-end over the core engine (dashboard, log browser, report viewer, logger controls). Optional; adds the Qt dependency. Bootstrapped early (ahead of its Phase 4 slot) at the user's request: dark-themed window with sidebar nav, a live dashboard (circular gauges + collapsible per-subsystem cards, temperature- colored values), and a Recording/Logs page with full M3 controls (start/stop/status + post-crash report). Health/Inventory remain placeholders until M4/M5. GUI-first per D17.
  • M11 Tray appletQSystemTrayIcon menu-bar applet. Dropdown shows live M1 readouts (CPU temp, GPU temp, memory used/total, status dot) and is led by a Run Diagnostic action (the guided diagnostic session), plus Open dashboard / Start-Stop recording / Snapshot / Quit (D13). Optional; shares the Qt dependency with M10.
  • M9 Installer — interactive wizard layered on the .deb (D8); apt-first dependency resolution; enables the logger service and trigger mode. Implemented (first cut): distro/ package-manager/GPU detection (core/sysenv), an optional-component catalog (core/catalog), and dependency install via pkexec/sudo — rigdoctor install [--check] [-y] + GUI Setup tab. The user-local app install is install.sh (private venv + ~/.local/bin launchers + desktop entry, no root; handles the python3-venv prerequisite) plus a self-extracting .run (makeself, built by CI). Pending: config/module selection + systemd --user service enable.
  • M12 Session sharing / remote assist (D16) — let a helper inspect a user's machine, in an escalating ladder: (1) diagnostic bundle export (inventory + recent log + report, one-way), (2) live read-only view over a user-chosen tunnel (Tailscale/cloudflared/SSH, no hosted relay), (3) gated interactive terminal wrapping tmate/sshx (read-only by default; read-write only on explicit consent — a deliberate exception to D9). Per-session consent, ephemeral revocable tokens, audit log.
  • M13 Auto-update (D18) — check + auth implemented: updates are gated to Gitea account holders via a Personal Access Token, stored encrypted in the OS keyring (secret-tool) with a 0600-file fallback (config.load_token/save_token/token_backend). core/updates queries the releases API with the token; CLI login/logout/update; GUI Setup "Update access" panel + sidebar states. The no-root self-update apply is implemented: rigdoctor update runs an authenticated pip install --upgrade "rigdoctor[gui] @ git+https://oauth2:<token>@…@<tag>" into the user-local venv (GUI "Update to v…" button + restart prompt; token scrubbed). Installed via the user-local install.sh / self-extracting .run (M9). Original plan: On launch, check the public Gitea releases API and self-update a user-local install with no root (download → verify checksum/signature → atomic symlink swap → restart, incl. the daemon). HTTPS-only, version-check-only (no telemetry), opt-out-able. Surfaced in the GUI; rigdoctor update in the CLI. (.deb users update via apt instead.)

Bundles (final — D14)

  • Essential: M1 + M3 + M4 (the MVP, NVIDIA-only — D5)
  • Monitoring: M2 + M8
  • Diagnostics: M5 + M6
  • Desktop UI: M10 + M11 (adds PySide6)
  • Sharing: M12 (session sharing / remote assist — D16)

MVP candidate — confirmed (D5)

M1 + M3 + M4 (Essential), NVIDIA-only, CLI-first. Gives a working tool that captures the GPU crash and explains the logs — deliverable before the installer, GUI/tray, or multi-vendor work.