# RigDoctor — Architecture (DRAFT v0.2) > Tech stack and key structural decisions are now settled (see `DECISIONS.md` D2, D6, D8, > D10, D11). Items still marked **[OPEN]** are tracked there. ## 1. Principles - **Modular core + plugins.** A small engine; every capability is a module that can be installed/omitted independently. - **Capability detection over assumption.** Probe what hardware/tools exist; degrade gracefully. - **Vendor & distro abstraction.** GPU and package-manager differences live behind interfaces, not scattered through the code (NVIDIA + apt are the first concrete impls). - **One engine, many front-ends.** CLI, TUI, GUI, and tray are all thin front-ends over the same core engine. Anything the GUI/tray can do is reachable headless from the CLI. ## 2. Tech stack — *DECIDED (D2)* - **Language:** Python 3 (target machine has Python 3.14). - **Core / CLI / daemon:** **stdlib only** — no `pip` deps. Easy log/JSON/subprocess handling, tiny footprint, runs headless/over SSH. - **TUI (M2):** stdlib `curses` / plain ANSI redraw (no deps). - **GUI (M10) + tray (M11):** **Qt via PySide6** — one toolkit for both the desktop window and the `QSystemTrayIcon` menu-bar applet. PySide6 is a dependency of *only* these two modules, declared in the `.deb`; the core/daemon never import Qt. - **Installer bootstrap (M9):** the `.deb`'s maintainer scripts ensure Python is present, then hand off to the Python installer for module selection. ## 3. Component layout ``` +--------------------------+ | core engine | (stdlib only) | sources → sampler → bus | +--------------------------+ ^ ^ ^ ^ +-------------------+ | | +--------------------+ | +-----------+ +-----------+ | +---------+ +----------+ +-----------+ +--------------+ | CLI | | daemon | | GUI | | tray applet | | (stdlib)| | (M3, | | (M10,Qt) | | (M11, Qt) | | TUI(M2) | | systemd) | | | | | +---------+ +----------+ +-----------+ +--------------+ ``` - The **core engine** is a stdlib-only library: sources → sampler loop → an internal bus that fans samples out to sinks (TUI renderer, CSV/JSON logger, alert engine, report builder). - The **daemon** (M3) is a long-running, stdlib-only process managed by `systemd --user`. - The **GUI** and **tray** import PySide6 and talk to the same engine; for live status they can read the daemon's output / a small status file or socket rather than re-sampling. ## 4. Core engine ``` +-------------------+ +------------------+ +-------------------+ | Sources (probe) | ---> | Sampler loop | ---> | Sinks | | nvidia-smi/NVML | | (interval, Hz) | | - TUI renderer | | amdgpu sysfs | | normalizes into | | - CSV/JSON logger | | hwmon/lm-sensors | | Sample records | | - Alert engine | | journalctl/SMART | | | | - Report builder | +-------------------+ +------------------+ | - GUI/tray feed | +-------------------+ ``` - **Sample record:** `{ ts, source, metric, value, unit }` flattened per tick into a row. - **Sources** are pluggable; each declares which metrics it can provide and self-checks availability at startup. NVIDIA (`nvidia-smi`/NVML) + hwmon are the first implementations. ## 5. Module contract Each module declares a manifest so the installer and engine can reason about it: ``` module: id: crash-logger name: "Crash-capture logger" provides: [logging] requires_sources: [gpu, cpu_temp] # capabilities, not packages system_packages: # per package manager, optional apt: [] # uses nvidia-smi + sysfs only pacman: [] dnf: [] python_deps: [] # e.g. GUI/tray modules → [pyside6] optional_packages: apt: [smartmontools] # enriches if present gpu_vendors: [nvidia, amd, intel] default_in_bundles: [essential] ``` Lifecycle hooks a module may implement: `probe()`, `collect(sample)`, `render(view)`, `report()`, `install_hint()`. GUI/tray modules additionally declare `python_deps: [pyside6]`. ## 6. Crash-logger daemon & trigger model — *DECIDED (D6)* The logger (M3) runs as a `systemd --user` service. Three user-selectable trigger modes: 1. **Always-on** — service enabled at login, samples continuously (bounded by rotation). 2. **Game-launch-triggered** — starts when a game/Steam session begins, stops after. Detection is layered (D12), no root: a precise **wrapper** (`rigdoctor wrap %command%` + global Steam compat-tool) as primary; a zero-config **watcher** (Steam `RunningAppID` + `/proc` heuristic) as fallback; **GameMode** D-Bus signals if `gamemoded` is present. 3. **Manual** — started/stopped via the CLI (`rigdoctor record start/stop`) or the tray applet's quick action. The selected mode is written to config by the installer and changeable later via CLI/GUI. ## 7. GUI & tray — *DECIDED (D10/D11)* - **GUI (M10):** a PySide6 desktop app — live dashboard (graphs/gauges), crash-log browser, health-report viewer, inventory view, logger controls. Works under X11 and Wayland. - **Tray (M11):** `QSystemTrayIcon` applet in the top menu bar (StatusNotifierItem; on Ubuntu/GNOME surfaced via the AppIndicator extension). Dropdown shows live M1 readouts (CPU temp, GPU temp, memory used/total, status dot) and actions led by **Run Diagnostic** (the guided diagnostic session, §7.1), plus Open dashboard / Start-Stop recording / Snapshot / Quit (D13). - Both are **optional** — a headless/server install omits them and loses no diagnostic capability (everything is in the CLI). ### 7.1 Guided diagnostic session (orchestration) The "Run Diagnostic" flow (exposed in tray, GUI, and CLI) is not a new module — it orchestrates existing ones: **pick a game** (D12 detection: Steam library / recently played / running process) → **focused capture** (M3 scoped to that game's session via the D12 wrapper/watcher) → **scan & analyze** (M4 over the captured window + system logs) → **present prioritized findings** with suggested fixes (read-only, D9). The engine exposes it as a single callable so all three front-ends share one implementation. ## 8. Installer design (M9) 1. **Detect** GPU vendor via `lspci` (NVIDIA first) and the package manager (apt first). 2. **Present** a module menu grouped into bundles: - *Essential* (sensor core + crash logger + health report) — the MVP, NVIDIA-only. - *Monitoring* (live TUI + alerts) - *Diagnostics* (inventory + gaming-env checks + SMART) - *Desktop UI* (GUI + tray applet — adds the PySide6 dependency) - *Custom* (pick individual modules) For each selection, show the exact packages that will be installed. 3. **Resolve** dependencies: union of selected modules' `system_packages` + `python_deps` for the detected package manager; report-only if a package is missing and sudo unavailable. 4. **Install** (with explicit confirmation), **write config** (`~/.config/rigdoctor/`), optionally **enable** the `systemd --user` logger service and choose its trigger mode (D6). 5. **Verify** each installed module's `probe()` and print a readiness summary. Module list/bundling is final (D14). Packaging: a **user-local install is primary** (self-updating from the public repo, no root — D8/D18), with an **optional `.deb`** system package; the wizard layers module selection on top of either. ## 9. GPU vendor abstraction | Capability | NVIDIA (first) | AMD (later) | Intel (later) | |------------|--------|-----|-------| | Temps/clocks/power | `nvidia-smi`/NVML | `/sys/class/drm/.../device` + `rocm-smi` | `/sys` + `intel_gpu_top` | | VRAM temp | mem-junction (often N/A on GeForce) | sysfs `mem` hwmon | n/a | | Crash signature | Xid in dmesg | `amdgpu: GPU reset` / ring timeouts | i915 GPU hang | | Power limit (read-only, D9) | `nvidia-smi -pl` (suggested, not applied) | sysfs `power_dpm` / `pp_*` | n/a | ## 10. Data & config layout ``` ~/.config/rigdoctor/config.toml # enabled modules, thresholds, interval, trigger mode ~/.local/share/rigdoctor/logs/ # rotated crash logs (CSV/JSON) ~/.local/state/rigdoctor/ # session/min-max state, daemon status feed ``` ## 11. Dependency package names — apt-only (D15) We maintain package names for **Ubuntu/apt only**; no cross-distro mapping is built or maintained. The set is small (filled in per module as they land): | Logical dep | apt package | |-------------|-------------| | SMART | `smartmontools` | | lm-sensors | `lm-sensors` | | DMI/inventory | `dmidecode` | | GUI/tray (Qt) | `python3-pyside6` | | Tray on GNOME | `gir1.2-appindicator3-0.1` (AppIndicator) | | Desktop notifications | `libnotify-bin` | Module manifests still declare deps under a `system_packages.apt` / `python_deps` key, so a thin seam remains if another package manager is ever added — but multi-distro support is **not a planned deliverable** (D15).