From 2342dd83aa2972ad972bdb03680cd55345ce9226 Mon Sep 17 00:00:00 2001 From: Jessey van Offeren Date: Fri, 22 May 2026 15:31:36 +0200 Subject: [PATCH] docs: rewrite README to be user-first (install + use) Lead with what RigDoctor does, then install (.deb/apt incl. the private-registry auth.conf.d + trusted=yes notes, and the .run), then usage (GUI/tray/CLI), requirements, and privacy. Move the dev content (from-source, tests, docs links) into a short Development section at the end. Drops the stale status/decisions/ repo-layout planning sections from the top. Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 211 ++++++++++++++++++++++++------------------------------ 1 file changed, 94 insertions(+), 117 deletions(-) diff --git a/README.md b/README.md index d59663d..34f2cb0 100644 --- a/README.md +++ b/README.md @@ -1,152 +1,129 @@ # RigDoctor -A **modular diagnostics, monitoring, and health-check toolkit for Linux gamers.** +**Hardware monitoring & crash diagnostics for Linux gamers.** Live sensors, crash-safe +logging, plain-language health reports, per-game diagnostics, and optional AI explanations β€” +in a desktop app, a tray applet, or the terminal. Ubuntu/Debian + NVIDIA first. -> **Status:** 🟒 Phase 1 (MVP) complete. The **sensor core (M1)**, **crash-capture logger -> (M3)**, and **health report (M4)** all work β€” live `snapshot`/`monitor`, crash-safe `record` -> with a post-crash report, and `report` to scan logs/SMART/driver for likely causes. A -> desktop GUI (M10) ties them together (dashboard, recording, health). See `docs/ROADMAP.md`. +Linux gaming faults are hard to pin down β€” GPUs falling off the PCIe bus, black screens +mid-game, silent thermal/VRAM throttling, driver/Proton mismatches. The useful data is +scattered across `nvidia-smi`, `/sys`, `journalctl`, and SMART, and the readings right before a +freeze are usually lost. RigDoctor pulls it together and keeps the evidence. -## Why this exists +## Features -Linux gaming hardware faults are hard to diagnose: GPUs falling off the PCIe bus, the screen -suddenly going black mid-game, silent thermal/VRAM throttling, power transients, -driver/library mismatches, Proton quirks, and CPU governor / power-profile misconfiguration. -The data needed to diagnose them is scattered across `nvidia-smi`, `/sys/class/hwmon`, -`journalctl`, SMART, and more β€” and the most useful readings (the ones right before a hard -freeze) are usually lost because nothing flushed them to disk. +- **Live monitoring** β€” a dark desktop **dashboard** (history graphs + per-subsystem cards), a + **tray applet** with at-a-glance status, and a terminal view (`rigdoctor monitor`). +- **Crash-safe recording** β€” background logger that `fsync`s every sample, so the state right + before a hard freeze survives. Manual, always-on, or auto-start when a game launches. +- **Health report** β€” scans `journalctl`/SMART/driver for likely causes (Xid, OOM, disk + errors, throttling…) and explains them with suggested fixes. +- **Per-game diagnostics** β€” pick a game, capture while you play, get a focused report; hard + crashes are detected and analysed on next launch. +- **Gaming tune-ups** β€” flags risky settings (CPU governor, PCIe ASPM, persistence mode…) with + **one-click, reversible fixes**. +- **Proactive alerts** β€” desktop notifications on overheating and critical kernel events + (GPU-lost, Xid, out-of-memory, disk I/O). +- **AI explanations** *(optional, opt-in)* β€” explain a diagnostic in plain language with a + **local model (Ollama)** or **Claude**. Never automatic; only when you press the button. +- **Shareable reports** β€” zip a diagnostic (logs, inventory, AI transcript) to hand to someone, + or share a live **terminal session** for remote help. +- **Self-updating** β€” `apt upgrade`, or the in-app updater. -RigDoctor pulls all of that into one modular tool: live monitoring, crash-safe logging, a -one-shot health report, and an interactive installer that only sets up the modules a given -user actually needs for their hardware. +## Install -**Seed use cases:** an RTX 3070 that intermittently "falls off the bus" under heavy GPU load -(Path of Exile on Linux, Escape from Tarkov on Windows), and a monitor going black mid-game. -See `docs/SPEC.md` Β§1. +### Debian / Ubuntu β€” `.deb` -## How you run it - -RigDoctor is **GUI-first** β€” the desktop app is the primary way in β€” but every feature is -also available headless: -- **Desktop GUI** β€” graphical dashboard, recording controls, log browser, reports. The - default interface for most users. -- **Tray applet** β€” a small top-menu-bar applet with quick actions and at-a-glance status. -- **CLI** β€” full functionality from the terminal; works over SSH and in scripts. - -The GUI/tray are optional modules; a headless (CLI-only) install loses no capability. - -## Key decisions (settled) - -| Topic | Decision | -|-------|----------| -| Name | **RigDoctor** | -| Language / stack | **Python 3 + Qt (PySide6)** β€” core/CLI/daemon stdlib-only; Qt only for GUI/tray | -| Primary distro | **Ubuntu** (Debian via apt); others best-effort later | -| Primary GPU | **NVIDIA** first; AMD, then Intel later | -| MVP | **Sensor core + crash logger + health report** (NVIDIA-only, CLI-first) | -| Distribution | **User-local install** (self-updating from the public repo, no root); **`.deb`** optional | -| Scope of action | **Read-only + suggestions** (no auto-apply yet) | -| Stress tests | **Out of scope** | - -Full rationale and the still-open questions are in `docs/DECISIONS.md`. - -## Repo layout - -| Path | Purpose | -|------|---------| -| `docs/SPEC.md` | Product specification β€” vision, requirements, modules (the main planning doc) | -| `docs/ARCHITECTURE.md` | Technical design β€” core engine, front-ends, daemon, installer | -| `docs/MODULES.md` | Catalog of modules with scope, dependencies, status | -| `docs/ROADMAP.md` | Phased milestones | -| `docs/DECISIONS.md` | Decision log + remaining open questions | -| `src/rigdoctor/` | Source code β€” `core/` engine + sources, `cli.py`, `render.py` | -| `installer/` | Installer / `.deb` packaging (empty until Phase 4) | -| `tests/` | Tests (stdlib `unittest`) | - -## Install (user-local, no root) - -RigDoctor installs into a private venv under `~/.local` β€” no root, self-updating: +The simplest path: grab the latest **`rigdoctor__all.deb`** from the +[releases page](https://git.jesseyvanofferen.com/jessey/rigdoctor/releases) and install it β€” +apt pulls the GUI dependencies (PySide6, pyte) automatically: ```bash -./install.sh # from a source checkout or the self-extracting .run -./install.sh --ref v0.0.6 # install a specific released tag (needs a token) -./install.sh --uninstall # remove it +sudo apt install ./rigdoctor_*_all.deb # CLI only: add --no-install-recommends ``` -This adds `rigdoctor` / `rigdoctor-gui` to `~/.local/bin` and a desktop entry. Each release -also ships a one-file **`.run`** installer (download, `chmod +x`, run). Updates are gated to -accounts on the Git server (a Personal Access Token); save one via the GUI **Setup β†’ Update -access** panel or `rigdoctor login`, then `rigdoctor update` (or the sidebar button). - -## Install (`.deb`, system-wide) - -Each release also ships a **`.deb`** (`Architecture: all`, M9/D8). Download it from the release -and install with apt (pulls the GUI deps β€” PySide6/pyte β€” via Recommends): +**Or add the apt repository** for `apt install` + automatic updates: ```bash -sudo apt install ./rigdoctor__all.deb # CLI-only: add --no-install-recommends -``` +# the registry is private, so give apt a token (a Gitea PAT with read:package) +echo "machine git.jesseyvanofferen.com login password " \ + | sudo tee /etc/apt/auth.conf.d/rigdoctor.conf +sudo chmod 600 /etc/apt/auth.conf.d/rigdoctor.conf -When the apt registry is enabled on the server, you can instead add it as a source and -`sudo apt update && sudo apt install rigdoctor` (with `apt upgrade` for updates): - -```bash -curl -fsSL https://git.jesseyvanofferen.com/api/packages/jessey/debian/repository.key \ - | sudo tee /etc/apt/keyrings/gitea-rigdoctor.asc > /dev/null -echo "deb [signed-by=/etc/apt/keyrings/gitea-rigdoctor.asc] \ - https://git.jesseyvanofferen.com/api/packages/jessey/debian stable main" \ +echo "deb [trusted=yes] https://git.jesseyvanofferen.com/api/packages/jessey/debian stable main" \ | sudo tee /etc/apt/sources.list.d/rigdoctor.list + +sudo apt update && sudo apt install rigdoctor ``` -## Run it (dev) +Then `sudo apt upgrade` keeps it current. *(If your server serves a signed registry, drop the +`auth.conf.d` file and replace `[trusted=yes]` with `[signed-by=…]` + the `repository.key`.)* -Stdlib-only, no install needed (target is Python β‰₯ 3.11; tested on 3.14): +### Any distro β€” self-extracting `.run` (no root) + +Download **`rigdoctor--installer.run`** from the releases page and run it. It installs +into a private virtualenv under `~/.local` (no root), adds the launchers + desktop entry, and +opens the first-run setup wizard: ```bash -PYTHONPATH=src python3 -m rigdoctor snapshot # one-shot sensor read -PYTHONPATH=src python3 -m rigdoctor snapshot --json -PYTHONPATH=src python3 -m rigdoctor monitor -n 1 # live view (Ctrl-C to quit) -PYTHONPATH=src python3 -m rigdoctor sources # list detected sensor sources -PYTHONPATH=src python3 -m unittest discover -s tests +sh rigdoctor-*-installer.run ``` -### Crash-capture logger (M3) +### Updating & removing -A crash-safe background logger (JSONL, `fsync` per sample, bounded by rotation) for catching -the state right before a freeze: +- **`.deb`:** `sudo apt upgrade` (or reinstall a newer `.deb`). +- **`.run` / user-local:** the in-app **Update** button, or `rigdoctor update`. +- **Remove:** `sudo apt remove rigdoctor`, or `rigdoctor uninstall` for the user-local install. + +## Using it + +Launch **RigDoctor** from your app menu, or: ```bash -rigdoctor record start # start logging in the background -rigdoctor record status # is it running? latest readings, sample count -rigdoctor record stop # stop it -rigdoctor record report # post-crash summary: peaks, events, last samples -rigdoctor record run # run in the foreground (the systemd-ready entrypoint) +rigdoctor-gui # desktop app (+ tray) +rigdoctor --help # everything from the terminal (works over SSH) ``` -Logs live in `~/.local/share/rigdoctor/logs/`. It detects GPU "lost"/hang (nvidia-smi query -timeout) and writes an event marker. Trigger modes (always-on / game-launch) and the -`systemd --user` service arrive in Phase 4. - -### Desktop GUI (M10) - -The GUI uses PySide6 (Qt) β€” the only part of RigDoctor that needs a non-stdlib dep: +Handy CLI commands: ```bash -pip install -e '.[gui]' # core + PySide6, gives `rigdoctor` and `rigdoctor-gui` -rigdoctor gui # or: rigdoctor-gui +rigdoctor snapshot # one-shot reading of every sensor +rigdoctor monitor # live terminal dashboard +rigdoctor report # health report (logs / SMART / driver) +rigdoctor diagnose start|finish # capture while gaming, then analyse +rigdoctor gameenv # flag risky gaming settings + fixes +rigdoctor inventory # hardware/OS inventory +rigdoctor ai explain # AI explanation of the current findings (opt-in) +rigdoctor bundle # zip the latest diagnostic into a shareable report ``` -It opens a dark-themed window with sidebar navigation and a **live dashboard** over the -same sensor core β€” circular gauges for the headline metrics plus collapsible per-subsystem -cards (GPU/CPU/memory/storage) with temperature-colored values (icey-blue β†’ green β†’ red). -The **Logs** and **Health** sections are full pages (recording controls + post-crash report; -and the kernel-log / SMART / driver scan). **Inventory** is a placeholder until M5 lands. +## Requirements -Without the GUI extra, `pip install -e .` gives just the stdlib-only CLI. +- **Linux** β€” Ubuntu/Debian first-class (the `.deb`); the `.run` works on any distro with + Python β‰₯ 3.11. +- **GPU** β€” NVIDIA fully supported (via `nvidia-smi`); AMD/Intel sensors are best-effort. +- **CLI/daemon** need only Python 3 (stdlib). The **GUI/tray** add **PySide6** (`python3-pyside6`). +- Optional tools unlock more: `smartmontools`, `lm-sensors`, `gamemode`, `mangohud`. The setup + wizard offers to install them. -## Start here +## Privacy -1. Read `docs/SPEC.md` for what we're building. -2. Read `docs/ROADMAP.md` for the build order (Phase 1 = the MVP). -3. Read `docs/DECISIONS.md` for the settled decisions (D1–D15). - +Everything stays on your machine β€” no telemetry, no phone-home. The AI assistant is **off by +default** and runs only when you explicitly trigger it; with Ollama nothing leaves the machine, +and the Claude option asks before sending. Reports are local files; they leave only if you share +the zip. + +## Development + +RigDoctor's core is stdlib-only Python; the GUI/tray use PySide6. + +```bash +git clone https://git.jesseyvanofferen.com/jessey/rigdoctor && cd rigdoctor +pip install -e ".[gui]" # core + GUI; omit [gui] for CLI-only +python -m unittest discover -s tests # run the test suite +PYTHONPATH=src python3 -m rigdoctor snapshot # run without installing +``` + +Design docs live in `docs/` β€” `SPEC.md` (vision/requirements), `ARCHITECTURE.md`, +`MODULES.md` (module catalog), `ROADMAP.md`, and `DECISIONS.md` (the decision log). +Contributions: branch off `main`, keep tests green (CI runs them on PRs), and bump the version ++ `CHANGELOG.md` for shipped changes.