docs: rewrite README to be user-first (install + use) #32

Merged
jessey merged 1 commits from docs/readme-users into main 2026-05-22 13:32:41 +00:00
Showing only changes of commit 2342dd83aa - Show all commits
+94 -117
View File
@@ -1,152 +1,129 @@
# RigDoctor
A **modular diagnostics, monitoring, and health-check toolkit for Linux gamers.**
**Hardware monitoring & crash diagnostics for Linux gamers.** Live sensors, crash-safe
logging, plain-language health reports, per-game diagnostics, and optional AI explanations —
in a desktop app, a tray applet, or the terminal. Ubuntu/Debian + NVIDIA first.
> **Status:** 🟢 Phase 1 (MVP) complete. The **sensor core (M1)**, **crash-capture logger
> (M3)**, and **health report (M4)** all work — live `snapshot`/`monitor`, crash-safe `record`
> with a post-crash report, and `report` to scan logs/SMART/driver for likely causes. A
> desktop GUI (M10) ties them together (dashboard, recording, health). See `docs/ROADMAP.md`.
Linux gaming faults are hard to pin down — GPUs falling off the PCIe bus, black screens
mid-game, silent thermal/VRAM throttling, driver/Proton mismatches. The useful data is
scattered across `nvidia-smi`, `/sys`, `journalctl`, and SMART, and the readings right before a
freeze are usually lost. RigDoctor pulls it together and keeps the evidence.
## Why this exists
## Features
Linux gaming hardware faults are hard to diagnose: GPUs falling off the PCIe bus, the screen
suddenly going black mid-game, silent thermal/VRAM throttling, power transients,
driver/library mismatches, Proton quirks, and CPU governor / power-profile misconfiguration.
The data needed to diagnose them is scattered across `nvidia-smi`, `/sys/class/hwmon`,
`journalctl`, SMART, and more — and the most useful readings (the ones right before a hard
freeze) are usually lost because nothing flushed them to disk.
- **Live monitoring** — a dark desktop **dashboard** (history graphs + per-subsystem cards), a
**tray applet** with at-a-glance status, and a terminal view (`rigdoctor monitor`).
- **Crash-safe recording** — background logger that `fsync`s every sample, so the state right
before a hard freeze survives. Manual, always-on, or auto-start when a game launches.
- **Health report** — scans `journalctl`/SMART/driver for likely causes (Xid, OOM, disk
errors, throttling…) and explains them with suggested fixes.
- **Per-game diagnostics** — pick a game, capture while you play, get a focused report; hard
crashes are detected and analysed on next launch.
- **Gaming tune-ups** — flags risky settings (CPU governor, PCIe ASPM, persistence mode…) with
**one-click, reversible fixes**.
- **Proactive alerts** — desktop notifications on overheating and critical kernel events
(GPU-lost, Xid, out-of-memory, disk I/O).
- **AI explanations** *(optional, opt-in)* — explain a diagnostic in plain language with a
**local model (Ollama)** or **Claude**. Never automatic; only when you press the button.
- **Shareable reports** — zip a diagnostic (logs, inventory, AI transcript) to hand to someone,
or share a live **terminal session** for remote help.
- **Self-updating** — `apt upgrade`, or the in-app updater.
RigDoctor pulls all of that into one modular tool: live monitoring, crash-safe logging, a
one-shot health report, and an interactive installer that only sets up the modules a given
user actually needs for their hardware.
## Install
**Seed use cases:** an RTX 3070 that intermittently "falls off the bus" under heavy GPU load
(Path of Exile on Linux, Escape from Tarkov on Windows), and a monitor going black mid-game.
See `docs/SPEC.md` §1.
### Debian / Ubuntu — `.deb`
## How you run it
RigDoctor is **GUI-first** — the desktop app is the primary way in — but every feature is
also available headless:
- **Desktop GUI** — graphical dashboard, recording controls, log browser, reports. The
default interface for most users.
- **Tray applet** — a small top-menu-bar applet with quick actions and at-a-glance status.
- **CLI** — full functionality from the terminal; works over SSH and in scripts.
The GUI/tray are optional modules; a headless (CLI-only) install loses no capability.
## Key decisions (settled)
| Topic | Decision |
|-------|----------|
| Name | **RigDoctor** |
| Language / stack | **Python 3 + Qt (PySide6)** — core/CLI/daemon stdlib-only; Qt only for GUI/tray |
| Primary distro | **Ubuntu** (Debian via apt); others best-effort later |
| Primary GPU | **NVIDIA** first; AMD, then Intel later |
| MVP | **Sensor core + crash logger + health report** (NVIDIA-only, CLI-first) |
| Distribution | **User-local install** (self-updating from the public repo, no root); **`.deb`** optional |
| Scope of action | **Read-only + suggestions** (no auto-apply yet) |
| Stress tests | **Out of scope** |
Full rationale and the still-open questions are in `docs/DECISIONS.md`.
## Repo layout
| Path | Purpose |
|------|---------|
| `docs/SPEC.md` | Product specification — vision, requirements, modules (the main planning doc) |
| `docs/ARCHITECTURE.md` | Technical design — core engine, front-ends, daemon, installer |
| `docs/MODULES.md` | Catalog of modules with scope, dependencies, status |
| `docs/ROADMAP.md` | Phased milestones |
| `docs/DECISIONS.md` | Decision log + remaining open questions |
| `src/rigdoctor/` | Source code — `core/` engine + sources, `cli.py`, `render.py` |
| `installer/` | Installer / `.deb` packaging (empty until Phase 4) |
| `tests/` | Tests (stdlib `unittest`) |
## Install (user-local, no root)
RigDoctor installs into a private venv under `~/.local` — no root, self-updating:
The simplest path: grab the latest **`rigdoctor_<version>_all.deb`** from the
[releases page](https://git.jesseyvanofferen.com/jessey/rigdoctor/releases) and install it —
apt pulls the GUI dependencies (PySide6, pyte) automatically:
```bash
./install.sh # from a source checkout or the self-extracting .run
./install.sh --ref v0.0.6 # install a specific released tag (needs a token)
./install.sh --uninstall # remove it
sudo apt install ./rigdoctor_*_all.deb # CLI only: add --no-install-recommends
```
This adds `rigdoctor` / `rigdoctor-gui` to `~/.local/bin` and a desktop entry. Each release
also ships a one-file **`.run`** installer (download, `chmod +x`, run). Updates are gated to
accounts on the Git server (a Personal Access Token); save one via the GUI **Setup → Update
access** panel or `rigdoctor login`, then `rigdoctor update` (or the sidebar button).
## Install (`.deb`, system-wide)
Each release also ships a **`.deb`** (`Architecture: all`, M9/D8). Download it from the release
and install with apt (pulls the GUI deps — PySide6/pyte — via Recommends):
**Or add the apt repository** for `apt install` + automatic updates:
```bash
sudo apt install ./rigdoctor_<version>_all.deb # CLI-only: add --no-install-recommends
```
# the registry is private, so give apt a token (a Gitea PAT with read:package)
echo "machine git.jesseyvanofferen.com login <user> password <token>" \
| sudo tee /etc/apt/auth.conf.d/rigdoctor.conf
sudo chmod 600 /etc/apt/auth.conf.d/rigdoctor.conf
When the apt registry is enabled on the server, you can instead add it as a source and
`sudo apt update && sudo apt install rigdoctor` (with `apt upgrade` for updates):
```bash
curl -fsSL https://git.jesseyvanofferen.com/api/packages/jessey/debian/repository.key \
| sudo tee /etc/apt/keyrings/gitea-rigdoctor.asc > /dev/null
echo "deb [signed-by=/etc/apt/keyrings/gitea-rigdoctor.asc] \
https://git.jesseyvanofferen.com/api/packages/jessey/debian stable main" \
echo "deb [trusted=yes] https://git.jesseyvanofferen.com/api/packages/jessey/debian stable main" \
| sudo tee /etc/apt/sources.list.d/rigdoctor.list
sudo apt update && sudo apt install rigdoctor
```
## Run it (dev)
Then `sudo apt upgrade` keeps it current. *(If your server serves a signed registry, drop the
`auth.conf.d` file and replace `[trusted=yes]` with `[signed-by=…]` + the `repository.key`.)*
Stdlib-only, no install needed (target is Python ≥ 3.11; tested on 3.14):
### Any distro — self-extracting `.run` (no root)
Download **`rigdoctor-<version>-installer.run`** from the releases page and run it. It installs
into a private virtualenv under `~/.local` (no root), adds the launchers + desktop entry, and
opens the first-run setup wizard:
```bash
PYTHONPATH=src python3 -m rigdoctor snapshot # one-shot sensor read
PYTHONPATH=src python3 -m rigdoctor snapshot --json
PYTHONPATH=src python3 -m rigdoctor monitor -n 1 # live view (Ctrl-C to quit)
PYTHONPATH=src python3 -m rigdoctor sources # list detected sensor sources
PYTHONPATH=src python3 -m unittest discover -s tests
sh rigdoctor-*-installer.run
```
### Crash-capture logger (M3)
### Updating & removing
A crash-safe background logger (JSONL, `fsync` per sample, bounded by rotation) for catching
the state right before a freeze:
- **`.deb`:** `sudo apt upgrade` (or reinstall a newer `.deb`).
- **`.run` / user-local:** the in-app **Update** button, or `rigdoctor update`.
- **Remove:** `sudo apt remove rigdoctor`, or `rigdoctor uninstall` for the user-local install.
## Using it
Launch **RigDoctor** from your app menu, or:
```bash
rigdoctor record start # start logging in the background
rigdoctor record status # is it running? latest readings, sample count
rigdoctor record stop # stop it
rigdoctor record report # post-crash summary: peaks, events, last samples
rigdoctor record run # run in the foreground (the systemd-ready entrypoint)
rigdoctor-gui # desktop app (+ tray)
rigdoctor --help # everything from the terminal (works over SSH)
```
Logs live in `~/.local/share/rigdoctor/logs/`. It detects GPU "lost"/hang (nvidia-smi query
timeout) and writes an event marker. Trigger modes (always-on / game-launch) and the
`systemd --user` service arrive in Phase 4.
### Desktop GUI (M10)
The GUI uses PySide6 (Qt) — the only part of RigDoctor that needs a non-stdlib dep:
Handy CLI commands:
```bash
pip install -e '.[gui]' # core + PySide6, gives `rigdoctor` and `rigdoctor-gui`
rigdoctor gui # or: rigdoctor-gui
rigdoctor snapshot # one-shot reading of every sensor
rigdoctor monitor # live terminal dashboard
rigdoctor report # health report (logs / SMART / driver)
rigdoctor diagnose start|finish # capture while gaming, then analyse
rigdoctor gameenv # flag risky gaming settings + fixes
rigdoctor inventory # hardware/OS inventory
rigdoctor ai explain # AI explanation of the current findings (opt-in)
rigdoctor bundle # zip the latest diagnostic into a shareable report
```
It opens a dark-themed window with sidebar navigation and a **live dashboard** over the
same sensor core — circular gauges for the headline metrics plus collapsible per-subsystem
cards (GPU/CPU/memory/storage) with temperature-colored values (icey-blue → green → red).
The **Logs** and **Health** sections are full pages (recording controls + post-crash report;
and the kernel-log / SMART / driver scan). **Inventory** is a placeholder until M5 lands.
## Requirements
Without the GUI extra, `pip install -e .` gives just the stdlib-only CLI.
- **Linux** — Ubuntu/Debian first-class (the `.deb`); the `.run` works on any distro with
Python ≥ 3.11.
- **GPU** — NVIDIA fully supported (via `nvidia-smi`); AMD/Intel sensors are best-effort.
- **CLI/daemon** need only Python 3 (stdlib). The **GUI/tray** add **PySide6** (`python3-pyside6`).
- Optional tools unlock more: `smartmontools`, `lm-sensors`, `gamemode`, `mangohud`. The setup
wizard offers to install them.
## Start here
## Privacy
1. Read `docs/SPEC.md` for what we're building.
2. Read `docs/ROADMAP.md` for the build order (Phase 1 = the MVP).
3. Read `docs/DECISIONS.md` for the settled decisions (D1D15).
</content>
Everything stays on your machine — no telemetry, no phone-home. The AI assistant is **off by
default** and runs only when you explicitly trigger it; with Ollama nothing leaves the machine,
and the Claude option asks before sending. Reports are local files; they leave only if you share
the zip.
## Development
RigDoctor's core is stdlib-only Python; the GUI/tray use PySide6.
```bash
git clone https://git.jesseyvanofferen.com/jessey/rigdoctor && cd rigdoctor
pip install -e ".[gui]" # core + GUI; omit [gui] for CLI-only
python -m unittest discover -s tests # run the test suite
PYTHONPATH=src python3 -m rigdoctor snapshot # run without installing
```
Design docs live in `docs/``SPEC.md` (vision/requirements), `ARCHITECTURE.md`,
`MODULES.md` (module catalog), `ROADMAP.md`, and `DECISIONS.md` (the decision log).
Contributions: branch off `main`, keep tests green (CI runs them on PRs), and bump the version
+ `CHANGELOG.md` for shipped changes.