Release 0.0.2: M3 logger (CLI + GUI), GUI-first, CI release workflow
release / release (push) Successful in 2m13s

Crash-capture logger (M3):
- crash-safe JSONL (fsync per sample), size-based rotation, GPU-lost/recovered
  markers, atomic status file
- CLI: record run/start/stop/status/report (run = systemd-ready entrypoint)
- shared core.reccontrol so CLI + GUI drive the same recorder
- crashlog tests (writer, rotation, reader, summary, recorder)

GUI:
- Recording/Logs page: start/stop/interval controls, live status, post-crash report
- shared render helpers (format_raw/headline, render_summary)

Docs/decisions:
- GUI-first (D17); CLI keeps full parity
- D8 revised: user-local self-updating install primary, .deb optional
- planned: M12 session sharing (D16), M13 no-root auto-update from public repo (D18)
- versioning + CHANGELOG convention (D19)

Infra:
- .gitea/workflows/release.yml: build wheel+sdist and publish a Gitea release
  v<version> on push to main
- align version to the 0.0.x release line; bump to 0.0.2

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-21 17:16:41 +02:00
parent 2ccf7ca50c
commit ce5f830393
20 changed files with 1157 additions and 60 deletions
+31 -11
View File
@@ -2,9 +2,10 @@
A **modular diagnostics, monitoring, and health-check toolkit for Linux gamers.**
> **Status:** 🟢 Phase 1 (MVP) in progress. Foundational decisions are settled and the
> **sensor core (M1)** works — `snapshot` / `monitor` read NVIDIA GPU, CPU, memory, and
> NVMe live. Crash logger (M3) and health report (M4) are next. See `docs/ROADMAP.md`.
> **Status:** 🟢 Phase 1 (MVP) in progress. The **sensor core (M1)** and **crash-capture
> logger (M3)** work — `snapshot`/`monitor` read NVIDIA GPU, CPU, memory, and NVMe live, and
> `record` captures a crash-safe log with a post-crash report. A desktop GUI (M10) is also
> up. Health report (M4) is next. See `docs/ROADMAP.md`.
## Why this exists
@@ -25,13 +26,14 @@ See `docs/SPEC.md` §1.
## How you run it
Three front-ends over one shared enginepick what fits:
- **CLI / headless** — full functionality from the terminal, works over SSH.
- **Desktop GUI** — graphical dashboard, log browser, and health-report viewer.
- **Tray applet** — a small applet in the top menu bar with quick actions (e.g. start
recording) and at-a-glance status.
RigDoctor is **GUI-first** — the desktop app is the primary way in — but every feature is
also available headless:
- **Desktop GUI** — graphical dashboard, recording controls, log browser, reports. The
default interface for most users.
- **Tray applet** — a small top-menu-bar applet with quick actions and at-a-glance status.
- **CLI** — full functionality from the terminal; works over SSH and in scripts.
The GUI and tray are optional modules; a headless install loses no diagnostic capability.
The GUI/tray are optional modules; a headless (CLI-only) install loses no capability.
## Key decisions (settled)
@@ -42,7 +44,7 @@ The GUI and tray are optional modules; a headless install loses no diagnostic ca
| Primary distro | **Ubuntu** (Debian via apt); others best-effort later |
| Primary GPU | **NVIDIA** first; AMD, then Intel later |
| MVP | **Sensor core + crash logger + health report** (NVIDIA-only, CLI-first) |
| Distribution | **`.deb`** + interactive module installer |
| Distribution | **User-local install** (self-updating from the public repo, no root); **`.deb`** optional |
| Scope of action | **Read-only + suggestions** (no auto-apply yet) |
| Stress tests | **Out of scope** |
@@ -73,6 +75,23 @@ PYTHONPATH=src python3 -m rigdoctor sources # list detected sensor sources
PYTHONPATH=src python3 -m unittest discover -s tests
```
### Crash-capture logger (M3)
A crash-safe background logger (JSONL, `fsync` per sample, bounded by rotation) for catching
the state right before a freeze:
```bash
rigdoctor record start # start logging in the background
rigdoctor record status # is it running? latest readings, sample count
rigdoctor record stop # stop it
rigdoctor record report # post-crash summary: peaks, events, last samples
rigdoctor record run # run in the foreground (the systemd-ready entrypoint)
```
Logs live in `~/.local/share/rigdoctor/logs/`. It detects GPU "lost"/hang (nvidia-smi query
timeout) and writes an event marker. Trigger modes (always-on / game-launch) and the
`systemd --user` service arrive in Phase 4.
### Desktop GUI (M10)
The GUI uses PySide6 (Qt) — the only part of RigDoctor that needs a non-stdlib dep:
@@ -85,7 +104,8 @@ rigdoctor gui # or: rigdoctor-gui
It opens a dark-themed window with sidebar navigation and a **live dashboard** over the
same sensor core — circular gauges for the headline metrics plus collapsible per-subsystem
cards (GPU/CPU/memory/storage) with temperature-colored values (icey-blue → green → red).
The Logs / Health / Inventory sections are placeholders until M3M5 land.
The **Logs** section is a full recording page (start/stop, live status, and the post-crash
report); Health / Inventory are placeholders until M4 / M5 land.
Without the GUI extra, `pip install -e .` gives just the stdlib-only CLI.