Compare commits
24 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| f95387c5b8 | |||
| 1dc86121f6 | |||
| cd54e5f2c5 | |||
| 1b24d1b032 | |||
| 7ac14416b5 | |||
| b22a2f5593 | |||
| f45d8c9b34 | |||
| 8d6ce47e87 | |||
| 03b2dd8363 | |||
| ab89dda0b4 | |||
| 305c88ba09 | |||
| 82f3ea49de | |||
| 8d695227bc | |||
| 82bef0a08c | |||
| 73f347449e | |||
| 5cd51beadf | |||
| 934b489fec | |||
| 7a283dc338 | |||
| 5682878f22 | |||
| 5a584c08d5 | |||
| 8b1083a29b | |||
| 25b7a58e3c | |||
| 1ec8675fa0 | |||
| 9c30c9824e |
+126
@@ -5,6 +5,132 @@ All notable changes to RigDoctor are recorded here. Format follows
|
|||||||
(`MAJOR.MINOR.PATCH`, pre-1.0). `__version__` and `pyproject.toml` must match the git
|
(`MAJOR.MINOR.PATCH`, pre-1.0). `__version__` and `pyproject.toml` must match the git
|
||||||
release tag (so the auto-updater, D18, can compare versions).
|
release tag (so the auto-updater, D18, can compare versions).
|
||||||
|
|
||||||
|
## [0.18.2] - 2026-05-22
|
||||||
|
### Fixed
|
||||||
|
- **GUI wouldn't start** (0.18.0 regression): the recording indicator used a wrong relative
|
||||||
|
import (`from .core` → `rigdoctor.gui.core`, which doesn't exist), crashing `MainWindow` on
|
||||||
|
launch. Corrected to `from ..core`.
|
||||||
|
|
||||||
|
## [0.18.1] - 2026-05-22
|
||||||
|
### Changed
|
||||||
|
- Recording badge: dropped the sample count (not useful at a glance) — it now shows just
|
||||||
|
**● Recording** + the game, plus a **⚠ GPU-lost** line if one is detected.
|
||||||
|
|
||||||
|
## [0.18.0] - 2026-05-22
|
||||||
|
### Added
|
||||||
|
- **Global recording indicator.** While a capture is running, the sidebar shows a red
|
||||||
|
**● Recording** badge on every page — with the **game** being captured and the live sample
|
||||||
|
count (and a GPU-lost flag if seen). It polls the recorder, so it reflects captures started
|
||||||
|
any way: manual `record`, a guided diagnostic, or the Steam launch wrapper.
|
||||||
|
|
||||||
|
## [0.17.0] - 2026-05-22
|
||||||
|
### Added
|
||||||
|
- **Inventory page is back in the GUI** (it was removed in 0.7.2 in favor of the CLI). Sidebar
|
||||||
|
**Inventory** → System / CPU / Firmware / Memory / GPU / Storage / Display as cards, with
|
||||||
|
**Copy Markdown** and **Save…** for pasting into forum/bug reports, and **Refresh**. Root-only
|
||||||
|
details (motherboard/BIOS/RAM modules via dmidecode) fill in after the launch password prompt.
|
||||||
|
Backed by the existing M5 `core/inventory.py` — the CLI `rigdoctor inventory` is unchanged.
|
||||||
|
|
||||||
|
## [0.16.0] - 2026-05-22
|
||||||
|
### Added
|
||||||
|
- **Automatic crash-capture via a Steam launch wrapper (M6/D12).** Set `rigdoctor wrap
|
||||||
|
%command%` as a game's Steam launch option (or in Lutris/Heroic's wrapper field) and RigDoctor
|
||||||
|
starts a focused, game-tagged capture when the game launches and stops it cleanly on exit — no
|
||||||
|
manual Run Diagnostic / Finish. A hard freeze leaves the capture unterminated, so it's flagged
|
||||||
|
as a crash next launch. The wrapper resolves the game name from Steam's `SteamAppId`, doesn't
|
||||||
|
disturb an existing capture, and returns the game's exit code. (`core/wrap.py`, `rigdoctor wrap`.)
|
||||||
|
- GUI **Auto-capture…** helper on the Games page: shows the exact launch-option line (absolute
|
||||||
|
path, copy button) and how to set it in Steam.
|
||||||
|
- Auto-capture preserves an unanalyzed crash (`diagnostic-crash.jsonl`) before starting a new
|
||||||
|
capture, so relaunching the game can't wipe a crash report you haven't seen yet.
|
||||||
|
### Fixed
|
||||||
|
- `docs/MODULES.md` status column was stale — M1, M3, M4, M5, M8, M10, and M13 are done and now
|
||||||
|
marked ✅ (only M2 and M11 remain not-started; M6/M9/M12 in progress).
|
||||||
|
|
||||||
|
## [0.15.0] - 2026-05-22
|
||||||
|
### Added
|
||||||
|
- **Hard-crash detection & recovery for the guided diagnostic.** If a focused capture ends
|
||||||
|
without a clean stop (the recorder never wrote `session-stop` and isn't running), RigDoctor
|
||||||
|
treats it as a likely hard freeze. On launch the **Games** page shows a warning banner —
|
||||||
|
*"Your last diagnostic for <game> ended unexpectedly…"* — with **Analyze crash** / **Dismiss**.
|
||||||
|
- **Deeper crash analysis.** *Analyze crash* combines the captured window (final readings before
|
||||||
|
the freeze + any GPU-lost event) with a focused scan of the **previous (crashed) boot's kernel
|
||||||
|
log** (`journalctl -k -b -1`: Xid/panic/OOM/MCE/AER/thermal) plus SMART/driver/persistence/
|
||||||
|
live-temp checks — the full "what happened" picture. `core/diagnostic.py` gains
|
||||||
|
`pending_crash()` / `analyze_crash()`; `health.check_previous_boot()` +
|
||||||
|
`run_health_checks(include_journal=False)` back it.
|
||||||
|
|
||||||
|
## [0.14.0] - 2026-05-22
|
||||||
|
### Changed
|
||||||
|
- **Dashboard headline tiles are now history trend graphs** instead of single-value gauges —
|
||||||
|
GPU temp, GPU load, CPU temp, and memory each plot their recent history (with the current
|
||||||
|
value, window min/max, and a dashed warning-threshold line), so you can see changes over time
|
||||||
|
rather than only the instantaneous reading. New `HistoryGraph` widget (QPainter, no new deps).
|
||||||
|
|
||||||
|
## [0.13.0] - 2026-05-22
|
||||||
|
### Added
|
||||||
|
- **Run Diagnostic now explains itself and can launch the game.** Clicking Run Diagnostic shows
|
||||||
|
what to do — *play the game, reproduce the crash, then Finish & analyze* (and that data
|
||||||
|
survives a hard freeze + reboot) — and offers **Launch game & start** (asks Steam to run it by
|
||||||
|
appid) or **Start without launching**. The recording banner now spells out the next step
|
||||||
|
instead of just showing a sample count.
|
||||||
|
### Fixed
|
||||||
|
- Button labels containing "&" (e.g. "Finish & analyze") rendered as "Finish _analyze" because
|
||||||
|
Qt treated the "&" as a keyboard mnemonic — now escaped so the ampersand shows literally.
|
||||||
|
|
||||||
|
## [0.12.0] - 2026-05-22
|
||||||
|
### Added
|
||||||
|
- **Guided diagnostic in the GUI.** Each game on the **Games** page now has a **Run Diagnostic**
|
||||||
|
button → a focused, game-tagged capture starts and a recording banner appears (live sample
|
||||||
|
count, GPU-lost indicator) with **Finish & analyze** / **Discard**. Finishing opens a results
|
||||||
|
dialog: the window-scoped capture summary (peak temps/power, events, last samples) plus the
|
||||||
|
health findings as cards. The banner persists/restores if you navigate away and back while a
|
||||||
|
capture is running. Shares `core/diagnostic.py` with the CLI (one flow, three front-ends).
|
||||||
|
|
||||||
|
## [0.11.0] - 2026-05-22
|
||||||
|
### Added
|
||||||
|
- **Guided diagnostic session (CLI) — the seed use case, end to end.** `rigdoctor diagnose
|
||||||
|
start --game "<name>"` runs a **focused crash-capture tagged with that game** (its own
|
||||||
|
diagnostic log, so the report is scoped to just that session), `diagnose status` shows
|
||||||
|
progress, and `diagnose finish` stops it and prints a combined report: the **capture
|
||||||
|
summary** (peak temps/power, GPU-lost events, last samples — M3) plus the **health findings**
|
||||||
|
(Xid/SMART/driver/etc. — M4). The game can be given by `--game` or `--appid` (resolved from
|
||||||
|
the Steam scan), and is recorded as a log event so it survives a crash + reboot.
|
||||||
|
- Shared orchestration lives in `core/diagnostic.py` (one callable for CLI/GUI/tray, per
|
||||||
|
ARCHITECTURE §7.1); the recorder/`record run` gained an optional `--game` tag.
|
||||||
|
|
||||||
|
## [0.10.2] - 2026-05-22
|
||||||
|
### Changed
|
||||||
|
- When an Environment **Apply**/**Install** fails, the status now shows the **real reason**
|
||||||
|
(cancelled at the password prompt vs. the system rejecting the change, e.g. a BIOS/kernel-
|
||||||
|
locked PCIe ASPM policy) instead of a vague "cancelled, or needs privileges".
|
||||||
|
|
||||||
|
## [0.10.1] - 2026-05-22
|
||||||
|
### Fixed
|
||||||
|
- **Environment-page contrast.** The combo-box **drop-down list** was rendering light-on-light
|
||||||
|
(the popup view is a separate widget the theme didn't cover) — now dark with readable text.
|
||||||
|
- The **Install / Apply** buttons on findings were hard to read (the accent fill didn't paint
|
||||||
|
reliably inside the finding cards, leaving dim dark-on-dark text). They're now an outlined
|
||||||
|
style — bright accent text on the dark card, filling accent on hover — readable regardless,
|
||||||
|
and given a minimum height so the row can't crush them.
|
||||||
|
|
||||||
|
## [0.10.0] - 2026-05-22
|
||||||
|
### Added
|
||||||
|
- **Actionable Environment page (M6) — install & apply, not just advice.** Findings that
|
||||||
|
recommend a tool or a setting are now one-click:
|
||||||
|
- **Install buttons** for GameMode, MangoHud, and cpupower (added to the M9 component catalog,
|
||||||
|
so they also appear on the **Setup** page with the existing installer).
|
||||||
|
- **Apply controls** for runtime-reversible tunables — a dropdown of the live options + Apply,
|
||||||
|
via a single pkexec prompt, no reboot: **CPU governor**, **NVIDIA persistence mode**,
|
||||||
|
**PCIe ASPM policy**, **vm.swappiness**, **Transparent HugePages** (`core/fixes.py`). The
|
||||||
|
chosen value is validated against the live options before anything runs.
|
||||||
|
- This is the consent-gated apply milestone D9 anticipated, scoped to safe settings (**D22**).
|
||||||
|
GRUB-based fixes and CPU mitigations stay suggestion-only; `rigdoctor gameenv` still prints
|
||||||
|
the exact commands for headless use.
|
||||||
|
### Changed
|
||||||
|
- The `Finding` model gained optional `action` (installable component) and `fix` (applyable
|
||||||
|
tunable) fields; the shared `finding_card` widget renders the matching control.
|
||||||
|
|
||||||
## [0.9.0] - 2026-05-22
|
## [0.9.0] - 2026-05-22
|
||||||
### Added
|
### Added
|
||||||
- **Gaming environment checks (M6) — the evaluate-and-suggest engine.** A new read-only report
|
- **Gaming environment checks (M6) — the evaluate-and-suggest engine.** A new read-only report
|
||||||
|
|||||||
+17
-1
@@ -223,9 +223,25 @@ The next version is **determined by the Conventional Commit types** since the la
|
|||||||
`packaging/bump.sh` writes it into `__init__.py` + `pyproject.toml`. Rules live in
|
`packaging/bump.sh` writes it into `__init__.py` + `pyproject.toml`. Rules live in
|
||||||
`cliff.toml [bump]` (pre-1.0: `breaking_always_bump_major = false`).
|
`cliff.toml [bump]` (pre-1.0: `breaking_always_bump_major = false`).
|
||||||
|
|
||||||
|
### D22 — Limited live apply of fixes (M6) — *DECIDED 2026-05-22; realizes the D9 milestone*
|
||||||
|
D9 deferred auto-applying fixes to "a deliberate later milestone, gated behind explicit user
|
||||||
|
consent." That milestone lands here, **scoped tightly to stay safe**:
|
||||||
|
- **Only runtime-reversible settings** are applyable from the gaming-environment report (M6):
|
||||||
|
**CPU governor, NVIDIA persistence mode, PCIe ASPM policy, vm.swappiness, Transparent
|
||||||
|
HugePages.** Each takes effect immediately, needs **no reboot**, and reverts on reboot.
|
||||||
|
- **How:** a dropdown of the live options + an Apply button per finding (`core/fixes.py`).
|
||||||
|
Applying runs a **single pkexec-elevated command** (one auth prompt); the chosen value is
|
||||||
|
validated against the live options first; writes target **sysfs/procfs or `nvidia-smi`** —
|
||||||
|
never the GRUB cmdline or a persistent config file.
|
||||||
|
- **Still suggestion-only** (the read-only stance holds for these): GRUB-based `pcie_aspm=off`,
|
||||||
|
CPU **mitigations** changes (security-sensitive, need a reboot), and the shader-cache env var.
|
||||||
|
- Everything remains **CLI-discoverable** (`rigdoctor gameenv` still prints the exact commands);
|
||||||
|
the apply UI is an additive convenience in the GUI, not the only path. Installing optional
|
||||||
|
tools (GameMode/MangoHud/cpupower) reuses the M9 installer and is likewise one-click.
|
||||||
|
|
||||||
## Open
|
## Open
|
||||||
|
|
||||||
None currently — all tracked decisions (D1–D21) are resolved. New questions will be added
|
None currently — all tracked decisions (D1–D22) are resolved. New questions will be added
|
||||||
here as they arise. Remaining detail to flesh out during build: the tray's supporting-action
|
here as they arise. Remaining detail to flesh out during build: the tray's supporting-action
|
||||||
set (D13), per-module apt package names, M12's tunnel/token specifics, and M13's
|
set (D13), per-module apt package names, M12's tunnel/token specifics, and M13's
|
||||||
update mechanism (APT repo vs. self-installed `.deb`).
|
update mechanism (APT repo vs. self-installed `.deb`).
|
||||||
|
|||||||
+24
-12
@@ -8,18 +8,18 @@ Status: ⬜ not started · 🟦 designing · 🟨 in progress · ✅ done
|
|||||||
|
|
||||||
| ID | Module | Bundle | Key deps | GPU scope | Priority | Status |
|
| ID | Module | Bundle | Key deps | GPU scope | Priority | Status |
|
||||||
|----|--------|--------|----------|-----------|----------|--------|
|
|----|--------|--------|----------|-----------|----------|--------|
|
||||||
| M1 | Sensor core | Essential | none (nvidia-smi, sysfs) | all (NVIDIA first) | P0 | ⬜ |
|
| M1 | Sensor core | Essential | none (nvidia-smi, sysfs) | all (NVIDIA first) | P0 | ✅ |
|
||||||
| M3 | Crash-capture logger | Essential | none (opt: smartmontools) | all (NVIDIA first) | P0 | 🟨 |
|
| M3 | Crash-capture logger | Essential | none (opt: smartmontools) | all (NVIDIA first) | P0 | ✅ |
|
||||||
| M4 | Health report (log scan) | Essential | none (opt: smartmontools) | all (NVIDIA first) | P0 | 🟨 |
|
| M4 | Health report (log scan) | Essential | none (opt: smartmontools) | all (NVIDIA first) | P0 | ✅ |
|
||||||
| M2 | Live monitor (TUI) | Monitoring | none (stdlib curses) | all | P1 | ⬜ |
|
| M2 | Live monitor (TUI) | Monitoring | none (stdlib curses) | all | P1 | ⬜ |
|
||||||
| M8 | Alerting | Monitoring | libnotify (opt) | all | P2 | 🟨 |
|
| M8 | Alerting | Monitoring | libnotify (opt) | all | P2 | ✅ |
|
||||||
| M5 | System inventory | Diagnostics | none (opt: lm-sensors, dmidecode) | all | P1 | 🟨 |
|
| M5 | System inventory | Diagnostics | none (opt: lm-sensors, dmidecode) | all | P1 | ✅ |
|
||||||
| M6 | Gaming env checks | Diagnostics | none | all | P2 | 🟨 |
|
| M6 | Gaming env checks | Diagnostics | none | all | P2 | 🟨 |
|
||||||
| M10 | Desktop GUI | Desktop UI | **python3-pyside6** | all | P2 | 🟨 |
|
| M10 | Desktop GUI | Desktop UI | **python3-pyside6** | all | P2 | ✅ |
|
||||||
| M11 | Tray / menu-bar applet | Desktop UI | **python3-pyside6** (+ AppIndicator on GNOME) | all | P2 | ⬜ |
|
| M11 | Tray / menu-bar applet | Desktop UI | **python3-pyside6** (+ AppIndicator on GNOME) | all | P2 | ⬜ |
|
||||||
| M9 | Installer | (meta) | none | all | P1 | 🟨 |
|
| M9 | Installer | (meta) | none | all | P1 | 🟨 |
|
||||||
| M12 | Session sharing / remote assist | Sharing | none (Tier 3: tmate/sshx) | all | P3 | 🟨 |
|
| M12 | Session sharing / remote assist | Sharing | none (Tier 3: tmate/sshx) | all | P3 | 🟨 |
|
||||||
| M13 | Auto-update | (core) | none (stdlib; user-local file swap) | all | P3 | 🟨 |
|
| M13 | Auto-update | (core) | none (stdlib; user-local file swap) | all | P3 | ✅ |
|
||||||
| ~~M7~~ | ~~Stress / repro~~ | — | — | — | — | ❌ dropped (D7) |
|
| ~~M7~~ | ~~Stress / repro~~ | — | — | — | — | ❌ dropped (D7) |
|
||||||
|
|
||||||
## Notes per module
|
## Notes per module
|
||||||
@@ -31,8 +31,10 @@ Status: ⬜ not started · 🟦 designing · 🟨 in progress · ✅ done
|
|||||||
*Implemented (manual trigger):* JSONL log with fsync-per-sample, size-based rotation
|
*Implemented (manual trigger):* JSONL log with fsync-per-sample, size-based rotation
|
||||||
(`log_max_bytes`/`log_backups`), GPU-lost/recovered event markers, atomic status file, and
|
(`log_max_bytes`/`log_backups`), GPU-lost/recovered event markers, atomic status file, and
|
||||||
`rigdoctor record run|start|stop|status|report`. The foreground `run` is the systemd-ready
|
`rigdoctor record run|start|stop|status|report`. The foreground `run` is the systemd-ready
|
||||||
entrypoint; the service unit + always-on/game-launch triggers (D6/D12) land in Phase 4.
|
entrypoint. The **game-launch trigger** is implemented via the D12 wrapper (`rigdoctor wrap
|
||||||
Also fully driven from the GUI's Recording/Logs page (M10) via shared `core.reccontrol`.
|
%command%`, see M6/below); the `systemd --user` service unit + always-on trigger (D6) and the
|
||||||
|
zero-config watcher (D12) are still pending. Also fully driven from the GUI's Recording/Logs
|
||||||
|
page (M10) via shared `core.reccontrol`.
|
||||||
- **M4 Health report** — turns scattered logs into a prioritized, plain-language findings
|
- **M4 Health report** — turns scattered logs into a prioritized, plain-language findings
|
||||||
list with **suggested** fixes (read-only, D9). Reuses M1 for a live snapshot. Also powers
|
list with **suggested** fixes (read-only, D9). Reuses M1 for a live snapshot. Also powers
|
||||||
the **guided diagnostic session** (with M3): pick a game → focused capture → scan →
|
the **guided diagnostic session** (with M3): pick a game → focused capture → scan →
|
||||||
@@ -51,9 +53,19 @@ Status: ⬜ not started · 🟦 designing · 🟨 in progress · ✅ done
|
|||||||
*Env-check engine implemented* (`core/gameenv.py`): a read-only findings report (reusing the
|
*Env-check engine implemented* (`core/gameenv.py`): a read-only findings report (reusing the
|
||||||
M4 `Finding` model) over PCIe ASPM, NVIDIA persistence mode, CPU governor (the three seed-case
|
M4 `Finding` model) over PCIe ASPM, NVIDIA persistence mode, CPU governor (the three seed-case
|
||||||
contributors to GPU bus-drop / Xid 79), GameMode, MangoHud, swappiness, shader cache, THP, CPU
|
contributors to GPU bus-drop / Xid 79), GameMode, MangoHud, swappiness, shader cache, THP, CPU
|
||||||
mitigations, and installed Proton versions — each with the suggested fix command (D9). CLI
|
mitigations, and installed Proton versions — each with the suggested fix command. CLI
|
||||||
`rigdoctor gameenv`; GUI **Environment** page. *Pending:* non-Steam launchers (Lutris/Heroic)
|
`rigdoctor gameenv`; GUI **Environment** page. Per **D22**, the GUI adds **one-click apply**
|
||||||
and per-GPU power-profile (PowerMizer) checks.
|
for the runtime-reversible tunables (governor / NVIDIA persistence / PCIe ASPM / swappiness /
|
||||||
|
THP — dropdown + Apply via a single pkexec prompt, `core/fixes.py`) and **one-click install**
|
||||||
|
of optional tools (GameMode / MangoHud / cpupower, now in the M9 catalog). GRUB/mitigations
|
||||||
|
stay suggestion-only. *Guided diagnostic (D12 "pick a game", `core/diagnostic.py`):* a focused
|
||||||
|
capture tagged with a game → window-scoped report (capture summary + M4 findings), in the CLI
|
||||||
|
(`rigdoctor diagnose start/status/finish`) and GUI (per-game **Run Diagnostic** → recording
|
||||||
|
banner → results dialog). **Auto-capture** via the D12 wrapper (`rigdoctor wrap %command%`,
|
||||||
|
`core/wrap.py`; GUI "Auto-capture…" helper). **Hard crashes are detected** (capture left
|
||||||
|
without a clean stop) and flagged on next launch with a crash-boot kernel-log analysis
|
||||||
|
(`pending_crash`/`analyze_crash` + `health.check_previous_boot`). *Pending:* non-Steam
|
||||||
|
launchers (Lutris/Heroic), GPU power-profile (PowerMizer) checks, and the zero-config watcher.
|
||||||
- **M8 Alerting** — threshold/event notifications; integrates with the tray applet (M11).
|
- **M8 Alerting** — threshold/event notifications; integrates with the tray applet (M11).
|
||||||
- **M10 Desktop GUI** — PySide6 graphical front-end over the core engine (dashboard, log
|
- **M10 Desktop GUI** — PySide6 graphical front-end over the core engine (dashboard, log
|
||||||
browser, report viewer, logger controls). Optional; adds the Qt dependency. *Bootstrapped
|
browser, report viewer, logger controls). Optional; adds the Qt dependency. *Bootstrapped
|
||||||
|
|||||||
+19
-7
@@ -40,11 +40,21 @@ Ubuntu + NVIDIA first; `.deb` distribution (see `DECISIONS.md`).
|
|||||||
- [ ] M10 desktop GUI (PySide6: dashboard, log browser, report viewer, logger controls)
|
- [ ] M10 desktop GUI (PySide6: dashboard, log browser, report viewer, logger controls)
|
||||||
- [ ] M11 tray / menu-bar applet (QSystemTrayIcon: live M1 readouts + Run Diagnostic +
|
- [ ] M11 tray / menu-bar applet (QSystemTrayIcon: live M1 readouts + Run Diagnostic +
|
||||||
supporting actions — D13)
|
supporting actions — D13)
|
||||||
- [ ] Guided diagnostic session (pick game → focused M3 capture → M4 scan → findings),
|
- [~] Guided diagnostic session (pick game → focused M3 capture → M4 scan → findings),
|
||||||
shared by tray/GUI/CLI
|
shared by tray/GUI/CLI — *core + CLI + GUI done* (`core/diagnostic.py`, `rigdoctor
|
||||||
- [ ] Logger trigger modes: always-on + game-launch (D12 — wrapper first:
|
diagnose start/status/finish`, and a **Run Diagnostic** button per game on the GUI Games
|
||||||
`rigdoctor wrap %command%` + global Steam compat-tool; zero-config watcher
|
page → recording banner → results dialog with the capture summary + findings). Tags a
|
||||||
(Steam RunningAppID + /proc) and GameMode hook follow)
|
focused capture with the chosen game (own diagnostic log, window-scoped report) and
|
||||||
|
combines the capture summary with the M4 findings. **Auto start/stop** via the D12
|
||||||
|
wrapper is wired in, and a **hard-crash is detected** (capture left without a clean stop)
|
||||||
|
→ flagged on next launch with a deeper crash-boot log analysis. *Pending:* the tray (M11)
|
||||||
|
entry point and the zero-config watcher.
|
||||||
|
- [~] Logger trigger modes: always-on + game-launch (D12) — *game-launch **wrapper** done:*
|
||||||
|
`rigdoctor wrap %command%` (per-game Steam launch option / Lutris/Heroic wrapper field)
|
||||||
|
auto-brackets a focused capture around the game; GUI "Auto-capture…" helper shows the
|
||||||
|
launch-option string. *Pending:* global Steam compat-tool registration, the zero-config
|
||||||
|
watcher (Steam RunningAppID + /proc), GameMode hook, and the always-on `systemd --user`
|
||||||
|
service.
|
||||||
- [~] M9 interactive installer — *done:* distro/GPU detection + optional-dependency install
|
- [~] M9 interactive installer — *done:* distro/GPU detection + optional-dependency install
|
||||||
(`rigdoctor install`, GUI Setup tab); **user-local `install.sh` + self-extracting `.run`**
|
(`rigdoctor install`, GUI Setup tab); **user-local `install.sh` + self-extracting `.run`**
|
||||||
(no-root venv install, handles python3-venv prereq, CI-built). *Pending:* module-selection
|
(no-root venv install, handles python3-venv prereq, CI-built). *Pending:* module-selection
|
||||||
@@ -57,8 +67,10 @@ Ubuntu + NVIDIA first; `.deb` distribution (see `DECISIONS.md`).
|
|||||||
- [x] M13 auto-update (D18) — launch-time version check (GUI sidebar) + no-root self-update
|
- [x] M13 auto-update (D18) — launch-time version check (GUI sidebar) + no-root self-update
|
||||||
apply (`rigdoctor update` / sidebar button → authenticated pip upgrade), token-gated.
|
apply (`rigdoctor update` / sidebar button → authenticated pip upgrade), token-gated.
|
||||||
Restart-after-update is manual for now.
|
Restart-after-update is manual for now.
|
||||||
- [ ] (Later, separate milestone) Optional auto-apply of suggested fixes behind explicit
|
- [~] Optional auto-apply of suggested fixes behind explicit consent (D9 milestone) — *first
|
||||||
consent — currently out of scope (D9)
|
cut shipped for M6 (D22):* one-click apply of runtime-reversible tunables (CPU governor,
|
||||||
|
NVIDIA persistence, PCIe ASPM, swappiness, THP) via a single pkexec prompt, no reboot.
|
||||||
|
GRUB-based fixes + CPU mitigations remain suggestion-only.
|
||||||
|
|
||||||
## Phase 6 — Session sharing / remote assist (M12, D16)
|
## Phase 6 — Session sharing / remote assist (M12, D16)
|
||||||
Escalating ladder, built in order:
|
Escalating ladder, built in order:
|
||||||
|
|||||||
+10
-5
@@ -43,9 +43,12 @@ RigDoctor's crash-safe logger is designed to fix exactly that.
|
|||||||
- **Not a stress-test / load-generator** — explicitly out of scope (D7). Users can run
|
- **Not a stress-test / load-generator** — explicitly out of scope (D7). Users can run
|
||||||
existing tools (gpu-burn, vkmark, stress-ng) alongside the logger if they want.
|
existing tools (gpu-burn, vkmark, stress-ng) alongside the logger if they want.
|
||||||
- Not an overclocking utility.
|
- Not an overclocking utility.
|
||||||
- **Not (yet) an auto-fixer.** RigDoctor is **read-only**: it diagnoses and *suggests*
|
- **Read-only by default, with a narrow consent-gated exception.** RigDoctor diagnoses and
|
||||||
actions (with the exact command where possible) but does not apply changes itself in this
|
*suggests* actions (with the exact command where possible). It does **not** apply changes
|
||||||
stage. Auto-apply is a deliberate later milestone behind explicit consent. (D9)
|
itself — **except** a small set of **runtime-reversible** gaming tunables (M6: CPU governor,
|
||||||
|
NVIDIA persistence, PCIe ASPM policy, swappiness, THP) that can be applied from the GUI via a
|
||||||
|
single pkexec prompt, no reboot, revert on reboot (D22, realizing the D9 milestone). Risky/
|
||||||
|
persistent fixes (GRUB cmdline, CPU mitigations) remain suggestion-only.
|
||||||
|
|
||||||
## 3. Target users & platforms
|
## 3. Target users & platforms
|
||||||
|
|
||||||
@@ -96,8 +99,10 @@ PCIe topology. Exportable (Markdown/JSON) to paste into forum/bug reports.
|
|||||||
### M6 — Gaming environment checks
|
### M6 — Gaming environment checks
|
||||||
Detects & evaluates: GPU power profile / persistence mode, CPU governor, Proton/Wine/Steam
|
Detects & evaluates: GPU power profile / persistence mode, CPU governor, Proton/Wine/Steam
|
||||||
versions, GameMode, MangoHud, shader cache, swappiness, hugepages, CPU mitigations,
|
versions, GameMode, MangoHud, shader cache, swappiness, hugepages, CPU mitigations,
|
||||||
PCIe ASPM. Flags settings that hurt stability/performance and **suggests** the fix command
|
PCIe ASPM. Flags settings that hurt stability/performance and **suggests** the fix command.
|
||||||
(read-only per D9).
|
Also includes Steam library/game detection (the D12 "pick a game" foundation) and, per D22,
|
||||||
|
a **one-click apply** for the runtime-reversible tunables (governor, persistence, ASPM,
|
||||||
|
swappiness, THP) plus one-click install of optional tools (GameMode/MangoHud/cpupower).
|
||||||
|
|
||||||
### M8 — Alerting
|
### M8 — Alerting
|
||||||
Threshold + event alerts (desktop notification / sound / log) on overheat, throttle,
|
Threshold + event alerts (desktop notification / sound / log) on overheat, throttle,
|
||||||
|
|||||||
+1
-1
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
|||||||
|
|
||||||
[project]
|
[project]
|
||||||
name = "rigdoctor"
|
name = "rigdoctor"
|
||||||
version = "0.9.0"
|
version = "0.18.2"
|
||||||
description = "Modular hardware monitoring & crash diagnostics for Linux gamers."
|
description = "Modular hardware monitoring & crash diagnostics for Linux gamers."
|
||||||
readme = "README.md"
|
readme = "README.md"
|
||||||
requires-python = ">=3.11"
|
requires-python = ">=3.11"
|
||||||
|
|||||||
@@ -1,3 +1,3 @@
|
|||||||
"""RigDoctor — modular hardware monitoring & crash diagnostics for Linux gamers."""
|
"""RigDoctor — modular hardware monitoring & crash diagnostics for Linux gamers."""
|
||||||
|
|
||||||
__version__ = "0.9.0"
|
__version__ = "0.18.2"
|
||||||
|
|||||||
@@ -86,6 +86,7 @@ def cmd_record_run(args) -> int:
|
|||||||
max_bytes=cfg["log_max_bytes"],
|
max_bytes=cfg["log_max_bytes"],
|
||||||
backups=cfg["log_backups"],
|
backups=cfg["log_backups"],
|
||||||
status_path=config.STATUS_FILE,
|
status_path=config.STATUS_FILE,
|
||||||
|
game=getattr(args, "game", None),
|
||||||
)
|
)
|
||||||
|
|
||||||
def _handle(_sig, _frame):
|
def _handle(_sig, _frame):
|
||||||
@@ -345,6 +346,83 @@ def cmd_report(args) -> int:
|
|||||||
return 0
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_game(args) -> str | None:
|
||||||
|
"""Game name from --game, or looked up from --appid via the Steam scan."""
|
||||||
|
if getattr(args, "game", None):
|
||||||
|
return args.game
|
||||||
|
if getattr(args, "appid", None):
|
||||||
|
from .core import steam
|
||||||
|
|
||||||
|
for g in steam.scan_games(steam.selected_library_paths()):
|
||||||
|
if g.appid == str(args.appid):
|
||||||
|
return g.name
|
||||||
|
return None
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def cmd_diagnose(args) -> int:
|
||||||
|
from .core import diagnostic, reccontrol, steam
|
||||||
|
|
||||||
|
sub = args.diagnose_cmd or "status"
|
||||||
|
|
||||||
|
if sub == "start":
|
||||||
|
if reccontrol.running_pid():
|
||||||
|
print("A capture is already running — finish it with: rigdoctor diagnose finish")
|
||||||
|
return 1
|
||||||
|
game = _resolve_game(args)
|
||||||
|
if game is None and (args.game or args.appid):
|
||||||
|
print("Couldn't match that game in your selected Steam libraries.")
|
||||||
|
return 1
|
||||||
|
if game is None:
|
||||||
|
games = steam.cached_games() or steam.scan_games(steam.selected_library_paths())
|
||||||
|
if games:
|
||||||
|
print("Pick a game to focus on, then re-run with --game:")
|
||||||
|
for g in games:
|
||||||
|
print(f" --game {g.name!r}")
|
||||||
|
else:
|
||||||
|
print("No games detected. Select a library: rigdoctor games libraries --all")
|
||||||
|
return 1
|
||||||
|
pid = diagnostic.start(game=game, interval=args.interval)
|
||||||
|
time.sleep(1.0)
|
||||||
|
if pid and reccontrol.pid_alive(pid):
|
||||||
|
print(f"Diagnostic capture started for {game!r} (pid {pid}).")
|
||||||
|
print(" Play your game. When you're done (or after a crash + reboot):")
|
||||||
|
print(" rigdoctor diagnose finish")
|
||||||
|
return 0
|
||||||
|
print(f"Capture failed to start; see {config.SPAWN_LOG}")
|
||||||
|
return 1
|
||||||
|
|
||||||
|
if sub == "status":
|
||||||
|
status = diagnostic.active()
|
||||||
|
if not status:
|
||||||
|
print("No diagnostic capture is running.")
|
||||||
|
return 0
|
||||||
|
game = status.get("game") or "—"
|
||||||
|
print(f"Capturing for {game!r}: {status.get('samples', 0)} samples"
|
||||||
|
+ (" · GPU-lost seen" if status.get("gpu_lost") else ""))
|
||||||
|
return 0
|
||||||
|
|
||||||
|
# finish
|
||||||
|
if not reccontrol.running_pid() and not config.DIAG_LOG.exists():
|
||||||
|
print("No diagnostic to analyze. Start one with: rigdoctor diagnose start --game <name>")
|
||||||
|
return 1
|
||||||
|
print("Stopping capture and analyzing…\n")
|
||||||
|
result = diagnostic.finish(last_n=args.last)
|
||||||
|
from .render import render_health, render_summary
|
||||||
|
|
||||||
|
if result.game:
|
||||||
|
print(f"Diagnostic — {result.game}\n")
|
||||||
|
print(render_summary(result.summary, log_path=config.DIAG_LOG))
|
||||||
|
print("\n" + render_health(result.findings, title="Findings"))
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
def cmd_wrap(args) -> int:
|
||||||
|
from .core import wrap
|
||||||
|
|
||||||
|
return wrap.run(args.command)
|
||||||
|
|
||||||
|
|
||||||
def cmd_gameenv(args) -> int:
|
def cmd_gameenv(args) -> int:
|
||||||
from dataclasses import asdict
|
from dataclasses import asdict
|
||||||
|
|
||||||
@@ -470,6 +548,7 @@ def build_parser() -> argparse.ArgumentParser:
|
|||||||
run_p = rec_sub.add_parser("run", help="run the capture loop in the foreground (systemd-friendly)")
|
run_p = rec_sub.add_parser("run", help="run the capture loop in the foreground (systemd-friendly)")
|
||||||
run_p.add_argument("-n", "--interval", type=float, default=None, help="sampling interval (s)")
|
run_p.add_argument("-n", "--interval", type=float, default=None, help="sampling interval (s)")
|
||||||
run_p.add_argument("-o", "--out", default=None, help="log file path")
|
run_p.add_argument("-o", "--out", default=None, help="log file path")
|
||||||
|
run_p.add_argument("--game", default=None, help="tag the capture with a game name (M6/diagnose)")
|
||||||
run_p.set_defaults(func=cmd_record_run)
|
run_p.set_defaults(func=cmd_record_run)
|
||||||
|
|
||||||
start_p = rec_sub.add_parser("start", help="start recording in the background")
|
start_p = rec_sub.add_parser("start", help="start recording in the background")
|
||||||
@@ -519,6 +598,25 @@ def build_parser() -> argparse.ArgumentParser:
|
|||||||
env_p = sub.add_parser("gameenv", help="gaming environment checks (M6): flag stability/perf settings")
|
env_p = sub.add_parser("gameenv", help="gaming environment checks (M6): flag stability/perf settings")
|
||||||
env_p.add_argument("--json", action="store_true", help="output JSON instead of text")
|
env_p.add_argument("--json", action="store_true", help="output JSON instead of text")
|
||||||
env_p.set_defaults(func=cmd_gameenv)
|
env_p.set_defaults(func=cmd_gameenv)
|
||||||
|
|
||||||
|
diag_p = sub.add_parser("diagnose", help="guided diagnostic: capture while gaming, then analyze")
|
||||||
|
diag_sub = diag_p.add_subparsers(dest="diagnose_cmd")
|
||||||
|
diag_start = diag_sub.add_parser("start", help="start a focused capture for a game")
|
||||||
|
diag_start.add_argument("--game", default=None, help="game name to focus on")
|
||||||
|
diag_start.add_argument("--appid", default=None, help="Steam appid to focus on (resolved to a name)")
|
||||||
|
diag_start.add_argument("-n", "--interval", type=float, default=None, help="sampling interval (s)")
|
||||||
|
diag_start.set_defaults(func=cmd_diagnose)
|
||||||
|
diag_sub.add_parser("status", help="show the in-progress diagnostic").set_defaults(func=cmd_diagnose)
|
||||||
|
diag_finish = diag_sub.add_parser("finish", help="stop the capture and analyze it")
|
||||||
|
diag_finish.add_argument("--last", type=int, default=10, help="recent samples to show")
|
||||||
|
diag_finish.set_defaults(func=cmd_diagnose)
|
||||||
|
diag_p.set_defaults(func=cmd_diagnose, diagnose_cmd=None, last=10)
|
||||||
|
|
||||||
|
wrap_p = sub.add_parser(
|
||||||
|
"wrap", help="run a game with automatic crash-capture (Steam launch option, D12)")
|
||||||
|
wrap_p.add_argument("command", nargs=argparse.REMAINDER,
|
||||||
|
help="the game command — use `rigdoctor wrap %%command%%` in Steam")
|
||||||
|
wrap_p.set_defaults(func=cmd_wrap)
|
||||||
return p
|
return p
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -23,6 +23,12 @@ CONFIG_FILE = CONFIG_DIR / "config.toml"
|
|||||||
|
|
||||||
# Crash-capture logger (M3)
|
# Crash-capture logger (M3)
|
||||||
LOG_FILE = LOG_DIR / "capture.jsonl"
|
LOG_FILE = LOG_DIR / "capture.jsonl"
|
||||||
|
# Guided diagnostic (M6/D12): a focused capture writes here, separate from the always-on
|
||||||
|
# crash log, so its report covers only that session's window.
|
||||||
|
DIAG_LOG = LOG_DIR / "diagnostic.jsonl"
|
||||||
|
# A crashed (unterminated, unacknowledged) diagnostic is preserved here when a new capture
|
||||||
|
# starts, so auto-capture (the Steam wrapper) relaunching the game doesn't wipe it first.
|
||||||
|
DIAG_CRASH = LOG_DIR / "diagnostic-crash.jsonl"
|
||||||
STATUS_FILE = STATE_DIR / "recorder.json"
|
STATUS_FILE = STATE_DIR / "recorder.json"
|
||||||
PID_FILE = STATE_DIR / "recorder.pid"
|
PID_FILE = STATE_DIR / "recorder.pid"
|
||||||
SPAWN_LOG = STATE_DIR / "recorder.out"
|
SPAWN_LOG = STATE_DIR / "recorder.out"
|
||||||
|
|||||||
@@ -45,4 +45,23 @@ COMPONENTS: tuple[Component, ...] = (
|
|||||||
"libsecret", "Encrypted token storage", "Updates",
|
"libsecret", "Encrypted token storage", "Updates",
|
||||||
"Store the update token in the OS keyring, encrypted", ("libsecret-tools",), "secret-tool",
|
"Store the update token in the OS keyring, encrypted", ("libsecret-tools",), "secret-tool",
|
||||||
),
|
),
|
||||||
|
Component(
|
||||||
|
"gamemode", "Feral GameMode", "Gaming",
|
||||||
|
"Auto-applies performance tweaks (CPU governor, scheduling) while a game runs",
|
||||||
|
("gamemode",), "gamemoderun",
|
||||||
|
),
|
||||||
|
Component(
|
||||||
|
"mangohud", "MangoHud", "Gaming",
|
||||||
|
"In-game overlay for FPS, frame times, and temperatures", ("mangohud",), "mangohud",
|
||||||
|
),
|
||||||
|
Component(
|
||||||
|
"cpupower", "cpupower", "Gaming",
|
||||||
|
"Read/set the CPU frequency governor (e.g. performance for gaming)",
|
||||||
|
("linux-tools-common", "linux-tools-generic"), "cpupower",
|
||||||
|
),
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def by_id(component_id: str) -> Component | None:
|
||||||
|
"""Look up a catalog component by its id (None if unknown)."""
|
||||||
|
return next((c for c in COMPONENTS if c.id == component_id), None)
|
||||||
|
|||||||
@@ -0,0 +1,187 @@
|
|||||||
|
"""Guided diagnostic session (SPEC §4 / ARCHITECTURE §7.1): orchestrate M3 + M4.
|
||||||
|
|
||||||
|
The seed use case, one flow: **pick a game** → **focused crash-capture** scoped to that
|
||||||
|
session (M3, tagged with the game) → on **finish**, **scan & analyze** (M4 health report)
|
||||||
|
over the captured window + system logs → return a prioritized result. This is not a new
|
||||||
|
module — it's a single shared callable so the CLI, GUI, and tray run the identical flow.
|
||||||
|
|
||||||
|
The capture is **manually bracketed** (start/finish) for now; auto start/stop on game launch
|
||||||
|
(the D12 wrapper/watcher) plugs in here later without changing the result shape.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
|
import time
|
||||||
|
from dataclasses import dataclass
|
||||||
|
|
||||||
|
from .. import config
|
||||||
|
from . import reccontrol
|
||||||
|
from .crashlog import Summary, summarize
|
||||||
|
from .health import CRITICAL, OK, WARNING, Finding
|
||||||
|
|
||||||
|
_SEV_ORDER = {CRITICAL: 0, WARNING: 1, "info": 2, OK: 3}
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class DiagnosticResult:
|
||||||
|
game: str | None
|
||||||
|
summary: Summary # capture window: peak temps/power, events, last samples (M3)
|
||||||
|
findings: list[Finding] # health findings: Xid/SMART/driver/etc. (M4)
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class CrashInfo:
|
||||||
|
game: str | None
|
||||||
|
samples: int
|
||||||
|
when: float | None # ts of the last captured sample (≈ when the freeze hit)
|
||||||
|
gpu_lost: bool
|
||||||
|
|
||||||
|
|
||||||
|
def _clear_diag_log() -> None:
|
||||||
|
"""Each diagnostic is a fresh focused capture — drop any previous session + segments."""
|
||||||
|
base = config.DIAG_LOG
|
||||||
|
for p in [base, *base.parent.glob(base.name + ".*")]:
|
||||||
|
try:
|
||||||
|
p.unlink()
|
||||||
|
except OSError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
def start(game: str | None = None, interval: float | None = None) -> int | None:
|
||||||
|
"""Begin a focused capture, tagged with the game, into the dedicated diagnostic log.
|
||||||
|
Returns the pid, or None if a capture is already running."""
|
||||||
|
if reccontrol.running_pid():
|
||||||
|
return None
|
||||||
|
if _crash_from_log(config.DIAG_LOG): # preserve an unanalyzed crash before overwriting it
|
||||||
|
try:
|
||||||
|
config.DIAG_LOG.replace(config.DIAG_CRASH)
|
||||||
|
except OSError:
|
||||||
|
pass
|
||||||
|
_clear_diag_log()
|
||||||
|
return reccontrol.start_background(interval=interval, out=str(config.DIAG_LOG), game=game)
|
||||||
|
|
||||||
|
|
||||||
|
def is_running() -> bool:
|
||||||
|
return reccontrol.running_pid() is not None
|
||||||
|
|
||||||
|
|
||||||
|
def active() -> dict | None:
|
||||||
|
"""Status of the in-progress session (running flag, game, samples), or None if idle."""
|
||||||
|
if not is_running():
|
||||||
|
return None
|
||||||
|
return reccontrol.read_status()
|
||||||
|
|
||||||
|
|
||||||
|
def _await_stopped(timeout: float = 6.0) -> None:
|
||||||
|
deadline = time.monotonic() + timeout
|
||||||
|
while reccontrol.running_pid() and time.monotonic() < deadline:
|
||||||
|
time.sleep(0.1)
|
||||||
|
|
||||||
|
|
||||||
|
def _game_from_summary(summary: Summary) -> str | None:
|
||||||
|
"""Recover the focused game from the log's 'game' event (survives a crash + reboot)."""
|
||||||
|
for _ts, kind, detail in reversed(summary.events):
|
||||||
|
if kind == "game" and detail:
|
||||||
|
return detail
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def finish(last_n: int = 10, log_path=None) -> DiagnosticResult:
|
||||||
|
"""Stop the capture (if running), summarize the window, and run the health report."""
|
||||||
|
from .health import run_health_checks
|
||||||
|
|
||||||
|
reccontrol.stop_background()
|
||||||
|
_await_stopped()
|
||||||
|
path = log_path or config.DIAG_LOG
|
||||||
|
summary = summarize(path, last_n=last_n)
|
||||||
|
game = _game_from_summary(summary) or (reccontrol.read_status() or {}).get("game")
|
||||||
|
findings = run_health_checks()
|
||||||
|
return DiagnosticResult(game=game, summary=summary, findings=findings)
|
||||||
|
|
||||||
|
|
||||||
|
# --- hard-crash detection & post-crash analysis -----------------------------------
|
||||||
|
|
||||||
|
def _crash_from_log(path) -> CrashInfo | None:
|
||||||
|
"""CrashInfo if `path` holds an abnormally-ended session (start, no stop, not acked)."""
|
||||||
|
if not path.exists():
|
||||||
|
return None
|
||||||
|
summary = summarize(path)
|
||||||
|
kinds = {kind for _ts, kind, _detail in summary.events}
|
||||||
|
if "session-start" not in kinds:
|
||||||
|
return None
|
||||||
|
if "session-stop" in kinds or "diagnostic-acknowledged" in kinds:
|
||||||
|
return None
|
||||||
|
return CrashInfo(
|
||||||
|
game=_game_from_summary(summary),
|
||||||
|
samples=summary.samples,
|
||||||
|
when=summary.end,
|
||||||
|
gpu_lost="gpu-lost" in kinds,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _crash_path():
|
||||||
|
"""Where the pending crash lives: the preserved archive if present, else the live log."""
|
||||||
|
return config.DIAG_CRASH if config.DIAG_CRASH.exists() else config.DIAG_LOG
|
||||||
|
|
||||||
|
|
||||||
|
def pending_crash() -> CrashInfo | None:
|
||||||
|
"""Detect a diagnostic that ended abnormally (no clean stop, no live recorder).
|
||||||
|
|
||||||
|
A focused capture writes `session-start` (+ `game`) and, on a clean stop, `session-stop`.
|
||||||
|
After a hard freeze that block never runs, so the log has a start with no stop and no
|
||||||
|
live recorder — that's our hard-crash signal. A crash preserved across an auto-relaunch
|
||||||
|
(`DIAG_CRASH`) is checked first. Returns None if a capture is running, none is recorded,
|
||||||
|
it stopped cleanly, or the user already acknowledged it.
|
||||||
|
"""
|
||||||
|
info = _crash_from_log(config.DIAG_CRASH) # preserved across a relaunch (wrapper)
|
||||||
|
if info is not None:
|
||||||
|
return info
|
||||||
|
if is_running():
|
||||||
|
return None
|
||||||
|
return _crash_from_log(config.DIAG_LOG)
|
||||||
|
|
||||||
|
|
||||||
|
def acknowledge_crash() -> None:
|
||||||
|
"""Mark the recorded crash as seen so it stops prompting."""
|
||||||
|
try:
|
||||||
|
config.DIAG_CRASH.unlink() # drop the preserved archive, if any
|
||||||
|
except OSError:
|
||||||
|
pass
|
||||||
|
try:
|
||||||
|
config.DIAG_LOG.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
with open(config.DIAG_LOG, "a", encoding="utf-8") as fh:
|
||||||
|
fh.write(json.dumps({"ts": time.time(), "event": "diagnostic-acknowledged", "detail": ""}) + "\n")
|
||||||
|
except OSError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
def _crash_headline(summary: Summary) -> Finding:
|
||||||
|
gpu_lost = any(kind == "gpu-lost" for _ts, kind, _detail in summary.events)
|
||||||
|
when = time.strftime("%H:%M:%S", time.localtime(summary.end)) if summary.end else "?"
|
||||||
|
detail = (
|
||||||
|
f"The capture stopped abruptly at {when} after {summary.samples} samples, with no clean "
|
||||||
|
"shutdown recorded — consistent with a hard freeze or power loss."
|
||||||
|
)
|
||||||
|
if gpu_lost:
|
||||||
|
detail += " A GPU-lost event was captured during the session."
|
||||||
|
return Finding(
|
||||||
|
CRITICAL if gpu_lost else WARNING,
|
||||||
|
"Diagnostic",
|
||||||
|
"Session ended without a clean stop (likely a hard crash)",
|
||||||
|
detail,
|
||||||
|
"Review the last readings (Capture, above) and the crash-boot findings below.",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def analyze_crash(last_n: int = 15) -> DiagnosticResult:
|
||||||
|
"""Analyze a recorded hard crash: the captured window + the previous boot's kernel log
|
||||||
|
+ the rest of the health report (SMART/driver/persistence/temps)."""
|
||||||
|
from .health import check_previous_boot, run_health_checks
|
||||||
|
|
||||||
|
summary = summarize(_crash_path(), last_n=last_n)
|
||||||
|
findings: list[Finding] = [_crash_headline(summary)]
|
||||||
|
findings += check_previous_boot() # the crashed boot's kernel log
|
||||||
|
findings += run_health_checks(include_journal=False) # SMART/driver/persistence/temps
|
||||||
|
findings.sort(key=lambda f: _SEV_ORDER.get(f.severity, 9))
|
||||||
|
return DiagnosticResult(game=_game_from_summary(summary), summary=summary, findings=findings)
|
||||||
@@ -0,0 +1,177 @@
|
|||||||
|
"""Apply runtime-reversible system tunables (M6) — a limited, consent-gated exception to
|
||||||
|
the read-only stance (D9, amended by D22).
|
||||||
|
|
||||||
|
Only safe settings that take effect immediately, need no reboot, and revert on reboot are
|
||||||
|
applyable here: CPU governor, NVIDIA persistence mode, PCIe ASPM policy, vm.swappiness, and
|
||||||
|
Transparent HugePages. Each is set by a single privileged command (one pkexec prompt). The
|
||||||
|
chosen value is validated against the live options before building the command, and writes go
|
||||||
|
to sysfs / procfs (or `nvidia-smi`) — never the GRUB cmdline or a persistent config file.
|
||||||
|
Riskier fixes (GRUB-based PCIe ASPM-off, CPU mitigations) stay suggestion-only.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import os
|
||||||
|
import shlex
|
||||||
|
import shutil
|
||||||
|
import subprocess
|
||||||
|
from collections.abc import Callable
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class Tunable:
|
||||||
|
id: str
|
||||||
|
label: str # e.g. "CPU governor"
|
||||||
|
options: list[str] # selectable values (live, from the system)
|
||||||
|
current: str | None # the value in effect now (preselect this in the dropdown)
|
||||||
|
note: str = "" # caveat shown by the control, e.g. "resets on reboot"
|
||||||
|
|
||||||
|
|
||||||
|
def _read(path: str) -> str | None:
|
||||||
|
try:
|
||||||
|
return Path(path).read_text()
|
||||||
|
except OSError:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _bracketed(text: str) -> tuple[list[str], str | None]:
|
||||||
|
"""Parse a sysfs 'a [b] c' enum into (options, active)."""
|
||||||
|
options = [tok.strip("[]") for tok in text.split()]
|
||||||
|
active = next((tok.strip("[]") for tok in text.split() if tok.startswith("[")), None)
|
||||||
|
return options, active
|
||||||
|
|
||||||
|
|
||||||
|
# --- individual tunables: a state reader + a command builder per id -------------------
|
||||||
|
|
||||||
|
_GOV = "/sys/devices/system/cpu"
|
||||||
|
|
||||||
|
|
||||||
|
def _cpu_governor() -> Tunable | None:
|
||||||
|
cur = _read(f"{_GOV}/cpu0/cpufreq/scaling_governor")
|
||||||
|
if cur is None:
|
||||||
|
return None
|
||||||
|
avail = _read(f"{_GOV}/cpu0/cpufreq/scaling_available_governors")
|
||||||
|
options = avail.split() if avail and avail.strip() else ["performance", "powersave", "schedutil"]
|
||||||
|
return Tunable("cpu_governor", "CPU governor", options, cur.strip(), "applies now; resets on reboot")
|
||||||
|
|
||||||
|
|
||||||
|
def _cpu_governor_cmd(value: str) -> list[str]:
|
||||||
|
return ["/bin/sh", "-c",
|
||||||
|
f'for f in {_GOV}/cpu*/cpufreq/scaling_governor; do echo {shlex.quote(value)} > "$f"; done']
|
||||||
|
|
||||||
|
|
||||||
|
def _nvidia_persistence() -> Tunable | None:
|
||||||
|
if shutil.which("nvidia-smi") is None:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
proc = subprocess.run(
|
||||||
|
["nvidia-smi", "--query-gpu=persistence_mode", "--format=csv,noheader"],
|
||||||
|
capture_output=True, text=True, timeout=10,
|
||||||
|
)
|
||||||
|
except (subprocess.SubprocessError, OSError):
|
||||||
|
return None
|
||||||
|
state = proc.stdout.strip().splitlines()[0].strip().lower() if proc.stdout.strip() else ""
|
||||||
|
current = "Enabled" if state.startswith("enabled") else ("Disabled" if state.startswith("disabled") else None)
|
||||||
|
return Tunable("nvidia_persistence", "NVIDIA persistence mode", ["Enabled", "Disabled"], current,
|
||||||
|
"resets on reboot (enable nvidia-persistenced to persist)")
|
||||||
|
|
||||||
|
|
||||||
|
def _nvidia_persistence_cmd(value: str) -> list[str]:
|
||||||
|
return ["nvidia-smi", "-pm", "1" if value == "Enabled" else "0"]
|
||||||
|
|
||||||
|
|
||||||
|
def _pcie_aspm() -> Tunable | None:
|
||||||
|
text = _read("/sys/module/pcie_aspm/parameters/policy")
|
||||||
|
if not text:
|
||||||
|
return None
|
||||||
|
options, active = _bracketed(text)
|
||||||
|
return Tunable("pcie_aspm", "PCIe ASPM policy", options, active, "applies now; resets on reboot")
|
||||||
|
|
||||||
|
|
||||||
|
def _pcie_aspm_cmd(value: str) -> list[str]:
|
||||||
|
return ["/bin/sh", "-c", f'echo {shlex.quote(value)} > /sys/module/pcie_aspm/parameters/policy']
|
||||||
|
|
||||||
|
|
||||||
|
def _swappiness() -> Tunable | None:
|
||||||
|
text = _read("/proc/sys/vm/swappiness")
|
||||||
|
if text is None or not text.strip().isdigit():
|
||||||
|
return None
|
||||||
|
cur = text.strip()
|
||||||
|
options = ["0", "10", "30", "60", "100"]
|
||||||
|
if cur not in options:
|
||||||
|
options = sorted(set(options) | {cur}, key=int)
|
||||||
|
return Tunable("swappiness", "vm.swappiness", options, cur, "applies now; resets on reboot")
|
||||||
|
|
||||||
|
|
||||||
|
def _swappiness_cmd(value: str) -> list[str]:
|
||||||
|
return ["/bin/sh", "-c", f'echo {shlex.quote(value)} > /proc/sys/vm/swappiness']
|
||||||
|
|
||||||
|
|
||||||
|
def _thp() -> Tunable | None:
|
||||||
|
text = _read("/sys/kernel/mm/transparent_hugepage/enabled")
|
||||||
|
if not text:
|
||||||
|
return None
|
||||||
|
options, active = _bracketed(text)
|
||||||
|
return Tunable("thp", "Transparent HugePages", options, active, "applies now; resets on reboot")
|
||||||
|
|
||||||
|
|
||||||
|
def _thp_cmd(value: str) -> list[str]:
|
||||||
|
return ["/bin/sh", "-c", f'echo {shlex.quote(value)} > /sys/kernel/mm/transparent_hugepage/enabled']
|
||||||
|
|
||||||
|
|
||||||
|
_TUNABLES: dict[str, tuple[Callable[[], Tunable | None], Callable[[str], list[str]]]] = {
|
||||||
|
"cpu_governor": (_cpu_governor, _cpu_governor_cmd),
|
||||||
|
"nvidia_persistence": (_nvidia_persistence, _nvidia_persistence_cmd),
|
||||||
|
"pcie_aspm": (_pcie_aspm, _pcie_aspm_cmd),
|
||||||
|
"swappiness": (_swappiness, _swappiness_cmd),
|
||||||
|
"thp": (_thp, _thp_cmd),
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
# --- public API -----------------------------------------------------------------------
|
||||||
|
|
||||||
|
def get_tunable(fix_id: str) -> Tunable | None:
|
||||||
|
"""Live state (options + current value) for a fix id, or None if not applicable here."""
|
||||||
|
fns = _TUNABLES.get(fix_id)
|
||||||
|
return fns[0]() if fns else None
|
||||||
|
|
||||||
|
|
||||||
|
def apply_command(fix_id: str, value: str) -> list[str] | None:
|
||||||
|
"""The privileged command to set fix_id=value, or None if unknown/invalid.
|
||||||
|
|
||||||
|
The value is validated against the *live* options, so only a real, currently-available
|
||||||
|
setting can ever be turned into a command.
|
||||||
|
"""
|
||||||
|
fns = _TUNABLES.get(fix_id)
|
||||||
|
if not fns:
|
||||||
|
return None
|
||||||
|
state = fns[0]()
|
||||||
|
if state is None or value not in state.options:
|
||||||
|
return None
|
||||||
|
return fns[1](value)
|
||||||
|
|
||||||
|
|
||||||
|
def _elevate(cmd: list[str]) -> list[str]:
|
||||||
|
prog = shutil.which(cmd[0]) or cmd[0] # pkexec needs an absolute program path
|
||||||
|
cmd = [prog, *cmd[1:]]
|
||||||
|
if os.geteuid() == 0:
|
||||||
|
return cmd
|
||||||
|
if shutil.which("pkexec"):
|
||||||
|
return ["pkexec", *cmd]
|
||||||
|
if shutil.which("sudo"):
|
||||||
|
return ["sudo", *cmd]
|
||||||
|
return cmd # no escalation available — will likely fail, surfaced to the caller
|
||||||
|
|
||||||
|
|
||||||
|
def apply(fix_id: str, value: str) -> tuple[int, str]:
|
||||||
|
"""Apply fix_id=value via a single elevated command. Returns (exit_code, output)."""
|
||||||
|
cmd = apply_command(fix_id, value)
|
||||||
|
if cmd is None:
|
||||||
|
return (1, f"Unknown or unavailable setting: {fix_id}={value}")
|
||||||
|
try:
|
||||||
|
proc = subprocess.run(_elevate(cmd), capture_output=True, text=True, timeout=120)
|
||||||
|
return (proc.returncode, proc.stdout + proc.stderr)
|
||||||
|
except (subprocess.SubprocessError, OSError) as exc:
|
||||||
|
return (1, str(exc))
|
||||||
@@ -49,15 +49,18 @@ def evaluate_aspm(policy_text: str | None) -> Finding | None:
|
|||||||
WARNING, "PCIe", f"PCIe ASPM is in power-saving mode ({active})",
|
WARNING, "PCIe", f"PCIe ASPM is in power-saving mode ({active})",
|
||||||
"Aggressive PCIe Active-State Power Management can cause the GPU to drop off the "
|
"Aggressive PCIe Active-State Power Management can cause the GPU to drop off the "
|
||||||
"bus under load (Xid 79) or stutter — the seed-case failure mode.",
|
"bus under load (Xid 79) or stutter — the seed-case failure mode.",
|
||||||
"Disable ASPM via the kernel cmdline: add `pcie_aspm=off` (and optionally "
|
"Set the policy to performance below (live), or for a permanent change add "
|
||||||
"`pcie_aspm.policy=performance`) in GRUB, then `sudo update-grub` and reboot.",
|
"`pcie_aspm=off` in GRUB, then `sudo update-grub` and reboot.",
|
||||||
|
fix="pcie_aspm",
|
||||||
)
|
)
|
||||||
if active == "performance":
|
if active == "performance":
|
||||||
return Finding(OK, "PCIe", "PCIe ASPM set to performance", "ASPM power-saving is disabled.")
|
return Finding(OK, "PCIe", "PCIe ASPM set to performance", "ASPM power-saving is disabled.",
|
||||||
|
fix="pcie_aspm")
|
||||||
return Finding(
|
return Finding(
|
||||||
INFO, "PCIe", f"PCIe ASPM policy: {active}",
|
INFO, "PCIe", f"PCIe ASPM policy: {active}",
|
||||||
"ASPM is left to the kernel/BIOS default.",
|
"ASPM is left to the kernel/BIOS default.",
|
||||||
"If you see GPU bus-drop events (Xid 79), try `pcie_aspm=off` on the kernel cmdline.",
|
"If you see GPU bus-drop events (Xid 79), set the policy to performance below.",
|
||||||
|
fix="pcie_aspm",
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@@ -84,11 +87,13 @@ def check_gpu_persistence() -> list[Finding]:
|
|||||||
INFO, "GPU", "NVIDIA persistence mode is off",
|
INFO, "GPU", "NVIDIA persistence mode is off",
|
||||||
"The driver unloads when no client is attached, adding latency on first GPU "
|
"The driver unloads when no client is attached, adding latency on first GPU "
|
||||||
"access and churning state between game launches.",
|
"access and churning state between game launches.",
|
||||||
"Enable it: `sudo nvidia-smi -pm 1` (per-boot), or enable the "
|
"Enable it below (per-boot), or enable the `nvidia-persistenced` service to "
|
||||||
"`nvidia-persistenced` service to make it permanent.",
|
"make it permanent.",
|
||||||
|
fix="nvidia_persistence",
|
||||||
)]
|
)]
|
||||||
if state.lower().startswith("enabled"):
|
if state.lower().startswith("enabled"):
|
||||||
return [Finding(OK, "GPU", "NVIDIA persistence mode on", "The driver stays resident.")]
|
return [Finding(OK, "GPU", "NVIDIA persistence mode on", "The driver stays resident.",
|
||||||
|
fix="nvidia_persistence")]
|
||||||
return []
|
return []
|
||||||
|
|
||||||
|
|
||||||
@@ -99,18 +104,20 @@ def evaluate_governor(governors: set[str]) -> Finding | None:
|
|||||||
return None
|
return None
|
||||||
shown = ", ".join(sorted(governors))
|
shown = ", ".join(sorted(governors))
|
||||||
if governors == {"performance"}:
|
if governors == {"performance"}:
|
||||||
return Finding(OK, "CPU", "CPU governor: performance", "CPUs run at full clocks under load.")
|
return Finding(OK, "CPU", "CPU governor: performance", "CPUs run at full clocks under load.",
|
||||||
|
fix="cpu_governor")
|
||||||
if "powersave" in governors:
|
if "powersave" in governors:
|
||||||
return Finding(
|
return Finding(
|
||||||
WARNING, "CPU", f"CPU governor set to power-saving ({shown})",
|
WARNING, "CPU", f"CPU governor set to power-saving ({shown})",
|
||||||
"A powersave governor caps CPU frequency and can bottleneck frame times.",
|
"A powersave governor caps CPU frequency and can bottleneck frame times.",
|
||||||
"Set performance: `sudo cpupower frequency-set -g performance` "
|
"Set it to performance below (or install GameMode to switch it per-game).",
|
||||||
"(install `linux-tools-common`/`cpupower`), or install GameMode to switch it per-game.",
|
fix="cpu_governor",
|
||||||
)
|
)
|
||||||
return Finding(
|
return Finding(
|
||||||
INFO, "CPU", f"CPU governor: {shown}",
|
INFO, "CPU", f"CPU governor: {shown}",
|
||||||
"A dynamic governor scales with load; usually fine.",
|
"A dynamic governor scales with load; usually fine.",
|
||||||
"For the most consistent frame pacing, `performance` (or GameMode) avoids ramp-up lag.",
|
"For the most consistent frame pacing, set performance below (or use GameMode).",
|
||||||
|
fix="cpu_governor",
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@@ -137,6 +144,7 @@ def check_gamemode() -> list[Finding]:
|
|||||||
"GameMode auto-applies performance tweaks (governor, scheduling) for the duration of a game.",
|
"GameMode auto-applies performance tweaks (governor, scheduling) for the duration of a game.",
|
||||||
"Install it: `sudo apt install gamemode`, then launch games with `gamemoderun %command%` "
|
"Install it: `sudo apt install gamemode`, then launch games with `gamemoderun %command%` "
|
||||||
"(or use a global Steam launch option).",
|
"(or use a global Steam launch option).",
|
||||||
|
action="gamemode",
|
||||||
)]
|
)]
|
||||||
|
|
||||||
|
|
||||||
@@ -147,6 +155,7 @@ def check_mangohud() -> list[Finding]:
|
|||||||
INFO, "Tools", "MangoHud not installed",
|
INFO, "Tools", "MangoHud not installed",
|
||||||
"MangoHud overlays live FPS, frame times, and temps in-game — handy for spotting stutter.",
|
"MangoHud overlays live FPS, frame times, and temps in-game — handy for spotting stutter.",
|
||||||
"Install it: `sudo apt install mangohud`, then launch with `mangohud %command%`.",
|
"Install it: `sudo apt install mangohud`, then launch with `mangohud %command%`.",
|
||||||
|
action="mangohud",
|
||||||
)]
|
)]
|
||||||
|
|
||||||
|
|
||||||
@@ -158,9 +167,11 @@ def evaluate_swappiness(value: int) -> Finding:
|
|||||||
INFO, "Memory", f"vm.swappiness is high ({value})",
|
INFO, "Memory", f"vm.swappiness is high ({value})",
|
||||||
"A high swappiness lets the kernel swap out memory eagerly, which can cause "
|
"A high swappiness lets the kernel swap out memory eagerly, which can cause "
|
||||||
"hitching during gaming on systems with ample RAM.",
|
"hitching during gaming on systems with ample RAM.",
|
||||||
"Lower it: `sudo sysctl vm.swappiness=10` (persist in /etc/sysctl.d/99-rigdoctor.conf).",
|
"Lower it below (e.g. 10); applies immediately.",
|
||||||
|
fix="swappiness",
|
||||||
)
|
)
|
||||||
return Finding(OK, "Memory", f"vm.swappiness is {value}", "Swapping is conservative.")
|
return Finding(OK, "Memory", f"vm.swappiness is {value}", "Swapping is conservative.",
|
||||||
|
fix="swappiness")
|
||||||
|
|
||||||
|
|
||||||
def check_swappiness() -> list[Finding]:
|
def check_swappiness() -> list[Finding]:
|
||||||
@@ -204,7 +215,8 @@ def check_thp() -> list[Finding]:
|
|||||||
return [Finding(
|
return [Finding(
|
||||||
INFO, "Memory", "Transparent HugePages disabled (never)",
|
INFO, "Memory", "Transparent HugePages disabled (never)",
|
||||||
"Some workloads benefit from THP; 'madvise' lets apps opt in without the downsides of 'always'.",
|
"Some workloads benefit from THP; 'madvise' lets apps opt in without the downsides of 'always'.",
|
||||||
"Optional: `echo madvise | sudo tee /sys/kernel/mm/transparent_hugepage/enabled`.",
|
"Optional: set 'madvise' below; applies immediately.",
|
||||||
|
fix="thp",
|
||||||
)]
|
)]
|
||||||
return []
|
return []
|
||||||
|
|
||||||
|
|||||||
@@ -27,6 +27,8 @@ class Finding:
|
|||||||
title: str
|
title: str
|
||||||
detail: str = ""
|
detail: str = ""
|
||||||
suggestion: str = ""
|
suggestion: str = ""
|
||||||
|
action: str = "" # optional: id of an installable catalog component (for an Install button)
|
||||||
|
fix: str = "" # optional: id of an applyable runtime tunable (for an Apply dropdown, M6)
|
||||||
|
|
||||||
|
|
||||||
# --- NVIDIA Xid knowledge (the seed crash is Xid 79) --------------------------
|
# --- NVIDIA Xid knowledge (the seed crash is Xid 79) --------------------------
|
||||||
@@ -144,6 +146,22 @@ def check_journal() -> list[Finding]:
|
|||||||
return findings
|
return findings
|
||||||
|
|
||||||
|
|
||||||
|
def check_previous_boot() -> list[Finding]:
|
||||||
|
"""Scan the previous boot's kernel log — the boot that crashed — for fault signatures.
|
||||||
|
|
||||||
|
Needs persistent journald (else the crashed boot's logs were lost on reboot, which the
|
||||||
|
persistence check flags separately). Findings are framed as coming from that boot.
|
||||||
|
"""
|
||||||
|
out = _journalctl(["-k", "-b", "-1", "--no-pager", "-o", "cat"])
|
||||||
|
if not out or not out.strip():
|
||||||
|
return []
|
||||||
|
tagged = []
|
||||||
|
for f in scan_journal_text(out):
|
||||||
|
detail = ("Logged during the previous (crashed) boot. " + (f.detail or "")).strip()
|
||||||
|
tagged.append(Finding(f.severity, f.category, f.title, detail, f.suggestion))
|
||||||
|
return tagged
|
||||||
|
|
||||||
|
|
||||||
def check_journal_persistence() -> list[Finding]:
|
def check_journal_persistence() -> list[Finding]:
|
||||||
if Path("/var/log/journal").is_dir():
|
if Path("/var/log/journal").is_dir():
|
||||||
return []
|
return []
|
||||||
@@ -233,16 +251,20 @@ def check_live_temps() -> list[Finding]:
|
|||||||
)]
|
)]
|
||||||
|
|
||||||
|
|
||||||
def run_health_checks() -> list[Finding]:
|
def run_health_checks(include_journal: bool = True) -> list[Finding]:
|
||||||
"""Run all checks and return findings sorted by severity (worst first).
|
"""Run all checks and return findings sorted by severity (worst first).
|
||||||
|
|
||||||
SMART needs root; if the session collected it via launch elevation, use that
|
SMART needs root; if the session collected it via launch elevation, use that
|
||||||
instead of re-running smartctl (which would just report "needs root").
|
instead of re-running smartctl (which would just report "needs root").
|
||||||
|
|
||||||
|
`include_journal=False` skips the 7-day kernel-journal scan — used by the crash
|
||||||
|
analysis, which scans the previous (crashed) boot specifically instead.
|
||||||
"""
|
"""
|
||||||
from . import elevation
|
from . import elevation
|
||||||
|
|
||||||
findings: list[Finding] = []
|
findings: list[Finding] = []
|
||||||
findings += check_nvidia_driver()
|
findings += check_nvidia_driver()
|
||||||
|
if include_journal:
|
||||||
findings += check_journal()
|
findings += check_journal()
|
||||||
findings += check_journal_persistence()
|
findings += check_journal_persistence()
|
||||||
priv = elevation.privileged()
|
priv = elevation.privileged()
|
||||||
|
|||||||
@@ -38,7 +38,9 @@ def read_status() -> dict | None:
|
|||||||
return None
|
return None
|
||||||
|
|
||||||
|
|
||||||
def start_background(interval: float | None = None, out: str | None = None) -> int | None:
|
def start_background(
|
||||||
|
interval: float | None = None, out: str | None = None, game: str | None = None
|
||||||
|
) -> int | None:
|
||||||
"""Spawn a detached `record run`. Returns the child pid, or None if already running."""
|
"""Spawn a detached `record run`. Returns the child pid, or None if already running."""
|
||||||
if running_pid():
|
if running_pid():
|
||||||
return None
|
return None
|
||||||
@@ -48,6 +50,8 @@ def start_background(interval: float | None = None, out: str | None = None) -> i
|
|||||||
cmd += ["--interval", str(interval)]
|
cmd += ["--interval", str(interval)]
|
||||||
if out:
|
if out:
|
||||||
cmd += ["--out", out]
|
cmd += ["--out", out]
|
||||||
|
if game:
|
||||||
|
cmd += ["--game", game]
|
||||||
out_fh = open(config.SPAWN_LOG, "a")
|
out_fh = open(config.SPAWN_LOG, "a")
|
||||||
proc = subprocess.Popen(
|
proc = subprocess.Popen(
|
||||||
cmd,
|
cmd,
|
||||||
|
|||||||
@@ -27,12 +27,14 @@ class Recorder:
|
|||||||
backups: int = 10,
|
backups: int = 10,
|
||||||
status_path=None,
|
status_path=None,
|
||||||
sampler: Sampler | None = None,
|
sampler: Sampler | None = None,
|
||||||
|
game: str | None = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
self.interval = interval
|
self.interval = interval
|
||||||
self.sampler = sampler or Sampler(available_sources())
|
self.sampler = sampler or Sampler(available_sources())
|
||||||
self.writer = CrashLogWriter(log_path, max_bytes, backups)
|
self.writer = CrashLogWriter(log_path, max_bytes, backups)
|
||||||
self.log_path = Path(log_path)
|
self.log_path = Path(log_path)
|
||||||
self.status_path = Path(status_path) if status_path else None
|
self.status_path = Path(status_path) if status_path else None
|
||||||
|
self.game = game or None
|
||||||
self.samples = 0
|
self.samples = 0
|
||||||
self._stop = threading.Event()
|
self._stop = threading.Event()
|
||||||
self._gpu_lost = False
|
self._gpu_lost = False
|
||||||
@@ -43,6 +45,8 @@ class Recorder:
|
|||||||
|
|
||||||
def run(self) -> None:
|
def run(self) -> None:
|
||||||
self.writer.write_event("session-start", f"interval={self.interval:g}s")
|
self.writer.write_event("session-start", f"interval={self.interval:g}s")
|
||||||
|
if self.game:
|
||||||
|
self.writer.write_event("game", self.game) # tag the focused-diagnostic target
|
||||||
self._write_status(running=True)
|
self._write_status(running=True)
|
||||||
try:
|
try:
|
||||||
while not self._stop.is_set():
|
while not self._stop.is_set():
|
||||||
@@ -81,6 +85,7 @@ class Recorder:
|
|||||||
"samples": self.samples,
|
"samples": self.samples,
|
||||||
"updated": time.time(),
|
"updated": time.time(),
|
||||||
"gpu_lost": self._gpu_lost,
|
"gpu_lost": self._gpu_lost,
|
||||||
|
"game": self.game,
|
||||||
}
|
}
|
||||||
if sample is not None:
|
if sample is not None:
|
||||||
data["latest"] = headline(sample)
|
data["latest"] = headline(sample)
|
||||||
|
|||||||
@@ -15,6 +15,8 @@ from __future__ import annotations
|
|||||||
|
|
||||||
import json
|
import json
|
||||||
import os
|
import os
|
||||||
|
import shutil
|
||||||
|
import subprocess
|
||||||
import time
|
import time
|
||||||
from dataclasses import asdict, dataclass
|
from dataclasses import asdict, dataclass
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
@@ -351,6 +353,24 @@ def acknowledge_new() -> None:
|
|||||||
|
|
||||||
# --- formatting -----------------------------------------------------------------------
|
# --- formatting -----------------------------------------------------------------------
|
||||||
|
|
||||||
|
def launch_game(appid: str) -> bool:
|
||||||
|
"""Best-effort: ask Steam to launch a game by appid (steam:// URL). Non-blocking."""
|
||||||
|
if not appid:
|
||||||
|
return False
|
||||||
|
url = f"steam://rungameid/{appid}"
|
||||||
|
for cmd in (["steam", url], ["xdg-open", url]):
|
||||||
|
if shutil.which(cmd[0]):
|
||||||
|
try:
|
||||||
|
subprocess.Popen(
|
||||||
|
cmd, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL,
|
||||||
|
stdin=subprocess.DEVNULL, start_new_session=True,
|
||||||
|
)
|
||||||
|
return True
|
||||||
|
except (OSError, subprocess.SubprocessError):
|
||||||
|
continue
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
def human_size(num_bytes: int) -> str:
|
def human_size(num_bytes: int) -> str:
|
||||||
if num_bytes <= 0:
|
if num_bytes <= 0:
|
||||||
return "—"
|
return "—"
|
||||||
|
|||||||
@@ -0,0 +1,78 @@
|
|||||||
|
"""Steam-launch wrapper (D12): auto-bracket a focused diagnostic around a game.
|
||||||
|
|
||||||
|
Set as a per-game Steam launch option — `rigdoctor wrap %command%` — or in Lutris/Heroic's
|
||||||
|
wrapper field. Steam expands `%command%` to the real game command; we start a focused capture
|
||||||
|
(tagged with the game), run the game, and stop the capture cleanly when it exits. A hard
|
||||||
|
freeze means the game (and this wrapper) never returns, so the capture is left without a clean
|
||||||
|
stop — which RigDoctor then flags as a crash on next launch.
|
||||||
|
|
||||||
|
Deterministic and daemonless (D12 "build first"): no polling, and it knows the title.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import os
|
||||||
|
import signal
|
||||||
|
import subprocess
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
|
||||||
|
def game_name_from_env() -> str | None:
|
||||||
|
"""The launching game's name, resolved from Steam's SteamAppId env var via the scan."""
|
||||||
|
appid = os.environ.get("SteamAppId") or os.environ.get("SteamGameId")
|
||||||
|
if not appid:
|
||||||
|
return None
|
||||||
|
from . import steam
|
||||||
|
|
||||||
|
games = steam.cached_games() or steam.scan_games(steam.selected_library_paths())
|
||||||
|
for game in games:
|
||||||
|
if game.appid == str(appid):
|
||||||
|
return game.name
|
||||||
|
return f"Steam app {appid}"
|
||||||
|
|
||||||
|
|
||||||
|
def launch_option() -> str:
|
||||||
|
"""The exact string to paste into Steam's Launch Options (absolute path → PATH-proof)."""
|
||||||
|
exe = Path(sys.executable).with_name("rigdoctor")
|
||||||
|
prog = str(exe) if exe.exists() else "rigdoctor"
|
||||||
|
quoted = f'"{prog}"' if " " in prog else prog
|
||||||
|
return f"{quoted} wrap %command%"
|
||||||
|
|
||||||
|
|
||||||
|
def run(command: list[str]) -> int:
|
||||||
|
"""Start a focused capture (unless one's already running), run the game, then stop it.
|
||||||
|
Returns the game's exit code so Steam sees the right status."""
|
||||||
|
from . import diagnostic, reccontrol
|
||||||
|
|
||||||
|
if not command:
|
||||||
|
print("usage: rigdoctor wrap %command% (set as a Steam launch option)", file=sys.stderr)
|
||||||
|
return 2
|
||||||
|
|
||||||
|
game = game_name_from_env() or os.path.basename(command[0])
|
||||||
|
started = False
|
||||||
|
if not reccontrol.running_pid(): # don't disturb an existing capture
|
||||||
|
started = diagnostic.start(game=game) is not None
|
||||||
|
|
||||||
|
proc: subprocess.Popen | None = None
|
||||||
|
|
||||||
|
def _forward(signum, _frame): # pass Steam's stop signal to the game
|
||||||
|
if proc is not None and proc.poll() is None:
|
||||||
|
try:
|
||||||
|
proc.send_signal(signum)
|
||||||
|
except OSError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
previous = {sig: signal.signal(sig, _forward) for sig in (signal.SIGTERM, signal.SIGINT)}
|
||||||
|
try:
|
||||||
|
proc = subprocess.Popen(command)
|
||||||
|
rc = proc.wait()
|
||||||
|
except (OSError, ValueError, subprocess.SubprocessError) as exc:
|
||||||
|
print(f"rigdoctor wrap: couldn't launch the game: {exc}", file=sys.stderr)
|
||||||
|
rc = 1
|
||||||
|
finally:
|
||||||
|
for sig, handler in previous.items():
|
||||||
|
signal.signal(sig, handler)
|
||||||
|
if started:
|
||||||
|
reccontrol.stop_background() # clean stop → no false crash flag
|
||||||
|
return rc
|
||||||
@@ -17,19 +17,19 @@ from PySide6.QtWidgets import (
|
|||||||
|
|
||||||
from ..core.sample import Sample
|
from ..core.sample import Sample
|
||||||
from ..render import metric_label
|
from ..render import metric_label
|
||||||
from .widgets import Card, MetricBar, MetricRow, StatGauge
|
from .widgets import Card, HistoryGraph, MetricBar, MetricRow
|
||||||
|
|
||||||
_GROUP_ORDER = ["gpu", "cpu", "memory", "storage"]
|
_GROUP_ORDER = ["gpu", "cpu", "memory", "storage"]
|
||||||
_GROUP_TITLES = {"gpu": "GPU", "cpu": "CPU", "memory": "Memory", "storage": "Storage"}
|
_GROUP_TITLES = {"gpu": "GPU", "cpu": "CPU", "memory": "Memory", "storage": "Storage"}
|
||||||
_BAR_METRICS = {"util", "mem_util", "fan", "used_pct"}
|
_BAR_METRICS = {"util", "mem_util", "fan", "used_pct"}
|
||||||
|
|
||||||
|
|
||||||
def _gauge_card(gauge: StatGauge) -> QFrame:
|
def _tile_card(widget: QWidget) -> QFrame:
|
||||||
card = QFrame()
|
card = QFrame()
|
||||||
card.setObjectName("Card")
|
card.setObjectName("Card")
|
||||||
layout = QVBoxLayout(card)
|
layout = QVBoxLayout(card)
|
||||||
layout.setContentsMargins(6, 14, 6, 8)
|
layout.setContentsMargins(6, 10, 6, 8)
|
||||||
layout.addWidget(gauge)
|
layout.addWidget(widget)
|
||||||
return card
|
return card
|
||||||
|
|
||||||
|
|
||||||
@@ -54,16 +54,16 @@ class Dashboard(QWidget):
|
|||||||
header.addWidget(self._updated)
|
header.addWidget(self._updated)
|
||||||
root.addLayout(header)
|
root.addLayout(header)
|
||||||
|
|
||||||
# Headline gauges
|
# Headline trend graphs (history over the session, not just the live value)
|
||||||
self._g_gpu_temp = StatGauge("GPU Temp", "°C", 100, "temp")
|
self._g_gpu_temp = HistoryGraph("GPU Temp", "°C", 30, 100, "temp")
|
||||||
self._g_gpu_load = StatGauge("GPU Load", "%", 100, "accent")
|
self._g_gpu_load = HistoryGraph("GPU Load", "%", 0, 100, "accent")
|
||||||
self._g_cpu_temp = StatGauge("CPU Temp", "°C", 100, "temp")
|
self._g_cpu_temp = HistoryGraph("CPU Temp", "°C", 30, 100, "temp")
|
||||||
self._g_mem = StatGauge("Memory", "%", 100, "usage")
|
self._g_mem = HistoryGraph("Memory", "%", 0, 100, "usage")
|
||||||
gauges = QHBoxLayout()
|
graphs = QHBoxLayout()
|
||||||
gauges.setSpacing(14)
|
graphs.setSpacing(14)
|
||||||
for g in (self._g_gpu_temp, self._g_gpu_load, self._g_cpu_temp, self._g_mem):
|
for g in (self._g_gpu_temp, self._g_gpu_load, self._g_cpu_temp, self._g_mem):
|
||||||
gauges.addWidget(_gauge_card(g))
|
graphs.addWidget(_tile_card(g))
|
||||||
root.addLayout(gauges)
|
root.addLayout(graphs)
|
||||||
|
|
||||||
# Per-subsystem cards (scrollable, 2-column grid)
|
# Per-subsystem cards (scrollable, 2-column grid)
|
||||||
scroll = QScrollArea()
|
scroll = QScrollArea()
|
||||||
@@ -81,10 +81,10 @@ class Dashboard(QWidget):
|
|||||||
root.addWidget(scroll, 1)
|
root.addWidget(scroll, 1)
|
||||||
|
|
||||||
def update_sample(self, sample: Sample) -> None:
|
def update_sample(self, sample: Sample) -> None:
|
||||||
self._g_gpu_temp.set_value(self._val(sample, "gpu", "temp", ""))
|
self._g_gpu_temp.add_value(self._val(sample, "gpu", "temp", ""))
|
||||||
self._g_gpu_load.set_value(self._val(sample, "gpu", "util"))
|
self._g_gpu_load.add_value(self._val(sample, "gpu", "util"))
|
||||||
self._g_cpu_temp.set_value(self._cpu_temp(sample))
|
self._g_cpu_temp.add_value(self._cpu_temp(sample))
|
||||||
self._g_mem.set_value(self._val(sample, "memory", "used_pct"))
|
self._g_mem.add_value(self._val(sample, "memory", "used_pct"))
|
||||||
|
|
||||||
keys = [r.key for r in sample.readings]
|
keys = [r.key for r in sample.readings]
|
||||||
if keys != self._built_keys: # sources appeared/disappeared
|
if keys != self._built_keys: # sources appeared/disappeared
|
||||||
|
|||||||
@@ -0,0 +1,81 @@
|
|||||||
|
"""Results view for a guided diagnostic session (M6/D12): capture summary + findings."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from PySide6.QtCore import Qt
|
||||||
|
from PySide6.QtGui import QFont
|
||||||
|
from PySide6.QtWidgets import (
|
||||||
|
QDialog,
|
||||||
|
QFrame,
|
||||||
|
QHBoxLayout,
|
||||||
|
QLabel,
|
||||||
|
QPushButton,
|
||||||
|
QScrollArea,
|
||||||
|
QVBoxLayout,
|
||||||
|
QWidget,
|
||||||
|
)
|
||||||
|
|
||||||
|
from ..render import render_summary
|
||||||
|
from .widgets import finding_card
|
||||||
|
|
||||||
|
|
||||||
|
class DiagnosticDialog(QDialog):
|
||||||
|
def __init__(self, result, parent=None) -> None:
|
||||||
|
super().__init__(parent)
|
||||||
|
self.setWindowTitle(f"Diagnostic — {result.game}" if result.game else "Diagnostic")
|
||||||
|
self.resize(660, 680)
|
||||||
|
|
||||||
|
root = QVBoxLayout(self)
|
||||||
|
root.setContentsMargins(20, 18, 20, 16)
|
||||||
|
root.setSpacing(14)
|
||||||
|
|
||||||
|
title = QLabel(f"Diagnostic — {result.game}" if result.game else "Diagnostic")
|
||||||
|
title.setObjectName("PageTitle")
|
||||||
|
root.addWidget(title)
|
||||||
|
|
||||||
|
scroll = QScrollArea()
|
||||||
|
scroll.setWidgetResizable(True)
|
||||||
|
scroll.setFrameShape(QFrame.Shape.NoFrame)
|
||||||
|
scroll.setStyleSheet("background: transparent;")
|
||||||
|
body = QWidget()
|
||||||
|
col = QVBoxLayout(body)
|
||||||
|
col.setContentsMargins(0, 0, 0, 0)
|
||||||
|
col.setSpacing(10)
|
||||||
|
col.setAlignment(Qt.AlignmentFlag.AlignTop)
|
||||||
|
|
||||||
|
# Capture window summary (peaks / events / last samples) — monospace for the columns.
|
||||||
|
cap_head = QLabel("Capture")
|
||||||
|
cap_head.setStyleSheet("font-weight: 700; background: transparent;")
|
||||||
|
col.addWidget(cap_head)
|
||||||
|
summary = QLabel(render_summary(result.summary))
|
||||||
|
summary.setObjectName("Report")
|
||||||
|
summary.setFont(QFont("monospace"))
|
||||||
|
summary.setTextInteractionFlags(Qt.TextInteractionFlag.TextSelectableByMouse)
|
||||||
|
summary.setWordWrap(False)
|
||||||
|
summary.setStyleSheet(
|
||||||
|
"background: #0d0f13; color: #cfd3da; border: 1px solid #2a2f39; "
|
||||||
|
"border-radius: 8px; padding: 10px;"
|
||||||
|
)
|
||||||
|
col.addWidget(summary)
|
||||||
|
|
||||||
|
find_head = QLabel(f"Findings ({len(result.findings)})")
|
||||||
|
find_head.setStyleSheet("font-weight: 700; background: transparent;")
|
||||||
|
col.addWidget(find_head)
|
||||||
|
if result.findings:
|
||||||
|
for finding in result.findings:
|
||||||
|
col.addWidget(finding_card(finding))
|
||||||
|
else:
|
||||||
|
none = QLabel("No findings.")
|
||||||
|
none.setObjectName("Muted")
|
||||||
|
col.addWidget(none)
|
||||||
|
|
||||||
|
scroll.setWidget(body)
|
||||||
|
root.addWidget(scroll, 1)
|
||||||
|
|
||||||
|
buttons = QHBoxLayout()
|
||||||
|
buttons.addStretch(1)
|
||||||
|
close = QPushButton("Close")
|
||||||
|
close.setObjectName("PrimaryButton")
|
||||||
|
close.clicked.connect(self.accept)
|
||||||
|
buttons.addWidget(close)
|
||||||
|
root.addLayout(buttons)
|
||||||
@@ -19,13 +19,27 @@ from PySide6.QtWidgets import (
|
|||||||
from .widgets import finding_card
|
from .widgets import finding_card
|
||||||
|
|
||||||
|
|
||||||
|
def _fail_reason(out: str) -> str:
|
||||||
|
"""Turn the failed command's output into a short, human reason."""
|
||||||
|
low = (out or "").lower()
|
||||||
|
if "not authorized" in low or "dismissed" in low or "authentication" in low:
|
||||||
|
return "cancelled at the password prompt"
|
||||||
|
if "operation not permitted" in low or "invalid argument" in low or "permission denied" in low:
|
||||||
|
return "the system rejected the change (it may be locked by BIOS/kernel)"
|
||||||
|
last = next((ln.strip() for ln in reversed((out or "").splitlines()) if ln.strip()), "")
|
||||||
|
return (last[:80] or "no privileges, or cancelled")
|
||||||
|
|
||||||
|
|
||||||
class EnvironmentPage(QWidget):
|
class EnvironmentPage(QWidget):
|
||||||
_result = Signal(object) # list[Finding]
|
_result = Signal(object) # list[Finding]
|
||||||
|
_action_done = Signal(object) # (label, rc, output) — install or apply finished
|
||||||
|
|
||||||
def __init__(self) -> None:
|
def __init__(self) -> None:
|
||||||
super().__init__()
|
super().__init__()
|
||||||
self.setObjectName("Page")
|
self.setObjectName("Page")
|
||||||
self._result.connect(self._render_findings)
|
self._result.connect(self._render_findings)
|
||||||
|
self._action_done.connect(self._on_action_done)
|
||||||
|
self._busy = False
|
||||||
|
|
||||||
root = QVBoxLayout(self)
|
root = QVBoxLayout(self)
|
||||||
root.setContentsMargins(20, 18, 20, 18)
|
root.setContentsMargins(20, 18, 20, 18)
|
||||||
@@ -100,5 +114,43 @@ class EnvironmentPage(QWidget):
|
|||||||
f"{time.strftime('%H:%M:%S')}"
|
f"{time.strftime('%H:%M:%S')}"
|
||||||
)
|
)
|
||||||
for finding in findings:
|
for finding in findings:
|
||||||
self._list.addWidget(finding_card(finding))
|
self._list.addWidget(finding_card(finding, on_install=self._install, on_apply=self._apply))
|
||||||
self._list.addStretch(1)
|
self._list.addStretch(1)
|
||||||
|
|
||||||
|
def _install(self, component) -> None:
|
||||||
|
if self._busy:
|
||||||
|
return
|
||||||
|
self._busy = True
|
||||||
|
self._run_btn.setEnabled(False)
|
||||||
|
self._status.setText(f"Installing {component.name}… (may prompt for your password)")
|
||||||
|
threading.Thread(target=self._work_install, args=(component,), daemon=True).start()
|
||||||
|
|
||||||
|
def _work_install(self, component) -> None:
|
||||||
|
from ..core import installer
|
||||||
|
|
||||||
|
rc, out = installer.install_packages(list(component.apt))
|
||||||
|
self._action_done.emit((component.name, rc, out))
|
||||||
|
|
||||||
|
def _apply(self, fix_id: str, value: str) -> None:
|
||||||
|
if self._busy:
|
||||||
|
return
|
||||||
|
self._busy = True
|
||||||
|
self._run_btn.setEnabled(False)
|
||||||
|
self._status.setText(f"Applying {value}… (may prompt for your password)")
|
||||||
|
threading.Thread(target=self._work_apply, args=(fix_id, value), daemon=True).start()
|
||||||
|
|
||||||
|
def _work_apply(self, fix_id: str, value: str) -> None:
|
||||||
|
from ..core import fixes
|
||||||
|
|
||||||
|
rc, out = fixes.apply(fix_id, value)
|
||||||
|
self._action_done.emit((value, rc, out))
|
||||||
|
|
||||||
|
def _on_action_done(self, result) -> None:
|
||||||
|
label, rc, out = result
|
||||||
|
self._busy = False
|
||||||
|
if rc == 0:
|
||||||
|
self._status.setText(f"{label} applied — re-checking…")
|
||||||
|
self._run() # re-run so the finding reflects the new state
|
||||||
|
else:
|
||||||
|
self._run_btn.setEnabled(True)
|
||||||
|
self._status.setText(f"'{label}' failed — {_fail_reason(out)}")
|
||||||
|
|||||||
@@ -13,10 +13,14 @@ import time
|
|||||||
|
|
||||||
from PySide6.QtCore import Qt, QTimer, Signal
|
from PySide6.QtCore import Qt, QTimer, Signal
|
||||||
from PySide6.QtWidgets import (
|
from PySide6.QtWidgets import (
|
||||||
|
QApplication,
|
||||||
QCheckBox,
|
QCheckBox,
|
||||||
|
QDialog,
|
||||||
QFrame,
|
QFrame,
|
||||||
QHBoxLayout,
|
QHBoxLayout,
|
||||||
QLabel,
|
QLabel,
|
||||||
|
QLineEdit,
|
||||||
|
QMessageBox,
|
||||||
QPushButton,
|
QPushButton,
|
||||||
QScrollArea,
|
QScrollArea,
|
||||||
QVBoxLayout,
|
QVBoxLayout,
|
||||||
@@ -24,10 +28,11 @@ from PySide6.QtWidgets import (
|
|||||||
)
|
)
|
||||||
|
|
||||||
from ..config import load_config, update_config
|
from ..config import load_config, update_config
|
||||||
from .theme import ACCENT, GOOD, MUTED
|
from .diagnostic_dialog import DiagnosticDialog
|
||||||
|
from .theme import ACCENT, GOOD, MUTED, WARN
|
||||||
|
|
||||||
|
|
||||||
def _game_row(name: str, sublabel: str, size: str, is_new: bool) -> QFrame:
|
def _game_row(name: str, sublabel: str, size: str, is_new: bool, appid: str = "", on_diagnose=None) -> QFrame:
|
||||||
card = QFrame()
|
card = QFrame()
|
||||||
card.setObjectName("Card")
|
card.setObjectName("Card")
|
||||||
h = QHBoxLayout(card)
|
h = QHBoxLayout(card)
|
||||||
@@ -59,6 +64,13 @@ def _game_row(name: str, sublabel: str, size: str, is_new: bool) -> QFrame:
|
|||||||
size_label.setMinimumWidth(80)
|
size_label.setMinimumWidth(80)
|
||||||
size_label.setAlignment(Qt.AlignmentFlag.AlignRight | Qt.AlignmentFlag.AlignVCenter)
|
size_label.setAlignment(Qt.AlignmentFlag.AlignRight | Qt.AlignmentFlag.AlignVCenter)
|
||||||
h.addWidget(size_label, 0)
|
h.addWidget(size_label, 0)
|
||||||
|
|
||||||
|
if on_diagnose is not None:
|
||||||
|
diag_btn = QPushButton("Run Diagnostic")
|
||||||
|
diag_btn.setObjectName("ActionButton")
|
||||||
|
diag_btn.setCursor(Qt.CursorShape.PointingHandCursor)
|
||||||
|
diag_btn.clicked.connect(lambda: on_diagnose(name, appid))
|
||||||
|
h.addWidget(diag_btn, 0)
|
||||||
return card
|
return card
|
||||||
|
|
||||||
|
|
||||||
@@ -66,14 +78,17 @@ class GamesPage(QWidget):
|
|||||||
_libraries_ready = Signal(object) # list[dict(path, label, count, selected)]
|
_libraries_ready = Signal(object) # list[dict(path, label, count, selected)]
|
||||||
_scanned = Signal(object) # steam.ScanResult
|
_scanned = Signal(object) # steam.ScanResult
|
||||||
new_count_changed = Signal(int) # newly-installed game count (for the nav badge)
|
new_count_changed = Signal(int) # newly-installed game count (for the nav badge)
|
||||||
|
_diag_done = Signal(object) # DiagnosticResult — focused capture analyzed
|
||||||
|
|
||||||
def __init__(self) -> None:
|
def __init__(self) -> None:
|
||||||
super().__init__()
|
super().__init__()
|
||||||
self.setObjectName("Page")
|
self.setObjectName("Page")
|
||||||
self._libraries_ready.connect(self._render_libraries)
|
self._libraries_ready.connect(self._render_libraries)
|
||||||
self._scanned.connect(self._render_games)
|
self._scanned.connect(self._render_games)
|
||||||
|
self._diag_done.connect(self._on_diag_done)
|
||||||
self._busy = False
|
self._busy = False
|
||||||
self._new_appids: set[str] = set()
|
self._new_appids: set[str] = set()
|
||||||
|
self._diag_game: str | None = None
|
||||||
|
|
||||||
root = QVBoxLayout(self)
|
root = QVBoxLayout(self)
|
||||||
root.setContentsMargins(20, 18, 20, 18)
|
root.setContentsMargins(20, 18, 20, 18)
|
||||||
@@ -87,12 +102,61 @@ class GamesPage(QWidget):
|
|||||||
self._status = QLabel("")
|
self._status = QLabel("")
|
||||||
self._status.setObjectName("Muted")
|
self._status.setObjectName("Muted")
|
||||||
header.addWidget(self._status)
|
header.addWidget(self._status)
|
||||||
|
self._autocap_btn = QPushButton("Auto-capture…")
|
||||||
|
self._autocap_btn.clicked.connect(self._show_autocapture)
|
||||||
|
header.addWidget(self._autocap_btn)
|
||||||
self._rescan_btn = QPushButton("Rescan")
|
self._rescan_btn = QPushButton("Rescan")
|
||||||
self._rescan_btn.setObjectName("PrimaryButton")
|
self._rescan_btn.setObjectName("PrimaryButton")
|
||||||
self._rescan_btn.clicked.connect(self.refresh)
|
self._rescan_btn.clicked.connect(self.refresh)
|
||||||
header.addWidget(self._rescan_btn)
|
header.addWidget(self._rescan_btn)
|
||||||
root.addLayout(header)
|
root.addLayout(header)
|
||||||
|
|
||||||
|
# In-progress diagnostic banner (hidden until a focused capture is running).
|
||||||
|
self._banner = QFrame()
|
||||||
|
self._banner.setObjectName("Card")
|
||||||
|
self._banner.setStyleSheet(f"#Card {{ border: 1px solid {ACCENT}; }}")
|
||||||
|
banner_h = QHBoxLayout(self._banner)
|
||||||
|
banner_h.setContentsMargins(16, 10, 16, 10)
|
||||||
|
banner_h.setSpacing(10)
|
||||||
|
self._banner_label = QLabel("")
|
||||||
|
self._banner_label.setWordWrap(True)
|
||||||
|
self._banner_label.setStyleSheet(f"color: {ACCENT}; font-weight: 700; background: transparent;")
|
||||||
|
banner_h.addWidget(self._banner_label, 1)
|
||||||
|
self._finish_btn = QPushButton("Finish && analyze") # && → literal & (not a mnemonic)
|
||||||
|
self._finish_btn.setObjectName("ActionButton")
|
||||||
|
self._finish_btn.clicked.connect(self._finish_diagnostic)
|
||||||
|
banner_h.addWidget(self._finish_btn)
|
||||||
|
self._discard_btn = QPushButton("Discard")
|
||||||
|
self._discard_btn.clicked.connect(self._discard_diagnostic)
|
||||||
|
banner_h.addWidget(self._discard_btn)
|
||||||
|
self._banner.hide()
|
||||||
|
root.addWidget(self._banner)
|
||||||
|
|
||||||
|
# Hard-crash banner: a previous diagnostic ended without a clean stop.
|
||||||
|
self._crash_banner = QFrame()
|
||||||
|
self._crash_banner.setObjectName("Card")
|
||||||
|
self._crash_banner.setStyleSheet(f"#Card {{ border: 1px solid {WARN}; }}")
|
||||||
|
crash_h = QHBoxLayout(self._crash_banner)
|
||||||
|
crash_h.setContentsMargins(16, 10, 16, 10)
|
||||||
|
crash_h.setSpacing(10)
|
||||||
|
self._crash_label = QLabel("")
|
||||||
|
self._crash_label.setWordWrap(True)
|
||||||
|
self._crash_label.setStyleSheet(f"color: {WARN}; font-weight: 700; background: transparent;")
|
||||||
|
crash_h.addWidget(self._crash_label, 1)
|
||||||
|
self._analyze_btn = QPushButton("Analyze crash")
|
||||||
|
self._analyze_btn.setObjectName("ActionButton")
|
||||||
|
self._analyze_btn.clicked.connect(self._analyze_crash)
|
||||||
|
crash_h.addWidget(self._analyze_btn)
|
||||||
|
self._dismiss_btn = QPushButton("Dismiss")
|
||||||
|
self._dismiss_btn.clicked.connect(self._dismiss_crash)
|
||||||
|
crash_h.addWidget(self._dismiss_btn)
|
||||||
|
self._crash_banner.hide()
|
||||||
|
root.addWidget(self._crash_banner)
|
||||||
|
|
||||||
|
self._diag_timer = QTimer(self)
|
||||||
|
self._diag_timer.setInterval(1000)
|
||||||
|
self._diag_timer.timeout.connect(self._poll_diag)
|
||||||
|
|
||||||
# Libraries (opt-in checkboxes)
|
# Libraries (opt-in checkboxes)
|
||||||
lib_card = QFrame()
|
lib_card = QFrame()
|
||||||
lib_card.setObjectName("Card")
|
lib_card.setObjectName("Card")
|
||||||
@@ -126,6 +190,7 @@ class GamesPage(QWidget):
|
|||||||
|
|
||||||
self._load_cached() # instant display from the last scan
|
self._load_cached() # instant display from the last scan
|
||||||
QTimer.singleShot(400, self.refresh) # then rescan in the background on launch
|
QTimer.singleShot(400, self.refresh) # then rescan in the background on launch
|
||||||
|
self._check_crash() # surface an interrupted (crashed) diagnostic
|
||||||
|
|
||||||
# --- loading ----------------------------------------------------------------------
|
# --- loading ----------------------------------------------------------------------
|
||||||
|
|
||||||
@@ -233,9 +298,188 @@ class GamesPage(QWidget):
|
|||||||
os.path.basename(g.library.rstrip("/")) or g.library,
|
os.path.basename(g.library.rstrip("/")) or g.library,
|
||||||
steam.human_size(g.size_bytes),
|
steam.human_size(g.size_bytes),
|
||||||
g.appid in new_appids,
|
g.appid in new_appids,
|
||||||
|
appid=g.appid,
|
||||||
|
on_diagnose=self._start_diagnostic,
|
||||||
))
|
))
|
||||||
self._list.addStretch(1)
|
self._list.addStretch(1)
|
||||||
|
|
||||||
|
# --- guided diagnostic (M6/D12) ---------------------------------------------------
|
||||||
|
|
||||||
|
def _start_diagnostic(self, name: str, appid: str = "") -> None:
|
||||||
|
from ..core import diagnostic, steam
|
||||||
|
|
||||||
|
if diagnostic.is_running():
|
||||||
|
QMessageBox.information(
|
||||||
|
self, "RigDoctor",
|
||||||
|
"A capture is already running — finish or discard it first.")
|
||||||
|
return
|
||||||
|
|
||||||
|
# Tell the user what the flow actually is, and offer to launch the game for them.
|
||||||
|
box = QMessageBox(self)
|
||||||
|
box.setIcon(QMessageBox.Icon.Information)
|
||||||
|
box.setWindowTitle(f"Run Diagnostic — {name}")
|
||||||
|
box.setText(f"Record a focused diagnostic while you play {name}?")
|
||||||
|
box.setInformativeText(
|
||||||
|
"RigDoctor will capture sensors in the background. Then:\n\n"
|
||||||
|
"1. Play the game and try to reproduce the freeze / black screen / crash.\n"
|
||||||
|
"2. When you're done — or after a hard freeze and reboot — come back here and "
|
||||||
|
"click “Finish & analyze”.\n\n"
|
||||||
|
"Your readings are saved continuously, so even a hard lock won't lose them."
|
||||||
|
)
|
||||||
|
launch_btn = box.addButton("Launch game && start", QMessageBox.ButtonRole.AcceptRole)
|
||||||
|
start_btn = box.addButton("Start without launching", QMessageBox.ButtonRole.ActionRole)
|
||||||
|
box.addButton("Cancel", QMessageBox.ButtonRole.RejectRole)
|
||||||
|
if not appid:
|
||||||
|
launch_btn.setEnabled(False) # no appid → can't ask Steam to launch it
|
||||||
|
box.exec()
|
||||||
|
clicked = box.clickedButton()
|
||||||
|
if clicked not in (launch_btn, start_btn):
|
||||||
|
return
|
||||||
|
|
||||||
|
if diagnostic.start(game=name) is None:
|
||||||
|
QMessageBox.warning(self, "RigDoctor", "Couldn't start the capture.")
|
||||||
|
return
|
||||||
|
launched = steam.launch_game(appid) if clicked is launch_btn else False
|
||||||
|
self._diag_game = name
|
||||||
|
self._finish_btn.setEnabled(True)
|
||||||
|
self._discard_btn.setEnabled(True)
|
||||||
|
self._banner.show()
|
||||||
|
self._diag_timer.start()
|
||||||
|
self._poll_diag()
|
||||||
|
if clicked is launch_btn and not launched:
|
||||||
|
QMessageBox.information(
|
||||||
|
self, "RigDoctor",
|
||||||
|
"Recording started, but couldn't launch the game automatically — "
|
||||||
|
"launch it yourself, then click “Finish & analyze” when you're done.")
|
||||||
|
|
||||||
|
def _poll_diag(self) -> None:
|
||||||
|
from ..core import diagnostic
|
||||||
|
|
||||||
|
status = diagnostic.active()
|
||||||
|
if not status:
|
||||||
|
self._diag_timer.stop() # recorder exited on its own
|
||||||
|
return
|
||||||
|
samples = status.get("samples", 0)
|
||||||
|
lost = " · ⚠ GPU-lost detected" if status.get("gpu_lost") else ""
|
||||||
|
game = status.get("game") or self._diag_game or "your game"
|
||||||
|
self._banner_label.setText(
|
||||||
|
f"● Recording {game} — play it and reproduce the problem, then click "
|
||||||
|
f"“Finish & analyze”. ({samples} samples{lost})"
|
||||||
|
)
|
||||||
|
|
||||||
|
def _finish_diagnostic(self) -> None:
|
||||||
|
self._diag_timer.stop()
|
||||||
|
self._finish_btn.setEnabled(False)
|
||||||
|
self._discard_btn.setEnabled(False)
|
||||||
|
self._banner_label.setText("Analyzing… (running the health report)")
|
||||||
|
threading.Thread(target=self._work_finish, daemon=True).start()
|
||||||
|
|
||||||
|
def _work_finish(self) -> None:
|
||||||
|
from ..core import diagnostic
|
||||||
|
|
||||||
|
try:
|
||||||
|
result = diagnostic.finish()
|
||||||
|
except Exception:
|
||||||
|
result = None
|
||||||
|
self._diag_done.emit(result)
|
||||||
|
|
||||||
|
def _on_diag_done(self, result) -> None:
|
||||||
|
self._banner.hide()
|
||||||
|
self._crash_banner.hide()
|
||||||
|
self._finish_btn.setEnabled(True)
|
||||||
|
self._discard_btn.setEnabled(True)
|
||||||
|
self._analyze_btn.setEnabled(True)
|
||||||
|
if result is None:
|
||||||
|
QMessageBox.warning(self, "RigDoctor", "The diagnostic couldn't be analyzed.")
|
||||||
|
return
|
||||||
|
DiagnosticDialog(result, self).exec()
|
||||||
|
|
||||||
|
def _discard_diagnostic(self) -> None:
|
||||||
|
from ..core import reccontrol
|
||||||
|
|
||||||
|
self._diag_timer.stop()
|
||||||
|
reccontrol.stop_background()
|
||||||
|
self._banner.hide()
|
||||||
|
|
||||||
|
def _show_autocapture(self) -> None:
|
||||||
|
from ..core import wrap
|
||||||
|
|
||||||
|
option = wrap.launch_option()
|
||||||
|
dlg = QDialog(self)
|
||||||
|
dlg.setWindowTitle("Auto-capture in Steam")
|
||||||
|
dlg.resize(580, 250)
|
||||||
|
v = QVBoxLayout(dlg)
|
||||||
|
v.setContentsMargins(20, 18, 20, 16)
|
||||||
|
v.setSpacing(12)
|
||||||
|
info = QLabel(
|
||||||
|
"Capture automatically every time you launch a game — no need to click "
|
||||||
|
"Run Diagnostic.\n\n"
|
||||||
|
"1. In Steam, right-click the game → Properties → Launch Options.\n"
|
||||||
|
"2. Paste the line below.\n\n"
|
||||||
|
"RigDoctor starts a focused capture when the game launches and stops it on exit. "
|
||||||
|
"If the game hard-freezes, you'll get a crash report next time you open RigDoctor."
|
||||||
|
)
|
||||||
|
info.setWordWrap(True)
|
||||||
|
v.addWidget(info)
|
||||||
|
row = QHBoxLayout()
|
||||||
|
field = QLineEdit(option)
|
||||||
|
field.setReadOnly(True)
|
||||||
|
row.addWidget(field, 1)
|
||||||
|
copy = QPushButton("Copy")
|
||||||
|
copy.setObjectName("PrimaryButton")
|
||||||
|
copy.clicked.connect(lambda: QApplication.clipboard().setText(option))
|
||||||
|
row.addWidget(copy)
|
||||||
|
v.addLayout(row)
|
||||||
|
buttons = QHBoxLayout()
|
||||||
|
buttons.addStretch(1)
|
||||||
|
close = QPushButton("Close")
|
||||||
|
close.clicked.connect(dlg.accept)
|
||||||
|
buttons.addWidget(close)
|
||||||
|
v.addLayout(buttons)
|
||||||
|
dlg.exec()
|
||||||
|
|
||||||
|
# --- hard-crash recovery ----------------------------------------------------------
|
||||||
|
|
||||||
|
def _check_crash(self) -> None:
|
||||||
|
from ..core import diagnostic
|
||||||
|
|
||||||
|
info = diagnostic.pending_crash()
|
||||||
|
if info is None:
|
||||||
|
self._crash_banner.hide()
|
||||||
|
return
|
||||||
|
game = info.game or "your last game"
|
||||||
|
extra = " · ⚠ GPU-lost was captured" if info.gpu_lost else ""
|
||||||
|
self._crash_label.setText(
|
||||||
|
f"⚠ Your last diagnostic for {game} ended unexpectedly — likely a hard crash "
|
||||||
|
f"({info.samples} samples{extra}). Analyze it to see the final readings and the "
|
||||||
|
f"likely cause from the system logs."
|
||||||
|
)
|
||||||
|
self._analyze_btn.setEnabled(True)
|
||||||
|
self._crash_banner.show()
|
||||||
|
|
||||||
|
def _analyze_crash(self) -> None:
|
||||||
|
from ..core import diagnostic
|
||||||
|
|
||||||
|
diagnostic.acknowledge_crash() # don't prompt again for this one
|
||||||
|
self._analyze_btn.setEnabled(False)
|
||||||
|
self._crash_label.setText("Analyzing the crash (final readings + system logs)…")
|
||||||
|
threading.Thread(target=self._work_analyze_crash, daemon=True).start()
|
||||||
|
|
||||||
|
def _work_analyze_crash(self) -> None:
|
||||||
|
from ..core import diagnostic
|
||||||
|
|
||||||
|
try:
|
||||||
|
result = diagnostic.analyze_crash()
|
||||||
|
except Exception:
|
||||||
|
result = None
|
||||||
|
self._diag_done.emit(result)
|
||||||
|
|
||||||
|
def _dismiss_crash(self) -> None:
|
||||||
|
from ..core import diagnostic
|
||||||
|
|
||||||
|
diagnostic.acknowledge_crash()
|
||||||
|
self._crash_banner.hide()
|
||||||
|
|
||||||
# --- nav badge integration --------------------------------------------------------
|
# --- nav badge integration --------------------------------------------------------
|
||||||
|
|
||||||
def showEvent(self, event) -> None: # noqa: N802 (Qt override)
|
def showEvent(self, event) -> None: # noqa: N802 (Qt override)
|
||||||
@@ -247,3 +491,15 @@ class GamesPage(QWidget):
|
|||||||
|
|
||||||
threading.Thread(target=steam.acknowledge_new, daemon=True).start()
|
threading.Thread(target=steam.acknowledge_new, daemon=True).start()
|
||||||
self.new_count_changed.emit(0)
|
self.new_count_changed.emit(0)
|
||||||
|
|
||||||
|
# Reflect a capture that's still running (e.g. started earlier, navigated back).
|
||||||
|
from ..core import diagnostic
|
||||||
|
|
||||||
|
if diagnostic.is_running():
|
||||||
|
status = diagnostic.active() or {}
|
||||||
|
self._diag_game = status.get("game") or self._diag_game
|
||||||
|
self._banner.show()
|
||||||
|
if not self._diag_timer.isActive():
|
||||||
|
self._diag_timer.start()
|
||||||
|
else:
|
||||||
|
self._check_crash() # re-surface an interrupted diagnostic if one is pending
|
||||||
|
|||||||
@@ -0,0 +1,150 @@
|
|||||||
|
"""Inventory page (M5 in the GUI): system inventory with copy/save + admin re-collect."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import os
|
||||||
|
import threading
|
||||||
|
|
||||||
|
from PySide6.QtCore import Qt, QTimer, Signal
|
||||||
|
from PySide6.QtWidgets import (
|
||||||
|
QApplication,
|
||||||
|
QFileDialog,
|
||||||
|
QFrame,
|
||||||
|
QGridLayout,
|
||||||
|
QHBoxLayout,
|
||||||
|
QLabel,
|
||||||
|
QPushButton,
|
||||||
|
QScrollArea,
|
||||||
|
QVBoxLayout,
|
||||||
|
QWidget,
|
||||||
|
)
|
||||||
|
|
||||||
|
from ..core import inventory
|
||||||
|
|
||||||
|
|
||||||
|
def _section_card(section) -> QFrame:
|
||||||
|
card = QFrame()
|
||||||
|
card.setObjectName("Card")
|
||||||
|
layout = QVBoxLayout(card)
|
||||||
|
layout.setContentsMargins(16, 12, 16, 12)
|
||||||
|
layout.setSpacing(6)
|
||||||
|
title = QLabel(section.title)
|
||||||
|
title.setStyleSheet("font-weight: 700; background: transparent;")
|
||||||
|
layout.addWidget(title)
|
||||||
|
grid = QGridLayout()
|
||||||
|
grid.setColumnStretch(1, 1)
|
||||||
|
grid.setHorizontalSpacing(14)
|
||||||
|
grid.setVerticalSpacing(4)
|
||||||
|
for row, (key, value) in enumerate(section.items):
|
||||||
|
k = QLabel(key)
|
||||||
|
k.setObjectName("Muted")
|
||||||
|
v = QLabel(value)
|
||||||
|
v.setWordWrap(True)
|
||||||
|
v.setStyleSheet("background: transparent;")
|
||||||
|
grid.addWidget(k, row, 0)
|
||||||
|
grid.addWidget(v, row, 1)
|
||||||
|
layout.addLayout(grid)
|
||||||
|
return card
|
||||||
|
|
||||||
|
|
||||||
|
class InventoryPage(QWidget):
|
||||||
|
_result = Signal(object) # list[Section]
|
||||||
|
|
||||||
|
def __init__(self) -> None:
|
||||||
|
super().__init__()
|
||||||
|
self.setObjectName("Page")
|
||||||
|
self._sections: list = []
|
||||||
|
self._result.connect(self._render)
|
||||||
|
|
||||||
|
root = QVBoxLayout(self)
|
||||||
|
root.setContentsMargins(20, 18, 20, 18)
|
||||||
|
root.setSpacing(16)
|
||||||
|
|
||||||
|
header = QHBoxLayout()
|
||||||
|
title = QLabel("Inventory")
|
||||||
|
title.setObjectName("PageTitle")
|
||||||
|
header.addWidget(title)
|
||||||
|
header.addStretch(1)
|
||||||
|
self._status = QLabel("")
|
||||||
|
self._status.setObjectName("Muted")
|
||||||
|
header.addWidget(self._status)
|
||||||
|
self._copy_btn = QPushButton("Copy Markdown")
|
||||||
|
self._copy_btn.clicked.connect(self._copy)
|
||||||
|
header.addWidget(self._copy_btn)
|
||||||
|
self._save_btn = QPushButton("Save…")
|
||||||
|
self._save_btn.clicked.connect(self._save)
|
||||||
|
header.addWidget(self._save_btn)
|
||||||
|
self._refresh_btn = QPushButton("Refresh")
|
||||||
|
self._refresh_btn.setObjectName("PrimaryButton")
|
||||||
|
self._refresh_btn.clicked.connect(self._run)
|
||||||
|
header.addWidget(self._refresh_btn)
|
||||||
|
root.addLayout(header)
|
||||||
|
|
||||||
|
self._scroll = scroll = QScrollArea()
|
||||||
|
scroll.setWidgetResizable(True)
|
||||||
|
scroll.setFrameShape(QFrame.Shape.NoFrame)
|
||||||
|
scroll.setStyleSheet("background: transparent;")
|
||||||
|
self._container = QWidget()
|
||||||
|
self._list = QVBoxLayout(self._container)
|
||||||
|
self._list.setContentsMargins(0, 0, 0, 0)
|
||||||
|
self._list.setSpacing(12)
|
||||||
|
self._list.setAlignment(Qt.AlignmentFlag.AlignTop)
|
||||||
|
scroll.setWidget(self._container)
|
||||||
|
root.addWidget(scroll, 1)
|
||||||
|
|
||||||
|
QTimer.singleShot(300, self._run)
|
||||||
|
|
||||||
|
def _run(self) -> None:
|
||||||
|
self._busy("Collecting…")
|
||||||
|
threading.Thread(target=self._work, daemon=True).start()
|
||||||
|
|
||||||
|
def _work(self) -> None:
|
||||||
|
try:
|
||||||
|
sections = inventory.collect()
|
||||||
|
except Exception:
|
||||||
|
sections = []
|
||||||
|
self._result.emit(sections)
|
||||||
|
|
||||||
|
def _busy(self, text: str) -> None:
|
||||||
|
self._status.setText(text)
|
||||||
|
for b in (self._refresh_btn, self._copy_btn, self._save_btn):
|
||||||
|
b.setEnabled(False)
|
||||||
|
|
||||||
|
def _render(self, sections) -> None:
|
||||||
|
self._refresh_btn.setEnabled(True)
|
||||||
|
self._copy_btn.setEnabled(True)
|
||||||
|
self._save_btn.setEnabled(True)
|
||||||
|
if sections is None: # collection failed — keep current
|
||||||
|
self._status.setText("collection failed")
|
||||||
|
return
|
||||||
|
if sections == self._sections: # unchanged — don't rebuild (would jump scroll)
|
||||||
|
self._status.setText("")
|
||||||
|
return
|
||||||
|
|
||||||
|
scroll_pos = self._scroll.verticalScrollBar().value()
|
||||||
|
self._sections = sections
|
||||||
|
while self._list.count():
|
||||||
|
item = self._list.takeAt(0)
|
||||||
|
w = item.widget()
|
||||||
|
if w is not None:
|
||||||
|
w.deleteLater()
|
||||||
|
for section in sections:
|
||||||
|
self._list.addWidget(_section_card(section))
|
||||||
|
self._list.addStretch(1)
|
||||||
|
self._status.setText("")
|
||||||
|
# restore scroll after the layout settles so re-renders don't yank to the top
|
||||||
|
QTimer.singleShot(0, lambda: self._scroll.verticalScrollBar().setValue(scroll_pos))
|
||||||
|
|
||||||
|
def _copy(self) -> None:
|
||||||
|
if self._sections:
|
||||||
|
QApplication.clipboard().setText(inventory.render_markdown(self._sections))
|
||||||
|
self._status.setText("copied as Markdown")
|
||||||
|
|
||||||
|
def _save(self) -> None:
|
||||||
|
if not self._sections:
|
||||||
|
return
|
||||||
|
path, _ = QFileDialog.getSaveFileName(self, "Save inventory", "rigdoctor-inventory.md", "Markdown (*.md)")
|
||||||
|
if path:
|
||||||
|
with open(path, "w", encoding="utf-8") as f:
|
||||||
|
f.write(inventory.render_markdown(self._sections))
|
||||||
|
self._status.setText(f"saved {os.path.basename(path)}")
|
||||||
@@ -2,6 +2,7 @@
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import html
|
||||||
import os
|
import os
|
||||||
import sys
|
import sys
|
||||||
import threading
|
import threading
|
||||||
@@ -31,14 +32,15 @@ from .dashboard import Dashboard
|
|||||||
from .environment_page import EnvironmentPage
|
from .environment_page import EnvironmentPage
|
||||||
from .games_page import GamesPage
|
from .games_page import GamesPage
|
||||||
from .health_page import HealthPage
|
from .health_page import HealthPage
|
||||||
|
from .inventory_page import InventoryPage
|
||||||
from .notifications_page import NotificationsPage
|
from .notifications_page import NotificationsPage
|
||||||
from .recorder_page import RecorderPage
|
from .recorder_page import RecorderPage
|
||||||
from .setup_page import SetupPage
|
from .setup_page import SetupPage
|
||||||
from .share_page import SharePage
|
from .share_page import SharePage
|
||||||
from .theme import ACCENT, GOOD, MUTED
|
from .theme import ACCENT, CRIT, GOOD, MUTED, TEXT
|
||||||
from .worker import SamplerWorker
|
from .worker import SamplerWorker
|
||||||
|
|
||||||
_NAV_ITEMS = ["Dashboard", "Logs", "Health", "Games", "Environment", "Setup", "Notifications", "Share"]
|
_NAV_ITEMS = ["Dashboard", "Logs", "Health", "Games", "Environment", "Inventory", "Setup", "Notifications", "Share"]
|
||||||
|
|
||||||
|
|
||||||
class MainWindow(QMainWindow):
|
class MainWindow(QMainWindow):
|
||||||
@@ -71,6 +73,7 @@ class MainWindow(QMainWindow):
|
|||||||
self.games_page = GamesPage()
|
self.games_page = GamesPage()
|
||||||
self.games_page.new_count_changed.connect(self._set_games_badge)
|
self.games_page.new_count_changed.connect(self._set_games_badge)
|
||||||
self.environment_page = EnvironmentPage()
|
self.environment_page = EnvironmentPage()
|
||||||
|
self.inventory_page = InventoryPage()
|
||||||
self.setup_page = SetupPage()
|
self.setup_page = SetupPage()
|
||||||
self.notifications_page = NotificationsPage()
|
self.notifications_page = NotificationsPage()
|
||||||
self.notifications_page.changed.connect(self._apply_alert_settings)
|
self.notifications_page.changed.connect(self._apply_alert_settings)
|
||||||
@@ -80,9 +83,10 @@ class MainWindow(QMainWindow):
|
|||||||
self._stack.addWidget(self.health_page) # 2 Health
|
self._stack.addWidget(self.health_page) # 2 Health
|
||||||
self._stack.addWidget(self.games_page) # 3 Games
|
self._stack.addWidget(self.games_page) # 3 Games
|
||||||
self._stack.addWidget(self.environment_page) # 4 Environment
|
self._stack.addWidget(self.environment_page) # 4 Environment
|
||||||
self._stack.addWidget(self.setup_page) # 5 Setup
|
self._stack.addWidget(self.inventory_page) # 5 Inventory
|
||||||
self._stack.addWidget(self.notifications_page) # 6 Notifications
|
self._stack.addWidget(self.setup_page) # 6 Setup
|
||||||
self._stack.addWidget(self.share_page) # 7 Share
|
self._stack.addWidget(self.notifications_page) # 7 Notifications
|
||||||
|
self._stack.addWidget(self.share_page) # 8 Share
|
||||||
content_layout.addWidget(self._stack)
|
content_layout.addWidget(self._stack)
|
||||||
|
|
||||||
layout.addWidget(self._build_sidebar())
|
layout.addWidget(self._build_sidebar())
|
||||||
@@ -124,6 +128,14 @@ class MainWindow(QMainWindow):
|
|||||||
self._update_timer.timeout.connect(self._start_update_check)
|
self._update_timer.timeout.connect(self._start_update_check)
|
||||||
self._update_timer.start()
|
self._update_timer.start()
|
||||||
|
|
||||||
|
# Reflect any capture (manual, diagnostic, or the Steam wrapper) in the sidebar on
|
||||||
|
# every page, so it's always clear when RigDoctor is recording and for which game.
|
||||||
|
self._rec_timer = QTimer(self)
|
||||||
|
self._rec_timer.setInterval(1500)
|
||||||
|
self._rec_timer.timeout.connect(self._update_recording)
|
||||||
|
self._rec_timer.start()
|
||||||
|
self._update_recording()
|
||||||
|
|
||||||
def _build_sidebar(self) -> QFrame:
|
def _build_sidebar(self) -> QFrame:
|
||||||
bar = QFrame()
|
bar = QFrame()
|
||||||
bar.setObjectName("Sidebar")
|
bar.setObjectName("Sidebar")
|
||||||
@@ -138,6 +150,17 @@ class MainWindow(QMainWindow):
|
|||||||
subtitle.setObjectName("AppSubtitle")
|
subtitle.setObjectName("AppSubtitle")
|
||||||
v.addWidget(title)
|
v.addWidget(title)
|
||||||
v.addWidget(subtitle)
|
v.addWidget(subtitle)
|
||||||
|
|
||||||
|
# Global recording indicator — visible on every page while a capture runs.
|
||||||
|
self._rec_indicator = QLabel()
|
||||||
|
self._rec_indicator.setWordWrap(True)
|
||||||
|
self._rec_indicator.setTextFormat(Qt.TextFormat.RichText)
|
||||||
|
self._rec_indicator.setStyleSheet(
|
||||||
|
f"background: #241316; border: 1px solid {CRIT}; border-radius: 8px; padding: 8px 10px;"
|
||||||
|
)
|
||||||
|
self._rec_indicator.hide()
|
||||||
|
v.addSpacing(12)
|
||||||
|
v.addWidget(self._rec_indicator)
|
||||||
v.addSpacing(18)
|
v.addSpacing(18)
|
||||||
|
|
||||||
group = QButtonGroup(self)
|
group = QButtonGroup(self)
|
||||||
@@ -234,9 +257,26 @@ class MainWindow(QMainWindow):
|
|||||||
self._elevated.emit()
|
self._elevated.emit()
|
||||||
|
|
||||||
def _on_elevated(self) -> None:
|
def _on_elevated(self) -> None:
|
||||||
# Re-run Health now that root-only SMART data is available. (dmidecode is still
|
# Re-run Health + Inventory now that root-only data is available (SMART for Health,
|
||||||
# collected and used by the relay guest view + the CLI `rigdoctor inventory`.)
|
# dmidecode motherboard/BIOS/RAM for Inventory).
|
||||||
self.health_page._run()
|
self.health_page._run()
|
||||||
|
self.inventory_page._run()
|
||||||
|
|
||||||
|
def _update_recording(self) -> None:
|
||||||
|
from ..core import diagnostic
|
||||||
|
|
||||||
|
status = diagnostic.active()
|
||||||
|
if not status:
|
||||||
|
self._rec_indicator.hide()
|
||||||
|
return
|
||||||
|
game = status.get("game")
|
||||||
|
lines = [f"<span style='color:{CRIT};'>●</span> <b style='color:{TEXT};'>Recording</b>"]
|
||||||
|
if game:
|
||||||
|
lines.append(f"<span style='color:{TEXT};'>{html.escape(str(game))}</span>")
|
||||||
|
if status.get("gpu_lost"):
|
||||||
|
lines.append(f"<span style='color:{CRIT};'>⚠ GPU-lost</span>")
|
||||||
|
self._rec_indicator.setText("<br>".join(lines))
|
||||||
|
self._rec_indicator.show()
|
||||||
|
|
||||||
def _set_games_badge(self, count: int) -> None:
|
def _set_games_badge(self, count: int) -> None:
|
||||||
btn = self._nav_buttons.get("Games")
|
btn = self._nav_buttons.get("Games")
|
||||||
|
|||||||
@@ -104,6 +104,15 @@ QPushButton#PrimaryButton {{ background: {ACCENT}; color: #06222e; border: none;
|
|||||||
QPushButton#PrimaryButton:hover {{ background: #5cc8fb; }}
|
QPushButton#PrimaryButton:hover {{ background: #5cc8fb; }}
|
||||||
QPushButton#PrimaryButton:disabled {{ background: #27424f; color: #5f7c8a; }}
|
QPushButton#PrimaryButton:disabled {{ background: #27424f; color: #5f7c8a; }}
|
||||||
|
|
||||||
|
/* Inline per-finding action buttons (Install / Apply). Outlined: bright accent text on the
|
||||||
|
dark card so it stays readable regardless of fill painting; fills accent on hover. */
|
||||||
|
QPushButton#ActionButton {{
|
||||||
|
background: transparent; color: {ACCENT}; border: 1px solid {ACCENT};
|
||||||
|
border-radius: 8px; padding: 6px 16px; font-weight: 700; min-height: 18px;
|
||||||
|
}}
|
||||||
|
QPushButton#ActionButton:hover {{ background: {ACCENT}; color: #06222e; }}
|
||||||
|
QPushButton#ActionButton:disabled {{ color: {MUTED}; border-color: {CARD_BORDER}; }}
|
||||||
|
|
||||||
QDoubleSpinBox, QSpinBox {{
|
QDoubleSpinBox, QSpinBox {{
|
||||||
background: #262b34; color: {TEXT}; border: 1px solid {CARD_BORDER};
|
background: #262b34; color: {TEXT}; border: 1px solid {CARD_BORDER};
|
||||||
border-radius: 6px; padding: 4px 6px;
|
border-radius: 6px; padding: 4px 6px;
|
||||||
@@ -150,4 +159,13 @@ QLineEdit:focus, QPlainTextEdit:focus, QAbstractSpinBox:focus, QComboBox:focus {
|
|||||||
border: 1px solid {ACCENT};
|
border: 1px solid {ACCENT};
|
||||||
}}
|
}}
|
||||||
QLineEdit:disabled, QPlainTextEdit:disabled, QAbstractSpinBox:disabled {{ color: {MUTED}; }}
|
QLineEdit:disabled, QPlainTextEdit:disabled, QAbstractSpinBox:disabled {{ color: {MUTED}; }}
|
||||||
|
|
||||||
|
/* The combo-box drop-down list is a separate popup view — unstyled it renders
|
||||||
|
light-on-light (same Fusion trap as the closed control above). */
|
||||||
|
QComboBox QAbstractItemView {{
|
||||||
|
background: {CARD}; color: {TEXT};
|
||||||
|
border: 1px solid {CARD_BORDER}; outline: 0;
|
||||||
|
selection-background-color: {ACCENT}; selection-color: #06222e;
|
||||||
|
}}
|
||||||
|
QComboBox QAbstractItemView::item {{ padding: 5px 8px; min-height: 22px; }}
|
||||||
"""
|
"""
|
||||||
|
|||||||
@@ -2,9 +2,12 @@
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
from PySide6.QtCore import QRectF, Qt
|
from collections import deque
|
||||||
from PySide6.QtGui import QColor, QFont, QPainter, QPen
|
|
||||||
|
from PySide6.QtCore import QPointF, QRectF, Qt
|
||||||
|
from PySide6.QtGui import QColor, QFont, QPainter, QPainterPath, QPen
|
||||||
from PySide6.QtWidgets import (
|
from PySide6.QtWidgets import (
|
||||||
|
QComboBox,
|
||||||
QFrame,
|
QFrame,
|
||||||
QHBoxLayout,
|
QHBoxLayout,
|
||||||
QLabel,
|
QLabel,
|
||||||
@@ -16,7 +19,19 @@ from PySide6.QtWidgets import (
|
|||||||
|
|
||||||
from ..core.sample import Reading
|
from ..core.sample import Reading
|
||||||
from ..render import format_value
|
from ..render import format_value
|
||||||
from .theme import ACCENT, CRIT, GOOD, MUTED, TEXT, TRACK, WARN, gauge_color, temp_color
|
from .theme import (
|
||||||
|
ACCENT,
|
||||||
|
CRIT,
|
||||||
|
GOOD,
|
||||||
|
MUTED,
|
||||||
|
TEMP_WARN,
|
||||||
|
TEXT,
|
||||||
|
TRACK,
|
||||||
|
USAGE_WARN,
|
||||||
|
WARN,
|
||||||
|
gauge_color,
|
||||||
|
temp_color,
|
||||||
|
)
|
||||||
|
|
||||||
_SEV = {
|
_SEV = {
|
||||||
"critical": ("CRITICAL", CRIT),
|
"critical": ("CRITICAL", CRIT),
|
||||||
@@ -26,8 +41,17 @@ _SEV = {
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
def finding_card(finding) -> QFrame:
|
def finding_card(finding, on_install=None, on_apply=None) -> QFrame:
|
||||||
"""A card for one M4/M6 Finding (severity-colored title, detail, suggested fix)."""
|
"""A card for one M4/M6 Finding (severity-colored title, detail, suggested fix).
|
||||||
|
|
||||||
|
If the finding names an installable catalog component (``finding.action``) and an
|
||||||
|
``on_install(component)`` callback is given, an "Install" button is shown — so a
|
||||||
|
"tool not installed" finding becomes one click instead of a copy-pasted apt command.
|
||||||
|
|
||||||
|
If the finding names a runtime tunable (``finding.fix``) and an ``on_apply(fix_id,
|
||||||
|
value)`` callback is given, a dropdown of the live options + an Apply button is shown
|
||||||
|
(M6 live fixes — D22).
|
||||||
|
"""
|
||||||
label, color = _SEV.get(finding.severity, ("?", MUTED))
|
label, color = _SEV.get(finding.severity, ("?", MUTED))
|
||||||
card = QFrame()
|
card = QFrame()
|
||||||
card.setObjectName("Card")
|
card.setObjectName("Card")
|
||||||
@@ -50,9 +74,65 @@ def finding_card(finding) -> QFrame:
|
|||||||
suggestion.setStyleSheet(f"color: {ACCENT}; background: transparent;")
|
suggestion.setStyleSheet(f"color: {ACCENT}; background: transparent;")
|
||||||
suggestion.setWordWrap(True)
|
suggestion.setWordWrap(True)
|
||||||
v.addWidget(suggestion)
|
v.addWidget(suggestion)
|
||||||
|
|
||||||
|
component = _installable_component(finding) if on_install else None
|
||||||
|
if component is not None:
|
||||||
|
row = QHBoxLayout()
|
||||||
|
row.addStretch(1)
|
||||||
|
btn = QPushButton(f"Install {component.name}")
|
||||||
|
btn.setObjectName("ActionButton")
|
||||||
|
btn.setCursor(Qt.CursorShape.PointingHandCursor)
|
||||||
|
btn.clicked.connect(lambda: on_install(component))
|
||||||
|
row.addWidget(btn)
|
||||||
|
v.addLayout(row)
|
||||||
|
|
||||||
|
tunable = _tunable(finding) if on_apply else None
|
||||||
|
if tunable is not None and tunable.options:
|
||||||
|
row = QHBoxLayout()
|
||||||
|
name = QLabel(f"{tunable.label}:")
|
||||||
|
name.setObjectName("Muted")
|
||||||
|
combo = QComboBox()
|
||||||
|
combo.addItems(tunable.options)
|
||||||
|
if tunable.current in tunable.options:
|
||||||
|
combo.setCurrentText(tunable.current)
|
||||||
|
combo.setCursor(Qt.CursorShape.PointingHandCursor)
|
||||||
|
apply_btn = QPushButton("Apply")
|
||||||
|
apply_btn.setObjectName("ActionButton")
|
||||||
|
apply_btn.setCursor(Qt.CursorShape.PointingHandCursor)
|
||||||
|
apply_btn.clicked.connect(lambda: on_apply(tunable.id, combo.currentText()))
|
||||||
|
row.addWidget(name)
|
||||||
|
row.addWidget(combo, 1)
|
||||||
|
row.addWidget(apply_btn)
|
||||||
|
v.addLayout(row)
|
||||||
|
if tunable.note:
|
||||||
|
note = QLabel(tunable.note)
|
||||||
|
note.setObjectName("Muted")
|
||||||
|
v.addWidget(note)
|
||||||
return card
|
return card
|
||||||
|
|
||||||
|
|
||||||
|
def _tunable(finding):
|
||||||
|
"""The runtime tunable a finding can apply, if any."""
|
||||||
|
fix = getattr(finding, "fix", "")
|
||||||
|
if not fix:
|
||||||
|
return None
|
||||||
|
from ..core import fixes
|
||||||
|
|
||||||
|
return fixes.get_tunable(fix)
|
||||||
|
|
||||||
|
|
||||||
|
def _installable_component(finding):
|
||||||
|
"""The catalog component a finding offers to install, if any and if apt is usable."""
|
||||||
|
action = getattr(finding, "action", "")
|
||||||
|
if not action:
|
||||||
|
return None
|
||||||
|
from ..core import catalog, sysenv
|
||||||
|
|
||||||
|
if sysenv.package_manager() != "apt":
|
||||||
|
return None # apt-only (D15) — no one-click install elsewhere
|
||||||
|
return catalog.by_id(action)
|
||||||
|
|
||||||
|
|
||||||
class Card(QFrame):
|
class Card(QFrame):
|
||||||
"""A titled panel whose body collapses when the header is clicked."""
|
"""A titled panel whose body collapses when the header is clicked."""
|
||||||
|
|
||||||
@@ -182,6 +262,117 @@ class StatGauge(QWidget):
|
|||||||
p.end()
|
p.end()
|
||||||
|
|
||||||
|
|
||||||
|
class HistoryGraph(QWidget):
|
||||||
|
"""A headline metric as a trend: current value + window min/max + a history line.
|
||||||
|
|
||||||
|
Replaces the at-a-glance gauge with changes-over-time. `kind` drives the color
|
||||||
|
(temp band / usage / accent), matching StatGauge so the dashboard stays consistent.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, title: str, unit: str = "", vmin: float = 0.0, vmax: float = 100.0,
|
||||||
|
kind: str = "accent", history: int = 180) -> None:
|
||||||
|
super().__init__()
|
||||||
|
self._title = title
|
||||||
|
self._unit = unit
|
||||||
|
self._min = vmin
|
||||||
|
self._max = vmax
|
||||||
|
self._kind = kind # "temp" | "usage" | "accent"
|
||||||
|
self._values: deque[float | None] = deque(maxlen=history)
|
||||||
|
self.setMinimumSize(160, 132)
|
||||||
|
|
||||||
|
def add_value(self, value: float | None) -> None:
|
||||||
|
self._values.append(value)
|
||||||
|
self.update()
|
||||||
|
|
||||||
|
def _fmt(self, value: float | None) -> str:
|
||||||
|
if value is None:
|
||||||
|
return "—"
|
||||||
|
if self._unit == "°C":
|
||||||
|
return f"{value:.0f}°"
|
||||||
|
if self._unit == "%":
|
||||||
|
return f"{value:.0f}%"
|
||||||
|
return f"{value:.0f}{self._unit}"
|
||||||
|
|
||||||
|
def paintEvent(self, event) -> None: # noqa: N802 (Qt override)
|
||||||
|
p = QPainter(self)
|
||||||
|
p.setRenderHint(QPainter.RenderHint.Antialiasing)
|
||||||
|
w, h = self.width(), self.height()
|
||||||
|
pad = 10.0
|
||||||
|
present = [v for v in self._values if v is not None]
|
||||||
|
current = next((v for v in reversed(self._values) if v is not None), None)
|
||||||
|
color = QColor(gauge_color(self._kind, current))
|
||||||
|
|
||||||
|
ftitle = QFont()
|
||||||
|
ftitle.setPointSizeF(10.0)
|
||||||
|
ftitle.setBold(True)
|
||||||
|
p.setFont(ftitle)
|
||||||
|
p.setPen(QColor(MUTED))
|
||||||
|
p.drawText(QRectF(pad, 6, w - 2 * pad, 18),
|
||||||
|
Qt.AlignmentFlag.AlignLeft | Qt.AlignmentFlag.AlignVCenter, self._title)
|
||||||
|
|
||||||
|
fval = QFont()
|
||||||
|
fval.setPointSizeF(21.0)
|
||||||
|
fval.setBold(True)
|
||||||
|
p.setFont(fval)
|
||||||
|
p.setPen(color if current is not None else QColor(MUTED))
|
||||||
|
p.drawText(QRectF(pad, 2, w - 2 * pad, 28),
|
||||||
|
Qt.AlignmentFlag.AlignRight | Qt.AlignmentFlag.AlignTop, self._fmt(current))
|
||||||
|
|
||||||
|
if present:
|
||||||
|
fsm = QFont()
|
||||||
|
fsm.setPointSizeF(8.5)
|
||||||
|
p.setFont(fsm)
|
||||||
|
p.setPen(QColor(MUTED))
|
||||||
|
p.drawText(QRectF(pad, 27, w - 2 * pad, 14), Qt.AlignmentFlag.AlignLeft,
|
||||||
|
f"min {self._fmt(min(present))} max {self._fmt(max(present))}")
|
||||||
|
|
||||||
|
g_top, g_bot = 48.0, h - pad
|
||||||
|
g_left, g_right = pad, w - pad
|
||||||
|
span = self._max - self._min
|
||||||
|
if g_bot - g_top < 12 or g_right - g_left < 12 or span <= 0:
|
||||||
|
p.end()
|
||||||
|
return
|
||||||
|
|
||||||
|
def y_of(v: float) -> float:
|
||||||
|
frac = (max(self._min, min(self._max, v)) - self._min) / span
|
||||||
|
return g_bot - frac * (g_bot - g_top)
|
||||||
|
|
||||||
|
warn = TEMP_WARN if self._kind == "temp" else (USAGE_WARN if self._kind == "usage" else None)
|
||||||
|
if warn is not None and self._min <= warn <= self._max:
|
||||||
|
pen = QPen(QColor(TRACK))
|
||||||
|
pen.setWidthF(1.0)
|
||||||
|
pen.setStyle(Qt.PenStyle.DashLine)
|
||||||
|
p.setPen(pen)
|
||||||
|
yw = y_of(warn)
|
||||||
|
p.drawLine(QPointF(g_left, yw), QPointF(g_right, yw))
|
||||||
|
|
||||||
|
maxlen = self._values.maxlen or 1
|
||||||
|
step = (g_right - g_left) / max(1, maxlen - 1)
|
||||||
|
n = len(self._values)
|
||||||
|
# Build the line newest-at-right; break it where readings are missing.
|
||||||
|
path = QPainterPath()
|
||||||
|
drawing = False
|
||||||
|
for i, v in enumerate(self._values):
|
||||||
|
if v is None:
|
||||||
|
drawing = False
|
||||||
|
continue
|
||||||
|
x = g_right - (n - 1 - i) * step
|
||||||
|
y = y_of(v)
|
||||||
|
if drawing:
|
||||||
|
path.lineTo(x, y)
|
||||||
|
else:
|
||||||
|
path.moveTo(x, y)
|
||||||
|
drawing = True
|
||||||
|
if not path.isEmpty():
|
||||||
|
pen = QPen(color)
|
||||||
|
pen.setWidthF(2.0)
|
||||||
|
pen.setCapStyle(Qt.PenCapStyle.RoundCap)
|
||||||
|
pen.setJoinStyle(Qt.PenJoinStyle.RoundJoin)
|
||||||
|
p.setPen(pen)
|
||||||
|
p.drawPath(path)
|
||||||
|
p.end()
|
||||||
|
|
||||||
|
|
||||||
class MetricBar(QWidget):
|
class MetricBar(QWidget):
|
||||||
"""A label + value with a thin progress bar (for 0–100% metrics)."""
|
"""A label + value with a thin progress bar (for 0–100% metrics)."""
|
||||||
|
|
||||||
|
|||||||
@@ -0,0 +1,111 @@
|
|||||||
|
"""Tests for the guided diagnostic orchestration (M3+M4 glue)."""
|
||||||
|
|
||||||
|
import tempfile
|
||||||
|
import time
|
||||||
|
import unittest
|
||||||
|
from pathlib import Path
|
||||||
|
from unittest import mock
|
||||||
|
|
||||||
|
from rigdoctor.core import diagnostic
|
||||||
|
from rigdoctor.core.crashlog import CrashLogWriter, summarize
|
||||||
|
from rigdoctor.core.health import Finding
|
||||||
|
from rigdoctor.core.sample import Reading, Sample
|
||||||
|
|
||||||
|
|
||||||
|
def _write_log(path: str, game: str) -> None:
|
||||||
|
w = CrashLogWriter(path)
|
||||||
|
w.write_event("session-start", "interval=1s")
|
||||||
|
w.write_event("game", game)
|
||||||
|
for temp in (60.0, 72.0, 81.0):
|
||||||
|
w.write_sample(Sample(ts=time.time(), readings=[Reading("gpu", "temp", temp, "°C", "")]))
|
||||||
|
w.write_event("gpu-lost", "nvidia-smi query timed out")
|
||||||
|
w.close()
|
||||||
|
|
||||||
|
|
||||||
|
class GameRecoveryTests(unittest.TestCase):
|
||||||
|
def test_game_recovered_from_log_event(self):
|
||||||
|
with tempfile.TemporaryDirectory() as d:
|
||||||
|
log = str(Path(d) / "capture.jsonl")
|
||||||
|
_write_log(log, "Path of Exile 2")
|
||||||
|
summary = summarize(log)
|
||||||
|
self.assertEqual(diagnostic._game_from_summary(summary), "Path of Exile 2")
|
||||||
|
|
||||||
|
def test_no_game_event_returns_none(self):
|
||||||
|
with tempfile.TemporaryDirectory() as d:
|
||||||
|
log = str(Path(d) / "capture.jsonl")
|
||||||
|
w = CrashLogWriter(log)
|
||||||
|
w.write_event("session-start")
|
||||||
|
w.close()
|
||||||
|
self.assertIsNone(diagnostic._game_from_summary(summarize(log)))
|
||||||
|
|
||||||
|
|
||||||
|
class FinishTests(unittest.TestCase):
|
||||||
|
def test_finish_combines_summary_and_findings(self):
|
||||||
|
with tempfile.TemporaryDirectory() as d:
|
||||||
|
log = Path(d) / "capture.jsonl"
|
||||||
|
_write_log(str(log), "Satisfactory")
|
||||||
|
fake = [Finding("warning", "GPU", "NVIDIA Xid 79 ×1", "fell off the bus")]
|
||||||
|
with mock.patch("rigdoctor.core.health.run_health_checks", return_value=fake), \
|
||||||
|
mock.patch.object(diagnostic.reccontrol, "stop_background", return_value=False), \
|
||||||
|
mock.patch.object(diagnostic.reccontrol, "running_pid", return_value=None):
|
||||||
|
result = diagnostic.finish(log_path=log)
|
||||||
|
self.assertEqual(result.game, "Satisfactory")
|
||||||
|
self.assertEqual(result.summary.samples, 3)
|
||||||
|
self.assertEqual(result.findings, fake)
|
||||||
|
# peak GPU temp captured in the window, GPU-lost event recorded
|
||||||
|
self.assertEqual(result.summary.maxima["gpu.temp"][0], 81.0)
|
||||||
|
self.assertTrue(any(kind == "gpu-lost" for _ts, kind, _d in result.summary.events))
|
||||||
|
|
||||||
|
|
||||||
|
class CrashDetectionTests(unittest.TestCase):
|
||||||
|
def _diag_log(self, d) -> Path:
|
||||||
|
return Path(d) / "diagnostic.jsonl"
|
||||||
|
|
||||||
|
def test_unterminated_session_is_a_pending_crash(self):
|
||||||
|
with tempfile.TemporaryDirectory() as d:
|
||||||
|
log = self._diag_log(d)
|
||||||
|
_write_log(str(log), "Tarkov") # has session-start + game, no session-stop
|
||||||
|
with mock.patch.object(diagnostic.config, "DIAG_LOG", log), \
|
||||||
|
mock.patch.object(diagnostic.config, "DIAG_CRASH", log.with_suffix(".crash")), \
|
||||||
|
mock.patch.object(diagnostic.reccontrol, "running_pid", return_value=None):
|
||||||
|
info = diagnostic.pending_crash()
|
||||||
|
self.assertIsNotNone(info)
|
||||||
|
self.assertEqual(info.game, "Tarkov")
|
||||||
|
self.assertTrue(info.gpu_lost) # _write_log writes a gpu-lost event
|
||||||
|
|
||||||
|
def test_clean_stop_is_not_a_crash(self):
|
||||||
|
with tempfile.TemporaryDirectory() as d:
|
||||||
|
log = self._diag_log(d)
|
||||||
|
w = CrashLogWriter(str(log))
|
||||||
|
w.write_event("session-start"); w.write_event("game", "X")
|
||||||
|
w.write_sample(Sample(time.time(), [Reading("gpu", "temp", 60.0, "°C", "")]))
|
||||||
|
w.write_event("session-stop", "samples=1")
|
||||||
|
w.close()
|
||||||
|
with mock.patch.object(diagnostic.config, "DIAG_LOG", log), \
|
||||||
|
mock.patch.object(diagnostic.config, "DIAG_CRASH", log.with_suffix(".crash")), \
|
||||||
|
mock.patch.object(diagnostic.reccontrol, "running_pid", return_value=None):
|
||||||
|
self.assertIsNone(diagnostic.pending_crash())
|
||||||
|
|
||||||
|
def test_acknowledge_clears_pending_crash(self):
|
||||||
|
with tempfile.TemporaryDirectory() as d:
|
||||||
|
log = self._diag_log(d)
|
||||||
|
_write_log(str(log), "Tarkov")
|
||||||
|
with mock.patch.object(diagnostic.config, "DIAG_LOG", log), \
|
||||||
|
mock.patch.object(diagnostic.config, "DIAG_CRASH", log.with_suffix(".crash")), \
|
||||||
|
mock.patch.object(diagnostic.reccontrol, "running_pid", return_value=None):
|
||||||
|
self.assertIsNotNone(diagnostic.pending_crash())
|
||||||
|
diagnostic.acknowledge_crash()
|
||||||
|
self.assertIsNone(diagnostic.pending_crash())
|
||||||
|
|
||||||
|
def test_running_capture_is_not_a_crash(self):
|
||||||
|
with tempfile.TemporaryDirectory() as d:
|
||||||
|
log = self._diag_log(d)
|
||||||
|
_write_log(str(log), "Tarkov")
|
||||||
|
with mock.patch.object(diagnostic.config, "DIAG_LOG", log), \
|
||||||
|
mock.patch.object(diagnostic.config, "DIAG_CRASH", log.with_suffix(".crash")), \
|
||||||
|
mock.patch.object(diagnostic.reccontrol, "running_pid", return_value=4321):
|
||||||
|
self.assertIsNone(diagnostic.pending_crash()) # it's in-progress, not crashed
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
unittest.main()
|
||||||
@@ -0,0 +1,63 @@
|
|||||||
|
"""Tests for M6 runtime tunables (parse, command builders, value validation)."""
|
||||||
|
|
||||||
|
import unittest
|
||||||
|
from unittest import mock
|
||||||
|
|
||||||
|
from rigdoctor.core import fixes
|
||||||
|
from rigdoctor.core.fixes import Tunable
|
||||||
|
|
||||||
|
|
||||||
|
class ParseTests(unittest.TestCase):
|
||||||
|
def test_bracketed(self):
|
||||||
|
self.assertEqual(fixes._bracketed("always [madvise] never"), (["always", "madvise", "never"], "madvise"))
|
||||||
|
|
||||||
|
def test_bracketed_none_active(self):
|
||||||
|
self.assertEqual(fixes._bracketed("a b c"), (["a", "b", "c"], None))
|
||||||
|
|
||||||
|
|
||||||
|
class CommandBuilderTests(unittest.TestCase):
|
||||||
|
def test_governor_cmd_writes_value_to_sysfs(self):
|
||||||
|
cmd = fixes._cpu_governor_cmd("performance")
|
||||||
|
self.assertEqual(cmd[:2], ["/bin/sh", "-c"])
|
||||||
|
self.assertIn("performance", cmd[2])
|
||||||
|
self.assertIn("scaling_governor", cmd[2])
|
||||||
|
|
||||||
|
def test_persistence_cmd(self):
|
||||||
|
self.assertEqual(fixes._nvidia_persistence_cmd("Enabled"), ["nvidia-smi", "-pm", "1"])
|
||||||
|
self.assertEqual(fixes._nvidia_persistence_cmd("Disabled"), ["nvidia-smi", "-pm", "0"])
|
||||||
|
|
||||||
|
def test_swappiness_cmd_targets_procfs(self):
|
||||||
|
self.assertIn("/proc/sys/vm/swappiness", fixes._swappiness_cmd("10")[2])
|
||||||
|
|
||||||
|
def test_quoting_is_safe(self):
|
||||||
|
# A value that would be dangerous unquoted stays a single quoted token.
|
||||||
|
cmd = fixes._pcie_aspm_cmd("performance; rm -rf /")
|
||||||
|
self.assertIn("'performance; rm -rf /'", cmd[2])
|
||||||
|
|
||||||
|
|
||||||
|
class ApplyValidationTests(unittest.TestCase):
|
||||||
|
def test_unknown_fix_returns_none(self):
|
||||||
|
self.assertIsNone(fixes.apply_command("does_not_exist", "x"))
|
||||||
|
|
||||||
|
def test_value_validated_against_live_options(self):
|
||||||
|
fake = Tunable("x", "X", ["a", "b"], "a")
|
||||||
|
with mock.patch.dict(fixes._TUNABLES, {"x": (lambda: fake, lambda v: ["echo", v])}, clear=False):
|
||||||
|
self.assertEqual(fixes.apply_command("x", "a"), ["echo", "a"])
|
||||||
|
self.assertIsNone(fixes.apply_command("x", "not-an-option"))
|
||||||
|
|
||||||
|
def test_apply_unknown_is_error(self):
|
||||||
|
rc, _ = fixes.apply("nope", "x")
|
||||||
|
self.assertEqual(rc, 1)
|
||||||
|
|
||||||
|
|
||||||
|
class GameenvWiringTests(unittest.TestCase):
|
||||||
|
def test_findings_reference_known_fix_ids(self):
|
||||||
|
from rigdoctor.core import gameenv
|
||||||
|
|
||||||
|
fix_ids = {f.fix for f in gameenv.run_gameenv_checks() if f.fix}
|
||||||
|
# Whatever fixes the live system surfaces, each must be a real tunable id.
|
||||||
|
self.assertTrue(fix_ids.issubset(set(fixes._TUNABLES)))
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
unittest.main()
|
||||||
@@ -30,7 +30,7 @@ class GovernorTests(unittest.TestCase):
|
|||||||
def test_powersave_is_warning(self):
|
def test_powersave_is_warning(self):
|
||||||
f = gameenv.evaluate_governor({"powersave"})
|
f = gameenv.evaluate_governor({"powersave"})
|
||||||
self.assertEqual(f.severity, "warning")
|
self.assertEqual(f.severity, "warning")
|
||||||
self.assertIn("cpupower", f.suggestion)
|
self.assertEqual(f.fix, "cpu_governor") # offers the live Apply dropdown
|
||||||
|
|
||||||
def test_dynamic_is_info(self):
|
def test_dynamic_is_info(self):
|
||||||
self.assertEqual(gameenv.evaluate_governor({"schedutil"}).severity, "info")
|
self.assertEqual(gameenv.evaluate_governor({"schedutil"}).severity, "info")
|
||||||
@@ -43,7 +43,7 @@ class SwappinessTests(unittest.TestCase):
|
|||||||
def test_high_is_info_with_suggestion(self):
|
def test_high_is_info_with_suggestion(self):
|
||||||
f = gameenv.evaluate_swappiness(60)
|
f = gameenv.evaluate_swappiness(60)
|
||||||
self.assertEqual(f.severity, "info")
|
self.assertEqual(f.severity, "info")
|
||||||
self.assertIn("swappiness", f.suggestion)
|
self.assertEqual(f.fix, "swappiness") # offers the live Apply dropdown
|
||||||
|
|
||||||
def test_low_is_ok(self):
|
def test_low_is_ok(self):
|
||||||
self.assertEqual(gameenv.evaluate_swappiness(10).severity, "ok")
|
self.assertEqual(gameenv.evaluate_swappiness(10).severity, "ok")
|
||||||
|
|||||||
@@ -0,0 +1,68 @@
|
|||||||
|
"""Tests for the D12 Steam-launch wrapper (rigdoctor wrap %command%)."""
|
||||||
|
|
||||||
|
import unittest
|
||||||
|
from unittest import mock
|
||||||
|
|
||||||
|
from rigdoctor.core import wrap
|
||||||
|
from rigdoctor.core.steam import Game
|
||||||
|
|
||||||
|
|
||||||
|
class LaunchOptionTests(unittest.TestCase):
|
||||||
|
def test_format(self):
|
||||||
|
opt = wrap.launch_option()
|
||||||
|
self.assertTrue(opt.endswith("wrap %command%"))
|
||||||
|
self.assertIn("rigdoctor", opt)
|
||||||
|
|
||||||
|
|
||||||
|
class GameNameTests(unittest.TestCase):
|
||||||
|
def test_resolves_from_steam_appid(self):
|
||||||
|
g = Game(appid="570", name="Dota 2", library="/x", installdir="dota")
|
||||||
|
with mock.patch.dict("os.environ", {"SteamAppId": "570"}), \
|
||||||
|
mock.patch("rigdoctor.core.steam.cached_games", return_value=[g]):
|
||||||
|
self.assertEqual(wrap.game_name_from_env(), "Dota 2")
|
||||||
|
|
||||||
|
def test_unknown_appid_falls_back(self):
|
||||||
|
with mock.patch.dict("os.environ", {"SteamAppId": "999"}), \
|
||||||
|
mock.patch("rigdoctor.core.steam.cached_games", return_value=[]), \
|
||||||
|
mock.patch("rigdoctor.core.steam.scan_games", return_value=[]):
|
||||||
|
self.assertEqual(wrap.game_name_from_env(), "Steam app 999")
|
||||||
|
|
||||||
|
def test_none_without_steam_env(self):
|
||||||
|
with mock.patch.dict("os.environ", {}, clear=True):
|
||||||
|
self.assertIsNone(wrap.game_name_from_env())
|
||||||
|
|
||||||
|
|
||||||
|
class RunTests(unittest.TestCase):
|
||||||
|
def test_brackets_capture_and_returns_exit_code(self):
|
||||||
|
with mock.patch("rigdoctor.core.reccontrol.running_pid", return_value=None), \
|
||||||
|
mock.patch("rigdoctor.core.diagnostic.start", return_value=123) as start, \
|
||||||
|
mock.patch("rigdoctor.core.reccontrol.stop_background") as stop, \
|
||||||
|
mock.patch.dict("os.environ", {}, clear=True):
|
||||||
|
rc = wrap.run(["true"])
|
||||||
|
self.assertEqual(rc, 0)
|
||||||
|
start.assert_called_once()
|
||||||
|
stop.assert_called_once()
|
||||||
|
|
||||||
|
def test_propagates_game_failure(self):
|
||||||
|
with mock.patch("rigdoctor.core.reccontrol.running_pid", return_value=None), \
|
||||||
|
mock.patch("rigdoctor.core.diagnostic.start", return_value=123), \
|
||||||
|
mock.patch("rigdoctor.core.reccontrol.stop_background"), \
|
||||||
|
mock.patch.dict("os.environ", {}, clear=True):
|
||||||
|
self.assertEqual(wrap.run(["false"]), 1)
|
||||||
|
|
||||||
|
def test_does_not_touch_an_existing_capture(self):
|
||||||
|
with mock.patch("rigdoctor.core.reccontrol.running_pid", return_value=999), \
|
||||||
|
mock.patch("rigdoctor.core.diagnostic.start") as start, \
|
||||||
|
mock.patch("rigdoctor.core.reccontrol.stop_background") as stop, \
|
||||||
|
mock.patch.dict("os.environ", {}, clear=True):
|
||||||
|
rc = wrap.run(["true"])
|
||||||
|
self.assertEqual(rc, 0)
|
||||||
|
start.assert_not_called()
|
||||||
|
stop.assert_not_called()
|
||||||
|
|
||||||
|
def test_empty_command_is_usage_error(self):
|
||||||
|
self.assertEqual(wrap.run([]), 2)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
unittest.main()
|
||||||
Reference in New Issue
Block a user