Compare commits
7 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 5502251789 | |||
| 4bd51a40c3 | |||
| 984292c368 | |||
| bffaf73ad4 | |||
| 7f0ab9a635 | |||
| 12339c3282 | |||
| c7e50ba4cb |
@@ -5,6 +5,58 @@ All notable changes to RigDoctor are recorded here. Format follows
|
|||||||
(`MAJOR.MINOR.PATCH`, pre-1.0). `__version__` and `pyproject.toml` must match the git
|
(`MAJOR.MINOR.PATCH`, pre-1.0). `__version__` and `pyproject.toml` must match the git
|
||||||
release tag (so the auto-updater, D18, can compare versions).
|
release tag (so the auto-updater, D18, can compare versions).
|
||||||
|
|
||||||
|
## [0.32.0] - 2026-05-22
|
||||||
|
### Added
|
||||||
|
- **More for diagnostics & reports:**
|
||||||
|
- **`nvidia-smi -q` snapshot** — driver, throttle/clock-event reasons, clocks, power, temps,
|
||||||
|
PCIe link, ECC + retired pages (point-in-time at diagnostic time).
|
||||||
|
- **Display-server log** — auto-detected: `Xorg.0.log` on X11, or the compositor's user-journal
|
||||||
|
slice (gnome-shell/kwin/sway/gamescope) on Wayland.
|
||||||
|
- **Full system inventory** (M5 hardware/OS) is now included in each stored diagnostic and the
|
||||||
|
**Report** bundle — invaluable for larger/shared debugging.
|
||||||
|
These join the kernel log + coredump records in `syslogs.txt`/`inventory.*`, are saved per
|
||||||
|
diagnostic, included in the Report zip, and (logs) fed to the AI on "Explain".
|
||||||
|
|
||||||
|
## [0.31.0] - 2026-05-22
|
||||||
|
### Added
|
||||||
|
- **Diagnostics now collect session-scoped system logs** (`core/syslogs.py`): a kernel-log
|
||||||
|
slice (`journalctl -k` — Xid, OOM-killer, MCE, PCIe AER, thermal, hung tasks) and
|
||||||
|
**crashed-process records** (`coredumpctl` — which executable, signal, and when). They're saved
|
||||||
|
to the diagnostic directory (`syslogs.txt`), included in the **Report** bundle, and fed to the
|
||||||
|
AI on "Explain" alongside the game logs. Best-effort — degrades quietly if the tools are
|
||||||
|
missing or access is denied; scoped to the session window so it doesn't drag in old noise.
|
||||||
|
|
||||||
|
## [0.30.0] - 2026-05-22
|
||||||
|
### Added
|
||||||
|
- **Logging & report bundles (M15, D25)** — opt-in via one **Settings → Logging** toggle
|
||||||
|
(default off). When on: the app logs to a rotating `app.log`, and **each diagnostic is stored
|
||||||
|
in its own folder** (`~/.local/share/rigdoctor/diagnostics/<id>/`) with the capture log, a
|
||||||
|
structured `result.json`, a readable `report.txt`, a session-scoped game-log snapshot, and an
|
||||||
|
`ai/` record of every AI interaction — **the exact data sent, which model, and its reply**.
|
||||||
|
- **Report** — a button on the diagnostic dialog (and `rigdoctor bundle`) zips a diagnostic's
|
||||||
|
folder plus `app.log` into `~/.local/share/rigdoctor/reports/<id>.zip` for sharing. Everything
|
||||||
|
stays local; the zip only leaves your machine if you share it. Available only when logging is on.
|
||||||
|
|
||||||
|
## [0.29.0] - 2026-05-22
|
||||||
|
### Added
|
||||||
|
- **AI now resolves Steam app IDs from your library instead of guessing.** When app IDs appear
|
||||||
|
in the logs/findings, RigDoctor looks them up in your scanned games (`steam.appid_names()`) and
|
||||||
|
injects an "App IDs (resolved from your installed games)" glossary into the prompt — so the
|
||||||
|
model names games correctly (e.g. `2694490 = Path of Exile 2`) rather than hallucinating. Only
|
||||||
|
IDs it can resolve locally are listed; no network, no model "training" needed.
|
||||||
|
|
||||||
|
## [0.28.1] - 2026-05-22
|
||||||
|
### Fixed
|
||||||
|
- **AI explanations were misreading stale/benign logs.** Three fixes so the model analyses the
|
||||||
|
*actual* session: (1) the prompt now states the **real game name, capture duration, and
|
||||||
|
outcome** (clean vs. crash) so the model stops guessing the game from log paths; (2) game logs
|
||||||
|
are **scoped to the session window** (Steam-console lines filtered by timestamp; a stale
|
||||||
|
per-app Proton log from an earlier game is skipped); (3) the reference KB flags common
|
||||||
|
**benign** Steam/Proton lines (`libnvidia-ml.so.1` assertion, routine minidump uploads, "fork
|
||||||
|
without exec") so they aren't reported as the cause. The system prompt also forbids
|
||||||
|
Windows-only advice (no "run as administrator") and tells the model not to invent a problem
|
||||||
|
when the run was clean.
|
||||||
|
|
||||||
## [0.28.0] - 2026-05-22
|
## [0.28.0] - 2026-05-22
|
||||||
### Added
|
### Added
|
||||||
- **AI explanations now include recent game logs.** When you press "Explain with AI" on a
|
- **AI explanations now include recent game logs.** When you press "Explain with AI" on a
|
||||||
|
|||||||
+13
-1
@@ -264,9 +264,21 @@ root cause + suggested next steps). Adds M14 to the D14 set.
|
|||||||
as suggestions (consistent with D9 — it explains/recommends, applying fixes stays
|
as suggestions (consistent with D9 — it explains/recommends, applying fixes stays
|
||||||
consent-gated). No new runtime dependency (HTTP via stdlib).
|
consent-gated). No new runtime dependency (HTTP via stdlib).
|
||||||
|
|
||||||
|
### D25 — Logging & report bundles (M15) — *DECIDED 2026-05-22*
|
||||||
|
Opt-in logging + shareable diagnostic reports.
|
||||||
|
- **One combined `logging_enabled` toggle** (default off) controls both application logging
|
||||||
|
(rotating `app.log`) and per-diagnostic storage. Kept as a single switch for simplicity.
|
||||||
|
- **Each diagnostic is stored in its own directory** (`DATA_DIR/diagnostics/<id>/`): capture
|
||||||
|
log, structured `result.json`, human-readable `report.txt`, a scoped game-log snapshot, and an
|
||||||
|
`ai/` folder recording each AI interaction (**exact data sent, provider+model, and the reply**).
|
||||||
|
- **"Report"** zips one diagnostic directory (plus `app.log`) into `DATA_DIR/reports/` —
|
||||||
|
auto-saved there (no save dialog), shown with its path. Available only when logging is on
|
||||||
|
(nothing is stored otherwise). CLI: `rigdoctor bundle`.
|
||||||
|
- Everything stays local; the report only leaves the machine if the user shares the zip.
|
||||||
|
|
||||||
## Open
|
## Open
|
||||||
|
|
||||||
None currently — all tracked decisions (D1–D24) are resolved. New questions will be added
|
None currently — all tracked decisions (D1–D25) are resolved. New questions will be added
|
||||||
here as they arise. Remaining detail to flesh out during build: the tray's supporting-action
|
here as they arise. Remaining detail to flesh out during build: the tray's supporting-action
|
||||||
set (D13), per-module apt package names, M12's tunnel/token specifics, and M13's
|
set (D13), per-module apt package names, M12's tunnel/token specifics, and M13's
|
||||||
update mechanism (APT repo vs. self-installed `.deb`).
|
update mechanism (APT repo vs. self-installed `.deb`).
|
||||||
|
|||||||
+13
-1
@@ -2,7 +2,8 @@
|
|||||||
|
|
||||||
Status: ⬜ not started · 🟦 designing · 🟨 in progress · ✅ done
|
Status: ⬜ not started · 🟦 designing · 🟨 in progress · ✅ done
|
||||||
|
|
||||||
> Module set per D14, plus **M12 (session sharing, D16)** and **M13 (auto-update, D18)**.
|
> Module set per D14, plus **M12 (session sharing, D16)**, **M13 (auto-update, D18)**,
|
||||||
|
> **M14 (AI assistant, D24)**, and **M15 (logging & reports, D25)**.
|
||||||
> **M7 (stress/repro) was dropped (D7).** M10/M11 are the GUI and tray modules (D10/D11).
|
> **M7 (stress/repro) was dropped (D7).** M10/M11 are the GUI and tray modules (D10/D11).
|
||||||
> GPU scope reads "all (NVIDIA first)" — NVIDIA first, others via the vendor abstraction (D4).
|
> GPU scope reads "all (NVIDIA first)" — NVIDIA first, others via the vendor abstraction (D4).
|
||||||
|
|
||||||
@@ -21,6 +22,7 @@ Status: ⬜ not started · 🟦 designing · 🟨 in progress · ✅ done
|
|||||||
| M12 | Session sharing (shared terminal) | Sharing | none (relay) | all | P3 | ✅ |
|
| M12 | Session sharing (shared terminal) | Sharing | none (relay) | all | P3 | ✅ |
|
||||||
| M13 | Auto-update | (core) | none (stdlib; user-local file swap) | all | P3 | ✅ |
|
| M13 | Auto-update | (core) | none (stdlib; user-local file swap) | all | P3 | ✅ |
|
||||||
| M14 | AI assistant (explain diagnostics) | (optional) | none (stdlib urllib; Ollama or Claude) | all | P3 | ✅ |
|
| M14 | AI assistant (explain diagnostics) | (optional) | none (stdlib urllib; Ollama or Claude) | all | P3 | ✅ |
|
||||||
|
| M15 | Logging & report bundles | (core) | none (stdlib logging + zip) | all | P3 | ✅ |
|
||||||
| ~~M7~~ | ~~Stress / repro~~ | — | — | — | — | ❌ dropped (D7) |
|
| ~~M7~~ | ~~Stress / repro~~ | — | — | — | — | ❌ dropped (D7) |
|
||||||
|
|
||||||
## Notes per module
|
## Notes per module
|
||||||
@@ -128,6 +130,16 @@ Status: ⬜ not started · 🟦 designing · 🟨 in progress · ✅ done
|
|||||||
which lifts a small local model and sharpens Claude. Stdlib `urllib` (no pip deps); output is
|
which lifts a small local model and sharpens Claude. Stdlib `urllib` (no pip deps); output is
|
||||||
advisory (D9). Configure in **Settings → AI assistant**.
|
advisory (D9). Configure in **Settings → AI assistant**.
|
||||||
|
|
||||||
|
- **M15 Logging & report bundles** (D25) — opt-in via one `logging_enabled` toggle (default off):
|
||||||
|
application logging to a rotating `app.log` (`core/applog.py`) and **per-diagnostic storage**
|
||||||
|
(`core/diagstore.py`) — each diagnostic gets its own `DATA_DIR/diagnostics/<id>/`: capture,
|
||||||
|
`result.json`, `report.txt`, the full **inventory** (M5: hardware/OS), scoped **game logs**
|
||||||
|
(`core/gamelogs.py`), scoped **system logs** (`core/syslogs.py` — `journalctl -k`,
|
||||||
|
`coredumpctl`, an `nvidia-smi -q` snapshot, and the X11/Wayland display-server log), and an
|
||||||
|
`ai/` record of every AI interaction (exact data sent, model, reply). **"Report"** zips one
|
||||||
|
into `DATA_DIR/reports/` (GUI button on the diagnostic dialog; CLI `rigdoctor bundle`). Logs
|
||||||
|
are session-scoped and fed to the AI on "Explain". Stays local; shareable on demand.
|
||||||
|
|
||||||
## Bundles (final — D14)
|
## Bundles (final — D14)
|
||||||
- **Essential:** M1 + M3 + M4 *(the MVP, NVIDIA-only — D5)*
|
- **Essential:** M1 + M3 + M4 *(the MVP, NVIDIA-only — D5)*
|
||||||
- **Monitoring:** M2 + M8
|
- **Monitoring:** M2 + M8
|
||||||
|
|||||||
@@ -97,6 +97,13 @@ Ubuntu + NVIDIA first; `.deb` distribution (see `DECISIONS.md`).
|
|||||||
- [ ] *Possible follow-ups:* interactive chat grounded in the data; more reference-KB entries;
|
- [ ] *Possible follow-ups:* interactive chat grounded in the data; more reference-KB entries;
|
||||||
an "Explain" button on the System Health page.
|
an "Explain" button on the System Health page.
|
||||||
|
|
||||||
|
## Phase 8 — Logging & report bundles (M15, D25)
|
||||||
|
- [x] **Opt-in logging** (one `logging_enabled` toggle): rotating `app.log` (`core/applog.py`)
|
||||||
|
+ **per-diagnostic storage** in its own directory (`core/diagstore.py`) — capture,
|
||||||
|
result, report, scoped game logs, and AI-interaction records.
|
||||||
|
- [x] **Report** bundle — zip a diagnostic (incl. exactly what was sent to the AI, the model,
|
||||||
|
and its reply) into the reports folder. GUI button + `rigdoctor bundle`.
|
||||||
|
|
||||||
> **Out of scope:** stress/repro module (D7); multi-distro support and packaging beyond
|
> **Out of scope:** stress/repro module (D7); multi-distro support and packaging beyond
|
||||||
> Ubuntu/apt + `.deb` (D15) — a thin seam is kept but not built out.
|
> Ubuntu/apt + `.deb` (D15) — a thin seam is kept but not built out.
|
||||||
|
|
||||||
|
|||||||
@@ -162,6 +162,18 @@ the actual findings plus matched reference facts from a curated, exact-match kno
|
|||||||
("RAG-lite" — no embeddings/vector store, stdlib only); no fine-tuning. HTTP via stdlib `urllib`
|
("RAG-lite" — no embeddings/vector store, stdlib only); no fine-tuning. HTTP via stdlib `urllib`
|
||||||
(no new core dependency); output is advisory (consistent with D9).
|
(no new core dependency); output is advisory (consistent with D9).
|
||||||
|
|
||||||
|
### M15 — Logging & report bundles (D25)
|
||||||
|
Opt-in (one `logging_enabled` toggle, default off). When on: the application logs to a rotating
|
||||||
|
`app.log`, and **each diagnostic is stored in its own directory** (capture log, structured
|
||||||
|
result, human-readable report, the full **inventory** (M5 hardware/OS), session-scoped **game
|
||||||
|
logs** (Proton/Steam) and **system logs** (`journalctl -k`, `coredumpctl`, an `nvidia-smi -q`
|
||||||
|
snapshot, and the X11/Wayland display-server log), and a record of every AI interaction — the
|
||||||
|
exact data sent, the model, and its reply). The collected logs are also fed to the AI on
|
||||||
|
"Explain". Collection is best-effort (degrades if tools are missing/denied). A **Report** action zips one diagnostic's directory
|
||||||
|
(plus the app log) into a shareable bundle saved under the reports folder (GUI button; CLI
|
||||||
|
`rigdoctor bundle`). Everything stays local — a report only leaves the machine if the user
|
||||||
|
shares the zip. Stdlib only (`logging` + `zipfile`).
|
||||||
|
|
||||||
## 5. Non-functional requirements
|
## 5. Non-functional requirements
|
||||||
- **Zero hard deps for the core/CLI/daemon** — Python stdlib + tools already present. **Qt
|
- **Zero hard deps for the core/CLI/daemon** — Python stdlib + tools already present. **Qt
|
||||||
(PySide6) is required only by the GUI (M10) and tray (M11) modules**, declared in the
|
(PySide6) is required only by the GUI (M10) and tray (M11) modules**, declared in the
|
||||||
|
|||||||
+1
-1
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
|||||||
|
|
||||||
[project]
|
[project]
|
||||||
name = "rigdoctor"
|
name = "rigdoctor"
|
||||||
version = "0.28.0"
|
version = "0.32.0"
|
||||||
description = "Modular hardware monitoring & crash diagnostics for Linux gamers."
|
description = "Modular hardware monitoring & crash diagnostics for Linux gamers."
|
||||||
readme = "README.md"
|
readme = "README.md"
|
||||||
requires-python = ">=3.11"
|
requires-python = ">=3.11"
|
||||||
|
|||||||
@@ -1,3 +1,3 @@
|
|||||||
"""RigDoctor — modular hardware monitoring & crash diagnostics for Linux gamers."""
|
"""RigDoctor — modular hardware monitoring & crash diagnostics for Linux gamers."""
|
||||||
|
|
||||||
__version__ = "0.28.0"
|
__version__ = "0.32.0"
|
||||||
|
|||||||
@@ -472,6 +472,23 @@ def cmd_ai(args) -> int:
|
|||||||
return 0 if ok else 1
|
return 0 if ok else 1
|
||||||
|
|
||||||
|
|
||||||
|
def cmd_bundle(args) -> int:
|
||||||
|
"""Zip the latest stored diagnostic into a report bundle (M15) — needs logging enabled."""
|
||||||
|
from .core import diagstore
|
||||||
|
|
||||||
|
if not diagstore.enabled():
|
||||||
|
print("Logging is off. Enable it (Settings → Logging, or set logging_enabled) so "
|
||||||
|
"diagnostics are stored and can be reported.")
|
||||||
|
return 1
|
||||||
|
directory = diagstore.latest_dir()
|
||||||
|
if directory is None:
|
||||||
|
print("No stored diagnostics yet — run a diagnostic first.")
|
||||||
|
return 1
|
||||||
|
out = diagstore.make_report(directory)
|
||||||
|
print(f"Report written: {out}")
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
def cmd_gameenv(args) -> int:
|
def cmd_gameenv(args) -> int:
|
||||||
from dataclasses import asdict
|
from dataclasses import asdict
|
||||||
|
|
||||||
@@ -686,10 +703,16 @@ def build_parser() -> argparse.ArgumentParser:
|
|||||||
ai_sub.add_parser("test", help="send a tiny probe to verify connectivity").set_defaults(func=cmd_ai)
|
ai_sub.add_parser("test", help="send a tiny probe to verify connectivity").set_defaults(func=cmd_ai)
|
||||||
ai_sub.add_parser("explain", help="explain the current health findings with AI").set_defaults(func=cmd_ai)
|
ai_sub.add_parser("explain", help="explain the current health findings with AI").set_defaults(func=cmd_ai)
|
||||||
ai_p.set_defaults(func=cmd_ai, ai_cmd=None)
|
ai_p.set_defaults(func=cmd_ai, ai_cmd=None)
|
||||||
|
|
||||||
|
bundle_p = sub.add_parser("bundle", help="zip the latest stored diagnostic into a report bundle (M15)")
|
||||||
|
bundle_p.set_defaults(func=cmd_bundle)
|
||||||
return p
|
return p
|
||||||
|
|
||||||
|
|
||||||
def main(argv: list[str] | None = None) -> int:
|
def main(argv: list[str] | None = None) -> int:
|
||||||
|
from .core import applog
|
||||||
|
|
||||||
|
applog.setup() # opt-in app logging (M15); no-op unless logging_enabled
|
||||||
args = build_parser().parse_args(argv)
|
args = build_parser().parse_args(argv)
|
||||||
return args.func(args)
|
return args.func(args)
|
||||||
|
|
||||||
|
|||||||
@@ -37,6 +37,12 @@ SPAWN_LOG = STATE_DIR / "recorder.out"
|
|||||||
# not config: refreshed by the background scan on every launch).
|
# not config: refreshed by the background scan on every launch).
|
||||||
GAMES_FILE = STATE_DIR / "games.json"
|
GAMES_FILE = STATE_DIR / "games.json"
|
||||||
|
|
||||||
|
# Logging & reports (opt-in via `logging_enabled`). App log: rotating file of app events.
|
||||||
|
# Each diagnostic is stored under DIAGNOSTICS_DIR/<id>/; "Report" zips one into REPORTS_DIR.
|
||||||
|
APP_LOG = STATE_DIR / "app.log"
|
||||||
|
DIAGNOSTICS_DIR = DATA_DIR / "diagnostics"
|
||||||
|
REPORTS_DIR = DATA_DIR / "reports"
|
||||||
|
|
||||||
# Update access token (M13) — gates updates to Gitea account holders (D18).
|
# Update access token (M13) — gates updates to Gitea account holders (D18).
|
||||||
# Stored in the OS keyring (Secret Service / GNOME Keyring) via `secret-tool` when
|
# Stored in the OS keyring (Secret Service / GNOME Keyring) via `secret-tool` when
|
||||||
# available — encrypted at rest, unlocked with the login session — else a 0600 file.
|
# available — encrypted at rest, unlocked with the login session — else a 0600 file.
|
||||||
@@ -190,6 +196,7 @@ DEFAULTS: dict = {
|
|||||||
"ai_provider": "", # AI assistant (M14, D24): "" (unset) | "ollama" | "claude"
|
"ai_provider": "", # AI assistant (M14, D24): "" (unset) | "ollama" | "claude"
|
||||||
"ai_model": "", # model name (e.g. "llama3.1" for Ollama; blank = Claude default)
|
"ai_model": "", # model name (e.g. "llama3.1" for Ollama; blank = Claude default)
|
||||||
"ai_endpoint": "http://localhost:11434", # Ollama server base URL (Claude uses a fixed endpoint)
|
"ai_endpoint": "http://localhost:11434", # Ollama server base URL (Claude uses a fixed endpoint)
|
||||||
|
"logging_enabled": False, # opt-in: app logging + per-diagnostic storage + Report (M15)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
+44
-11
@@ -16,12 +16,15 @@ Answers are *grounded*: we pass the actual findings plus matched reference facts
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import json
|
import json
|
||||||
|
import re
|
||||||
import urllib.error
|
import urllib.error
|
||||||
import urllib.request
|
import urllib.request
|
||||||
|
|
||||||
from .. import config
|
from .. import config
|
||||||
from . import ai_knowledge
|
from . import ai_knowledge
|
||||||
|
|
||||||
|
_APPID_RE = re.compile(r"\b\d{5,7}\b") # Steam app IDs are 5–7 digits
|
||||||
|
|
||||||
PROVIDERS = ("ollama", "claude")
|
PROVIDERS = ("ollama", "claude")
|
||||||
OLLAMA_DEFAULT_ENDPOINT = "http://localhost:11434"
|
OLLAMA_DEFAULT_ENDPOINT = "http://localhost:11434"
|
||||||
# Suggested Ollama model — strong instruction-following that fits an 8 GB GPU at Q4. Because we
|
# Suggested Ollama model — strong instruction-following that fits an 8 GB GPU at Q4. Because we
|
||||||
@@ -33,15 +36,20 @@ CLAUDE_MAX_TOKENS = 2000
|
|||||||
ANTHROPIC_VERSION = "2023-06-01"
|
ANTHROPIC_VERSION = "2023-06-01"
|
||||||
|
|
||||||
SYSTEM_PROMPT = (
|
SYSTEM_PROMPT = (
|
||||||
"You are RigDoctor's hardware-diagnostics assistant for Linux gamers. You are given the "
|
"You are RigDoctor's hardware-diagnostics assistant for Linux gamers (Ubuntu + NVIDIA, games "
|
||||||
"structured findings RigDoctor collected from this machine — which may include recent game, "
|
"via Steam/Proton). You are given session context, the structured findings RigDoctor "
|
||||||
"Proton, and system log excerpts — plus a set of reference facts. Explain in plain language "
|
"collected — which may include recent game/Proton/system log excerpts scoped to this session "
|
||||||
"what they mean, correlate any log errors with the findings to pinpoint WHEN and WHY things "
|
"— plus reference facts. Use the GAME NAME from the session context; never guess the game "
|
||||||
"went wrong, identify the most likely root cause, and give concrete, ordered next steps "
|
"from log paths or app IDs. Correlate log errors with the findings to pinpoint WHEN and WHY "
|
||||||
"(exact commands where useful). Base your reasoning ONLY on the data and reference facts "
|
"things went wrong, identify the most likely root cause, and give concrete, ordered next "
|
||||||
"provided — do not invent readings, hardware, or log lines. Be concise and practical. "
|
"steps with exact Linux commands where useful.\n"
|
||||||
"Present fixes as suggestions, and clearly warn before any step that could cause data loss "
|
"Rules: Base your reasoning ONLY on the data and reference facts provided — never invent "
|
||||||
"or instability. Format your answer in Markdown."
|
"readings, hardware, or log lines. This is LINUX: never suggest Windows-only steps (e.g. "
|
||||||
|
"'run as administrator', registry edits, toggling antivirus). Treat log lines flagged BENIGN "
|
||||||
|
"in the reference facts as non-causal. If no crash was recorded and there are no warning or "
|
||||||
|
"critical findings, say plainly that the session looks healthy and do NOT manufacture a "
|
||||||
|
"problem. Be concise. Present fixes as suggestions and warn before anything that risks data "
|
||||||
|
"loss or instability. Format your answer in Markdown."
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@@ -84,10 +92,35 @@ def provider_label() -> str:
|
|||||||
return "not configured"
|
return "not configured"
|
||||||
|
|
||||||
|
|
||||||
|
def appid_glossary(text: str) -> str:
|
||||||
|
"""Resolve Steam app IDs that appear in `text` against the user's scanned library.
|
||||||
|
|
||||||
|
We don't teach the model app IDs — we look them up locally and hand it the mapping, so it
|
||||||
|
names games correctly instead of guessing. Only IDs we can resolve are listed.
|
||||||
|
"""
|
||||||
|
candidates = set(_APPID_RE.findall(text))
|
||||||
|
if not candidates:
|
||||||
|
return ""
|
||||||
|
try:
|
||||||
|
from . import steam
|
||||||
|
names = steam.appid_names()
|
||||||
|
except Exception: # never let a glossary lookup break an explanation
|
||||||
|
return ""
|
||||||
|
known = sorted((i, names[i]) for i in candidates if i in names)
|
||||||
|
if not known:
|
||||||
|
return ""
|
||||||
|
return "App IDs (resolved from your installed games):\n" + "\n".join(
|
||||||
|
f"- {appid} = {name}" for appid, name in known)
|
||||||
|
|
||||||
|
|
||||||
def build_prompt(findings_text: str) -> str:
|
def build_prompt(findings_text: str) -> str:
|
||||||
"""The user-message content: matched reference facts + the collected findings."""
|
"""The user-message content: app-ID glossary + matched reference facts + the findings."""
|
||||||
facts = ai_knowledge.relevant(findings_text)
|
|
||||||
parts = []
|
parts = []
|
||||||
|
glossary = appid_glossary(findings_text)
|
||||||
|
if glossary:
|
||||||
|
parts.append(glossary)
|
||||||
|
parts.append("")
|
||||||
|
facts = ai_knowledge.relevant(findings_text)
|
||||||
if facts:
|
if facts:
|
||||||
parts.append("Reference facts (use these to interpret the findings):")
|
parts.append("Reference facts (use these to interpret the findings):")
|
||||||
parts += [f"- {f}" for f in facts]
|
parts += [f"- {f}" for f in facts]
|
||||||
|
|||||||
@@ -64,6 +64,18 @@ ENTRIES: list[tuple[tuple[str, ...], str]] = [
|
|||||||
(("nvidia persistence", "persistence mode"),
|
(("nvidia persistence", "persistence mode"),
|
||||||
"NVIDIA persistence mode keeps the driver loaded when no app is using the GPU, avoiding "
|
"NVIDIA persistence mode keeps the driver loaded when no app is using the GPU, avoiding "
|
||||||
"re-init stalls — harmless to enable."),
|
"re-init stalls — harmless to enable."),
|
||||||
|
(("libnvidia-ml.so", "interface.h", "failed to load \"libnvidia-ml"),
|
||||||
|
"BENIGN: a Steam log assertion 'Failed to load libnvidia-ml.so.1' (from interface.h) is "
|
||||||
|
"logged on many normal launches — the Steam runtime sandbox can't see the host NVML library. "
|
||||||
|
"It is NOT by itself a crash cause. Only investigate the driver if the GPU is genuinely "
|
||||||
|
"undetected (nvidia-smi fails)."),
|
||||||
|
(("minidump", ".dmp", "uploading minidump"),
|
||||||
|
"BENIGN-by-default: a minidump upload line means a crash handler ran AND that the game/engine "
|
||||||
|
"routinely uploads dumps; it is not proof that THIS session crashed unless a hard freeze or "
|
||||||
|
"non-zero exit was also recorded. Don't treat a routine minidump line as the root cause."),
|
||||||
|
(("fork without exec", "skipping destruction"),
|
||||||
|
"BENIGN: 'pid X != Y, skipping destruction (fork without exec?)' is routine Steam/Proton "
|
||||||
|
"process bookkeeping, not an error."),
|
||||||
]
|
]
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -0,0 +1,63 @@
|
|||||||
|
"""Application logging (M15) — opt-in via the `logging_enabled` setting.
|
||||||
|
|
||||||
|
When enabled, app events/errors are written to a rotating file (`config.APP_LOG`); when
|
||||||
|
disabled, nothing is written (no file is created). All RigDoctor code logs through
|
||||||
|
``applog.get_logger(__name__)``; the handler is attached once at startup by :func:`setup`.
|
||||||
|
Stdlib ``logging`` only.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
|
from logging.handlers import RotatingFileHandler
|
||||||
|
|
||||||
|
from .. import config
|
||||||
|
|
||||||
|
_ROOT = "rigdoctor"
|
||||||
|
_configured = False
|
||||||
|
|
||||||
|
|
||||||
|
def setup(force: bool = False) -> bool:
|
||||||
|
"""Attach the file handler if logging is enabled. Idempotent. Returns whether it's on."""
|
||||||
|
global _configured
|
||||||
|
logger = logging.getLogger(_ROOT)
|
||||||
|
enabled = bool(config.load_config().get("logging_enabled", False))
|
||||||
|
|
||||||
|
if not enabled:
|
||||||
|
if force: # toggled off at runtime — detach so we stop writing
|
||||||
|
for h in list(logger.handlers):
|
||||||
|
logger.removeHandler(h)
|
||||||
|
h.close()
|
||||||
|
_configured = False
|
||||||
|
return False
|
||||||
|
|
||||||
|
if _configured and not force:
|
||||||
|
return True
|
||||||
|
for h in list(logger.handlers): # avoid duplicate handlers on re-setup
|
||||||
|
logger.removeHandler(h)
|
||||||
|
h.close()
|
||||||
|
try:
|
||||||
|
config.STATE_DIR.mkdir(parents=True, exist_ok=True)
|
||||||
|
handler = RotatingFileHandler(config.APP_LOG, maxBytes=2_000_000, backupCount=3,
|
||||||
|
encoding="utf-8")
|
||||||
|
handler.setFormatter(logging.Formatter(
|
||||||
|
"%(asctime)s %(levelname)-7s %(name)s: %(message)s"))
|
||||||
|
logger.addHandler(handler)
|
||||||
|
logger.setLevel(logging.INFO)
|
||||||
|
logger.propagate = False
|
||||||
|
_configured = True
|
||||||
|
logger.info("logging started (rigdoctor %s)", _version())
|
||||||
|
except OSError:
|
||||||
|
return False
|
||||||
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
def get_logger(name: str) -> logging.Logger:
|
||||||
|
"""A child logger. Safe to call before setup — it just won't write until enabled."""
|
||||||
|
short = name.split(".")[-1]
|
||||||
|
return logging.getLogger(f"{_ROOT}.{short}")
|
||||||
|
|
||||||
|
|
||||||
|
def _version() -> str:
|
||||||
|
from .. import __version__
|
||||||
|
return __version__
|
||||||
@@ -28,6 +28,7 @@ class DiagnosticResult:
|
|||||||
game: str | None
|
game: str | None
|
||||||
summary: Summary # capture window: peak temps/power, events, last samples (M3)
|
summary: Summary # capture window: peak temps/power, events, last samples (M3)
|
||||||
findings: list[Finding] # health findings: Xid/SMART/driver/etc. (M4)
|
findings: list[Finding] # health findings: Xid/SMART/driver/etc. (M4)
|
||||||
|
dir: str | None = None # storage directory when logging is on (M15); else None
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
@@ -97,7 +98,22 @@ def finish(last_n: int = 10, log_path=None) -> DiagnosticResult:
|
|||||||
summary = summarize(path, last_n=last_n)
|
summary = summarize(path, last_n=last_n)
|
||||||
game = _game_from_summary(summary) or (reccontrol.read_status() or {}).get("game")
|
game = _game_from_summary(summary) or (reccontrol.read_status() or {}).get("game")
|
||||||
findings = run_health_checks()
|
findings = run_health_checks()
|
||||||
return DiagnosticResult(game=game, summary=summary, findings=findings)
|
result = DiagnosticResult(game=game, summary=summary, findings=findings)
|
||||||
|
_store(result, path, summary)
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
def _store(result: DiagnosticResult, capture_path, summary: Summary) -> None:
|
||||||
|
"""Persist the diagnostic to its own directory when logging is enabled (M15)."""
|
||||||
|
try:
|
||||||
|
from . import diagstore
|
||||||
|
|
||||||
|
since = (summary.start - 60) if summary.start else None
|
||||||
|
directory = diagstore.store(result, capture_path, since=since)
|
||||||
|
if directory:
|
||||||
|
result.dir = str(directory)
|
||||||
|
except Exception: # storage must never break a diagnostic
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
# --- hard-crash detection & post-crash analysis -----------------------------------
|
# --- hard-crash detection & post-crash analysis -----------------------------------
|
||||||
@@ -184,4 +200,6 @@ def analyze_crash(last_n: int = 15) -> DiagnosticResult:
|
|||||||
findings += check_previous_boot() # the crashed boot's kernel log
|
findings += check_previous_boot() # the crashed boot's kernel log
|
||||||
findings += run_health_checks(include_journal=False) # SMART/driver/persistence/temps
|
findings += run_health_checks(include_journal=False) # SMART/driver/persistence/temps
|
||||||
findings.sort(key=lambda f: _SEV_ORDER.get(f.severity, 9))
|
findings.sort(key=lambda f: _SEV_ORDER.get(f.severity, 9))
|
||||||
return DiagnosticResult(game=_game_from_summary(summary), summary=summary, findings=findings)
|
result = DiagnosticResult(game=_game_from_summary(summary), summary=summary, findings=findings)
|
||||||
|
_store(result, _crash_path(), summary)
|
||||||
|
return result
|
||||||
|
|||||||
@@ -0,0 +1,152 @@
|
|||||||
|
"""Per-diagnostic storage + Report bundles (M15) — opt-in via `logging_enabled`.
|
||||||
|
|
||||||
|
When logging is on, each finished diagnostic is persisted to its own directory under
|
||||||
|
``config.DIAGNOSTICS_DIR/<id>/`` (capture log, structured result, human-readable report, a
|
||||||
|
game-log snapshot, and any AI interactions). "Report" zips one directory — including exactly
|
||||||
|
**what was sent to the AI, which model, and its reply** — into ``config.REPORTS_DIR``.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
|
import shutil
|
||||||
|
import time
|
||||||
|
import zipfile
|
||||||
|
from dataclasses import asdict, is_dataclass
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from .. import config
|
||||||
|
|
||||||
|
|
||||||
|
def enabled() -> bool:
|
||||||
|
return bool(config.load_config().get("logging_enabled", False))
|
||||||
|
|
||||||
|
|
||||||
|
def _slug(name: str | None) -> str:
|
||||||
|
s = "".join(c if c.isalnum() else "-" for c in (name or "session").lower())
|
||||||
|
return s.strip("-")[:40] or "session"
|
||||||
|
|
||||||
|
|
||||||
|
def _new_dir(game: str | None) -> Path:
|
||||||
|
base = config.DIAGNOSTICS_DIR
|
||||||
|
stamp = time.strftime("%Y%m%d-%H%M%S")
|
||||||
|
name = f"{stamp}-{_slug(game)}"
|
||||||
|
target = base / name
|
||||||
|
n = 1
|
||||||
|
while target.exists():
|
||||||
|
target = base / f"{name}-{n}"
|
||||||
|
n += 1
|
||||||
|
target.mkdir(parents=True, exist_ok=True)
|
||||||
|
return target
|
||||||
|
|
||||||
|
|
||||||
|
def _as_dict(obj):
|
||||||
|
if is_dataclass(obj):
|
||||||
|
return asdict(obj)
|
||||||
|
return getattr(obj, "__dict__", {}) or str(obj)
|
||||||
|
|
||||||
|
|
||||||
|
def store(result, capture_path=None, since: float | None = None) -> Path | None:
|
||||||
|
"""Persist a finished diagnostic to its own directory. Returns the dir, or None if off."""
|
||||||
|
if not enabled():
|
||||||
|
return None
|
||||||
|
from ..render import render_summary
|
||||||
|
from . import ai, gamelogs, syslogs
|
||||||
|
|
||||||
|
target = _new_dir(getattr(result, "game", None))
|
||||||
|
|
||||||
|
if capture_path and Path(capture_path).exists():
|
||||||
|
try:
|
||||||
|
shutil.copyfile(capture_path, target / "capture.jsonl")
|
||||||
|
except OSError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
payload = {
|
||||||
|
"game": getattr(result, "game", None),
|
||||||
|
"stored_at": time.time(),
|
||||||
|
"summary": _as_dict(result.summary),
|
||||||
|
"findings": [_as_dict(f) for f in result.findings],
|
||||||
|
}
|
||||||
|
_write(target / "result.json", json.dumps(payload, indent=2, default=str))
|
||||||
|
|
||||||
|
report = [f"Game: {getattr(result, 'game', None) or 'unknown'}", "",
|
||||||
|
render_summary(result.summary), "",
|
||||||
|
ai.format_findings(result.findings, header="Findings:")]
|
||||||
|
_write(target / "report.txt", "\n".join(report))
|
||||||
|
|
||||||
|
try:
|
||||||
|
logs = gamelogs.collect(since=since)
|
||||||
|
if logs:
|
||||||
|
_write(target / "gamelogs.txt", logs)
|
||||||
|
except OSError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
try:
|
||||||
|
sys_logs = syslogs.collect(since=since)
|
||||||
|
if sys_logs:
|
||||||
|
_write(target / "syslogs.txt", sys_logs)
|
||||||
|
except OSError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
try: # full hardware/OS inventory (M5) — invaluable for larger debugging in a shared report
|
||||||
|
from . import inventory
|
||||||
|
|
||||||
|
sections = inventory.collect()
|
||||||
|
_write(target / "inventory.txt", inventory.render_text(sections))
|
||||||
|
_write(target / "inventory.json", inventory.render_json(sections))
|
||||||
|
except Exception: # inventory probes vary by machine; never let it break storage
|
||||||
|
pass
|
||||||
|
return target
|
||||||
|
|
||||||
|
|
||||||
|
def record_ai(diag_dir, *, provider: str, model: str, system: str, prompt: str, response: str) -> None:
|
||||||
|
"""Save one AI interaction (exact data sent, model, reply) into the diagnostic's `ai/` dir."""
|
||||||
|
if not diag_dir:
|
||||||
|
return
|
||||||
|
out = Path(diag_dir) / "ai"
|
||||||
|
try:
|
||||||
|
out.mkdir(parents=True, exist_ok=True)
|
||||||
|
except OSError:
|
||||||
|
return
|
||||||
|
stamp = time.strftime("%Y%m%d-%H%M%S")
|
||||||
|
record = {
|
||||||
|
"timestamp": time.time(), "provider": provider, "model": model,
|
||||||
|
"system_prompt": system, "data_sent_to_model": prompt, "model_reply": response,
|
||||||
|
}
|
||||||
|
_write(out / f"explain-{stamp}.json", json.dumps(record, indent=2, default=str))
|
||||||
|
readable = (
|
||||||
|
f"Provider: {provider}\nModel: {model}\n\n"
|
||||||
|
f"=== System prompt ===\n{system}\n\n"
|
||||||
|
f"=== Data sent to the model ===\n{prompt}\n\n"
|
||||||
|
f"=== Model reply ===\n{response}\n"
|
||||||
|
)
|
||||||
|
_write(out / f"explain-{stamp}.txt", readable)
|
||||||
|
|
||||||
|
|
||||||
|
def make_report(diag_dir) -> Path:
|
||||||
|
"""Zip a diagnostic directory (plus the app log) into REPORTS_DIR; return the zip path."""
|
||||||
|
diag_dir = Path(diag_dir)
|
||||||
|
config.REPORTS_DIR.mkdir(parents=True, exist_ok=True)
|
||||||
|
out = config.REPORTS_DIR / f"report-{diag_dir.name}.zip"
|
||||||
|
with zipfile.ZipFile(out, "w", zipfile.ZIP_DEFLATED) as zf:
|
||||||
|
for path in sorted(diag_dir.rglob("*")):
|
||||||
|
if path.is_file():
|
||||||
|
zf.write(path, arcname=str(Path(diag_dir.name) / path.relative_to(diag_dir)))
|
||||||
|
if config.APP_LOG.exists(): # the application log, for context around the session
|
||||||
|
zf.write(config.APP_LOG, arcname=str(Path(diag_dir.name) / "app.log"))
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def latest_dir() -> Path | None:
|
||||||
|
try:
|
||||||
|
dirs = [d for d in config.DIAGNOSTICS_DIR.iterdir() if d.is_dir()]
|
||||||
|
except OSError:
|
||||||
|
return None
|
||||||
|
return max(dirs, key=lambda d: d.stat().st_mtime) if dirs else None
|
||||||
|
|
||||||
|
|
||||||
|
def _write(path: Path, text: str) -> None:
|
||||||
|
try:
|
||||||
|
path.write_text(text, encoding="utf-8")
|
||||||
|
except OSError:
|
||||||
|
pass
|
||||||
@@ -10,11 +10,41 @@ vkd3d/DXVK error, a crash line, the exit code) rather than only the sensor summa
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import os
|
import os
|
||||||
|
import re
|
||||||
|
import time
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
# Steam keeps logs under its install root; ~/.steam/steam usually symlinks to the real one.
|
# Steam keeps logs under its install root; ~/.steam/steam usually symlinks to the real one.
|
||||||
_STEAM_LOG_DIRS = ("~/.steam/steam/logs", "~/.local/share/Steam/logs", "~/.steam/root/logs")
|
_STEAM_LOG_DIRS = ("~/.steam/steam/logs", "~/.local/share/Steam/logs", "~/.steam/root/logs")
|
||||||
_STEAM_LOG_FILES = ("console-linux.txt", "console_log.txt", "stderr.txt")
|
_STEAM_LOG_FILES = ("console-linux.txt", "console_log.txt", "stderr.txt")
|
||||||
|
_TS = re.compile(r"^\[(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})\]")
|
||||||
|
|
||||||
|
|
||||||
|
def _line_epoch(line: str) -> float | None:
|
||||||
|
m = _TS.match(line)
|
||||||
|
if not m:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
return time.mktime(time.strptime(m.group(1), "%Y-%m-%d %H:%M:%S"))
|
||||||
|
except ValueError:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _since_filter(text: str, since: float) -> str:
|
||||||
|
"""Keep lines from the first timestamp >= `since` onward (logs are chronological).
|
||||||
|
|
||||||
|
Untimestamped lines before the window are dropped; once inside the window every line is
|
||||||
|
kept (so multi-line entries survive). This scopes a long-lived Steam log to one session.
|
||||||
|
"""
|
||||||
|
out: list[str] = []
|
||||||
|
including = False
|
||||||
|
for line in text.splitlines():
|
||||||
|
epoch = _line_epoch(line)
|
||||||
|
if epoch is not None and epoch >= since:
|
||||||
|
including = True
|
||||||
|
if including:
|
||||||
|
out.append(line)
|
||||||
|
return "\n".join(out)
|
||||||
|
|
||||||
|
|
||||||
def _tail(path: Path, max_bytes: int) -> str:
|
def _tail(path: Path, max_bytes: int) -> str:
|
||||||
@@ -51,17 +81,36 @@ def available() -> bool:
|
|||||||
return bool(_proton_logs() or _steam_console())
|
return bool(_proton_logs() or _steam_console())
|
||||||
|
|
||||||
|
|
||||||
def collect(max_bytes: int = 6000) -> str:
|
def collect(since: float | None = None, max_bytes: int = 8000) -> str:
|
||||||
"""Recent Proton + Steam log tails as one labelled text block ('' if none)."""
|
"""Recent Proton + Steam log tails as one labelled text block ('' if none).
|
||||||
|
|
||||||
|
With ``since`` (epoch), scope to that session: skip a Proton log not written during/after
|
||||||
|
the session (a stale per-app log from an earlier game), and keep only Steam-console lines
|
||||||
|
timestamped at/after ``since`` — so we don't feed the model an unrelated past session.
|
||||||
|
"""
|
||||||
sections: list[str] = []
|
sections: list[str] = []
|
||||||
|
|
||||||
protons = _proton_logs()
|
protons = _proton_logs()
|
||||||
if protons:
|
if protons:
|
||||||
tail = _tail(protons[0], max_bytes).strip()
|
log = protons[0]
|
||||||
|
fresh = since is None or _mtime(log) >= since
|
||||||
|
tail = _tail(log, max_bytes).strip() if fresh else ""
|
||||||
if tail:
|
if tail:
|
||||||
sections.append(f"--- Proton log ({protons[0].name}) ---\n{tail}")
|
sections.append(f"--- Proton log ({log.name}) ---\n{tail}")
|
||||||
|
|
||||||
console = _steam_console()
|
console = _steam_console()
|
||||||
if console:
|
if console:
|
||||||
tail = _tail(console, max_bytes).strip()
|
raw = _tail(console, 40000 if since else max_bytes)
|
||||||
if tail:
|
if since is not None:
|
||||||
sections.append(f"--- Steam log ({console.name}) ---\n{tail}")
|
raw = _since_filter(raw, since)
|
||||||
|
raw = raw.strip()[-max_bytes:].strip()
|
||||||
|
if raw:
|
||||||
|
sections.append(f"--- Steam log ({console.name}) ---\n{raw}")
|
||||||
return "\n\n".join(sections)
|
return "\n\n".join(sections)
|
||||||
|
|
||||||
|
|
||||||
|
def _mtime(path: Path) -> float:
|
||||||
|
try:
|
||||||
|
return path.stat().st_mtime
|
||||||
|
except OSError:
|
||||||
|
return 0.0
|
||||||
|
|||||||
@@ -318,6 +318,11 @@ def cached_games() -> list[Game]:
|
|||||||
return [Game(**{k: g[k] for k in Game.__dataclass_fields__ if k in g}) for g in cache.get("games", [])]
|
return [Game(**{k: g[k] for k in Game.__dataclass_fields__ if k in g}) for g in cache.get("games", [])]
|
||||||
|
|
||||||
|
|
||||||
|
def appid_names() -> dict[str, str]:
|
||||||
|
"""{appid: name} for the user's scanned games — lets us resolve IDs seen in logs (M14)."""
|
||||||
|
return {g.appid: g.name for g in cached_games() if g.appid and g.name}
|
||||||
|
|
||||||
|
|
||||||
def rescan(cfg: dict | None = None) -> ScanResult:
|
def rescan(cfg: dict | None = None) -> ScanResult:
|
||||||
"""Scan the selected libraries, diff against the cache, and persist the result.
|
"""Scan the selected libraries, diff against the cache, and persist the result.
|
||||||
|
|
||||||
|
|||||||
@@ -0,0 +1,141 @@
|
|||||||
|
"""Session-scoped system logs for diagnostics (M15): kernel, coredumps, NVIDIA, display.
|
||||||
|
|
||||||
|
Covers what the *system* logged when something went wrong, so the report bundle and the AI both
|
||||||
|
see it:
|
||||||
|
* kernel ring-buffer slice (`journalctl -k`) — Xid, OOM-killer, MCE, PCIe AER, thermal, hung tasks
|
||||||
|
* systemd-coredump records (`coredumpctl`) — did the game/wine dump core (SIGSEGV/ABRT), when
|
||||||
|
* an `nvidia-smi -q` snapshot — driver, throttle/clock-event reasons, clocks, power, temps, PCIe,
|
||||||
|
ECC + retired pages (point-in-time at diagnostic time)
|
||||||
|
* the display-server log — `Xorg.0.log` on X11, or the compositor's user-journal slice on Wayland
|
||||||
|
Best-effort and size-bounded: degrades silently if a tool is missing or access is denied. Stdlib only.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import os
|
||||||
|
import shutil
|
||||||
|
import subprocess
|
||||||
|
import time
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
_MAX = 8000 # cap each log section so the prompt/report stays small
|
||||||
|
_NV_MAX = 10000 # nvidia-smi -q is structured + valuable; allow a bit more (head-truncated)
|
||||||
|
|
||||||
|
# Compositors whose user-journal entries are the "Wayland log" (OR-matched by journalctl).
|
||||||
|
_COMPOSITORS = ("gnome-shell", "mutter", "kwin_wayland", "Xwayland", "sway", "gamescope")
|
||||||
|
_XORG_LOGS = ("~/.local/share/xorg/Xorg.0.log", "/var/log/Xorg.0.log")
|
||||||
|
|
||||||
|
|
||||||
|
def _since_arg(since: float | None) -> str | None:
|
||||||
|
return time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(since)) if since else None
|
||||||
|
|
||||||
|
|
||||||
|
def _run(cmd: list[str], timeout: float = 15.0) -> str:
|
||||||
|
try:
|
||||||
|
proc = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout)
|
||||||
|
except (OSError, subprocess.SubprocessError):
|
||||||
|
return ""
|
||||||
|
return (proc.stdout or "").strip()
|
||||||
|
|
||||||
|
|
||||||
|
def kernel_log(since: float | None = None, max_bytes: int = _MAX) -> str:
|
||||||
|
if not shutil.which("journalctl"):
|
||||||
|
return ""
|
||||||
|
cmd = ["journalctl", "-k", "--no-pager"]
|
||||||
|
since_arg = _since_arg(since)
|
||||||
|
if since_arg:
|
||||||
|
cmd += ["--since", since_arg]
|
||||||
|
out = _run(cmd)
|
||||||
|
if not out or out.strip().lower() == "-- no entries --": # journalctl's empty marker
|
||||||
|
return ""
|
||||||
|
return out[-max_bytes:]
|
||||||
|
|
||||||
|
|
||||||
|
def coredumps(since: float | None = None, max_bytes: int = _MAX) -> str:
|
||||||
|
if not shutil.which("coredumpctl"):
|
||||||
|
return ""
|
||||||
|
cmd = ["coredumpctl", "list", "--no-pager"]
|
||||||
|
since_arg = _since_arg(since)
|
||||||
|
if since_arg:
|
||||||
|
cmd += ["--since", since_arg]
|
||||||
|
out = _run(cmd)
|
||||||
|
if not out or "no coredumps" in out.lower():
|
||||||
|
return ""
|
||||||
|
return out[-max_bytes:]
|
||||||
|
|
||||||
|
|
||||||
|
def nvidia_snapshot(max_bytes: int = _NV_MAX) -> str:
|
||||||
|
"""Point-in-time `nvidia-smi -q` (head-truncated — driver/temps/clocks/ECC sit near the top)."""
|
||||||
|
if not shutil.which("nvidia-smi"):
|
||||||
|
return ""
|
||||||
|
out = _run(["nvidia-smi", "-q"])
|
||||||
|
return out[:max_bytes] if out else ""
|
||||||
|
|
||||||
|
|
||||||
|
def _xorg_log() -> Path | None:
|
||||||
|
for cand in _XORG_LOGS:
|
||||||
|
path = Path(os.path.expanduser(cand))
|
||||||
|
if path.exists():
|
||||||
|
return path
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _session_type() -> str:
|
||||||
|
declared = os.environ.get("XDG_SESSION_TYPE", "").lower()
|
||||||
|
if declared in ("x11", "wayland"):
|
||||||
|
return declared
|
||||||
|
if os.environ.get("WAYLAND_DISPLAY"):
|
||||||
|
return "wayland"
|
||||||
|
return "x11" if _xorg_log() else "unknown"
|
||||||
|
|
||||||
|
|
||||||
|
def _tail_file(path: Path, max_bytes: int) -> str:
|
||||||
|
try:
|
||||||
|
size = path.stat().st_size
|
||||||
|
with path.open("rb") as fh:
|
||||||
|
if size > max_bytes:
|
||||||
|
fh.seek(size - max_bytes)
|
||||||
|
return fh.read().decode("utf-8", "replace")
|
||||||
|
except OSError:
|
||||||
|
return ""
|
||||||
|
|
||||||
|
|
||||||
|
def display_log(since: float | None = None, max_bytes: int = _MAX) -> str:
|
||||||
|
"""Xorg.0.log on X11, or the compositor's user-journal slice on Wayland ('' if none)."""
|
||||||
|
if _session_type() == "wayland":
|
||||||
|
if not shutil.which("journalctl"):
|
||||||
|
return ""
|
||||||
|
cmd = ["journalctl", "--user", "--no-pager"]
|
||||||
|
since_arg = _since_arg(since)
|
||||||
|
if since_arg:
|
||||||
|
cmd += ["--since", since_arg]
|
||||||
|
cmd += [f"_COMM={comp}" for comp in _COMPOSITORS] # OR-matched
|
||||||
|
out = _run(cmd)
|
||||||
|
if not out or out.strip().lower() == "-- no entries --":
|
||||||
|
return ""
|
||||||
|
return out[-max_bytes:]
|
||||||
|
log = _xorg_log() # X11: Xorg log isn't wall-clock-timestamped, so tail rather than scope
|
||||||
|
return _tail_file(log, max_bytes) if log else ""
|
||||||
|
|
||||||
|
|
||||||
|
def available() -> bool:
|
||||||
|
return bool(shutil.which("journalctl") or shutil.which("coredumpctl")
|
||||||
|
or shutil.which("nvidia-smi") or _xorg_log())
|
||||||
|
|
||||||
|
|
||||||
|
def collect(since: float | None = None) -> str:
|
||||||
|
"""Kernel + coredumps + NVIDIA snapshot + display log as one labelled block ('' if none)."""
|
||||||
|
sections: list[str] = []
|
||||||
|
kern = kernel_log(since)
|
||||||
|
if kern:
|
||||||
|
sections.append(f"--- Kernel log (journalctl -k) ---\n{kern}")
|
||||||
|
cores = coredumps(since)
|
||||||
|
if cores:
|
||||||
|
sections.append(f"--- Crashed processes (coredumpctl) ---\n{cores}")
|
||||||
|
nvidia = nvidia_snapshot()
|
||||||
|
if nvidia:
|
||||||
|
sections.append(f"--- NVIDIA snapshot (nvidia-smi -q) ---\n{nvidia}")
|
||||||
|
display = display_log(since)
|
||||||
|
if display:
|
||||||
|
sections.append(f"--- Display server log ({_session_type()}) ---\n{display}")
|
||||||
|
return "\n\n".join(sections)
|
||||||
@@ -17,6 +17,10 @@ ICON = Path(__file__).parent / "assets" / "rigdoctor.svg"
|
|||||||
|
|
||||||
|
|
||||||
def main(argv: list[str] | None = None) -> int:
|
def main(argv: list[str] | None = None) -> int:
|
||||||
|
from ..core import applog
|
||||||
|
|
||||||
|
applog.setup() # opt-in app logging (M15); no-op unless logging_enabled
|
||||||
|
applog.get_logger(__name__).info("GUI starting")
|
||||||
desktop.ensure() # self-register icon + .desktop so updates show it without re-installing
|
desktop.ensure() # self-register icon + .desktop so updates show it without re-installing
|
||||||
app = QApplication(argv if argv is not None else sys.argv)
|
app = QApplication(argv if argv is not None else sys.argv)
|
||||||
app.setApplicationName("RigDoctor")
|
app.setApplicationName("RigDoctor")
|
||||||
|
|||||||
@@ -86,6 +86,10 @@ class DiagnosticDialog(QDialog):
|
|||||||
from ..core import ai
|
from ..core import ai
|
||||||
self._explain_btn.setVisible(ai.is_configured()) # opt-in only; hidden if not set up
|
self._explain_btn.setVisible(ai.is_configured()) # opt-in only; hidden if not set up
|
||||||
buttons.addWidget(self._explain_btn)
|
buttons.addWidget(self._explain_btn)
|
||||||
|
self._report_btn = QPushButton("Report") # zip this diagnostic's logs (M15)
|
||||||
|
self._report_btn.clicked.connect(self._make_report)
|
||||||
|
self._report_btn.setVisible(bool(result.dir)) # only when logging stored the session
|
||||||
|
buttons.addWidget(self._report_btn)
|
||||||
buttons.addStretch(1)
|
buttons.addStretch(1)
|
||||||
close = QPushButton("Close")
|
close = QPushButton("Close")
|
||||||
close.setObjectName("PrimaryButton")
|
close.setObjectName("PrimaryButton")
|
||||||
@@ -111,14 +115,42 @@ class DiagnosticDialog(QDialog):
|
|||||||
threading.Thread(target=self._work_explain, daemon=True).start()
|
threading.Thread(target=self._work_explain, daemon=True).start()
|
||||||
|
|
||||||
def _work_explain(self) -> None:
|
def _work_explain(self) -> None:
|
||||||
from ..core import ai, gamelogs
|
from ..core import ai, gamelogs, syslogs
|
||||||
|
|
||||||
text = ai.format_findings(self._result.findings, header="Diagnostic findings:")
|
result = self._result
|
||||||
text += "\n\nCapture summary:\n" + render_summary(self._result.summary)
|
summary = result.summary
|
||||||
logs = gamelogs.collect()
|
events = {kind for _ts, kind, _detail in summary.events}
|
||||||
|
clean = "session-stop" in events
|
||||||
|
gpu_lost = "gpu-lost" in events
|
||||||
|
|
||||||
|
lines = [f"Game: {result.game or 'unknown'}"]
|
||||||
|
if summary.start and summary.end:
|
||||||
|
lines.append(f"Capture duration: ~{int(summary.end - summary.start)}s")
|
||||||
|
outcome = "ended cleanly (no crash detected)" if clean else \
|
||||||
|
"ended without a clean stop (possible crash/freeze)"
|
||||||
|
if gpu_lost:
|
||||||
|
outcome += "; a GPU-lost event was recorded"
|
||||||
|
lines.append(f"Outcome: {outcome}")
|
||||||
|
lines.append("")
|
||||||
|
lines.append(ai.format_findings(result.findings, header="Findings:"))
|
||||||
|
lines.append("\nCapture summary:\n" + render_summary(summary))
|
||||||
|
|
||||||
|
since = (summary.start - 60) if summary.start else None
|
||||||
|
logs = gamelogs.collect(since=since) # scoped to this session
|
||||||
if logs:
|
if logs:
|
||||||
text += "\n\nRecent game/Proton/Steam logs (newest at the end):\n" + logs
|
lines.append("\nGame/Proton/Steam logs for this session:\n" + logs)
|
||||||
self._explained.emit(ai.explain(text))
|
sys_logs = syslogs.collect(since=since) # kernel log + crashed-process records
|
||||||
|
if sys_logs:
|
||||||
|
lines.append("\nSystem logs for this session (kernel + crashed processes):\n" + sys_logs)
|
||||||
|
text = "\n".join(lines)
|
||||||
|
ok, reply = ai.explain(text)
|
||||||
|
if result.dir: # record exactly what was sent, the model, and the reply (M15)
|
||||||
|
from ..core import diagstore
|
||||||
|
diagstore.record_ai(
|
||||||
|
result.dir, provider=ai.provider(), model=ai.model(),
|
||||||
|
system=ai.SYSTEM_PROMPT, prompt=ai.build_prompt(text),
|
||||||
|
response=reply if ok else f"[error] {reply}")
|
||||||
|
self._explained.emit((ok, reply))
|
||||||
|
|
||||||
def _on_explained(self, result) -> None:
|
def _on_explained(self, result) -> None:
|
||||||
ok, text = result
|
ok, text = result
|
||||||
@@ -126,6 +158,31 @@ class DiagnosticDialog(QDialog):
|
|||||||
self._explain_btn.setText("Explain with AI")
|
self._explain_btn.setText("Explain with AI")
|
||||||
self._show_explanation(text if ok else f"AI explanation failed:\n\n{text}")
|
self._show_explanation(text if ok else f"AI explanation failed:\n\n{text}")
|
||||||
|
|
||||||
|
# --- Report bundle (M15) ------------------------------------------------------
|
||||||
|
def _make_report(self) -> None:
|
||||||
|
from PySide6.QtCore import QUrl
|
||||||
|
from PySide6.QtGui import QDesktopServices
|
||||||
|
|
||||||
|
from ..core import diagstore
|
||||||
|
|
||||||
|
self._report_btn.setEnabled(False)
|
||||||
|
try:
|
||||||
|
out = diagstore.make_report(self._result.dir)
|
||||||
|
except OSError as exc:
|
||||||
|
self._report_btn.setEnabled(True)
|
||||||
|
QMessageBox.warning(self, "Report failed", str(exc))
|
||||||
|
return
|
||||||
|
self._report_btn.setEnabled(True)
|
||||||
|
box = QMessageBox(self)
|
||||||
|
box.setWindowTitle("Report created")
|
||||||
|
box.setText(f"Saved report:\n{out}\n\nIt contains this diagnostic's logs and any AI "
|
||||||
|
"interaction (data sent, model, and reply).")
|
||||||
|
open_btn = box.addButton("Open folder", QMessageBox.ButtonRole.ActionRole)
|
||||||
|
box.addButton("OK", QMessageBox.ButtonRole.AcceptRole)
|
||||||
|
box.exec()
|
||||||
|
if box.clickedButton() is open_btn:
|
||||||
|
QDesktopServices.openUrl(QUrl.fromLocalFile(str(out.parent)))
|
||||||
|
|
||||||
def _show_explanation(self, text: str) -> None:
|
def _show_explanation(self, text: str) -> None:
|
||||||
from ..core import ai
|
from ..core import ai
|
||||||
|
|
||||||
|
|||||||
@@ -215,6 +215,23 @@ class SetupPage(QWidget):
|
|||||||
ai_layout.addWidget(self._ai_status)
|
ai_layout.addWidget(self._ai_status)
|
||||||
root.addWidget(ai_card)
|
root.addWidget(ai_card)
|
||||||
|
|
||||||
|
# Logging (M15): opt-in app logging + per-diagnostic storage (enables the Report bundle).
|
||||||
|
log_card, log_layout = _panel("Logging")
|
||||||
|
log_desc = QLabel(
|
||||||
|
"Save application logs and store each diagnostic in its own folder so you can review "
|
||||||
|
"or <b>Report</b> it. Off by default; everything stays on your machine.\n"
|
||||||
|
f"• Diagnostics: {config.DIAGNOSTICS_DIR}\n"
|
||||||
|
f"• Reports: {config.REPORTS_DIR}"
|
||||||
|
)
|
||||||
|
log_desc.setObjectName("Muted")
|
||||||
|
log_desc.setWordWrap(True)
|
||||||
|
log_layout.addWidget(log_desc)
|
||||||
|
self._logging = QCheckBox("Enable logging (application + diagnostics)")
|
||||||
|
self._logging.setChecked(config.load_config().get("logging_enabled", False))
|
||||||
|
self._logging.toggled.connect(self._toggle_logging)
|
||||||
|
log_layout.addWidget(self._logging)
|
||||||
|
root.addWidget(log_card)
|
||||||
|
|
||||||
# Account access (M13/M12): one Gitea token gates updates and session sharing.
|
# Account access (M13/M12): one Gitea token gates updates and session sharing.
|
||||||
upd_card, upd_layout = _panel("Account access")
|
upd_card, upd_layout = _panel("Account access")
|
||||||
hint = QLabel("A Gitea access token unlocks updates and session sharing. "
|
hint = QLabel("A Gitea access token unlocks updates and session sharing. "
|
||||||
@@ -320,6 +337,12 @@ class SetupPage(QWidget):
|
|||||||
self._ai_test_btn.setEnabled(True)
|
self._ai_test_btn.setEnabled(True)
|
||||||
self._ai_status.setText(("✓ " if ok else "✗ ") + (msg[:200] if msg else ""))
|
self._ai_status.setText(("✓ " if ok else "✗ ") + (msg[:200] if msg else ""))
|
||||||
|
|
||||||
|
def _toggle_logging(self, on: bool) -> None:
|
||||||
|
from ..core import applog
|
||||||
|
|
||||||
|
config.update_config(logging_enabled=on)
|
||||||
|
applog.setup(force=True) # attach/detach the file handler immediately
|
||||||
|
|
||||||
def _run_wizard(self) -> None:
|
def _run_wizard(self) -> None:
|
||||||
from .setup_wizard import SetupWizard
|
from .setup_wizard import SetupWizard
|
||||||
|
|
||||||
|
|||||||
@@ -62,6 +62,23 @@ class PromptTests(unittest.TestCase):
|
|||||||
text = ai.format_findings([F()])
|
text = ai.format_findings([F()])
|
||||||
self.assertIn("[WARN] GPU: Hot — 92C", text)
|
self.assertIn("[WARN] GPU: Hot — 92C", text)
|
||||||
|
|
||||||
|
def test_appid_glossary_resolves_known_ids(self):
|
||||||
|
from rigdoctor.core import steam
|
||||||
|
with mock.patch.object(steam, "appid_names", return_value={"2694490": "Path of Exile 2"}):
|
||||||
|
glossary = ai.appid_glossary("Steam log: removed AppID 2694490 ... pid 130544")
|
||||||
|
self.assertIn("2694490 = Path of Exile 2", glossary)
|
||||||
|
|
||||||
|
def test_appid_glossary_ignores_unknown_ids(self):
|
||||||
|
from rigdoctor.core import steam
|
||||||
|
with mock.patch.object(steam, "appid_names", return_value={"570": "Dota 2"}):
|
||||||
|
self.assertEqual(ai.appid_glossary("pid 130544 used 8192 MiB"), "") # not in library
|
||||||
|
|
||||||
|
def test_build_prompt_includes_glossary(self):
|
||||||
|
from rigdoctor.core import steam
|
||||||
|
with mock.patch.object(steam, "appid_names", return_value={"2694490": "Path of Exile 2"}):
|
||||||
|
prompt = ai.build_prompt("AppID 2694490 launched")
|
||||||
|
self.assertIn("Path of Exile 2", prompt)
|
||||||
|
|
||||||
|
|
||||||
class ExplainTests(unittest.TestCase):
|
class ExplainTests(unittest.TestCase):
|
||||||
def _cfg(self, **over):
|
def _cfg(self, **over):
|
||||||
|
|||||||
@@ -0,0 +1,104 @@
|
|||||||
|
"""Tests for M15 per-diagnostic storage + Report bundles + app logging."""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import tempfile
|
||||||
|
import unittest
|
||||||
|
import zipfile
|
||||||
|
from dataclasses import dataclass, field
|
||||||
|
from pathlib import Path
|
||||||
|
from unittest import mock
|
||||||
|
|
||||||
|
from rigdoctor.core import applog, diagstore
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class FakeSummary:
|
||||||
|
start: float = 1.0
|
||||||
|
end: float = 2.0
|
||||||
|
samples: int = 3
|
||||||
|
events: list = field(default_factory=list)
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class FakeFinding:
|
||||||
|
severity: str = "ok"
|
||||||
|
category: str = "GPU"
|
||||||
|
title: str = "Looks fine"
|
||||||
|
detail: str = "no issues"
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class FakeResult:
|
||||||
|
game: str = "Path of Exile 2"
|
||||||
|
summary: FakeSummary = field(default_factory=FakeSummary)
|
||||||
|
findings: list = field(default_factory=lambda: [FakeFinding()])
|
||||||
|
dir: str | None = None
|
||||||
|
|
||||||
|
|
||||||
|
class StoreTests(unittest.TestCase):
|
||||||
|
def setUp(self):
|
||||||
|
self.tmp = Path(tempfile.mkdtemp())
|
||||||
|
|
||||||
|
def test_disabled_returns_none(self):
|
||||||
|
with mock.patch.object(diagstore, "enabled", return_value=False):
|
||||||
|
self.assertIsNone(diagstore.store(FakeResult()))
|
||||||
|
|
||||||
|
def test_store_writes_artifacts(self):
|
||||||
|
with mock.patch.object(diagstore, "enabled", return_value=True), \
|
||||||
|
mock.patch("rigdoctor.render.render_summary", return_value="SUMMARY-TEXT"), \
|
||||||
|
mock.patch("rigdoctor.core.gamelogs.collect", return_value="LOG-TEXT"), \
|
||||||
|
mock.patch("rigdoctor.core.syslogs.collect", return_value="SYS-LOG"), \
|
||||||
|
mock.patch("rigdoctor.core.inventory.collect", return_value=[]), \
|
||||||
|
mock.patch.object(diagstore.config, "DIAGNOSTICS_DIR", self.tmp / "diagnostics"):
|
||||||
|
directory = diagstore.store(FakeResult())
|
||||||
|
self.assertTrue((directory / "result.json").exists())
|
||||||
|
self.assertTrue((directory / "report.txt").exists())
|
||||||
|
self.assertEqual((directory / "gamelogs.txt").read_text(), "LOG-TEXT")
|
||||||
|
self.assertEqual((directory / "syslogs.txt").read_text(), "SYS-LOG")
|
||||||
|
self.assertTrue((directory / "inventory.txt").exists()) # inventory included for debugging
|
||||||
|
data = json.loads((directory / "result.json").read_text())
|
||||||
|
self.assertEqual(data["game"], "Path of Exile 2")
|
||||||
|
self.assertEqual(len(data["findings"]), 1)
|
||||||
|
|
||||||
|
def test_record_ai_then_report_includes_ai_and_applog(self):
|
||||||
|
diag = self.tmp / "20260522-poe2"
|
||||||
|
diag.mkdir()
|
||||||
|
diagstore.record_ai(diag, provider="claude", model="claude-opus-4-7",
|
||||||
|
system="SYS", prompt="EXACT DATA SENT", response="THE REPLY")
|
||||||
|
ai_files = list((diag / "ai").glob("explain-*.json"))
|
||||||
|
self.assertTrue(ai_files)
|
||||||
|
record = json.loads(ai_files[0].read_text())
|
||||||
|
self.assertEqual(record["model"], "claude-opus-4-7")
|
||||||
|
self.assertEqual(record["data_sent_to_model"], "EXACT DATA SENT")
|
||||||
|
self.assertEqual(record["model_reply"], "THE REPLY")
|
||||||
|
|
||||||
|
app_log = self.tmp / "app.log"
|
||||||
|
app_log.write_text("app log line")
|
||||||
|
with mock.patch.object(diagstore.config, "REPORTS_DIR", self.tmp / "reports"), \
|
||||||
|
mock.patch.object(diagstore.config, "APP_LOG", app_log):
|
||||||
|
out = diagstore.make_report(diag)
|
||||||
|
self.assertTrue(out.exists())
|
||||||
|
with zipfile.ZipFile(out) as zf:
|
||||||
|
names = zf.namelist()
|
||||||
|
self.assertTrue(any(n.endswith("app.log") for n in names))
|
||||||
|
self.assertTrue(any("/ai/explain-" in n for n in names))
|
||||||
|
|
||||||
|
|
||||||
|
class AppLogTests(unittest.TestCase):
|
||||||
|
def test_disabled_is_noop(self):
|
||||||
|
with mock.patch.object(applog.config, "load_config", return_value={"logging_enabled": False}):
|
||||||
|
self.assertFalse(applog.setup(force=True))
|
||||||
|
|
||||||
|
def test_enabled_writes_file(self):
|
||||||
|
tmp = Path(tempfile.mkdtemp())
|
||||||
|
with mock.patch.object(applog.config, "load_config", return_value={"logging_enabled": True}), \
|
||||||
|
mock.patch.object(applog.config, "STATE_DIR", tmp), \
|
||||||
|
mock.patch.object(applog.config, "APP_LOG", tmp / "app.log"):
|
||||||
|
self.assertTrue(applog.setup(force=True))
|
||||||
|
applog.get_logger("test").info("hello world")
|
||||||
|
applog.setup(force=True) # cleanup path: re-run detaches/reattaches cleanly
|
||||||
|
self.assertTrue((tmp / "app.log").exists())
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
unittest.main()
|
||||||
@@ -1,6 +1,8 @@
|
|||||||
"""Tests for M14 game/Proton/Steam log collection."""
|
"""Tests for M14 game/Proton/Steam log collection."""
|
||||||
|
|
||||||
|
import os
|
||||||
import tempfile
|
import tempfile
|
||||||
|
import time
|
||||||
import unittest
|
import unittest
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from unittest import mock
|
from unittest import mock
|
||||||
@@ -45,5 +47,31 @@ class CollectTests(unittest.TestCase):
|
|||||||
self.assertEqual(gamelogs.collect(), "")
|
self.assertEqual(gamelogs.collect(), "")
|
||||||
|
|
||||||
|
|
||||||
|
class SinceScopingTests(unittest.TestCase):
|
||||||
|
def test_since_filter_keeps_window_only(self):
|
||||||
|
text = (
|
||||||
|
"[2026-05-22 13:00:00] old session line\n"
|
||||||
|
"[2026-05-22 13:00:01] another old line\n"
|
||||||
|
"[2026-05-22 14:30:00] new session launch\n"
|
||||||
|
"[2026-05-22 14:30:05] new session error\n"
|
||||||
|
)
|
||||||
|
since = time.mktime(time.strptime("2026-05-22 14:00:00", "%Y-%m-%d %H:%M:%S"))
|
||||||
|
out = gamelogs._since_filter(text, since)
|
||||||
|
self.assertIn("new session launch", out)
|
||||||
|
self.assertIn("new session error", out)
|
||||||
|
self.assertNotIn("old session", out)
|
||||||
|
|
||||||
|
def test_collect_skips_stale_proton_log(self):
|
||||||
|
tmp = Path(tempfile.mkdtemp())
|
||||||
|
proton = tmp / "steam-9999.log"
|
||||||
|
proton.write_text("stale proton output from an earlier game")
|
||||||
|
old_mtime = time.time() - 3600
|
||||||
|
os.utime(proton, (old_mtime, old_mtime))
|
||||||
|
since = time.time() - 60 # session started a minute ago
|
||||||
|
with mock.patch.object(gamelogs, "_proton_logs", return_value=[proton]), \
|
||||||
|
mock.patch.object(gamelogs, "_steam_console", return_value=None):
|
||||||
|
self.assertEqual(gamelogs.collect(since=since), "") # stale log excluded
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
unittest.main()
|
unittest.main()
|
||||||
|
|||||||
@@ -0,0 +1,95 @@
|
|||||||
|
"""Tests for M15 session-scoped system-log collection (kernel + coredumps)."""
|
||||||
|
|
||||||
|
import unittest
|
||||||
|
from unittest import mock
|
||||||
|
|
||||||
|
from rigdoctor.core import syslogs
|
||||||
|
|
||||||
|
|
||||||
|
class KernelLogTests(unittest.TestCase):
|
||||||
|
def test_passes_since_and_tails(self):
|
||||||
|
with mock.patch("shutil.which", return_value="/usr/bin/journalctl"), \
|
||||||
|
mock.patch.object(syslogs, "_run", return_value="X" * 100 + "TAILLINE") as run:
|
||||||
|
out = syslogs.kernel_log(since=1_000_000_000, max_bytes=8)
|
||||||
|
self.assertEqual(out, "TAILLINE")
|
||||||
|
cmd = run.call_args[0][0]
|
||||||
|
self.assertIn("-k", cmd)
|
||||||
|
self.assertIn("--since", cmd)
|
||||||
|
|
||||||
|
def test_missing_tool_returns_empty(self):
|
||||||
|
with mock.patch("shutil.which", return_value=None):
|
||||||
|
self.assertEqual(syslogs.kernel_log(), "")
|
||||||
|
|
||||||
|
|
||||||
|
class CoredumpTests(unittest.TestCase):
|
||||||
|
def test_empty_when_no_coredumps(self):
|
||||||
|
with mock.patch("shutil.which", return_value="/usr/bin/coredumpctl"), \
|
||||||
|
mock.patch.object(syslogs, "_run", return_value="No coredumps found."):
|
||||||
|
self.assertEqual(syslogs.coredumps(), "")
|
||||||
|
|
||||||
|
def test_returns_list(self):
|
||||||
|
with mock.patch("shutil.which", return_value="/usr/bin/coredumpctl"), \
|
||||||
|
mock.patch.object(syslogs, "_run", return_value="TIME PID SIG EXE\n... SEGV PathOfExile"):
|
||||||
|
out = syslogs.coredumps()
|
||||||
|
self.assertIn("PathOfExile", out)
|
||||||
|
|
||||||
|
|
||||||
|
class NvidiaTests(unittest.TestCase):
|
||||||
|
def test_missing_tool(self):
|
||||||
|
with mock.patch("shutil.which", return_value=None):
|
||||||
|
self.assertEqual(syslogs.nvidia_snapshot(), "")
|
||||||
|
|
||||||
|
def test_snapshot_head_truncated(self):
|
||||||
|
with mock.patch("shutil.which", return_value="/usr/bin/nvidia-smi"), \
|
||||||
|
mock.patch.object(syslogs, "_run", return_value="DRIVER\n" + "x" * 99999):
|
||||||
|
out = syslogs.nvidia_snapshot(max_bytes=10)
|
||||||
|
self.assertEqual(out, "DRIVER\nxxx") # head, not tail
|
||||||
|
|
||||||
|
|
||||||
|
class DisplayTests(unittest.TestCase):
|
||||||
|
def test_session_type_env(self):
|
||||||
|
with mock.patch.dict("os.environ", {"XDG_SESSION_TYPE": "wayland"}):
|
||||||
|
self.assertEqual(syslogs._session_type(), "wayland")
|
||||||
|
|
||||||
|
def test_x11_tails_xorg_log(self):
|
||||||
|
import tempfile
|
||||||
|
from pathlib import Path
|
||||||
|
log = Path(tempfile.mkdtemp()) / "Xorg.0.log"
|
||||||
|
log.write_text("(EE) NVIDIA(GPU-0): something failed")
|
||||||
|
with mock.patch.object(syslogs, "_session_type", return_value="x11"), \
|
||||||
|
mock.patch.object(syslogs, "_xorg_log", return_value=log):
|
||||||
|
out = syslogs.display_log()
|
||||||
|
self.assertIn("(EE) NVIDIA", out)
|
||||||
|
|
||||||
|
def test_wayland_uses_user_journal(self):
|
||||||
|
with mock.patch.object(syslogs, "_session_type", return_value="wayland"), \
|
||||||
|
mock.patch("shutil.which", return_value="/usr/bin/journalctl"), \
|
||||||
|
mock.patch.object(syslogs, "_run", return_value="gnome-shell: GPU error") as run:
|
||||||
|
out = syslogs.display_log(since=1_000_000_000)
|
||||||
|
self.assertIn("GPU error", out)
|
||||||
|
cmd = run.call_args[0][0]
|
||||||
|
self.assertIn("--user", cmd)
|
||||||
|
self.assertTrue(any(a.startswith("_COMM=") for a in cmd))
|
||||||
|
|
||||||
|
|
||||||
|
class CollectTests(unittest.TestCase):
|
||||||
|
def test_collect_combines_sections(self):
|
||||||
|
with mock.patch.object(syslogs, "kernel_log", return_value="NVRM: Xid 79"), \
|
||||||
|
mock.patch.object(syslogs, "coredumps", return_value="game SIGSEGV"), \
|
||||||
|
mock.patch.object(syslogs, "nvidia_snapshot", return_value="Driver Version 595"), \
|
||||||
|
mock.patch.object(syslogs, "display_log", return_value="(EE) NVIDIA"):
|
||||||
|
out = syslogs.collect()
|
||||||
|
for needle in ("Kernel log", "Xid 79", "Crashed processes", "SIGSEGV",
|
||||||
|
"NVIDIA snapshot", "595", "Display server log"):
|
||||||
|
self.assertIn(needle, out)
|
||||||
|
|
||||||
|
def test_collect_empty_when_nothing(self):
|
||||||
|
with mock.patch.object(syslogs, "kernel_log", return_value=""), \
|
||||||
|
mock.patch.object(syslogs, "coredumps", return_value=""), \
|
||||||
|
mock.patch.object(syslogs, "nvidia_snapshot", return_value=""), \
|
||||||
|
mock.patch.object(syslogs, "display_log", return_value=""):
|
||||||
|
self.assertEqual(syslogs.collect(), "")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
unittest.main()
|
||||||
Reference in New Issue
Block a user