Compare commits
14 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 5502251789 | |||
| 4bd51a40c3 | |||
| 984292c368 | |||
| bffaf73ad4 | |||
| 7f0ab9a635 | |||
| 12339c3282 | |||
| c7e50ba4cb | |||
| a3caabc0d5 | |||
| b59f202891 | |||
| e6d94fbd59 | |||
| 045f40c4de | |||
| 2ff4056d89 | |||
| 2fe03269e4 | |||
| ac2a3981fc |
@@ -5,6 +5,94 @@ All notable changes to RigDoctor are recorded here. Format follows
|
||||
(`MAJOR.MINOR.PATCH`, pre-1.0). `__version__` and `pyproject.toml` must match the git
|
||||
release tag (so the auto-updater, D18, can compare versions).
|
||||
|
||||
## [0.32.0] - 2026-05-22
|
||||
### Added
|
||||
- **More for diagnostics & reports:**
|
||||
- **`nvidia-smi -q` snapshot** — driver, throttle/clock-event reasons, clocks, power, temps,
|
||||
PCIe link, ECC + retired pages (point-in-time at diagnostic time).
|
||||
- **Display-server log** — auto-detected: `Xorg.0.log` on X11, or the compositor's user-journal
|
||||
slice (gnome-shell/kwin/sway/gamescope) on Wayland.
|
||||
- **Full system inventory** (M5 hardware/OS) is now included in each stored diagnostic and the
|
||||
**Report** bundle — invaluable for larger/shared debugging.
|
||||
These join the kernel log + coredump records in `syslogs.txt`/`inventory.*`, are saved per
|
||||
diagnostic, included in the Report zip, and (logs) fed to the AI on "Explain".
|
||||
|
||||
## [0.31.0] - 2026-05-22
|
||||
### Added
|
||||
- **Diagnostics now collect session-scoped system logs** (`core/syslogs.py`): a kernel-log
|
||||
slice (`journalctl -k` — Xid, OOM-killer, MCE, PCIe AER, thermal, hung tasks) and
|
||||
**crashed-process records** (`coredumpctl` — which executable, signal, and when). They're saved
|
||||
to the diagnostic directory (`syslogs.txt`), included in the **Report** bundle, and fed to the
|
||||
AI on "Explain" alongside the game logs. Best-effort — degrades quietly if the tools are
|
||||
missing or access is denied; scoped to the session window so it doesn't drag in old noise.
|
||||
|
||||
## [0.30.0] - 2026-05-22
|
||||
### Added
|
||||
- **Logging & report bundles (M15, D25)** — opt-in via one **Settings → Logging** toggle
|
||||
(default off). When on: the app logs to a rotating `app.log`, and **each diagnostic is stored
|
||||
in its own folder** (`~/.local/share/rigdoctor/diagnostics/<id>/`) with the capture log, a
|
||||
structured `result.json`, a readable `report.txt`, a session-scoped game-log snapshot, and an
|
||||
`ai/` record of every AI interaction — **the exact data sent, which model, and its reply**.
|
||||
- **Report** — a button on the diagnostic dialog (and `rigdoctor bundle`) zips a diagnostic's
|
||||
folder plus `app.log` into `~/.local/share/rigdoctor/reports/<id>.zip` for sharing. Everything
|
||||
stays local; the zip only leaves your machine if you share it. Available only when logging is on.
|
||||
|
||||
## [0.29.0] - 2026-05-22
|
||||
### Added
|
||||
- **AI now resolves Steam app IDs from your library instead of guessing.** When app IDs appear
|
||||
in the logs/findings, RigDoctor looks them up in your scanned games (`steam.appid_names()`) and
|
||||
injects an "App IDs (resolved from your installed games)" glossary into the prompt — so the
|
||||
model names games correctly (e.g. `2694490 = Path of Exile 2`) rather than hallucinating. Only
|
||||
IDs it can resolve locally are listed; no network, no model "training" needed.
|
||||
|
||||
## [0.28.1] - 2026-05-22
|
||||
### Fixed
|
||||
- **AI explanations were misreading stale/benign logs.** Three fixes so the model analyses the
|
||||
*actual* session: (1) the prompt now states the **real game name, capture duration, and
|
||||
outcome** (clean vs. crash) so the model stops guessing the game from log paths; (2) game logs
|
||||
are **scoped to the session window** (Steam-console lines filtered by timestamp; a stale
|
||||
per-app Proton log from an earlier game is skipped); (3) the reference KB flags common
|
||||
**benign** Steam/Proton lines (`libnvidia-ml.so.1` assertion, routine minidump uploads, "fork
|
||||
without exec") so they aren't reported as the cause. The system prompt also forbids
|
||||
Windows-only advice (no "run as administrator") and tells the model not to invent a problem
|
||||
when the run was clean.
|
||||
|
||||
## [0.28.0] - 2026-05-22
|
||||
### Added
|
||||
- **AI explanations now include recent game logs.** When you press "Explain with AI" on a
|
||||
diagnostic, RigDoctor also gathers recent **Proton** (`~/steam-<appid>.log`) and **Steam**
|
||||
console logs (`core/gamelogs.py`, tail-read + size-bounded) and passes them to the model, so
|
||||
it can correlate log errors with the sensor findings and pinpoint *when* something went wrong.
|
||||
### Fixed
|
||||
- The AI explanation popup now **renders Markdown** (headings, bold, lists) instead of showing
|
||||
raw `###`/`**` — `QTextEdit.setMarkdown`, and the model is told to answer in Markdown.
|
||||
|
||||
## [0.27.1] - 2026-05-22
|
||||
### Changed
|
||||
- AI assistant: selecting **Ollama** now pre-fills the model field with **`qwen2.5:7b`** (a
|
||||
strong 7B that fits an 8 GB GPU; our grounding makes a 7B sufficient). It won't overwrite a
|
||||
model you've already entered, and you can change it freely.
|
||||
|
||||
## [0.27.0] - 2026-05-22
|
||||
### Added
|
||||
- **AI assistant (M14, D24)** — optional, **strictly opt-in, never automatic**. Explains your
|
||||
diagnostics in plain language only when you press **"Explain with AI"** on the diagnostic
|
||||
dialog (or run `rigdoctor ai explain`). You choose a provider explicitly (no default):
|
||||
**Ollama** (local, private, no key) or **Claude** (Anthropic; key stored in the keyring, with
|
||||
a consent prompt before any data is sent). Configure in **Settings → AI assistant**.
|
||||
- Answers are **grounded**: RigDoctor passes the actual findings plus matched reference facts
|
||||
from a curated knowledge base (`core/ai_knowledge.py` — exact keyword/code match, no
|
||||
embeddings, stdlib only), so even a small local model gets the domain facts it needs. Stdlib
|
||||
`urllib` only — no new core dependency. Output is advisory (D9).
|
||||
- CLI: `rigdoctor ai status|test|explain`.
|
||||
|
||||
## [0.26.1] - 2026-05-22
|
||||
### Fixed
|
||||
- **Setup wizard contrast.** The **radio buttons** (Recording trigger) were unstyled, so the
|
||||
selected option was invisible on the dark theme — now styled with a clear accent ring + dot.
|
||||
Bundle **checkboxes** got explicit checked/disabled states, and stay selectable even when a
|
||||
bundle is already installed (the page no longer looks dead when everything's present).
|
||||
|
||||
## [0.26.0] - 2026-05-22
|
||||
### Added
|
||||
- **Graphical setup wizard (M9).** A first-run GUI wizard (`gui/setup_wizard.py`) walks through:
|
||||
|
||||
+28
-1
@@ -249,9 +249,36 @@ duplicated what the GUI already shows and added surface area. Concretely:
|
||||
(preserves fish/ls/git theming), full-screen-able, with the guest read-only unless the host
|
||||
ticks "Allow the guest to type" (the D9 consent exception). Account-gated by the Gitea token.
|
||||
|
||||
### D24 — AI assistant module (M14) — *DECIDED 2026-05-22; adds to D14*
|
||||
A new optional module that **explains the collected diagnostics in plain language** (likely
|
||||
root cause + suggested next steps). Adds M14 to the D14 set.
|
||||
- **Strictly opt-in, never automatic.** The model is contacted **only** on an explicit user
|
||||
action (an "Explain with AI" button / `rigdoctor ai explain`) — never on launch, after a
|
||||
diagnostic, in the sample/record loop, or in the background. **Configuring** a provider does
|
||||
not trigger any call.
|
||||
- **Local-first.** Defaults to a local **Ollama** server (data never leaves the machine, no
|
||||
key, stdlib `urllib`). An **OpenAI-compatible** endpoint (cloud or local) can be used with a
|
||||
key (stored in the keyring like the update token). Cloud use shows a "this sends your data to
|
||||
X" consent before the first call.
|
||||
- **Grounded & advisory.** The prompt carries only the findings we collected; output is framed
|
||||
as suggestions (consistent with D9 — it explains/recommends, applying fixes stays
|
||||
consent-gated). No new runtime dependency (HTTP via stdlib).
|
||||
|
||||
### D25 — Logging & report bundles (M15) — *DECIDED 2026-05-22*
|
||||
Opt-in logging + shareable diagnostic reports.
|
||||
- **One combined `logging_enabled` toggle** (default off) controls both application logging
|
||||
(rotating `app.log`) and per-diagnostic storage. Kept as a single switch for simplicity.
|
||||
- **Each diagnostic is stored in its own directory** (`DATA_DIR/diagnostics/<id>/`): capture
|
||||
log, structured `result.json`, human-readable `report.txt`, a scoped game-log snapshot, and an
|
||||
`ai/` folder recording each AI interaction (**exact data sent, provider+model, and the reply**).
|
||||
- **"Report"** zips one diagnostic directory (plus `app.log`) into `DATA_DIR/reports/` —
|
||||
auto-saved there (no save dialog), shown with its path. Available only when logging is on
|
||||
(nothing is stored otherwise). CLI: `rigdoctor bundle`.
|
||||
- Everything stays local; the report only leaves the machine if the user shares the zip.
|
||||
|
||||
## Open
|
||||
|
||||
None currently — all tracked decisions (D1–D23) are resolved. New questions will be added
|
||||
None currently — all tracked decisions (D1–D25) are resolved. New questions will be added
|
||||
here as they arise. Remaining detail to flesh out during build: the tray's supporting-action
|
||||
set (D13), per-module apt package names, M12's tunnel/token specifics, and M13's
|
||||
update mechanism (APT repo vs. self-installed `.deb`).
|
||||
|
||||
+24
-1
@@ -2,7 +2,8 @@
|
||||
|
||||
Status: ⬜ not started · 🟦 designing · 🟨 in progress · ✅ done
|
||||
|
||||
> Module set per D14, plus **M12 (session sharing, D16)** and **M13 (auto-update, D18)**.
|
||||
> Module set per D14, plus **M12 (session sharing, D16)**, **M13 (auto-update, D18)**,
|
||||
> **M14 (AI assistant, D24)**, and **M15 (logging & reports, D25)**.
|
||||
> **M7 (stress/repro) was dropped (D7).** M10/M11 are the GUI and tray modules (D10/D11).
|
||||
> GPU scope reads "all (NVIDIA first)" — NVIDIA first, others via the vendor abstraction (D4).
|
||||
|
||||
@@ -20,6 +21,8 @@ Status: ⬜ not started · 🟦 designing · 🟨 in progress · ✅ done
|
||||
| M9 | Installer | (meta) | none | all | P1 | 🟨 |
|
||||
| M12 | Session sharing (shared terminal) | Sharing | none (relay) | all | P3 | ✅ |
|
||||
| M13 | Auto-update | (core) | none (stdlib; user-local file swap) | all | P3 | ✅ |
|
||||
| M14 | AI assistant (explain diagnostics) | (optional) | none (stdlib urllib; Ollama or Claude) | all | P3 | ✅ |
|
||||
| M15 | Logging & report bundles | (core) | none (stdlib logging + zip) | all | P3 | ✅ |
|
||||
| ~~M7~~ | ~~Stress / repro~~ | — | — | — | — | ❌ dropped (D7) |
|
||||
|
||||
## Notes per module
|
||||
@@ -117,6 +120,25 @@ Status: ⬜ not started · 🟦 designing · 🟨 in progress · ✅ done
|
||||
atomic symlink swap → restart, incl. the daemon). HTTPS-only, version-check-only (no
|
||||
telemetry), opt-out-able. Surfaced in the GUI; `rigdoctor update` in the CLI. (`.deb` users
|
||||
update via apt instead.)
|
||||
- **M14 AI assistant** (D24) — optional, **strictly opt-in, never automatic**: explains the
|
||||
collected diagnostics in plain language only when the user presses **"Explain with AI"**
|
||||
(`core/ai.py`, GUI button on the diagnostic dialog, `rigdoctor ai explain`). The user picks a
|
||||
provider explicitly (no default): **Ollama** (local, private, no key) or **Claude** (Anthropic
|
||||
Messages API, key in the keyring; consent prompt before sending). Answers are **grounded** —
|
||||
we pass the actual findings plus matched reference facts from a curated knowledge base
|
||||
(`core/ai_knowledge.py`, "RAG-lite": exact keyword/code match, no embeddings, stdlib only),
|
||||
which lifts a small local model and sharpens Claude. Stdlib `urllib` (no pip deps); output is
|
||||
advisory (D9). Configure in **Settings → AI assistant**.
|
||||
|
||||
- **M15 Logging & report bundles** (D25) — opt-in via one `logging_enabled` toggle (default off):
|
||||
application logging to a rotating `app.log` (`core/applog.py`) and **per-diagnostic storage**
|
||||
(`core/diagstore.py`) — each diagnostic gets its own `DATA_DIR/diagnostics/<id>/`: capture,
|
||||
`result.json`, `report.txt`, the full **inventory** (M5: hardware/OS), scoped **game logs**
|
||||
(`core/gamelogs.py`), scoped **system logs** (`core/syslogs.py` — `journalctl -k`,
|
||||
`coredumpctl`, an `nvidia-smi -q` snapshot, and the X11/Wayland display-server log), and an
|
||||
`ai/` record of every AI interaction (exact data sent, model, reply). **"Report"** zips one
|
||||
into `DATA_DIR/reports/` (GUI button on the diagnostic dialog; CLI `rigdoctor bundle`). Logs
|
||||
are session-scoped and fed to the AI on "Explain". Stays local; shareable on demand.
|
||||
|
||||
## Bundles (final — D14)
|
||||
- **Essential:** M1 + M3 + M4 *(the MVP, NVIDIA-only — D5)*
|
||||
@@ -124,6 +146,7 @@ Status: ⬜ not started · 🟦 designing · 🟨 in progress · ✅ done
|
||||
- **Diagnostics:** M5 + M6
|
||||
- **Desktop UI:** M10 + M11 *(adds PySide6)*
|
||||
- **Sharing:** M12 *(session sharing / remote assist — D16)*
|
||||
- **AI:** M14 *(optional AI explanations — D24)*
|
||||
|
||||
## MVP candidate — *confirmed (D5)*
|
||||
**M1 + M3 + M4 (Essential), NVIDIA-only, CLI-first.** Gives a working tool that captures the
|
||||
|
||||
@@ -89,6 +89,21 @@ Ubuntu + NVIDIA first; `.deb` distribution (see `DECISIONS.md`).
|
||||
- [removed] The read-only stats view (`share serve`) and bundle export — dropped per D23; the
|
||||
shared terminal is the only sharing mode.
|
||||
|
||||
## Phase 7 — AI assistant (M14, D24)
|
||||
- [x] **Explain diagnostics with AI** — opt-in, never automatic (`core/ai.py`, "Explain with AI"
|
||||
button + `rigdoctor ai explain`). Provider chosen explicitly: **Ollama** (local) or
|
||||
**Claude** (Anthropic). Grounded with a curated reference KB (`core/ai_knowledge.py`,
|
||||
RAG-lite, exact match — no embeddings); stdlib `urllib`. Settings → AI assistant.
|
||||
- [ ] *Possible follow-ups:* interactive chat grounded in the data; more reference-KB entries;
|
||||
an "Explain" button on the System Health page.
|
||||
|
||||
## Phase 8 — Logging & report bundles (M15, D25)
|
||||
- [x] **Opt-in logging** (one `logging_enabled` toggle): rotating `app.log` (`core/applog.py`)
|
||||
+ **per-diagnostic storage** in its own directory (`core/diagstore.py`) — capture,
|
||||
result, report, scoped game logs, and AI-interaction records.
|
||||
- [x] **Report** bundle — zip a diagnostic (incl. exactly what was sent to the AI, the model,
|
||||
and its reply) into the reports folder. GUI button + `rigdoctor bundle`.
|
||||
|
||||
> **Out of scope:** stress/repro module (D7); multi-distro support and packaging beyond
|
||||
> Ubuntu/apt + `.deb` (D15) — a thin seam is kept but not built out.
|
||||
|
||||
|
||||
@@ -152,6 +152,28 @@ type too (e.g. a sudo password, which stays local and is never sent to B). Accou
|
||||
Gitea token; per-session share code. The shared terminal preserves colors/theming and can be
|
||||
viewed full-screen. *(The earlier read-only stats view / bundle export were dropped — D23.)*
|
||||
|
||||
### M14 — AI assistant (D24)
|
||||
Optional module that explains the collected diagnostics in plain language. **Strictly opt-in and
|
||||
never automatic** — the model is contacted only when the user presses "Explain with AI" (GUI) or
|
||||
runs `rigdoctor ai explain`; configuring it contacts nothing. The user explicitly chooses a
|
||||
provider (no default): **Ollama** (local, private, no key) or **Claude** (Anthropic Messages
|
||||
API, key in the keyring, with a consent prompt before sending data). Answers are **grounded** in
|
||||
the actual findings plus matched reference facts from a curated, exact-match knowledge base
|
||||
("RAG-lite" — no embeddings/vector store, stdlib only); no fine-tuning. HTTP via stdlib `urllib`
|
||||
(no new core dependency); output is advisory (consistent with D9).
|
||||
|
||||
### M15 — Logging & report bundles (D25)
|
||||
Opt-in (one `logging_enabled` toggle, default off). When on: the application logs to a rotating
|
||||
`app.log`, and **each diagnostic is stored in its own directory** (capture log, structured
|
||||
result, human-readable report, the full **inventory** (M5 hardware/OS), session-scoped **game
|
||||
logs** (Proton/Steam) and **system logs** (`journalctl -k`, `coredumpctl`, an `nvidia-smi -q`
|
||||
snapshot, and the X11/Wayland display-server log), and a record of every AI interaction — the
|
||||
exact data sent, the model, and its reply). The collected logs are also fed to the AI on
|
||||
"Explain". Collection is best-effort (degrades if tools are missing/denied). A **Report** action zips one diagnostic's directory
|
||||
(plus the app log) into a shareable bundle saved under the reports folder (GUI button; CLI
|
||||
`rigdoctor bundle`). Everything stays local — a report only leaves the machine if the user
|
||||
shares the zip. Stdlib only (`logging` + `zipfile`).
|
||||
|
||||
## 5. Non-functional requirements
|
||||
- **Zero hard deps for the core/CLI/daemon** — Python stdlib + tools already present. **Qt
|
||||
(PySide6) is required only by the GUI (M10) and tray (M11) modules**, declared in the
|
||||
|
||||
+1
-1
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
||||
|
||||
[project]
|
||||
name = "rigdoctor"
|
||||
version = "0.26.0"
|
||||
version = "0.32.0"
|
||||
description = "Modular hardware monitoring & crash diagnostics for Linux gamers."
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.11"
|
||||
|
||||
@@ -1,3 +1,3 @@
|
||||
"""RigDoctor — modular hardware monitoring & crash diagnostics for Linux gamers."""
|
||||
|
||||
__version__ = "0.26.0"
|
||||
__version__ = "0.32.0"
|
||||
|
||||
@@ -438,6 +438,57 @@ def cmd_service(args) -> int:
|
||||
return 0
|
||||
|
||||
|
||||
def cmd_ai(args) -> int:
|
||||
"""AI assistant (M14) — opt-in; only contacts a provider on `test`/`explain`."""
|
||||
from .core import ai
|
||||
|
||||
sub = args.ai_cmd or "status"
|
||||
if sub == "status":
|
||||
print(f"Provider: {ai.provider() or 'not configured'}")
|
||||
if ai.provider():
|
||||
print(f" {ai.provider_label()}")
|
||||
print(f" ready: {'yes' if ai.is_configured() else 'no'}")
|
||||
else:
|
||||
print(" Configure it in the GUI: Settings → AI assistant.")
|
||||
return 0
|
||||
|
||||
if not ai.is_configured():
|
||||
print("AI is not configured. Set it up in the GUI (Settings → AI assistant).")
|
||||
return 1
|
||||
|
||||
if sub == "test":
|
||||
ok, msg = ai.explain("Connectivity test — reply exactly: RigDoctor AI is working.")
|
||||
print(msg)
|
||||
return 0 if ok else 1
|
||||
|
||||
# explain: gather the current health findings and ask the provider to explain them.
|
||||
from .core import health
|
||||
|
||||
findings = health.run_health_checks()
|
||||
text = ai.format_findings(findings)
|
||||
print(f"Asking {ai.provider_label()} to explain the current health findings…\n")
|
||||
ok, msg = ai.explain(text)
|
||||
print(msg)
|
||||
return 0 if ok else 1
|
||||
|
||||
|
||||
def cmd_bundle(args) -> int:
|
||||
"""Zip the latest stored diagnostic into a report bundle (M15) — needs logging enabled."""
|
||||
from .core import diagstore
|
||||
|
||||
if not diagstore.enabled():
|
||||
print("Logging is off. Enable it (Settings → Logging, or set logging_enabled) so "
|
||||
"diagnostics are stored and can be reported.")
|
||||
return 1
|
||||
directory = diagstore.latest_dir()
|
||||
if directory is None:
|
||||
print("No stored diagnostics yet — run a diagnostic first.")
|
||||
return 1
|
||||
out = diagstore.make_report(directory)
|
||||
print(f"Report written: {out}")
|
||||
return 0
|
||||
|
||||
|
||||
def cmd_gameenv(args) -> int:
|
||||
from dataclasses import asdict
|
||||
|
||||
@@ -645,10 +696,23 @@ def build_parser() -> argparse.ArgumentParser:
|
||||
mode_p.add_argument("mode", choices=("manual", "always-on", "game-launch"))
|
||||
mode_p.set_defaults(func=cmd_service)
|
||||
svc_p.set_defaults(func=cmd_service, service_cmd=None)
|
||||
|
||||
ai_p = sub.add_parser("ai", help="AI assistant (M14): explain diagnostics — opt-in, never automatic")
|
||||
ai_sub = ai_p.add_subparsers(dest="ai_cmd")
|
||||
ai_sub.add_parser("status", help="show the configured provider (contacts nothing)").set_defaults(func=cmd_ai)
|
||||
ai_sub.add_parser("test", help="send a tiny probe to verify connectivity").set_defaults(func=cmd_ai)
|
||||
ai_sub.add_parser("explain", help="explain the current health findings with AI").set_defaults(func=cmd_ai)
|
||||
ai_p.set_defaults(func=cmd_ai, ai_cmd=None)
|
||||
|
||||
bundle_p = sub.add_parser("bundle", help="zip the latest stored diagnostic into a report bundle (M15)")
|
||||
bundle_p.set_defaults(func=cmd_bundle)
|
||||
return p
|
||||
|
||||
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
from .core import applog
|
||||
|
||||
applog.setup() # opt-in app logging (M15); no-op unless logging_enabled
|
||||
args = build_parser().parse_args(argv)
|
||||
return args.func(args)
|
||||
|
||||
|
||||
+78
-37
@@ -37,12 +37,23 @@ SPAWN_LOG = STATE_DIR / "recorder.out"
|
||||
# not config: refreshed by the background scan on every launch).
|
||||
GAMES_FILE = STATE_DIR / "games.json"
|
||||
|
||||
# Logging & reports (opt-in via `logging_enabled`). App log: rotating file of app events.
|
||||
# Each diagnostic is stored under DIAGNOSTICS_DIR/<id>/; "Report" zips one into REPORTS_DIR.
|
||||
APP_LOG = STATE_DIR / "app.log"
|
||||
DIAGNOSTICS_DIR = DATA_DIR / "diagnostics"
|
||||
REPORTS_DIR = DATA_DIR / "reports"
|
||||
|
||||
# Update access token (M13) — gates updates to Gitea account holders (D18).
|
||||
# Stored in the OS keyring (Secret Service / GNOME Keyring) via `secret-tool` when
|
||||
# available — encrypted at rest, unlocked with the login session — else a 0600 file.
|
||||
TOKEN_FILE = CONFIG_DIR / "token"
|
||||
_SECRET_ATTRS = ["application", "rigdoctor", "type", "update-token"]
|
||||
|
||||
# AI assistant (M14, D24) — API key for the Claude provider, stored in the keyring like the
|
||||
# update token (Ollama is local and needs none). Separate keyring entry + file fallback.
|
||||
AI_KEY_FILE = CONFIG_DIR / "ai-key"
|
||||
_AI_SECRET_ATTRS = ["application", "rigdoctor", "type", "ai-key"]
|
||||
|
||||
|
||||
def _secret_tool() -> str | None:
|
||||
return shutil.which("secret-tool")
|
||||
@@ -53,27 +64,27 @@ def keyring_available() -> bool:
|
||||
return _secret_tool() is not None
|
||||
|
||||
|
||||
def _keyring_store(token: str) -> bool:
|
||||
def _keyring_store(value: str, attrs: list[str], label: str) -> bool:
|
||||
tool = _secret_tool()
|
||||
if not tool:
|
||||
return False
|
||||
try:
|
||||
proc = subprocess.run(
|
||||
[tool, "store", "--label", "RigDoctor update token", *_SECRET_ATTRS],
|
||||
input=token, text=True, capture_output=True, timeout=20,
|
||||
[tool, "store", "--label", label, *attrs],
|
||||
input=value, text=True, capture_output=True, timeout=20,
|
||||
)
|
||||
return proc.returncode == 0
|
||||
except (subprocess.SubprocessError, OSError):
|
||||
return False
|
||||
|
||||
|
||||
def _keyring_lookup() -> str | None:
|
||||
def _keyring_lookup(attrs: list[str]) -> str | None:
|
||||
tool = _secret_tool()
|
||||
if not tool:
|
||||
return None
|
||||
try:
|
||||
proc = subprocess.run(
|
||||
[tool, "lookup", *_SECRET_ATTRS], text=True, capture_output=True, timeout=20
|
||||
[tool, "lookup", *attrs], text=True, capture_output=True, timeout=20
|
||||
)
|
||||
if proc.returncode == 0 and proc.stdout.strip():
|
||||
return proc.stdout.strip()
|
||||
@@ -82,54 +93,67 @@ def _keyring_lookup() -> str | None:
|
||||
return None
|
||||
|
||||
|
||||
def _keyring_clear() -> None:
|
||||
def _keyring_clear(attrs: list[str]) -> None:
|
||||
tool = _secret_tool()
|
||||
if not tool:
|
||||
return
|
||||
try:
|
||||
subprocess.run([tool, "clear", *_SECRET_ATTRS], capture_output=True, timeout=20)
|
||||
subprocess.run([tool, "clear", *attrs], capture_output=True, timeout=20)
|
||||
except (subprocess.SubprocessError, OSError):
|
||||
pass
|
||||
|
||||
|
||||
def _load_secret(env_var: str | None, attrs: list[str], file: Path) -> str | None:
|
||||
if env_var:
|
||||
env = os.environ.get(env_var)
|
||||
if env and env.strip():
|
||||
return env.strip()
|
||||
from_keyring = _keyring_lookup(attrs)
|
||||
if from_keyring:
|
||||
return from_keyring
|
||||
try:
|
||||
value = file.read_text().strip()
|
||||
return value or None
|
||||
except OSError:
|
||||
return None
|
||||
|
||||
|
||||
def _save_secret(value: str, attrs: list[str], label: str, file: Path) -> None:
|
||||
value = value.strip()
|
||||
if _keyring_store(value, attrs, label):
|
||||
try: # don't leave a plaintext copy once it's in the keyring
|
||||
file.unlink()
|
||||
except OSError:
|
||||
pass
|
||||
return
|
||||
CONFIG_DIR.mkdir(parents=True, exist_ok=True)
|
||||
file.write_text(value + "\n")
|
||||
try:
|
||||
file.chmod(0o600)
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
|
||||
def _clear_secret(attrs: list[str], file: Path) -> None:
|
||||
_keyring_clear(attrs)
|
||||
try:
|
||||
file.unlink()
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
|
||||
def load_token() -> str | None:
|
||||
"""Token from $RIGDOCTOR_TOKEN, then the OS keyring, then a 0600 file."""
|
||||
env = os.environ.get("RIGDOCTOR_TOKEN")
|
||||
if env and env.strip():
|
||||
return env.strip()
|
||||
from_keyring = _keyring_lookup()
|
||||
if from_keyring:
|
||||
return from_keyring
|
||||
try:
|
||||
token = TOKEN_FILE.read_text().strip()
|
||||
return token or None
|
||||
except OSError:
|
||||
return None
|
||||
return _load_secret("RIGDOCTOR_TOKEN", _SECRET_ATTRS, TOKEN_FILE)
|
||||
|
||||
|
||||
def save_token(token: str) -> None:
|
||||
"""Save to the OS keyring if possible (encrypted); else a 0600 file."""
|
||||
token = token.strip()
|
||||
if _keyring_store(token):
|
||||
try: # don't leave a plaintext copy once it's in the keyring
|
||||
TOKEN_FILE.unlink()
|
||||
except OSError:
|
||||
pass
|
||||
return
|
||||
CONFIG_DIR.mkdir(parents=True, exist_ok=True)
|
||||
TOKEN_FILE.write_text(token + "\n")
|
||||
try:
|
||||
TOKEN_FILE.chmod(0o600)
|
||||
except OSError:
|
||||
pass
|
||||
_save_secret(token, _SECRET_ATTRS, "RigDoctor update token", TOKEN_FILE)
|
||||
|
||||
|
||||
def clear_token() -> None:
|
||||
_keyring_clear()
|
||||
try:
|
||||
TOKEN_FILE.unlink()
|
||||
except OSError:
|
||||
pass
|
||||
_clear_secret(_SECRET_ATTRS, TOKEN_FILE)
|
||||
|
||||
|
||||
def token_backend() -> str:
|
||||
@@ -137,12 +161,25 @@ def token_backend() -> str:
|
||||
env = os.environ.get("RIGDOCTOR_TOKEN")
|
||||
if env and env.strip():
|
||||
return "env"
|
||||
if _keyring_lookup() is not None:
|
||||
if _keyring_lookup(_SECRET_ATTRS) is not None:
|
||||
return "keyring"
|
||||
if TOKEN_FILE.exists():
|
||||
return "file"
|
||||
return "none"
|
||||
|
||||
|
||||
def load_ai_key() -> str | None:
|
||||
"""Claude API key from $RIGDOCTOR_AI_KEY, then the OS keyring, then a 0600 file (M14)."""
|
||||
return _load_secret("RIGDOCTOR_AI_KEY", _AI_SECRET_ATTRS, AI_KEY_FILE)
|
||||
|
||||
|
||||
def save_ai_key(key: str) -> None:
|
||||
_save_secret(key, _AI_SECRET_ATTRS, "RigDoctor AI key", AI_KEY_FILE)
|
||||
|
||||
|
||||
def clear_ai_key() -> None:
|
||||
_clear_secret(_AI_SECRET_ATTRS, AI_KEY_FILE)
|
||||
|
||||
DEFAULTS: dict = {
|
||||
"interval": 1.0, # sampling interval in seconds (default ≤1 Hz — NFR)
|
||||
"log_max_bytes": 20_000_000, # rotate a log segment past this size
|
||||
@@ -156,6 +193,10 @@ DEFAULTS: dict = {
|
||||
"steam_libraries": [], # Steam library paths to scan for games (M6); empty = none picked yet
|
||||
"trigger_mode": "manual", # crash-logger trigger (D6): manual | always-on | game-launch
|
||||
"setup_done": False, # first-run GUI setup wizard completed (M9)
|
||||
"ai_provider": "", # AI assistant (M14, D24): "" (unset) | "ollama" | "claude"
|
||||
"ai_model": "", # model name (e.g. "llama3.1" for Ollama; blank = Claude default)
|
||||
"ai_endpoint": "http://localhost:11434", # Ollama server base URL (Claude uses a fixed endpoint)
|
||||
"logging_enabled": False, # opt-in: app logging + per-diagnostic storage + Report (M15)
|
||||
}
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,211 @@
|
||||
"""AI assistant (M14, D24): explain the collected diagnostics in plain language.
|
||||
|
||||
**Strictly opt-in and never automatic** — the model is contacted ONLY from a direct user
|
||||
action ("Explain with AI" / ``rigdoctor ai explain``), never on launch, after a diagnostic, or
|
||||
in any loop. Choosing/configuring a provider does not contact anything. The user must pick a
|
||||
provider explicitly (there is no default).
|
||||
|
||||
Two providers, both over stdlib ``urllib`` (no pip deps in core):
|
||||
* **ollama** — a local server (data stays on the machine, no key).
|
||||
* **claude** — the Anthropic Messages API (key in the keyring).
|
||||
|
||||
Answers are *grounded*: we pass the actual findings plus matched reference facts
|
||||
(:mod:`ai_knowledge`) and ask the model to reason over them. Output is advisory (D9).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import re
|
||||
import urllib.error
|
||||
import urllib.request
|
||||
|
||||
from .. import config
|
||||
from . import ai_knowledge
|
||||
|
||||
_APPID_RE = re.compile(r"\b\d{5,7}\b") # Steam app IDs are 5–7 digits
|
||||
|
||||
PROVIDERS = ("ollama", "claude")
|
||||
OLLAMA_DEFAULT_ENDPOINT = "http://localhost:11434"
|
||||
# Suggested Ollama model — strong instruction-following that fits an 8 GB GPU at Q4. Because we
|
||||
# ground the prompt with reference facts, a 7B model is sufficient here.
|
||||
OLLAMA_SUGGESTED_MODEL = "qwen2.5:7b"
|
||||
CLAUDE_ENDPOINT = "https://api.anthropic.com/v1/messages"
|
||||
CLAUDE_DEFAULT_MODEL = "claude-opus-4-7"
|
||||
CLAUDE_MAX_TOKENS = 2000
|
||||
ANTHROPIC_VERSION = "2023-06-01"
|
||||
|
||||
SYSTEM_PROMPT = (
|
||||
"You are RigDoctor's hardware-diagnostics assistant for Linux gamers (Ubuntu + NVIDIA, games "
|
||||
"via Steam/Proton). You are given session context, the structured findings RigDoctor "
|
||||
"collected — which may include recent game/Proton/system log excerpts scoped to this session "
|
||||
"— plus reference facts. Use the GAME NAME from the session context; never guess the game "
|
||||
"from log paths or app IDs. Correlate log errors with the findings to pinpoint WHEN and WHY "
|
||||
"things went wrong, identify the most likely root cause, and give concrete, ordered next "
|
||||
"steps with exact Linux commands where useful.\n"
|
||||
"Rules: Base your reasoning ONLY on the data and reference facts provided — never invent "
|
||||
"readings, hardware, or log lines. This is LINUX: never suggest Windows-only steps (e.g. "
|
||||
"'run as administrator', registry edits, toggling antivirus). Treat log lines flagged BENIGN "
|
||||
"in the reference facts as non-causal. If no crash was recorded and there are no warning or "
|
||||
"critical findings, say plainly that the session looks healthy and do NOT manufacture a "
|
||||
"problem. Be concise. Present fixes as suggestions and warn before anything that risks data "
|
||||
"loss or instability. Format your answer in Markdown."
|
||||
)
|
||||
|
||||
|
||||
def provider() -> str:
|
||||
return config.load_config().get("ai_provider", "")
|
||||
|
||||
|
||||
def model() -> str:
|
||||
m = config.load_config().get("ai_model", "").strip()
|
||||
if m:
|
||||
return m
|
||||
return CLAUDE_DEFAULT_MODEL if provider() == "claude" else ""
|
||||
|
||||
|
||||
def endpoint() -> str:
|
||||
ep = config.load_config().get("ai_endpoint", OLLAMA_DEFAULT_ENDPOINT).strip()
|
||||
return ep or OLLAMA_DEFAULT_ENDPOINT
|
||||
|
||||
|
||||
def is_local() -> bool:
|
||||
return provider() == "ollama"
|
||||
|
||||
|
||||
def is_configured() -> bool:
|
||||
"""Whether the chosen provider is ready (does NOT contact anything)."""
|
||||
p = provider()
|
||||
if p == "claude":
|
||||
return bool(config.load_ai_key())
|
||||
if p == "ollama":
|
||||
return bool(model()) # a model name is required; endpoint has a default
|
||||
return False # no provider chosen
|
||||
|
||||
|
||||
def provider_label() -> str:
|
||||
p = provider()
|
||||
if p == "claude":
|
||||
return f"Claude ({model()})"
|
||||
if p == "ollama":
|
||||
return f"Ollama ({model() or '?'} @ {endpoint()})"
|
||||
return "not configured"
|
||||
|
||||
|
||||
def appid_glossary(text: str) -> str:
|
||||
"""Resolve Steam app IDs that appear in `text` against the user's scanned library.
|
||||
|
||||
We don't teach the model app IDs — we look them up locally and hand it the mapping, so it
|
||||
names games correctly instead of guessing. Only IDs we can resolve are listed.
|
||||
"""
|
||||
candidates = set(_APPID_RE.findall(text))
|
||||
if not candidates:
|
||||
return ""
|
||||
try:
|
||||
from . import steam
|
||||
names = steam.appid_names()
|
||||
except Exception: # never let a glossary lookup break an explanation
|
||||
return ""
|
||||
known = sorted((i, names[i]) for i in candidates if i in names)
|
||||
if not known:
|
||||
return ""
|
||||
return "App IDs (resolved from your installed games):\n" + "\n".join(
|
||||
f"- {appid} = {name}" for appid, name in known)
|
||||
|
||||
|
||||
def build_prompt(findings_text: str) -> str:
|
||||
"""The user-message content: app-ID glossary + matched reference facts + the findings."""
|
||||
parts = []
|
||||
glossary = appid_glossary(findings_text)
|
||||
if glossary:
|
||||
parts.append(glossary)
|
||||
parts.append("")
|
||||
facts = ai_knowledge.relevant(findings_text)
|
||||
if facts:
|
||||
parts.append("Reference facts (use these to interpret the findings):")
|
||||
parts += [f"- {f}" for f in facts]
|
||||
parts.append("")
|
||||
parts.append("Collected findings:")
|
||||
parts.append(findings_text.strip() or "(no findings provided)")
|
||||
return "\n".join(parts)
|
||||
|
||||
|
||||
def explain(findings_text: str, timeout: float = 120.0) -> tuple[bool, str]:
|
||||
"""Contact the configured provider to explain the findings. Returns (ok, text | error).
|
||||
|
||||
The caller MUST be a direct user action (D24) — this never runs automatically.
|
||||
"""
|
||||
content = build_prompt(findings_text)
|
||||
try:
|
||||
if provider() == "claude":
|
||||
return _claude(content, timeout)
|
||||
if provider() == "ollama":
|
||||
return _ollama(content, timeout)
|
||||
return False, "No AI provider is configured (Settings → AI assistant)."
|
||||
except urllib.error.HTTPError as exc:
|
||||
return False, _http_error(exc)
|
||||
except (urllib.error.URLError, OSError, TimeoutError) as exc:
|
||||
return False, f"Couldn't reach the AI provider: {exc}"
|
||||
except (ValueError, KeyError, IndexError) as exc:
|
||||
return False, f"Unexpected response from the AI provider: {exc}"
|
||||
|
||||
|
||||
def _post(url: str, payload: dict, headers: dict, timeout: float) -> dict:
|
||||
req = urllib.request.Request(
|
||||
url, data=json.dumps(payload).encode("utf-8"),
|
||||
headers={"Content-Type": "application/json", **headers},
|
||||
)
|
||||
with urllib.request.urlopen(req, timeout=timeout) as resp:
|
||||
return json.load(resp)
|
||||
|
||||
|
||||
def _ollama(content: str, timeout: float) -> tuple[bool, str]:
|
||||
if not model():
|
||||
return False, "No Ollama model is set (Settings → AI assistant)."
|
||||
payload = {"model": model(), "system": SYSTEM_PROMPT, "prompt": content, "stream": False}
|
||||
out = _post(endpoint().rstrip("/") + "/api/generate", payload, {}, timeout)
|
||||
return True, (out.get("response") or "").strip() or "(the model returned an empty response)"
|
||||
|
||||
|
||||
def _claude(content: str, timeout: float) -> tuple[bool, str]:
|
||||
key = config.load_ai_key()
|
||||
if not key:
|
||||
return False, "No Claude API key is set (Settings → AI assistant)."
|
||||
# One-shot call: no prompt caching (single request, short system prompt) and no thinking
|
||||
# (keeps a button-press snappy). Sampling params are omitted (removed on current Opus).
|
||||
payload = {
|
||||
"model": model(),
|
||||
"max_tokens": CLAUDE_MAX_TOKENS,
|
||||
"system": SYSTEM_PROMPT,
|
||||
"messages": [{"role": "user", "content": content}],
|
||||
}
|
||||
headers = {"x-api-key": key, "anthropic-version": ANTHROPIC_VERSION}
|
||||
out = _post(CLAUDE_ENDPOINT, payload, headers, timeout)
|
||||
text = "\n".join(b.get("text", "") for b in out.get("content", []) if b.get("type") == "text")
|
||||
return True, text.strip() or "(the model returned no text)"
|
||||
|
||||
|
||||
def _http_error(exc: urllib.error.HTTPError) -> str:
|
||||
detail = ""
|
||||
try:
|
||||
body = exc.read().decode("utf-8", "replace")
|
||||
detail = json.loads(body).get("error", {}).get("message", "") or ""
|
||||
except (ValueError, OSError):
|
||||
pass
|
||||
hint = " — check your API key in Settings → AI assistant." if exc.code in (401, 403) else ""
|
||||
return f"AI request failed (HTTP {exc.code}){hint}{(': ' + detail) if detail else ''}"
|
||||
|
||||
|
||||
def format_findings(findings, header: str = "") -> str:
|
||||
"""Render M4 Finding objects (or similar) into the plain-text block we send the model."""
|
||||
lines = [header] if header else []
|
||||
for f in findings:
|
||||
severity = str(getattr(f, "severity", "")).upper()
|
||||
category = getattr(f, "category", "")
|
||||
title = getattr(f, "title", "")
|
||||
detail = getattr(f, "detail", "")
|
||||
line = f"- [{severity}] {category}: {title}".rstrip()
|
||||
if detail:
|
||||
line += f" — {detail}"
|
||||
lines.append(line)
|
||||
return "\n".join(lines) if lines else "No findings."
|
||||
@@ -0,0 +1,91 @@
|
||||
"""Curated reference knowledge for the AI assistant (M14, D24) — "RAG-lite".
|
||||
|
||||
A small, hand-written set of domain facts (Xid codes, SMART attributes, common Linux-gaming
|
||||
error signatures, tunable meanings). At explain-time we select the entries whose triggers
|
||||
appear in the collected findings and inject them into the prompt, so even a small local model
|
||||
gets the relevant facts instead of having to recall them. Provider-agnostic — it sharpens
|
||||
Claude too.
|
||||
|
||||
Retrieval is exact keyword/substring matching, not embeddings: the keys here (``Xid 79``,
|
||||
``SMART 197``, ``fallen off the bus``) are precise, so a vector store would be overkill and
|
||||
would break the stdlib-only rule. Each entry is ``(triggers, fact)``; a trigger matches
|
||||
case-insensitively against the findings text.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
# (triggers, fact). Keep facts short, factual, and cause-oriented — they go into the prompt.
|
||||
ENTRIES: list[tuple[tuple[str, ...], str]] = [
|
||||
(("xid 79", "fallen off the bus", "gpu has fallen"),
|
||||
"NVIDIA Xid 79 / 'GPU has fallen off the bus' = the driver lost PCIe contact with the GPU "
|
||||
"mid-operation. Usual causes, in order: insufficient/unstable PSU power or a bad power "
|
||||
"cable, an unstable overclock/undervolt, PCIe link or riser issues, or overheating. Often "
|
||||
"fatal to the session (hard freeze)."),
|
||||
(("xid 13", "graphics engine exception"),
|
||||
"NVIDIA Xid 13 = graphics engine exception, frequently an unstable GPU overclock or a "
|
||||
"faulty application shader; revert any OC/UV and test."),
|
||||
(("xid 31", "fifo: mmu fault", "mmu fault"),
|
||||
"NVIDIA Xid 31 = MMU fault (illegal memory access by the app/driver) — often a game/driver "
|
||||
"bug or unstable VRAM overclock."),
|
||||
(("xid 8", "xid 62", "xid 63", "xid 64"),
|
||||
"These Xid codes commonly indicate VRAM/ECC or memory-training problems — suspect failing "
|
||||
"VRAM or an unstable memory overclock."),
|
||||
(("smart 197", "current_pending_sector", "pending sector"),
|
||||
"SMART 197 (Current Pending Sector) > 0 = sectors the drive can't read and is waiting to "
|
||||
"reallocate — early sign of a failing disk. Back up now and run an extended self-test."),
|
||||
(("smart 198", "offline_uncorrectable", "uncorrectable"),
|
||||
"SMART 198 (Offline Uncorrectable) > 0 = sectors that failed to read/write — the drive is "
|
||||
"degrading; back up immediately."),
|
||||
(("smart 5", "reallocated_sector", "reallocated sector"),
|
||||
"SMART 5 (Reallocated Sectors) climbing over time = the drive is using spares for bad "
|
||||
"sectors; a rising count predicts failure."),
|
||||
(("media and data integrity errors", "percentage used", "available spare"),
|
||||
"NVMe health: 'Media and Data Integrity Errors' > 0 is concerning; 'Percentage Used' near "
|
||||
"or over 100% and 'Available Spare' below the threshold mean the SSD is near end-of-life."),
|
||||
(("thermal throttling", "throttle", "tjmax", "package id 0"),
|
||||
"Sustained CPU/GPU temperatures at the thermal limit cause throttling (clocks drop to shed "
|
||||
"heat) — check cooling, fan curves, paste, and case airflow."),
|
||||
(("oom", "out of memory", "oom-killer", "killed process"),
|
||||
"The kernel OOM-killer terminates processes when RAM (and swap) are exhausted — a freeze "
|
||||
"or a game crashing to desktop under memory pressure points here; check swap and "
|
||||
"vm.swappiness, and watch for a memory leak."),
|
||||
(("segfault", "general protection fault", "segmentation fault"),
|
||||
"A segfault/GP-fault is a process accessing invalid memory — for games under Proton it's "
|
||||
"often a Proton/Wine or anticheat incompatibility, or unstable RAM (run memtest)."),
|
||||
(("proton", "wine", "d3d", "vkd3d", "dxvk"),
|
||||
"Proton/Wine issues: mismatched Proton version, missing vkd3d/DXVK, or shader-cache "
|
||||
"corruption are common. Try a known-good Proton version and clear the shader cache."),
|
||||
(("pcie_aspm", "aspm"),
|
||||
"PCIe ASPM (Active State Power Management) can cause GPU/NVMe instability on some boards; "
|
||||
"setting pcie_aspm=off is a common stability fix at a small idle-power cost."),
|
||||
(("cpu_governor", "powersave", "schedutil", "performance governor"),
|
||||
"The CPU frequency governor sets the clock policy; 'performance' avoids latency spikes from "
|
||||
"ramp-up at a higher power draw, while 'powersave'/'schedutil' favor efficiency."),
|
||||
(("nvidia persistence", "persistence mode"),
|
||||
"NVIDIA persistence mode keeps the driver loaded when no app is using the GPU, avoiding "
|
||||
"re-init stalls — harmless to enable."),
|
||||
(("libnvidia-ml.so", "interface.h", "failed to load \"libnvidia-ml"),
|
||||
"BENIGN: a Steam log assertion 'Failed to load libnvidia-ml.so.1' (from interface.h) is "
|
||||
"logged on many normal launches — the Steam runtime sandbox can't see the host NVML library. "
|
||||
"It is NOT by itself a crash cause. Only investigate the driver if the GPU is genuinely "
|
||||
"undetected (nvidia-smi fails)."),
|
||||
(("minidump", ".dmp", "uploading minidump"),
|
||||
"BENIGN-by-default: a minidump upload line means a crash handler ran AND that the game/engine "
|
||||
"routinely uploads dumps; it is not proof that THIS session crashed unless a hard freeze or "
|
||||
"non-zero exit was also recorded. Don't treat a routine minidump line as the root cause."),
|
||||
(("fork without exec", "skipping destruction"),
|
||||
"BENIGN: 'pid X != Y, skipping destruction (fork without exec?)' is routine Steam/Proton "
|
||||
"process bookkeeping, not an error."),
|
||||
]
|
||||
|
||||
|
||||
def relevant(findings_text: str, limit: int = 8) -> list[str]:
|
||||
"""Reference facts whose triggers appear in the findings text (case-insensitive)."""
|
||||
haystack = findings_text.lower()
|
||||
hits: list[str] = []
|
||||
for triggers, fact in ENTRIES:
|
||||
if any(t in haystack for t in triggers):
|
||||
hits.append(fact)
|
||||
if len(hits) >= limit:
|
||||
break
|
||||
return hits
|
||||
@@ -0,0 +1,63 @@
|
||||
"""Application logging (M15) — opt-in via the `logging_enabled` setting.
|
||||
|
||||
When enabled, app events/errors are written to a rotating file (`config.APP_LOG`); when
|
||||
disabled, nothing is written (no file is created). All RigDoctor code logs through
|
||||
``applog.get_logger(__name__)``; the handler is attached once at startup by :func:`setup`.
|
||||
Stdlib ``logging`` only.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from logging.handlers import RotatingFileHandler
|
||||
|
||||
from .. import config
|
||||
|
||||
_ROOT = "rigdoctor"
|
||||
_configured = False
|
||||
|
||||
|
||||
def setup(force: bool = False) -> bool:
|
||||
"""Attach the file handler if logging is enabled. Idempotent. Returns whether it's on."""
|
||||
global _configured
|
||||
logger = logging.getLogger(_ROOT)
|
||||
enabled = bool(config.load_config().get("logging_enabled", False))
|
||||
|
||||
if not enabled:
|
||||
if force: # toggled off at runtime — detach so we stop writing
|
||||
for h in list(logger.handlers):
|
||||
logger.removeHandler(h)
|
||||
h.close()
|
||||
_configured = False
|
||||
return False
|
||||
|
||||
if _configured and not force:
|
||||
return True
|
||||
for h in list(logger.handlers): # avoid duplicate handlers on re-setup
|
||||
logger.removeHandler(h)
|
||||
h.close()
|
||||
try:
|
||||
config.STATE_DIR.mkdir(parents=True, exist_ok=True)
|
||||
handler = RotatingFileHandler(config.APP_LOG, maxBytes=2_000_000, backupCount=3,
|
||||
encoding="utf-8")
|
||||
handler.setFormatter(logging.Formatter(
|
||||
"%(asctime)s %(levelname)-7s %(name)s: %(message)s"))
|
||||
logger.addHandler(handler)
|
||||
logger.setLevel(logging.INFO)
|
||||
logger.propagate = False
|
||||
_configured = True
|
||||
logger.info("logging started (rigdoctor %s)", _version())
|
||||
except OSError:
|
||||
return False
|
||||
return True
|
||||
|
||||
|
||||
def get_logger(name: str) -> logging.Logger:
|
||||
"""A child logger. Safe to call before setup — it just won't write until enabled."""
|
||||
short = name.split(".")[-1]
|
||||
return logging.getLogger(f"{_ROOT}.{short}")
|
||||
|
||||
|
||||
def _version() -> str:
|
||||
from .. import __version__
|
||||
return __version__
|
||||
@@ -28,6 +28,7 @@ class DiagnosticResult:
|
||||
game: str | None
|
||||
summary: Summary # capture window: peak temps/power, events, last samples (M3)
|
||||
findings: list[Finding] # health findings: Xid/SMART/driver/etc. (M4)
|
||||
dir: str | None = None # storage directory when logging is on (M15); else None
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -97,7 +98,22 @@ def finish(last_n: int = 10, log_path=None) -> DiagnosticResult:
|
||||
summary = summarize(path, last_n=last_n)
|
||||
game = _game_from_summary(summary) or (reccontrol.read_status() or {}).get("game")
|
||||
findings = run_health_checks()
|
||||
return DiagnosticResult(game=game, summary=summary, findings=findings)
|
||||
result = DiagnosticResult(game=game, summary=summary, findings=findings)
|
||||
_store(result, path, summary)
|
||||
return result
|
||||
|
||||
|
||||
def _store(result: DiagnosticResult, capture_path, summary: Summary) -> None:
|
||||
"""Persist the diagnostic to its own directory when logging is enabled (M15)."""
|
||||
try:
|
||||
from . import diagstore
|
||||
|
||||
since = (summary.start - 60) if summary.start else None
|
||||
directory = diagstore.store(result, capture_path, since=since)
|
||||
if directory:
|
||||
result.dir = str(directory)
|
||||
except Exception: # storage must never break a diagnostic
|
||||
pass
|
||||
|
||||
|
||||
# --- hard-crash detection & post-crash analysis -----------------------------------
|
||||
@@ -184,4 +200,6 @@ def analyze_crash(last_n: int = 15) -> DiagnosticResult:
|
||||
findings += check_previous_boot() # the crashed boot's kernel log
|
||||
findings += run_health_checks(include_journal=False) # SMART/driver/persistence/temps
|
||||
findings.sort(key=lambda f: _SEV_ORDER.get(f.severity, 9))
|
||||
return DiagnosticResult(game=_game_from_summary(summary), summary=summary, findings=findings)
|
||||
result = DiagnosticResult(game=_game_from_summary(summary), summary=summary, findings=findings)
|
||||
_store(result, _crash_path(), summary)
|
||||
return result
|
||||
|
||||
@@ -0,0 +1,152 @@
|
||||
"""Per-diagnostic storage + Report bundles (M15) — opt-in via `logging_enabled`.
|
||||
|
||||
When logging is on, each finished diagnostic is persisted to its own directory under
|
||||
``config.DIAGNOSTICS_DIR/<id>/`` (capture log, structured result, human-readable report, a
|
||||
game-log snapshot, and any AI interactions). "Report" zips one directory — including exactly
|
||||
**what was sent to the AI, which model, and its reply** — into ``config.REPORTS_DIR``.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import shutil
|
||||
import time
|
||||
import zipfile
|
||||
from dataclasses import asdict, is_dataclass
|
||||
from pathlib import Path
|
||||
|
||||
from .. import config
|
||||
|
||||
|
||||
def enabled() -> bool:
|
||||
return bool(config.load_config().get("logging_enabled", False))
|
||||
|
||||
|
||||
def _slug(name: str | None) -> str:
|
||||
s = "".join(c if c.isalnum() else "-" for c in (name or "session").lower())
|
||||
return s.strip("-")[:40] or "session"
|
||||
|
||||
|
||||
def _new_dir(game: str | None) -> Path:
|
||||
base = config.DIAGNOSTICS_DIR
|
||||
stamp = time.strftime("%Y%m%d-%H%M%S")
|
||||
name = f"{stamp}-{_slug(game)}"
|
||||
target = base / name
|
||||
n = 1
|
||||
while target.exists():
|
||||
target = base / f"{name}-{n}"
|
||||
n += 1
|
||||
target.mkdir(parents=True, exist_ok=True)
|
||||
return target
|
||||
|
||||
|
||||
def _as_dict(obj):
|
||||
if is_dataclass(obj):
|
||||
return asdict(obj)
|
||||
return getattr(obj, "__dict__", {}) or str(obj)
|
||||
|
||||
|
||||
def store(result, capture_path=None, since: float | None = None) -> Path | None:
|
||||
"""Persist a finished diagnostic to its own directory. Returns the dir, or None if off."""
|
||||
if not enabled():
|
||||
return None
|
||||
from ..render import render_summary
|
||||
from . import ai, gamelogs, syslogs
|
||||
|
||||
target = _new_dir(getattr(result, "game", None))
|
||||
|
||||
if capture_path and Path(capture_path).exists():
|
||||
try:
|
||||
shutil.copyfile(capture_path, target / "capture.jsonl")
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
payload = {
|
||||
"game": getattr(result, "game", None),
|
||||
"stored_at": time.time(),
|
||||
"summary": _as_dict(result.summary),
|
||||
"findings": [_as_dict(f) for f in result.findings],
|
||||
}
|
||||
_write(target / "result.json", json.dumps(payload, indent=2, default=str))
|
||||
|
||||
report = [f"Game: {getattr(result, 'game', None) or 'unknown'}", "",
|
||||
render_summary(result.summary), "",
|
||||
ai.format_findings(result.findings, header="Findings:")]
|
||||
_write(target / "report.txt", "\n".join(report))
|
||||
|
||||
try:
|
||||
logs = gamelogs.collect(since=since)
|
||||
if logs:
|
||||
_write(target / "gamelogs.txt", logs)
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
try:
|
||||
sys_logs = syslogs.collect(since=since)
|
||||
if sys_logs:
|
||||
_write(target / "syslogs.txt", sys_logs)
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
try: # full hardware/OS inventory (M5) — invaluable for larger debugging in a shared report
|
||||
from . import inventory
|
||||
|
||||
sections = inventory.collect()
|
||||
_write(target / "inventory.txt", inventory.render_text(sections))
|
||||
_write(target / "inventory.json", inventory.render_json(sections))
|
||||
except Exception: # inventory probes vary by machine; never let it break storage
|
||||
pass
|
||||
return target
|
||||
|
||||
|
||||
def record_ai(diag_dir, *, provider: str, model: str, system: str, prompt: str, response: str) -> None:
|
||||
"""Save one AI interaction (exact data sent, model, reply) into the diagnostic's `ai/` dir."""
|
||||
if not diag_dir:
|
||||
return
|
||||
out = Path(diag_dir) / "ai"
|
||||
try:
|
||||
out.mkdir(parents=True, exist_ok=True)
|
||||
except OSError:
|
||||
return
|
||||
stamp = time.strftime("%Y%m%d-%H%M%S")
|
||||
record = {
|
||||
"timestamp": time.time(), "provider": provider, "model": model,
|
||||
"system_prompt": system, "data_sent_to_model": prompt, "model_reply": response,
|
||||
}
|
||||
_write(out / f"explain-{stamp}.json", json.dumps(record, indent=2, default=str))
|
||||
readable = (
|
||||
f"Provider: {provider}\nModel: {model}\n\n"
|
||||
f"=== System prompt ===\n{system}\n\n"
|
||||
f"=== Data sent to the model ===\n{prompt}\n\n"
|
||||
f"=== Model reply ===\n{response}\n"
|
||||
)
|
||||
_write(out / f"explain-{stamp}.txt", readable)
|
||||
|
||||
|
||||
def make_report(diag_dir) -> Path:
|
||||
"""Zip a diagnostic directory (plus the app log) into REPORTS_DIR; return the zip path."""
|
||||
diag_dir = Path(diag_dir)
|
||||
config.REPORTS_DIR.mkdir(parents=True, exist_ok=True)
|
||||
out = config.REPORTS_DIR / f"report-{diag_dir.name}.zip"
|
||||
with zipfile.ZipFile(out, "w", zipfile.ZIP_DEFLATED) as zf:
|
||||
for path in sorted(diag_dir.rglob("*")):
|
||||
if path.is_file():
|
||||
zf.write(path, arcname=str(Path(diag_dir.name) / path.relative_to(diag_dir)))
|
||||
if config.APP_LOG.exists(): # the application log, for context around the session
|
||||
zf.write(config.APP_LOG, arcname=str(Path(diag_dir.name) / "app.log"))
|
||||
return out
|
||||
|
||||
|
||||
def latest_dir() -> Path | None:
|
||||
try:
|
||||
dirs = [d for d in config.DIAGNOSTICS_DIR.iterdir() if d.is_dir()]
|
||||
except OSError:
|
||||
return None
|
||||
return max(dirs, key=lambda d: d.stat().st_mtime) if dirs else None
|
||||
|
||||
|
||||
def _write(path: Path, text: str) -> None:
|
||||
try:
|
||||
path.write_text(text, encoding="utf-8")
|
||||
except OSError:
|
||||
pass
|
||||
@@ -0,0 +1,116 @@
|
||||
"""Collect recent game / Proton / Steam logs to enrich an AI diagnostic (M14).
|
||||
|
||||
Reads logs that already exist on disk — no change to how the game is launched. Two reliable
|
||||
sources: Proton's per-app log (``~/steam-<appid>.log``, written when ``PROTON_LOG=1``) and
|
||||
Steam's own console log. Each is tail-read and size-bounded so the AI prompt stays small. The
|
||||
text is fed to the AI alongside the findings so it can see *when* something went wrong (a
|
||||
vkd3d/DXVK error, a crash line, the exit code) rather than only the sensor summary.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import re
|
||||
import time
|
||||
from pathlib import Path
|
||||
|
||||
# Steam keeps logs under its install root; ~/.steam/steam usually symlinks to the real one.
|
||||
_STEAM_LOG_DIRS = ("~/.steam/steam/logs", "~/.local/share/Steam/logs", "~/.steam/root/logs")
|
||||
_STEAM_LOG_FILES = ("console-linux.txt", "console_log.txt", "stderr.txt")
|
||||
_TS = re.compile(r"^\[(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})\]")
|
||||
|
||||
|
||||
def _line_epoch(line: str) -> float | None:
|
||||
m = _TS.match(line)
|
||||
if not m:
|
||||
return None
|
||||
try:
|
||||
return time.mktime(time.strptime(m.group(1), "%Y-%m-%d %H:%M:%S"))
|
||||
except ValueError:
|
||||
return None
|
||||
|
||||
|
||||
def _since_filter(text: str, since: float) -> str:
|
||||
"""Keep lines from the first timestamp >= `since` onward (logs are chronological).
|
||||
|
||||
Untimestamped lines before the window are dropped; once inside the window every line is
|
||||
kept (so multi-line entries survive). This scopes a long-lived Steam log to one session.
|
||||
"""
|
||||
out: list[str] = []
|
||||
including = False
|
||||
for line in text.splitlines():
|
||||
epoch = _line_epoch(line)
|
||||
if epoch is not None and epoch >= since:
|
||||
including = True
|
||||
if including:
|
||||
out.append(line)
|
||||
return "\n".join(out)
|
||||
|
||||
|
||||
def _tail(path: Path, max_bytes: int) -> str:
|
||||
"""Last ``max_bytes`` of a file, decoded leniently (empty string on error)."""
|
||||
try:
|
||||
size = path.stat().st_size
|
||||
with path.open("rb") as fh:
|
||||
if size > max_bytes:
|
||||
fh.seek(size - max_bytes)
|
||||
return fh.read().decode("utf-8", "replace")
|
||||
except OSError:
|
||||
return ""
|
||||
|
||||
|
||||
def _proton_logs() -> list[Path]:
|
||||
try:
|
||||
logs = list(Path.home().glob("steam-*.log"))
|
||||
except OSError:
|
||||
return []
|
||||
return sorted(logs, key=lambda p: p.stat().st_mtime, reverse=True)
|
||||
|
||||
|
||||
def _steam_console() -> Path | None:
|
||||
for directory in _STEAM_LOG_DIRS:
|
||||
base = Path(os.path.expanduser(directory))
|
||||
for name in _STEAM_LOG_FILES:
|
||||
candidate = base / name
|
||||
if candidate.exists():
|
||||
return candidate
|
||||
return None
|
||||
|
||||
|
||||
def available() -> bool:
|
||||
return bool(_proton_logs() or _steam_console())
|
||||
|
||||
|
||||
def collect(since: float | None = None, max_bytes: int = 8000) -> str:
|
||||
"""Recent Proton + Steam log tails as one labelled text block ('' if none).
|
||||
|
||||
With ``since`` (epoch), scope to that session: skip a Proton log not written during/after
|
||||
the session (a stale per-app log from an earlier game), and keep only Steam-console lines
|
||||
timestamped at/after ``since`` — so we don't feed the model an unrelated past session.
|
||||
"""
|
||||
sections: list[str] = []
|
||||
|
||||
protons = _proton_logs()
|
||||
if protons:
|
||||
log = protons[0]
|
||||
fresh = since is None or _mtime(log) >= since
|
||||
tail = _tail(log, max_bytes).strip() if fresh else ""
|
||||
if tail:
|
||||
sections.append(f"--- Proton log ({log.name}) ---\n{tail}")
|
||||
|
||||
console = _steam_console()
|
||||
if console:
|
||||
raw = _tail(console, 40000 if since else max_bytes)
|
||||
if since is not None:
|
||||
raw = _since_filter(raw, since)
|
||||
raw = raw.strip()[-max_bytes:].strip()
|
||||
if raw:
|
||||
sections.append(f"--- Steam log ({console.name}) ---\n{raw}")
|
||||
return "\n\n".join(sections)
|
||||
|
||||
|
||||
def _mtime(path: Path) -> float:
|
||||
try:
|
||||
return path.stat().st_mtime
|
||||
except OSError:
|
||||
return 0.0
|
||||
@@ -318,6 +318,11 @@ def cached_games() -> list[Game]:
|
||||
return [Game(**{k: g[k] for k in Game.__dataclass_fields__ if k in g}) for g in cache.get("games", [])]
|
||||
|
||||
|
||||
def appid_names() -> dict[str, str]:
|
||||
"""{appid: name} for the user's scanned games — lets us resolve IDs seen in logs (M14)."""
|
||||
return {g.appid: g.name for g in cached_games() if g.appid and g.name}
|
||||
|
||||
|
||||
def rescan(cfg: dict | None = None) -> ScanResult:
|
||||
"""Scan the selected libraries, diff against the cache, and persist the result.
|
||||
|
||||
|
||||
@@ -0,0 +1,141 @@
|
||||
"""Session-scoped system logs for diagnostics (M15): kernel, coredumps, NVIDIA, display.
|
||||
|
||||
Covers what the *system* logged when something went wrong, so the report bundle and the AI both
|
||||
see it:
|
||||
* kernel ring-buffer slice (`journalctl -k`) — Xid, OOM-killer, MCE, PCIe AER, thermal, hung tasks
|
||||
* systemd-coredump records (`coredumpctl`) — did the game/wine dump core (SIGSEGV/ABRT), when
|
||||
* an `nvidia-smi -q` snapshot — driver, throttle/clock-event reasons, clocks, power, temps, PCIe,
|
||||
ECC + retired pages (point-in-time at diagnostic time)
|
||||
* the display-server log — `Xorg.0.log` on X11, or the compositor's user-journal slice on Wayland
|
||||
Best-effort and size-bounded: degrades silently if a tool is missing or access is denied. Stdlib only.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import shutil
|
||||
import subprocess
|
||||
import time
|
||||
from pathlib import Path
|
||||
|
||||
_MAX = 8000 # cap each log section so the prompt/report stays small
|
||||
_NV_MAX = 10000 # nvidia-smi -q is structured + valuable; allow a bit more (head-truncated)
|
||||
|
||||
# Compositors whose user-journal entries are the "Wayland log" (OR-matched by journalctl).
|
||||
_COMPOSITORS = ("gnome-shell", "mutter", "kwin_wayland", "Xwayland", "sway", "gamescope")
|
||||
_XORG_LOGS = ("~/.local/share/xorg/Xorg.0.log", "/var/log/Xorg.0.log")
|
||||
|
||||
|
||||
def _since_arg(since: float | None) -> str | None:
|
||||
return time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(since)) if since else None
|
||||
|
||||
|
||||
def _run(cmd: list[str], timeout: float = 15.0) -> str:
|
||||
try:
|
||||
proc = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout)
|
||||
except (OSError, subprocess.SubprocessError):
|
||||
return ""
|
||||
return (proc.stdout or "").strip()
|
||||
|
||||
|
||||
def kernel_log(since: float | None = None, max_bytes: int = _MAX) -> str:
|
||||
if not shutil.which("journalctl"):
|
||||
return ""
|
||||
cmd = ["journalctl", "-k", "--no-pager"]
|
||||
since_arg = _since_arg(since)
|
||||
if since_arg:
|
||||
cmd += ["--since", since_arg]
|
||||
out = _run(cmd)
|
||||
if not out or out.strip().lower() == "-- no entries --": # journalctl's empty marker
|
||||
return ""
|
||||
return out[-max_bytes:]
|
||||
|
||||
|
||||
def coredumps(since: float | None = None, max_bytes: int = _MAX) -> str:
|
||||
if not shutil.which("coredumpctl"):
|
||||
return ""
|
||||
cmd = ["coredumpctl", "list", "--no-pager"]
|
||||
since_arg = _since_arg(since)
|
||||
if since_arg:
|
||||
cmd += ["--since", since_arg]
|
||||
out = _run(cmd)
|
||||
if not out or "no coredumps" in out.lower():
|
||||
return ""
|
||||
return out[-max_bytes:]
|
||||
|
||||
|
||||
def nvidia_snapshot(max_bytes: int = _NV_MAX) -> str:
|
||||
"""Point-in-time `nvidia-smi -q` (head-truncated — driver/temps/clocks/ECC sit near the top)."""
|
||||
if not shutil.which("nvidia-smi"):
|
||||
return ""
|
||||
out = _run(["nvidia-smi", "-q"])
|
||||
return out[:max_bytes] if out else ""
|
||||
|
||||
|
||||
def _xorg_log() -> Path | None:
|
||||
for cand in _XORG_LOGS:
|
||||
path = Path(os.path.expanduser(cand))
|
||||
if path.exists():
|
||||
return path
|
||||
return None
|
||||
|
||||
|
||||
def _session_type() -> str:
|
||||
declared = os.environ.get("XDG_SESSION_TYPE", "").lower()
|
||||
if declared in ("x11", "wayland"):
|
||||
return declared
|
||||
if os.environ.get("WAYLAND_DISPLAY"):
|
||||
return "wayland"
|
||||
return "x11" if _xorg_log() else "unknown"
|
||||
|
||||
|
||||
def _tail_file(path: Path, max_bytes: int) -> str:
|
||||
try:
|
||||
size = path.stat().st_size
|
||||
with path.open("rb") as fh:
|
||||
if size > max_bytes:
|
||||
fh.seek(size - max_bytes)
|
||||
return fh.read().decode("utf-8", "replace")
|
||||
except OSError:
|
||||
return ""
|
||||
|
||||
|
||||
def display_log(since: float | None = None, max_bytes: int = _MAX) -> str:
|
||||
"""Xorg.0.log on X11, or the compositor's user-journal slice on Wayland ('' if none)."""
|
||||
if _session_type() == "wayland":
|
||||
if not shutil.which("journalctl"):
|
||||
return ""
|
||||
cmd = ["journalctl", "--user", "--no-pager"]
|
||||
since_arg = _since_arg(since)
|
||||
if since_arg:
|
||||
cmd += ["--since", since_arg]
|
||||
cmd += [f"_COMM={comp}" for comp in _COMPOSITORS] # OR-matched
|
||||
out = _run(cmd)
|
||||
if not out or out.strip().lower() == "-- no entries --":
|
||||
return ""
|
||||
return out[-max_bytes:]
|
||||
log = _xorg_log() # X11: Xorg log isn't wall-clock-timestamped, so tail rather than scope
|
||||
return _tail_file(log, max_bytes) if log else ""
|
||||
|
||||
|
||||
def available() -> bool:
|
||||
return bool(shutil.which("journalctl") or shutil.which("coredumpctl")
|
||||
or shutil.which("nvidia-smi") or _xorg_log())
|
||||
|
||||
|
||||
def collect(since: float | None = None) -> str:
|
||||
"""Kernel + coredumps + NVIDIA snapshot + display log as one labelled block ('' if none)."""
|
||||
sections: list[str] = []
|
||||
kern = kernel_log(since)
|
||||
if kern:
|
||||
sections.append(f"--- Kernel log (journalctl -k) ---\n{kern}")
|
||||
cores = coredumps(since)
|
||||
if cores:
|
||||
sections.append(f"--- Crashed processes (coredumpctl) ---\n{cores}")
|
||||
nvidia = nvidia_snapshot()
|
||||
if nvidia:
|
||||
sections.append(f"--- NVIDIA snapshot (nvidia-smi -q) ---\n{nvidia}")
|
||||
display = display_log(since)
|
||||
if display:
|
||||
sections.append(f"--- Display server log ({_session_type()}) ---\n{display}")
|
||||
return "\n\n".join(sections)
|
||||
@@ -17,6 +17,10 @@ ICON = Path(__file__).parent / "assets" / "rigdoctor.svg"
|
||||
|
||||
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
from ..core import applog
|
||||
|
||||
applog.setup() # opt-in app logging (M15); no-op unless logging_enabled
|
||||
applog.get_logger(__name__).info("GUI starting")
|
||||
desktop.ensure() # self-register icon + .desktop so updates show it without re-installing
|
||||
app = QApplication(argv if argv is not None else sys.argv)
|
||||
app.setApplicationName("RigDoctor")
|
||||
|
||||
@@ -2,15 +2,19 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from PySide6.QtCore import Qt
|
||||
import threading
|
||||
|
||||
from PySide6.QtCore import Qt, Signal
|
||||
from PySide6.QtGui import QFont
|
||||
from PySide6.QtWidgets import (
|
||||
QDialog,
|
||||
QFrame,
|
||||
QHBoxLayout,
|
||||
QLabel,
|
||||
QMessageBox,
|
||||
QPushButton,
|
||||
QScrollArea,
|
||||
QTextEdit,
|
||||
QVBoxLayout,
|
||||
QWidget,
|
||||
)
|
||||
@@ -20,8 +24,12 @@ from .widgets import finding_card
|
||||
|
||||
|
||||
class DiagnosticDialog(QDialog):
|
||||
_explained = Signal(object) # (ok, text) from a user-triggered AI explanation
|
||||
|
||||
def __init__(self, result, parent=None) -> None:
|
||||
super().__init__(parent)
|
||||
self._result = result
|
||||
self._explained.connect(self._on_explained)
|
||||
self.setWindowTitle(f"Diagnostic — {result.game}" if result.game else "Diagnostic")
|
||||
self.resize(660, 680)
|
||||
|
||||
@@ -73,9 +81,126 @@ class DiagnosticDialog(QDialog):
|
||||
root.addWidget(scroll, 1)
|
||||
|
||||
buttons = QHBoxLayout()
|
||||
self._explain_btn = QPushButton("Explain with AI")
|
||||
self._explain_btn.clicked.connect(self._explain_with_ai)
|
||||
from ..core import ai
|
||||
self._explain_btn.setVisible(ai.is_configured()) # opt-in only; hidden if not set up
|
||||
buttons.addWidget(self._explain_btn)
|
||||
self._report_btn = QPushButton("Report") # zip this diagnostic's logs (M15)
|
||||
self._report_btn.clicked.connect(self._make_report)
|
||||
self._report_btn.setVisible(bool(result.dir)) # only when logging stored the session
|
||||
buttons.addWidget(self._report_btn)
|
||||
buttons.addStretch(1)
|
||||
close = QPushButton("Close")
|
||||
close.setObjectName("PrimaryButton")
|
||||
close.clicked.connect(self.accept)
|
||||
buttons.addWidget(close)
|
||||
root.addLayout(buttons)
|
||||
|
||||
# --- AI explanation (M14, D24) — runs only on this button press ----------------
|
||||
def _explain_with_ai(self) -> None:
|
||||
from ..core import ai
|
||||
|
||||
if not ai.is_local(): # cloud provider → explicit consent before sending data
|
||||
confirm = QMessageBox.question(
|
||||
self, "Send to AI provider",
|
||||
f"This sends your diagnostic findings to {ai.provider_label()}.\n\nContinue?",
|
||||
QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No,
|
||||
QMessageBox.StandardButton.No,
|
||||
)
|
||||
if confirm != QMessageBox.StandardButton.Yes:
|
||||
return
|
||||
self._explain_btn.setEnabled(False)
|
||||
self._explain_btn.setText("Asking the AI…")
|
||||
threading.Thread(target=self._work_explain, daemon=True).start()
|
||||
|
||||
def _work_explain(self) -> None:
|
||||
from ..core import ai, gamelogs, syslogs
|
||||
|
||||
result = self._result
|
||||
summary = result.summary
|
||||
events = {kind for _ts, kind, _detail in summary.events}
|
||||
clean = "session-stop" in events
|
||||
gpu_lost = "gpu-lost" in events
|
||||
|
||||
lines = [f"Game: {result.game or 'unknown'}"]
|
||||
if summary.start and summary.end:
|
||||
lines.append(f"Capture duration: ~{int(summary.end - summary.start)}s")
|
||||
outcome = "ended cleanly (no crash detected)" if clean else \
|
||||
"ended without a clean stop (possible crash/freeze)"
|
||||
if gpu_lost:
|
||||
outcome += "; a GPU-lost event was recorded"
|
||||
lines.append(f"Outcome: {outcome}")
|
||||
lines.append("")
|
||||
lines.append(ai.format_findings(result.findings, header="Findings:"))
|
||||
lines.append("\nCapture summary:\n" + render_summary(summary))
|
||||
|
||||
since = (summary.start - 60) if summary.start else None
|
||||
logs = gamelogs.collect(since=since) # scoped to this session
|
||||
if logs:
|
||||
lines.append("\nGame/Proton/Steam logs for this session:\n" + logs)
|
||||
sys_logs = syslogs.collect(since=since) # kernel log + crashed-process records
|
||||
if sys_logs:
|
||||
lines.append("\nSystem logs for this session (kernel + crashed processes):\n" + sys_logs)
|
||||
text = "\n".join(lines)
|
||||
ok, reply = ai.explain(text)
|
||||
if result.dir: # record exactly what was sent, the model, and the reply (M15)
|
||||
from ..core import diagstore
|
||||
diagstore.record_ai(
|
||||
result.dir, provider=ai.provider(), model=ai.model(),
|
||||
system=ai.SYSTEM_PROMPT, prompt=ai.build_prompt(text),
|
||||
response=reply if ok else f"[error] {reply}")
|
||||
self._explained.emit((ok, reply))
|
||||
|
||||
def _on_explained(self, result) -> None:
|
||||
ok, text = result
|
||||
self._explain_btn.setEnabled(True)
|
||||
self._explain_btn.setText("Explain with AI")
|
||||
self._show_explanation(text if ok else f"AI explanation failed:\n\n{text}")
|
||||
|
||||
# --- Report bundle (M15) ------------------------------------------------------
|
||||
def _make_report(self) -> None:
|
||||
from PySide6.QtCore import QUrl
|
||||
from PySide6.QtGui import QDesktopServices
|
||||
|
||||
from ..core import diagstore
|
||||
|
||||
self._report_btn.setEnabled(False)
|
||||
try:
|
||||
out = diagstore.make_report(self._result.dir)
|
||||
except OSError as exc:
|
||||
self._report_btn.setEnabled(True)
|
||||
QMessageBox.warning(self, "Report failed", str(exc))
|
||||
return
|
||||
self._report_btn.setEnabled(True)
|
||||
box = QMessageBox(self)
|
||||
box.setWindowTitle("Report created")
|
||||
box.setText(f"Saved report:\n{out}\n\nIt contains this diagnostic's logs and any AI "
|
||||
"interaction (data sent, model, and reply).")
|
||||
open_btn = box.addButton("Open folder", QMessageBox.ButtonRole.ActionRole)
|
||||
box.addButton("OK", QMessageBox.ButtonRole.AcceptRole)
|
||||
box.exec()
|
||||
if box.clickedButton() is open_btn:
|
||||
QDesktopServices.openUrl(QUrl.fromLocalFile(str(out.parent)))
|
||||
|
||||
def _show_explanation(self, text: str) -> None:
|
||||
from ..core import ai
|
||||
|
||||
dlg = QDialog(self)
|
||||
dlg.setWindowTitle(f"AI explanation — {ai.provider_label()}")
|
||||
dlg.resize(620, 520)
|
||||
lay = QVBoxLayout(dlg)
|
||||
view = QTextEdit()
|
||||
view.setObjectName("Report")
|
||||
view.setReadOnly(True)
|
||||
view.setMarkdown(text) # the model replies in Markdown — render it
|
||||
lay.addWidget(view)
|
||||
note = QLabel("AI-generated suggestions — verify before acting, especially anything that changes settings or data.")
|
||||
note.setObjectName("Muted")
|
||||
note.setWordWrap(True)
|
||||
lay.addWidget(note)
|
||||
close = QPushButton("Close")
|
||||
close.setObjectName("PrimaryButton")
|
||||
close.clicked.connect(dlg.accept)
|
||||
lay.addWidget(close, alignment=Qt.AlignmentFlag.AlignRight)
|
||||
dlg.exec()
|
||||
|
||||
@@ -8,6 +8,7 @@ from PySide6.QtCore import Qt, QUrl, Signal
|
||||
from PySide6.QtGui import QDesktopServices
|
||||
from PySide6.QtWidgets import (
|
||||
QApplication,
|
||||
QButtonGroup,
|
||||
QCheckBox,
|
||||
QComboBox,
|
||||
QDoubleSpinBox,
|
||||
@@ -18,6 +19,7 @@ from PySide6.QtWidgets import (
|
||||
QLineEdit,
|
||||
QMessageBox,
|
||||
QPushButton,
|
||||
QRadioButton,
|
||||
QSizePolicy,
|
||||
QTextEdit,
|
||||
QVBoxLayout,
|
||||
@@ -25,7 +27,7 @@ from PySide6.QtWidgets import (
|
||||
)
|
||||
|
||||
from .. import config
|
||||
from ..core import alerts, installer, service, sysenv, uninstall, updates
|
||||
from ..core import ai, alerts, installer, service, sysenv, uninstall, updates
|
||||
from .theme import GOOD, MUTED, WARN
|
||||
|
||||
|
||||
@@ -54,6 +56,7 @@ class SetupPage(QWidget):
|
||||
_installed = Signal(int, str)
|
||||
_upd_state = Signal(object)
|
||||
_mode_applied = Signal(object) # (mode, ok, message) from a trigger-mode change
|
||||
_ai_tested = Signal(object) # (ok, message) from an AI connectivity test
|
||||
changed = Signal() # alert settings saved — main window re-applies them live
|
||||
|
||||
def __init__(self) -> None:
|
||||
@@ -62,6 +65,7 @@ class SetupPage(QWidget):
|
||||
self._installed.connect(self._on_installed)
|
||||
self._upd_state.connect(self._on_upd_state)
|
||||
self._mode_applied.connect(self._on_mode_applied)
|
||||
self._ai_tested.connect(self._on_ai_tested)
|
||||
|
||||
root = QVBoxLayout(self)
|
||||
root.setContentsMargins(20, 18, 20, 18)
|
||||
@@ -158,6 +162,76 @@ class SetupPage(QWidget):
|
||||
self._trigger_status.setText("systemd --user isn't available on this system.")
|
||||
root.addWidget(trig_card)
|
||||
|
||||
# AI assistant (M14, D24): explain diagnostics. Strictly opt-in — the model is only
|
||||
# contacted when the user presses "Explain with AI"; this panel just configures it.
|
||||
ai_card, ai_layout = _panel("AI assistant")
|
||||
ai_desc = QLabel(
|
||||
"Optionally let an AI explain your diagnostics in plain language. It runs <b>only</b> "
|
||||
"when you press “Explain with AI” — never automatically. Choose a provider:\n"
|
||||
"• Ollama — a local model on your machine (private, no key; needs Ollama running).\n"
|
||||
"• Claude — Anthropic's API (higher quality; sends findings to Anthropic; needs a key)."
|
||||
)
|
||||
ai_desc.setObjectName("Muted")
|
||||
ai_desc.setWordWrap(True)
|
||||
ai_layout.addWidget(ai_desc)
|
||||
|
||||
prov_row = QHBoxLayout()
|
||||
self._ai_group = QButtonGroup(self)
|
||||
self._ai_ollama = QRadioButton("Ollama (local)")
|
||||
self._ai_claude = QRadioButton("Claude (Anthropic)")
|
||||
self._ai_group.addButton(self._ai_ollama)
|
||||
self._ai_group.addButton(self._ai_claude)
|
||||
self._ai_ollama.toggled.connect(self._on_ai_provider_changed)
|
||||
prov_row.addWidget(self._ai_ollama)
|
||||
prov_row.addWidget(self._ai_claude)
|
||||
prov_row.addStretch(1)
|
||||
ai_layout.addLayout(prov_row)
|
||||
|
||||
self._ai_model = QLineEdit()
|
||||
self._ai_model.setPlaceholderText(
|
||||
f"Model (e.g. {ai.OLLAMA_SUGGESTED_MODEL} for Ollama; blank = Claude default)")
|
||||
ai_layout.addWidget(self._ai_model)
|
||||
self._ai_endpoint = QLineEdit()
|
||||
self._ai_endpoint.setPlaceholderText("Ollama server URL (default http://localhost:11434)")
|
||||
ai_layout.addWidget(self._ai_endpoint)
|
||||
self._ai_key = QLineEdit()
|
||||
self._ai_key.setEchoMode(QLineEdit.EchoMode.Password)
|
||||
self._ai_key.setPlaceholderText("Claude API key (stored in your keyring)")
|
||||
ai_layout.addWidget(self._ai_key)
|
||||
|
||||
ai_btn_row = QHBoxLayout()
|
||||
ai_save = QPushButton("Save")
|
||||
ai_save.setObjectName("PrimaryButton")
|
||||
ai_save.clicked.connect(self._save_ai)
|
||||
self._ai_test_btn = QPushButton("Test")
|
||||
self._ai_test_btn.clicked.connect(self._test_ai)
|
||||
ai_btn_row.addWidget(ai_save)
|
||||
ai_btn_row.addWidget(self._ai_test_btn)
|
||||
ai_btn_row.addStretch(1)
|
||||
ai_layout.addLayout(ai_btn_row)
|
||||
self._ai_status = QLabel("")
|
||||
self._ai_status.setObjectName("Muted")
|
||||
self._ai_status.setWordWrap(True)
|
||||
ai_layout.addWidget(self._ai_status)
|
||||
root.addWidget(ai_card)
|
||||
|
||||
# Logging (M15): opt-in app logging + per-diagnostic storage (enables the Report bundle).
|
||||
log_card, log_layout = _panel("Logging")
|
||||
log_desc = QLabel(
|
||||
"Save application logs and store each diagnostic in its own folder so you can review "
|
||||
"or <b>Report</b> it. Off by default; everything stays on your machine.\n"
|
||||
f"• Diagnostics: {config.DIAGNOSTICS_DIR}\n"
|
||||
f"• Reports: {config.REPORTS_DIR}"
|
||||
)
|
||||
log_desc.setObjectName("Muted")
|
||||
log_desc.setWordWrap(True)
|
||||
log_layout.addWidget(log_desc)
|
||||
self._logging = QCheckBox("Enable logging (application + diagnostics)")
|
||||
self._logging.setChecked(config.load_config().get("logging_enabled", False))
|
||||
self._logging.toggled.connect(self._toggle_logging)
|
||||
log_layout.addWidget(self._logging)
|
||||
root.addWidget(log_card)
|
||||
|
||||
# Account access (M13/M12): one Gitea token gates updates and session sharing.
|
||||
upd_card, upd_layout = _panel("Account access")
|
||||
hint = QLabel("A Gitea access token unlocks updates and session sharing. "
|
||||
@@ -203,8 +277,72 @@ class SetupPage(QWidget):
|
||||
self._refresh()
|
||||
self._load_alerts()
|
||||
self._trigger.setCurrentText(config.load_config().get("trigger_mode", "manual"))
|
||||
self._load_ai()
|
||||
self._refresh_update_status()
|
||||
|
||||
# --- AI assistant (M14) ---------------------------------------------------
|
||||
def _load_ai(self) -> None:
|
||||
cfg = config.load_config()
|
||||
prov = cfg.get("ai_provider", "")
|
||||
self._ai_claude.setChecked(prov == "claude")
|
||||
self._ai_ollama.setChecked(prov == "ollama")
|
||||
self._ai_model.setText(cfg.get("ai_model", ""))
|
||||
self._ai_endpoint.setText(cfg.get("ai_endpoint", "http://localhost:11434"))
|
||||
if config.load_ai_key():
|
||||
self._ai_key.setPlaceholderText("Claude API key saved — type to replace")
|
||||
self._on_ai_provider_changed()
|
||||
|
||||
def _ai_provider(self) -> str:
|
||||
if self._ai_claude.isChecked():
|
||||
return "claude"
|
||||
if self._ai_ollama.isChecked():
|
||||
return "ollama"
|
||||
return ""
|
||||
|
||||
def _on_ai_provider_changed(self) -> None:
|
||||
prov = self._ai_provider()
|
||||
self._ai_endpoint.setVisible(prov == "ollama")
|
||||
self._ai_key.setVisible(prov == "claude")
|
||||
self._ai_test_btn.setEnabled(prov != "")
|
||||
if prov == "ollama" and not self._ai_model.text().strip():
|
||||
self._ai_model.setText(ai.OLLAMA_SUGGESTED_MODEL) # suggested default; user can change
|
||||
|
||||
def _save_ai(self) -> None:
|
||||
prov = self._ai_provider()
|
||||
config.update_config(
|
||||
ai_provider=prov,
|
||||
ai_model=self._ai_model.text().strip(),
|
||||
ai_endpoint=self._ai_endpoint.text().strip() or "http://localhost:11434",
|
||||
)
|
||||
if prov == "claude" and self._ai_key.text().strip():
|
||||
config.save_ai_key(self._ai_key.text().strip())
|
||||
self._ai_key.clear()
|
||||
self._ai_key.setPlaceholderText("Claude API key saved — type to replace")
|
||||
self._ai_status.setText("Saved." if prov else "Saved — no provider selected (AI stays off).")
|
||||
|
||||
def _test_ai(self) -> None:
|
||||
self._save_ai()
|
||||
self._ai_status.setText("Testing… contacting the provider.")
|
||||
self._ai_test_btn.setEnabled(False)
|
||||
threading.Thread(target=self._work_test_ai, daemon=True).start()
|
||||
|
||||
def _work_test_ai(self) -> None:
|
||||
from ..core import ai
|
||||
|
||||
ok, msg = ai.explain("Connectivity test — reply exactly: RigDoctor AI is working.")
|
||||
self._ai_tested.emit((ok, msg))
|
||||
|
||||
def _on_ai_tested(self, result) -> None:
|
||||
ok, msg = result
|
||||
self._ai_test_btn.setEnabled(True)
|
||||
self._ai_status.setText(("✓ " if ok else "✗ ") + (msg[:200] if msg else ""))
|
||||
|
||||
def _toggle_logging(self, on: bool) -> None:
|
||||
from ..core import applog
|
||||
|
||||
config.update_config(logging_enabled=on)
|
||||
applog.setup(force=True) # attach/detach the file handler immediately
|
||||
|
||||
def _run_wizard(self) -> None:
|
||||
from .setup_wizard import SetupWizard
|
||||
|
||||
|
||||
@@ -117,7 +117,7 @@ class SetupWizard(QDialog):
|
||||
tag = " — all installed ✓" if not missing else f" — {len(missing)} to install"
|
||||
cb = QCheckBox(f"{bundle}: {names}{tag}")
|
||||
cb.setChecked(bool(missing)) # default-check bundles with something to add
|
||||
cb.setEnabled(bool(missing) and sysenv.package_manager() == "apt")
|
||||
cb.setEnabled(sysenv.package_manager() == "apt") # selectable even if already installed
|
||||
self._bundle_checks[bundle] = cb
|
||||
v.addWidget(cb)
|
||||
if sysenv.package_manager() != "apt":
|
||||
|
||||
@@ -144,6 +144,24 @@ QCheckBox::indicator:hover {{ border-color: {ACCENT}; }}
|
||||
QCheckBox::indicator:checked {{
|
||||
background: {ACCENT}; border-color: {ACCENT}; image: url("{_CHECK}");
|
||||
}}
|
||||
QCheckBox::indicator:disabled {{ border-color: #3a414d; background: #1c2026; }}
|
||||
QCheckBox::indicator:checked:disabled {{ background: #2a6175; border-color: #2a6175; }}
|
||||
QCheckBox:disabled {{ color: {MUTED}; }}
|
||||
|
||||
/* Radio buttons — same dark treatment as checkboxes; the selected one gets a clear
|
||||
accent dot (Fusion leaves these unstyled = the selection is invisible on dark). */
|
||||
QRadioButton {{ spacing: 8px; background: transparent; }}
|
||||
QRadioButton::indicator {{
|
||||
width: 17px; height: 17px; border-radius: 9px;
|
||||
border: 1px solid {MUTED}; background: #262b34;
|
||||
}}
|
||||
QRadioButton::indicator:hover {{ border-color: {ACCENT}; }}
|
||||
QRadioButton::indicator:checked {{
|
||||
border: 1px solid {ACCENT};
|
||||
background: qradialgradient(cx:0.5, cy:0.5, radius:0.5, fx:0.5, fy:0.5,
|
||||
stop:0 {ACCENT}, stop:0.5 {ACCENT}, stop:0.55 #262b34, stop:1 #262b34);
|
||||
}}
|
||||
QRadioButton:disabled {{ color: {MUTED}; }}
|
||||
|
||||
/* Dialogs (update prompt, changelog) — match the dark theme so text is readable. */
|
||||
QDialog {{ background: {BG}; }}
|
||||
|
||||
@@ -0,0 +1,118 @@
|
||||
"""Tests for the M14 AI assistant: provider selection, grounding, parsing (no network)."""
|
||||
|
||||
import unittest
|
||||
from unittest import mock
|
||||
|
||||
from rigdoctor.core import ai, ai_knowledge
|
||||
|
||||
|
||||
class KnowledgeTests(unittest.TestCase):
|
||||
def test_matches_xid_and_smart(self):
|
||||
facts = ai_knowledge.relevant("Kernel: NVRM: Xid 79: GPU has fallen off the bus")
|
||||
self.assertTrue(any("fallen off the bus" in f for f in facts))
|
||||
|
||||
def test_matches_smart_pending(self):
|
||||
facts = ai_knowledge.relevant("SMART 197 Current_Pending_Sector = 8")
|
||||
self.assertTrue(any("Pending Sector" in f for f in facts))
|
||||
|
||||
def test_no_match_returns_empty(self):
|
||||
self.assertEqual(ai_knowledge.relevant("everything is fine"), [])
|
||||
|
||||
|
||||
class ConfigStateTests(unittest.TestCase):
|
||||
def _cfg(self, **over):
|
||||
base = {"ai_provider": "", "ai_model": "", "ai_endpoint": "http://localhost:11434"}
|
||||
base.update(over)
|
||||
return base
|
||||
|
||||
def test_unconfigured_by_default(self):
|
||||
with mock.patch.object(ai.config, "load_config", return_value=self._cfg()):
|
||||
self.assertFalse(ai.is_configured())
|
||||
|
||||
def test_ollama_needs_model(self):
|
||||
with mock.patch.object(ai.config, "load_config", return_value=self._cfg(ai_provider="ollama")):
|
||||
self.assertFalse(ai.is_configured())
|
||||
with mock.patch.object(ai.config, "load_config",
|
||||
return_value=self._cfg(ai_provider="ollama", ai_model="llama3.1")):
|
||||
self.assertTrue(ai.is_configured())
|
||||
|
||||
def test_claude_needs_key(self):
|
||||
with mock.patch.object(ai.config, "load_config", return_value=self._cfg(ai_provider="claude")), \
|
||||
mock.patch.object(ai.config, "load_ai_key", return_value=None):
|
||||
self.assertFalse(ai.is_configured())
|
||||
with mock.patch.object(ai.config, "load_config", return_value=self._cfg(ai_provider="claude")), \
|
||||
mock.patch.object(ai.config, "load_ai_key", return_value="sk-ant-x"):
|
||||
self.assertTrue(ai.is_configured())
|
||||
|
||||
def test_claude_default_model(self):
|
||||
with mock.patch.object(ai.config, "load_config", return_value=self._cfg(ai_provider="claude")):
|
||||
self.assertEqual(ai.model(), ai.CLAUDE_DEFAULT_MODEL)
|
||||
|
||||
|
||||
class PromptTests(unittest.TestCase):
|
||||
def test_build_prompt_includes_facts_and_findings(self):
|
||||
prompt = ai.build_prompt("Xid 79: GPU has fallen off the bus")
|
||||
self.assertIn("Reference facts", prompt)
|
||||
self.assertIn("Collected findings", prompt)
|
||||
self.assertIn("fallen off the bus", prompt)
|
||||
|
||||
def test_format_findings(self):
|
||||
class F:
|
||||
severity, category, title, detail = "warn", "GPU", "Hot", "92C"
|
||||
text = ai.format_findings([F()])
|
||||
self.assertIn("[WARN] GPU: Hot — 92C", text)
|
||||
|
||||
def test_appid_glossary_resolves_known_ids(self):
|
||||
from rigdoctor.core import steam
|
||||
with mock.patch.object(steam, "appid_names", return_value={"2694490": "Path of Exile 2"}):
|
||||
glossary = ai.appid_glossary("Steam log: removed AppID 2694490 ... pid 130544")
|
||||
self.assertIn("2694490 = Path of Exile 2", glossary)
|
||||
|
||||
def test_appid_glossary_ignores_unknown_ids(self):
|
||||
from rigdoctor.core import steam
|
||||
with mock.patch.object(steam, "appid_names", return_value={"570": "Dota 2"}):
|
||||
self.assertEqual(ai.appid_glossary("pid 130544 used 8192 MiB"), "") # not in library
|
||||
|
||||
def test_build_prompt_includes_glossary(self):
|
||||
from rigdoctor.core import steam
|
||||
with mock.patch.object(steam, "appid_names", return_value={"2694490": "Path of Exile 2"}):
|
||||
prompt = ai.build_prompt("AppID 2694490 launched")
|
||||
self.assertIn("Path of Exile 2", prompt)
|
||||
|
||||
|
||||
class ExplainTests(unittest.TestCase):
|
||||
def _cfg(self, **over):
|
||||
base = {"ai_provider": "", "ai_model": "", "ai_endpoint": "http://localhost:11434"}
|
||||
base.update(over)
|
||||
return base
|
||||
|
||||
def test_no_provider(self):
|
||||
with mock.patch.object(ai.config, "load_config", return_value=self._cfg()):
|
||||
ok, msg = ai.explain("x")
|
||||
self.assertFalse(ok)
|
||||
self.assertIn("No AI provider", msg)
|
||||
|
||||
def test_ollama_parses_response(self):
|
||||
with mock.patch.object(ai.config, "load_config",
|
||||
return_value=self._cfg(ai_provider="ollama", ai_model="llama3.1")), \
|
||||
mock.patch.object(ai, "_post", return_value={"response": "It's the PSU."}) as post:
|
||||
ok, msg = ai.explain("Xid 79")
|
||||
self.assertTrue(ok)
|
||||
self.assertEqual(msg, "It's the PSU.")
|
||||
self.assertIn("/api/generate", post.call_args[0][0])
|
||||
|
||||
def test_claude_parses_content_blocks(self):
|
||||
with mock.patch.object(ai.config, "load_config", return_value=self._cfg(ai_provider="claude")), \
|
||||
mock.patch.object(ai.config, "load_ai_key", return_value="sk-ant-x"), \
|
||||
mock.patch.object(ai, "_post", return_value={"content": [
|
||||
{"type": "text", "text": "Likely a failing disk."}]}) as post:
|
||||
ok, msg = ai.explain("SMART 197")
|
||||
self.assertTrue(ok)
|
||||
self.assertEqual(msg, "Likely a failing disk.")
|
||||
headers = post.call_args[0][2]
|
||||
self.assertEqual(headers["anthropic-version"], ai.ANTHROPIC_VERSION)
|
||||
self.assertEqual(headers["x-api-key"], "sk-ant-x")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
unittest.main()
|
||||
@@ -0,0 +1,104 @@
|
||||
"""Tests for M15 per-diagnostic storage + Report bundles + app logging."""
|
||||
|
||||
import json
|
||||
import tempfile
|
||||
import unittest
|
||||
import zipfile
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
from unittest import mock
|
||||
|
||||
from rigdoctor.core import applog, diagstore
|
||||
|
||||
|
||||
@dataclass
|
||||
class FakeSummary:
|
||||
start: float = 1.0
|
||||
end: float = 2.0
|
||||
samples: int = 3
|
||||
events: list = field(default_factory=list)
|
||||
|
||||
|
||||
@dataclass
|
||||
class FakeFinding:
|
||||
severity: str = "ok"
|
||||
category: str = "GPU"
|
||||
title: str = "Looks fine"
|
||||
detail: str = "no issues"
|
||||
|
||||
|
||||
@dataclass
|
||||
class FakeResult:
|
||||
game: str = "Path of Exile 2"
|
||||
summary: FakeSummary = field(default_factory=FakeSummary)
|
||||
findings: list = field(default_factory=lambda: [FakeFinding()])
|
||||
dir: str | None = None
|
||||
|
||||
|
||||
class StoreTests(unittest.TestCase):
|
||||
def setUp(self):
|
||||
self.tmp = Path(tempfile.mkdtemp())
|
||||
|
||||
def test_disabled_returns_none(self):
|
||||
with mock.patch.object(diagstore, "enabled", return_value=False):
|
||||
self.assertIsNone(diagstore.store(FakeResult()))
|
||||
|
||||
def test_store_writes_artifacts(self):
|
||||
with mock.patch.object(diagstore, "enabled", return_value=True), \
|
||||
mock.patch("rigdoctor.render.render_summary", return_value="SUMMARY-TEXT"), \
|
||||
mock.patch("rigdoctor.core.gamelogs.collect", return_value="LOG-TEXT"), \
|
||||
mock.patch("rigdoctor.core.syslogs.collect", return_value="SYS-LOG"), \
|
||||
mock.patch("rigdoctor.core.inventory.collect", return_value=[]), \
|
||||
mock.patch.object(diagstore.config, "DIAGNOSTICS_DIR", self.tmp / "diagnostics"):
|
||||
directory = diagstore.store(FakeResult())
|
||||
self.assertTrue((directory / "result.json").exists())
|
||||
self.assertTrue((directory / "report.txt").exists())
|
||||
self.assertEqual((directory / "gamelogs.txt").read_text(), "LOG-TEXT")
|
||||
self.assertEqual((directory / "syslogs.txt").read_text(), "SYS-LOG")
|
||||
self.assertTrue((directory / "inventory.txt").exists()) # inventory included for debugging
|
||||
data = json.loads((directory / "result.json").read_text())
|
||||
self.assertEqual(data["game"], "Path of Exile 2")
|
||||
self.assertEqual(len(data["findings"]), 1)
|
||||
|
||||
def test_record_ai_then_report_includes_ai_and_applog(self):
|
||||
diag = self.tmp / "20260522-poe2"
|
||||
diag.mkdir()
|
||||
diagstore.record_ai(diag, provider="claude", model="claude-opus-4-7",
|
||||
system="SYS", prompt="EXACT DATA SENT", response="THE REPLY")
|
||||
ai_files = list((diag / "ai").glob("explain-*.json"))
|
||||
self.assertTrue(ai_files)
|
||||
record = json.loads(ai_files[0].read_text())
|
||||
self.assertEqual(record["model"], "claude-opus-4-7")
|
||||
self.assertEqual(record["data_sent_to_model"], "EXACT DATA SENT")
|
||||
self.assertEqual(record["model_reply"], "THE REPLY")
|
||||
|
||||
app_log = self.tmp / "app.log"
|
||||
app_log.write_text("app log line")
|
||||
with mock.patch.object(diagstore.config, "REPORTS_DIR", self.tmp / "reports"), \
|
||||
mock.patch.object(diagstore.config, "APP_LOG", app_log):
|
||||
out = diagstore.make_report(diag)
|
||||
self.assertTrue(out.exists())
|
||||
with zipfile.ZipFile(out) as zf:
|
||||
names = zf.namelist()
|
||||
self.assertTrue(any(n.endswith("app.log") for n in names))
|
||||
self.assertTrue(any("/ai/explain-" in n for n in names))
|
||||
|
||||
|
||||
class AppLogTests(unittest.TestCase):
|
||||
def test_disabled_is_noop(self):
|
||||
with mock.patch.object(applog.config, "load_config", return_value={"logging_enabled": False}):
|
||||
self.assertFalse(applog.setup(force=True))
|
||||
|
||||
def test_enabled_writes_file(self):
|
||||
tmp = Path(tempfile.mkdtemp())
|
||||
with mock.patch.object(applog.config, "load_config", return_value={"logging_enabled": True}), \
|
||||
mock.patch.object(applog.config, "STATE_DIR", tmp), \
|
||||
mock.patch.object(applog.config, "APP_LOG", tmp / "app.log"):
|
||||
self.assertTrue(applog.setup(force=True))
|
||||
applog.get_logger("test").info("hello world")
|
||||
applog.setup(force=True) # cleanup path: re-run detaches/reattaches cleanly
|
||||
self.assertTrue((tmp / "app.log").exists())
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
unittest.main()
|
||||
@@ -0,0 +1,77 @@
|
||||
"""Tests for M14 game/Proton/Steam log collection."""
|
||||
|
||||
import os
|
||||
import tempfile
|
||||
import time
|
||||
import unittest
|
||||
from pathlib import Path
|
||||
from unittest import mock
|
||||
|
||||
from rigdoctor.core import gamelogs
|
||||
|
||||
|
||||
class TailTests(unittest.TestCase):
|
||||
def test_tail_returns_last_bytes(self):
|
||||
path = Path(tempfile.mkdtemp()) / "x.log"
|
||||
path.write_text("A" * 100 + "TAIL")
|
||||
out = gamelogs._tail(path, 4)
|
||||
self.assertEqual(out, "TAIL")
|
||||
|
||||
def test_tail_short_file(self):
|
||||
path = Path(tempfile.mkdtemp()) / "x.log"
|
||||
path.write_text("short")
|
||||
self.assertEqual(gamelogs._tail(path, 9999), "short")
|
||||
|
||||
def test_tail_missing(self):
|
||||
self.assertEqual(gamelogs._tail(Path("/nope/x.log"), 10), "")
|
||||
|
||||
|
||||
class CollectTests(unittest.TestCase):
|
||||
def test_collect_includes_proton_and_steam(self):
|
||||
tmp = Path(tempfile.mkdtemp())
|
||||
proton = tmp / "steam-570.log"
|
||||
proton.write_text("err: vkd3d device lost")
|
||||
console = tmp / "console-linux.txt"
|
||||
console.write_text("Game removed AppID 570 ... exit")
|
||||
with mock.patch.object(gamelogs, "_proton_logs", return_value=[proton]), \
|
||||
mock.patch.object(gamelogs, "_steam_console", return_value=console):
|
||||
out = gamelogs.collect()
|
||||
self.assertIn("Proton log", out)
|
||||
self.assertIn("vkd3d", out)
|
||||
self.assertIn("Steam log", out)
|
||||
self.assertIn("exit", out)
|
||||
|
||||
def test_collect_empty_when_none(self):
|
||||
with mock.patch.object(gamelogs, "_proton_logs", return_value=[]), \
|
||||
mock.patch.object(gamelogs, "_steam_console", return_value=None):
|
||||
self.assertEqual(gamelogs.collect(), "")
|
||||
|
||||
|
||||
class SinceScopingTests(unittest.TestCase):
|
||||
def test_since_filter_keeps_window_only(self):
|
||||
text = (
|
||||
"[2026-05-22 13:00:00] old session line\n"
|
||||
"[2026-05-22 13:00:01] another old line\n"
|
||||
"[2026-05-22 14:30:00] new session launch\n"
|
||||
"[2026-05-22 14:30:05] new session error\n"
|
||||
)
|
||||
since = time.mktime(time.strptime("2026-05-22 14:00:00", "%Y-%m-%d %H:%M:%S"))
|
||||
out = gamelogs._since_filter(text, since)
|
||||
self.assertIn("new session launch", out)
|
||||
self.assertIn("new session error", out)
|
||||
self.assertNotIn("old session", out)
|
||||
|
||||
def test_collect_skips_stale_proton_log(self):
|
||||
tmp = Path(tempfile.mkdtemp())
|
||||
proton = tmp / "steam-9999.log"
|
||||
proton.write_text("stale proton output from an earlier game")
|
||||
old_mtime = time.time() - 3600
|
||||
os.utime(proton, (old_mtime, old_mtime))
|
||||
since = time.time() - 60 # session started a minute ago
|
||||
with mock.patch.object(gamelogs, "_proton_logs", return_value=[proton]), \
|
||||
mock.patch.object(gamelogs, "_steam_console", return_value=None):
|
||||
self.assertEqual(gamelogs.collect(since=since), "") # stale log excluded
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
unittest.main()
|
||||
@@ -0,0 +1,95 @@
|
||||
"""Tests for M15 session-scoped system-log collection (kernel + coredumps)."""
|
||||
|
||||
import unittest
|
||||
from unittest import mock
|
||||
|
||||
from rigdoctor.core import syslogs
|
||||
|
||||
|
||||
class KernelLogTests(unittest.TestCase):
|
||||
def test_passes_since_and_tails(self):
|
||||
with mock.patch("shutil.which", return_value="/usr/bin/journalctl"), \
|
||||
mock.patch.object(syslogs, "_run", return_value="X" * 100 + "TAILLINE") as run:
|
||||
out = syslogs.kernel_log(since=1_000_000_000, max_bytes=8)
|
||||
self.assertEqual(out, "TAILLINE")
|
||||
cmd = run.call_args[0][0]
|
||||
self.assertIn("-k", cmd)
|
||||
self.assertIn("--since", cmd)
|
||||
|
||||
def test_missing_tool_returns_empty(self):
|
||||
with mock.patch("shutil.which", return_value=None):
|
||||
self.assertEqual(syslogs.kernel_log(), "")
|
||||
|
||||
|
||||
class CoredumpTests(unittest.TestCase):
|
||||
def test_empty_when_no_coredumps(self):
|
||||
with mock.patch("shutil.which", return_value="/usr/bin/coredumpctl"), \
|
||||
mock.patch.object(syslogs, "_run", return_value="No coredumps found."):
|
||||
self.assertEqual(syslogs.coredumps(), "")
|
||||
|
||||
def test_returns_list(self):
|
||||
with mock.patch("shutil.which", return_value="/usr/bin/coredumpctl"), \
|
||||
mock.patch.object(syslogs, "_run", return_value="TIME PID SIG EXE\n... SEGV PathOfExile"):
|
||||
out = syslogs.coredumps()
|
||||
self.assertIn("PathOfExile", out)
|
||||
|
||||
|
||||
class NvidiaTests(unittest.TestCase):
|
||||
def test_missing_tool(self):
|
||||
with mock.patch("shutil.which", return_value=None):
|
||||
self.assertEqual(syslogs.nvidia_snapshot(), "")
|
||||
|
||||
def test_snapshot_head_truncated(self):
|
||||
with mock.patch("shutil.which", return_value="/usr/bin/nvidia-smi"), \
|
||||
mock.patch.object(syslogs, "_run", return_value="DRIVER\n" + "x" * 99999):
|
||||
out = syslogs.nvidia_snapshot(max_bytes=10)
|
||||
self.assertEqual(out, "DRIVER\nxxx") # head, not tail
|
||||
|
||||
|
||||
class DisplayTests(unittest.TestCase):
|
||||
def test_session_type_env(self):
|
||||
with mock.patch.dict("os.environ", {"XDG_SESSION_TYPE": "wayland"}):
|
||||
self.assertEqual(syslogs._session_type(), "wayland")
|
||||
|
||||
def test_x11_tails_xorg_log(self):
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
log = Path(tempfile.mkdtemp()) / "Xorg.0.log"
|
||||
log.write_text("(EE) NVIDIA(GPU-0): something failed")
|
||||
with mock.patch.object(syslogs, "_session_type", return_value="x11"), \
|
||||
mock.patch.object(syslogs, "_xorg_log", return_value=log):
|
||||
out = syslogs.display_log()
|
||||
self.assertIn("(EE) NVIDIA", out)
|
||||
|
||||
def test_wayland_uses_user_journal(self):
|
||||
with mock.patch.object(syslogs, "_session_type", return_value="wayland"), \
|
||||
mock.patch("shutil.which", return_value="/usr/bin/journalctl"), \
|
||||
mock.patch.object(syslogs, "_run", return_value="gnome-shell: GPU error") as run:
|
||||
out = syslogs.display_log(since=1_000_000_000)
|
||||
self.assertIn("GPU error", out)
|
||||
cmd = run.call_args[0][0]
|
||||
self.assertIn("--user", cmd)
|
||||
self.assertTrue(any(a.startswith("_COMM=") for a in cmd))
|
||||
|
||||
|
||||
class CollectTests(unittest.TestCase):
|
||||
def test_collect_combines_sections(self):
|
||||
with mock.patch.object(syslogs, "kernel_log", return_value="NVRM: Xid 79"), \
|
||||
mock.patch.object(syslogs, "coredumps", return_value="game SIGSEGV"), \
|
||||
mock.patch.object(syslogs, "nvidia_snapshot", return_value="Driver Version 595"), \
|
||||
mock.patch.object(syslogs, "display_log", return_value="(EE) NVIDIA"):
|
||||
out = syslogs.collect()
|
||||
for needle in ("Kernel log", "Xid 79", "Crashed processes", "SIGSEGV",
|
||||
"NVIDIA snapshot", "595", "Display server log"):
|
||||
self.assertIn(needle, out)
|
||||
|
||||
def test_collect_empty_when_nothing(self):
|
||||
with mock.patch.object(syslogs, "kernel_log", return_value=""), \
|
||||
mock.patch.object(syslogs, "coredumps", return_value=""), \
|
||||
mock.patch.object(syslogs, "nvidia_snapshot", return_value=""), \
|
||||
mock.patch.object(syslogs, "display_log", return_value=""):
|
||||
self.assertEqual(syslogs.collect(), "")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
unittest.main()
|
||||
Reference in New Issue
Block a user