fix(ai): analyse the actual session, not stale/benign logs — 0.28.1

The user ran a game ~20s with no crash but the AI dredged up old log lines,
guessed the wrong game, and gave Windows advice. Fixes:
- Prompt now includes the real game name + capture duration + outcome (clean vs
  crash), so the model uses the known game instead of guessing from log paths.
- gamelogs.collect(since=…): scope Steam-console lines by timestamp and skip a
  stale per-app Proton log (mtime before the session) — no unrelated past run.
- ai_knowledge: flag benign Steam/Proton lines (libnvidia-ml.so.1 assertion,
  routine minidumps, "fork without exec") as non-causal.
- System prompt: Linux-only steps (no "run as administrator"); don't manufacture
  a problem on a clean run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-22 13:38:19 +02:00
parent b59f202891
commit c7e50ba4cb
8 changed files with 146 additions and 23 deletions
+12
View File
@@ -5,6 +5,18 @@ All notable changes to RigDoctor are recorded here. Format follows
(`MAJOR.MINOR.PATCH`, pre-1.0). `__version__` and `pyproject.toml` must match the git (`MAJOR.MINOR.PATCH`, pre-1.0). `__version__` and `pyproject.toml` must match the git
release tag (so the auto-updater, D18, can compare versions). release tag (so the auto-updater, D18, can compare versions).
## [0.28.1] - 2026-05-22
### Fixed
- **AI explanations were misreading stale/benign logs.** Three fixes so the model analyses the
*actual* session: (1) the prompt now states the **real game name, capture duration, and
outcome** (clean vs. crash) so the model stops guessing the game from log paths; (2) game logs
are **scoped to the session window** (Steam-console lines filtered by timestamp; a stale
per-app Proton log from an earlier game is skipped); (3) the reference KB flags common
**benign** Steam/Proton lines (`libnvidia-ml.so.1` assertion, routine minidump uploads, "fork
without exec") so they aren't reported as the cause. The system prompt also forbids
Windows-only advice (no "run as administrator") and tells the model not to invent a problem
when the run was clean.
## [0.28.0] - 2026-05-22 ## [0.28.0] - 2026-05-22
### Added ### Added
- **AI explanations now include recent game logs.** When you press "Explain with AI" on a - **AI explanations now include recent game logs.** When you press "Explain with AI" on a
+1 -1
View File
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project] [project]
name = "rigdoctor" name = "rigdoctor"
version = "0.28.0" version = "0.28.1"
description = "Modular hardware monitoring & crash diagnostics for Linux gamers." description = "Modular hardware monitoring & crash diagnostics for Linux gamers."
readme = "README.md" readme = "README.md"
requires-python = ">=3.11" requires-python = ">=3.11"
+1 -1
View File
@@ -1,3 +1,3 @@
"""RigDoctor — modular hardware monitoring & crash diagnostics for Linux gamers.""" """RigDoctor — modular hardware monitoring & crash diagnostics for Linux gamers."""
__version__ = "0.28.0" __version__ = "0.28.1"
+14 -9
View File
@@ -33,15 +33,20 @@ CLAUDE_MAX_TOKENS = 2000
ANTHROPIC_VERSION = "2023-06-01" ANTHROPIC_VERSION = "2023-06-01"
SYSTEM_PROMPT = ( SYSTEM_PROMPT = (
"You are RigDoctor's hardware-diagnostics assistant for Linux gamers. You are given the " "You are RigDoctor's hardware-diagnostics assistant for Linux gamers (Ubuntu + NVIDIA, games "
"structured findings RigDoctor collected from this machine — which may include recent game, " "via Steam/Proton). You are given session context, the structured findings RigDoctor "
"Proton, and system log excerpts — plus a set of reference facts. Explain in plain language " "collected — which may include recent game/Proton/system log excerpts scoped to this session "
"what they mean, correlate any log errors with the findings to pinpoint WHEN and WHY things " "— plus reference facts. Use the GAME NAME from the session context; never guess the game "
"went wrong, identify the most likely root cause, and give concrete, ordered next steps " "from log paths or app IDs. Correlate log errors with the findings to pinpoint WHEN and WHY "
"(exact commands where useful). Base your reasoning ONLY on the data and reference facts " "things went wrong, identify the most likely root cause, and give concrete, ordered next "
"provided — do not invent readings, hardware, or log lines. Be concise and practical. " "steps with exact Linux commands where useful.\n"
"Present fixes as suggestions, and clearly warn before any step that could cause data loss " "Rules: Base your reasoning ONLY on the data and reference facts provided — never invent "
"or instability. Format your answer in Markdown." "readings, hardware, or log lines. This is LINUX: never suggest Windows-only steps (e.g. "
"'run as administrator', registry edits, toggling antivirus). Treat log lines flagged BENIGN "
"in the reference facts as non-causal. If no crash was recorded and there are no warning or "
"critical findings, say plainly that the session looks healthy and do NOT manufacture a "
"problem. Be concise. Present fixes as suggestions and warn before anything that risks data "
"loss or instability. Format your answer in Markdown."
) )
+12
View File
@@ -64,6 +64,18 @@ ENTRIES: list[tuple[tuple[str, ...], str]] = [
(("nvidia persistence", "persistence mode"), (("nvidia persistence", "persistence mode"),
"NVIDIA persistence mode keeps the driver loaded when no app is using the GPU, avoiding " "NVIDIA persistence mode keeps the driver loaded when no app is using the GPU, avoiding "
"re-init stalls — harmless to enable."), "re-init stalls — harmless to enable."),
(("libnvidia-ml.so", "interface.h", "failed to load \"libnvidia-ml"),
"BENIGN: a Steam log assertion 'Failed to load libnvidia-ml.so.1' (from interface.h) is "
"logged on many normal launches — the Steam runtime sandbox can't see the host NVML library. "
"It is NOT by itself a crash cause. Only investigate the driver if the GPU is genuinely "
"undetected (nvidia-smi fails)."),
(("minidump", ".dmp", "uploading minidump"),
"BENIGN-by-default: a minidump upload line means a crash handler ran AND that the game/engine "
"routinely uploads dumps; it is not proof that THIS session crashed unless a hard freeze or "
"non-zero exit was also recorded. Don't treat a routine minidump line as the root cause."),
(("fork without exec", "skipping destruction"),
"BENIGN: 'pid X != Y, skipping destruction (fork without exec?)' is routine Steam/Proton "
"process bookkeeping, not an error."),
] ]
+56 -7
View File
@@ -10,11 +10,41 @@ vkd3d/DXVK error, a crash line, the exit code) rather than only the sensor summa
from __future__ import annotations from __future__ import annotations
import os import os
import re
import time
from pathlib import Path from pathlib import Path
# Steam keeps logs under its install root; ~/.steam/steam usually symlinks to the real one. # Steam keeps logs under its install root; ~/.steam/steam usually symlinks to the real one.
_STEAM_LOG_DIRS = ("~/.steam/steam/logs", "~/.local/share/Steam/logs", "~/.steam/root/logs") _STEAM_LOG_DIRS = ("~/.steam/steam/logs", "~/.local/share/Steam/logs", "~/.steam/root/logs")
_STEAM_LOG_FILES = ("console-linux.txt", "console_log.txt", "stderr.txt") _STEAM_LOG_FILES = ("console-linux.txt", "console_log.txt", "stderr.txt")
_TS = re.compile(r"^\[(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})\]")
def _line_epoch(line: str) -> float | None:
m = _TS.match(line)
if not m:
return None
try:
return time.mktime(time.strptime(m.group(1), "%Y-%m-%d %H:%M:%S"))
except ValueError:
return None
def _since_filter(text: str, since: float) -> str:
"""Keep lines from the first timestamp >= `since` onward (logs are chronological).
Untimestamped lines before the window are dropped; once inside the window every line is
kept (so multi-line entries survive). This scopes a long-lived Steam log to one session.
"""
out: list[str] = []
including = False
for line in text.splitlines():
epoch = _line_epoch(line)
if epoch is not None and epoch >= since:
including = True
if including:
out.append(line)
return "\n".join(out)
def _tail(path: Path, max_bytes: int) -> str: def _tail(path: Path, max_bytes: int) -> str:
@@ -51,17 +81,36 @@ def available() -> bool:
return bool(_proton_logs() or _steam_console()) return bool(_proton_logs() or _steam_console())
def collect(max_bytes: int = 6000) -> str: def collect(since: float | None = None, max_bytes: int = 8000) -> str:
"""Recent Proton + Steam log tails as one labelled text block ('' if none).""" """Recent Proton + Steam log tails as one labelled text block ('' if none).
With ``since`` (epoch), scope to that session: skip a Proton log not written during/after
the session (a stale per-app log from an earlier game), and keep only Steam-console lines
timestamped at/after ``since`` — so we don't feed the model an unrelated past session.
"""
sections: list[str] = [] sections: list[str] = []
protons = _proton_logs() protons = _proton_logs()
if protons: if protons:
tail = _tail(protons[0], max_bytes).strip() log = protons[0]
fresh = since is None or _mtime(log) >= since
tail = _tail(log, max_bytes).strip() if fresh else ""
if tail: if tail:
sections.append(f"--- Proton log ({protons[0].name}) ---\n{tail}") sections.append(f"--- Proton log ({log.name}) ---\n{tail}")
console = _steam_console() console = _steam_console()
if console: if console:
tail = _tail(console, max_bytes).strip() raw = _tail(console, 40000 if since else max_bytes)
if tail: if since is not None:
sections.append(f"--- Steam log ({console.name}) ---\n{tail}") raw = _since_filter(raw, since)
raw = raw.strip()[-max_bytes:].strip()
if raw:
sections.append(f"--- Steam log ({console.name}) ---\n{raw}")
return "\n\n".join(sections) return "\n\n".join(sections)
def _mtime(path: Path) -> float:
try:
return path.stat().st_mtime
except OSError:
return 0.0
+22 -5
View File
@@ -113,12 +113,29 @@ class DiagnosticDialog(QDialog):
def _work_explain(self) -> None: def _work_explain(self) -> None:
from ..core import ai, gamelogs from ..core import ai, gamelogs
text = ai.format_findings(self._result.findings, header="Diagnostic findings:") result = self._result
text += "\n\nCapture summary:\n" + render_summary(self._result.summary) summary = result.summary
logs = gamelogs.collect() events = {kind for _ts, kind, _detail in summary.events}
clean = "session-stop" in events
gpu_lost = "gpu-lost" in events
lines = [f"Game: {result.game or 'unknown'}"]
if summary.start and summary.end:
lines.append(f"Capture duration: ~{int(summary.end - summary.start)}s")
outcome = "ended cleanly (no crash detected)" if clean else \
"ended without a clean stop (possible crash/freeze)"
if gpu_lost:
outcome += "; a GPU-lost event was recorded"
lines.append(f"Outcome: {outcome}")
lines.append("")
lines.append(ai.format_findings(result.findings, header="Findings:"))
lines.append("\nCapture summary:\n" + render_summary(summary))
since = (summary.start - 60) if summary.start else None
logs = gamelogs.collect(since=since) # scoped to this session
if logs: if logs:
text += "\n\nRecent game/Proton/Steam logs (newest at the end):\n" + logs lines.append("\nGame/Proton/Steam logs for this session:\n" + logs)
self._explained.emit(ai.explain(text)) self._explained.emit(ai.explain("\n".join(lines)))
def _on_explained(self, result) -> None: def _on_explained(self, result) -> None:
ok, text = result ok, text = result
+28
View File
@@ -1,6 +1,8 @@
"""Tests for M14 game/Proton/Steam log collection.""" """Tests for M14 game/Proton/Steam log collection."""
import os
import tempfile import tempfile
import time
import unittest import unittest
from pathlib import Path from pathlib import Path
from unittest import mock from unittest import mock
@@ -45,5 +47,31 @@ class CollectTests(unittest.TestCase):
self.assertEqual(gamelogs.collect(), "") self.assertEqual(gamelogs.collect(), "")
class SinceScopingTests(unittest.TestCase):
def test_since_filter_keeps_window_only(self):
text = (
"[2026-05-22 13:00:00] old session line\n"
"[2026-05-22 13:00:01] another old line\n"
"[2026-05-22 14:30:00] new session launch\n"
"[2026-05-22 14:30:05] new session error\n"
)
since = time.mktime(time.strptime("2026-05-22 14:00:00", "%Y-%m-%d %H:%M:%S"))
out = gamelogs._since_filter(text, since)
self.assertIn("new session launch", out)
self.assertIn("new session error", out)
self.assertNotIn("old session", out)
def test_collect_skips_stale_proton_log(self):
tmp = Path(tempfile.mkdtemp())
proton = tmp / "steam-9999.log"
proton.write_text("stale proton output from an earlier game")
old_mtime = time.time() - 3600
os.utime(proton, (old_mtime, old_mtime))
since = time.time() - 60 # session started a minute ago
with mock.patch.object(gamelogs, "_proton_logs", return_value=[proton]), \
mock.patch.object(gamelogs, "_steam_console", return_value=None):
self.assertEqual(gamelogs.collect(since=since), "") # stale log excluded
if __name__ == "__main__": if __name__ == "__main__":
unittest.main() unittest.main()