fix(ai): analyse the actual session, not stale/benign logs — 0.28.1

The user ran a game ~20s with no crash but the AI dredged up old log lines, guessed the wrong game, and gave Windows advice. Fixes: - Prompt now includes the real game name + capture duration + outcome (clean vs crash), so the model uses the known game instead of guessing from log paths. - gamelogs.collect(since=…): scope Steam-console lines by timestamp and skip a stale per-app Proton log (mtime before the session) — no unrelated past run. - ai_knowledge: flag benign Steam/Proton lines (libnvidia-ml.so.1 assertion, routine minidumps, "fork without exec") as non-causal. - System prompt: Linux-only steps (no "run as administrator"); don't manufacture a problem on a clean run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 13:38:19 +02:00
parent b59f202891
commit c7e50ba4cb
8 changed files with 146 additions and 23 deletions
@@ -5,6 +5,18 @@ All notable changes to RigDoctor are recorded here. Format follows
 (`MAJOR.MINOR.PATCH`, pre-1.0). `__version__` and `pyproject.toml` must match the git
 release tag (so the auto-updater, D18, can compare versions).
 ## [0.28.1] - 2026-05-22
 ### Fixed
 - **AI explanations were misreading stale/benign logs.** Three fixes so the model analyses the
  *actual* session: (1) the prompt now states the **real game name, capture duration, and
  outcome** (clean vs. crash) so the model stops guessing the game from log paths; (2) game logs
  are **scoped to the session window** (Steam-console lines filtered by timestamp; a stale
  per-app Proton log from an earlier game is skipped); (3) the reference KB flags common
  **benign** Steam/Proton lines (`libnvidia-ml.so.1` assertion, routine minidump uploads, "fork
  without exec") so they aren't reported as the cause. The system prompt also forbids
  Windows-only advice (no "run as administrator") and tells the model not to invent a problem
  when the run was clean.
 ## [0.28.0] - 2026-05-22
 ### Added
 - **AI explanations now include recent game logs.** When you press "Explain with AI" on a
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "rigdoctor"
-version = "0.28.0"
+version = "0.28.1"
 description = "Modular hardware monitoring & crash diagnostics for Linux gamers."
 readme = "README.md"
 requires-python = ">=3.11"
@@ -1,3 +1,3 @@
 """RigDoctor — modular hardware monitoring & crash diagnostics for Linux gamers."""
-__version__ = "0.28.0"
+__version__ = "0.28.1"
@@ -33,15 +33,20 @@ CLAUDE_MAX_TOKENS = 2000
 ANTHROPIC_VERSION = "2023-06-01"
 SYSTEM_PROMPT = (
-    "You are RigDoctor's hardware-diagnostics assistant for Linux gamers. You are given the "
+    "You are RigDoctor's hardware-diagnostics assistant for Linux gamers (Ubuntu + NVIDIA, games "
-    "structured findings RigDoctor collected from this machine — which may include recent game, "
+    "via Steam/Proton). You are given session context, the structured findings RigDoctor "
-    "Proton, and system log excerpts — plus a set of reference facts. Explain in plain language "
+    "collected — which may include recent game/Proton/system log excerpts scoped to this session "
-    "what they mean, correlate any log errors with the findings to pinpoint WHEN and WHY things "
+    "— plus reference facts. Use the GAME NAME from the session context; never guess the game "
-    "went wrong, identify the most likely root cause, and give concrete, ordered next steps "
+    "from log paths or app IDs. Correlate log errors with the findings to pinpoint WHEN and WHY "
-    "(exact commands where useful). Base your reasoning ONLY on the data and reference facts "
+    "things went wrong, identify the most likely root cause, and give concrete, ordered next "
-    "provided — do not invent readings, hardware, or log lines. Be concise and practical. "
+    "steps with exact Linux commands where useful.\n"
-    "Present fixes as suggestions, and clearly warn before any step that could cause data loss "
+    "Rules: Base your reasoning ONLY on the data and reference facts provided — never invent "
-    "or instability. Format your answer in Markdown."
+    "readings, hardware, or log lines. This is LINUX: never suggest Windows-only steps (e.g. "
    "'run as administrator', registry edits, toggling antivirus). Treat log lines flagged BENIGN "
    "in the reference facts as non-causal. If no crash was recorded and there are no warning or "
    "critical findings, say plainly that the session looks healthy and do NOT manufacture a "
    "problem. Be concise. Present fixes as suggestions and warn before anything that risks data "
    "loss or instability. Format your answer in Markdown."
 )
@@ -64,6 +64,18 @@ ENTRIES: list[tuple[tuple[str, ...], str]] = [
    (("nvidia persistence", "persistence mode"),
     "NVIDIA persistence mode keeps the driver loaded when no app is using the GPU, avoiding "
     "re-init stalls — harmless to enable."),
    (("libnvidia-ml.so", "interface.h", "failed to load \"libnvidia-ml"),
     "BENIGN: a Steam log assertion 'Failed to load libnvidia-ml.so.1' (from interface.h) is "
     "logged on many normal launches — the Steam runtime sandbox can't see the host NVML library. "
     "It is NOT by itself a crash cause. Only investigate the driver if the GPU is genuinely "
     "undetected (nvidia-smi fails)."),
    (("minidump", ".dmp", "uploading minidump"),
     "BENIGN-by-default: a minidump upload line means a crash handler ran AND that the game/engine "
     "routinely uploads dumps; it is not proof that THIS session crashed unless a hard freeze or "
     "non-zero exit was also recorded. Don't treat a routine minidump line as the root cause."),
    (("fork without exec", "skipping destruction"),
     "BENIGN: 'pid X != Y, skipping destruction (fork without exec?)' is routine Steam/Proton "
     "process bookkeeping, not an error."),
 ]
@@ -10,11 +10,41 @@ vkd3d/DXVK error, a crash line, the exit code) rather than only the sensor summa
 from __future__ import annotations
 import os
 import re
 import time
 from pathlib import Path
 # Steam keeps logs under its install root; ~/.steam/steam usually symlinks to the real one.
 _STEAM_LOG_DIRS = ("~/.steam/steam/logs", "~/.local/share/Steam/logs", "~/.steam/root/logs")
 _STEAM_LOG_FILES = ("console-linux.txt", "console_log.txt", "stderr.txt")
 _TS = re.compile(r"^\[(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})\]")
 def _line_epoch(line: str) -> float | None:
    m = _TS.match(line)
    if not m:
        return None
    try:
        return time.mktime(time.strptime(m.group(1), "%Y-%m-%d %H:%M:%S"))
    except ValueError:
        return None
 def _since_filter(text: str, since: float) -> str:
    """Keep lines from the first timestamp >= `since` onward (logs are chronological).
    Untimestamped lines before the window are dropped; once inside the window every line is
    kept (so multi-line entries survive). This scopes a long-lived Steam log to one session.
    """
    out: list[str] = []
    including = False
    for line in text.splitlines():
        epoch = _line_epoch(line)
        if epoch is not None and epoch >= since:
            including = True
        if including:
            out.append(line)
    return "\n".join(out)
 def _tail(path: Path, max_bytes: int) -> str:
@@ -51,17 +81,36 @@ def available() -> bool:
    return bool(_proton_logs() or _steam_console())
-def collect(max_bytes: int = 6000) -> str:
+def collect(since: float | None = None, max_bytes: int = 8000) -> str:
-    """Recent Proton + Steam log tails as one labelled text block ('' if none)."""
+    """Recent Proton + Steam log tails as one labelled text block ('' if none).
    With ``since`` (epoch), scope to that session: skip a Proton log not written during/after
    the session (a stale per-app log from an earlier game), and keep only Steam-console lines
    timestamped at/after ``since`` — so we don't feed the model an unrelated past session.
    """
    sections: list[str] = []
    protons = _proton_logs()
    if protons:
-        tail = _tail(protons[0], max_bytes).strip()
+        log = protons[0]
        fresh = since is None or _mtime(log) >= since
        tail = _tail(log, max_bytes).strip() if fresh else ""
        if tail:
-            sections.append(f"--- Proton log ({protons[0].name}) ---\n{tail}")
+            sections.append(f"--- Proton log ({log.name}) ---\n{tail}")
    console = _steam_console()
    if console:
-        tail = _tail(console, max_bytes).strip()
+        raw = _tail(console, 40000 if since else max_bytes)
-        if tail:
+        if since is not None:
-            sections.append(f"--- Steam log ({console.name}) ---\n{tail}")
+            raw = _since_filter(raw, since)
        raw = raw.strip()[-max_bytes:].strip()
        if raw:
            sections.append(f"--- Steam log ({console.name}) ---\n{raw}")
    return "\n\n".join(sections)
 def _mtime(path: Path) -> float:
    try:
        return path.stat().st_mtime
    except OSError:
        return 0.0
@@ -113,12 +113,29 @@ class DiagnosticDialog(QDialog):
    def _work_explain(self) -> None:
        from ..core import ai, gamelogs
-        text = ai.format_findings(self._result.findings, header="Diagnostic findings:")
+        result = self._result
-        text += "\n\nCapture summary:\n" + render_summary(self._result.summary)
+        summary = result.summary
-        logs = gamelogs.collect()
+        events = {kind for _ts, kind, _detail in summary.events}
        clean = "session-stop" in events
        gpu_lost = "gpu-lost" in events
        lines = [f"Game: {result.game or 'unknown'}"]
        if summary.start and summary.end:
            lines.append(f"Capture duration: ~{int(summary.end - summary.start)}s")
        outcome = "ended cleanly (no crash detected)" if clean else \
            "ended without a clean stop (possible crash/freeze)"
        if gpu_lost:
            outcome += "; a GPU-lost event was recorded"
        lines.append(f"Outcome: {outcome}")
        lines.append("")
        lines.append(ai.format_findings(result.findings, header="Findings:"))
        lines.append("\nCapture summary:\n" + render_summary(summary))
        since = (summary.start - 60) if summary.start else None
        logs = gamelogs.collect(since=since)  # scoped to this session
        if logs:
-            text += "\n\nRecent game/Proton/Steam logs (newest at the end):\n" + logs
+            lines.append("\nGame/Proton/Steam logs for this session:\n" + logs)
-        self._explained.emit(ai.explain(text))
+        self._explained.emit(ai.explain("\n".join(lines)))
    def _on_explained(self, result) -> None:
        ok, text = result
@@ -1,6 +1,8 @@
 """Tests for M14 game/Proton/Steam log collection."""
 import os
 import tempfile
 import time
 import unittest
 from pathlib import Path
 from unittest import mock
@@ -45,5 +47,31 @@ class CollectTests(unittest.TestCase):
            self.assertEqual(gamelogs.collect(), "")
 class SinceScopingTests(unittest.TestCase):
    def test_since_filter_keeps_window_only(self):
        text = (
            "[2026-05-22 13:00:00] old session line\n"
            "[2026-05-22 13:00:01] another old line\n"
            "[2026-05-22 14:30:00] new session launch\n"
            "[2026-05-22 14:30:05] new session error\n"
        )
        since = time.mktime(time.strptime("2026-05-22 14:00:00", "%Y-%m-%d %H:%M:%S"))
        out = gamelogs._since_filter(text, since)
        self.assertIn("new session launch", out)
        self.assertIn("new session error", out)
        self.assertNotIn("old session", out)
    def test_collect_skips_stale_proton_log(self):
        tmp = Path(tempfile.mkdtemp())
        proton = tmp / "steam-9999.log"
        proton.write_text("stale proton output from an earlier game")
        old_mtime = time.time() - 3600
        os.utime(proton, (old_mtime, old_mtime))
        since = time.time() - 60  # session started a minute ago
        with mock.patch.object(gamelogs, "_proton_logs", return_value=[proton]), \
             mock.patch.object(gamelogs, "_steam_console", return_value=None):
            self.assertEqual(gamelogs.collect(since=since), "")  # stale log excluded
 if __name__ == "__main__":
    unittest.main()
`@@ -1,3 +1,3 @@`
	`"""RigDoctor — modular hardware monitoring & crash diagnostics for Linux gamers."""`	`"""RigDoctor — modular hardware monitoring & crash diagnostics for Linux gamers."""`

	`__version__ = "0.28.0"`	`__version__ = "0.28.1"`