Nox-Lumen MfgNox-Lumen Mfg

bug-import

Open-source in combo-skills

Published at Nox-Lumen-tech/combo-skills/bug-import. Install from chat: /skill-install Nox-Lumen-tech/combo-skills/bug-import

Elevator pitch

bug-import is not a simple "dump bugs into the KB" ETL Skill.
It continuously turns individual bug records stored in your KB into LTM-ready pattern facts via scheduled batches — so Stage‑2 reviews hit patterns, not only keyword hits.

Why post-processing matters

Without distillationWith distillation
Retrieval surface raw bug chatterHits concise facts such as "module X is historically fragile"
Low precision, pricey tokensHigh signal summaries with actionable wording
New hires rediscover tacit loreRepeated lessons encoded as reusable facts
Too many noisy ticketsMore evidence improves confidence ceilings

Individual bugs seldom reveal systemic risk; patterns require aggregate statistics. Phase 3 runs on a cron (default every 6h), never inline per ticket.

Three-phase pipeline

Rendering diagram…

Phase 3 — four orthogonal dimensions

module_hotspot

ItemDetail
TriggerSame module accumulates ≥ 3 bugs
Template""Module X" accumulated N defects, mainly class Y issues—focus code review around Y-sensitive logic."
Base confidence0.60
Bonuses+0.10 at ≥5 / ≥10 tickets, +0.10 if any critical, cap 0.95
Tags["bug_pattern", "module:{module}", "pattern:{top_tag}"]

cross_module_pattern

ItemDetail
TriggerPattern tag appears in ≥ two modules
Template"pattern_tag spread across multiple modules ⇒ systemic risk footprint"
Base confidence0.65
Bonuses+0.10 for ≥3 modules, +0.10 for ≥5 total bugs; cap 0.95
Tags["bug_pattern", "cross_module", "pattern:{pattern_tag}"]

file_hotspot

ItemDetail
TriggerSame affected_file appears ≥ twice
TemplateHighlights chronic defect magnets for tighter review cadence
Base confidence0.70 (file-level is sharper than module summaries)
Bonuses+0.10 for ≥3 tickets, +0.10 for critical exposures; cap 0.95
Tags["bug_pattern", "file_hotspot", "file:{path}"]

recurring_root_cause

ItemDetail
TriggerRoot-cause lexemes co-occur ≥ 3 defects (TF‑IDF / co-mention heuristics)
TemplateFlags recurring RCA families for playbook checkpoints
Base confidence0.60
Bonuses+0.15 exact keyword overlaps, +0.10 beyond five bugs
Tags["bug_pattern", "recurring_root_cause", "cause:{keyword}"]

These vantage points are deliberately non-duplicative: module ↔ cross-cutting ↔ filepath ↔ RCA narrative.

Incremental recomputation flow

Cron evaluates whether bugs arrived since prior run. New rows lacking ltm_extracted_at feed aggregators:

  • Match existing facts via metadata.source_bug_ids → bump confidence/update narrative
  • Otherwise mint new fact rows
  • Mark processed bugs once written

Rejected / false-positive tickets cause future passes to downgrade or delete degraded facts automatically.

Example fact envelope

{
  "type": "fact",
  "content": "Module BMS recorded eight historical defects revolving around heartbeat timeout handling …",
  "confidence": 0.85,
  "tags": ["bug_pattern", "module:BMS", "pattern:heartbeat_timeout"],
  "metadata": {
    "source_bug_ids": ["BUG-102", "BUG-118"],
    "extraction_type": "module_hotspot",
    "project": "BMS-v3",
    "severity_distribution": {"critical": 2, "major": 4, "minor": 2}
  }
}

source_bug_ids is the audit breadcrumb powering drill-down evidence during L2 findings.

Safety budgets

GuardThresholdOverflow behavior
Facts per cron100Keep highest-confidence survivors
content length1000 charsHard truncate tail
source_bug_ids50Keep latest 50 references

End-to-end reviewer story

Historic tickets on bms/heartbeat.c produced a file_hotspot fact tying BUG‑118 / 203 / 301 / 412. A trivial PR tweaking interval + guarding hb_enabled now triggers retrieval:

unified_search(
  sources=["fact"],
  tags_include=["bug_pattern", "file:bms/heartbeat.c"],
  min_confidence=0.7
)

Facts hydrate the LLM with linked KB narratives so reviewers see prior failure physics, not lint noise alone — including concurrency regressions aligning with BUG‑203 boundary cases.

CLI / chat triggers

/bug-import import closed bugs for Jira project ABC from the past 12 months

Natural intents: "import CSV defects", "load historical bugs".

Constraints

RuleExplanation
Avoid megabatchesShard imports to bound LTM workloads
Scrub sensitivityRespect export policies before ingestion
Tenant isolationImported corpora remain tenant-bound
Expect lagFacts appear after next cron, not instantaneous

Why scheduled—not streaming?

Dimensional thresholds (≥3 bugs, RCA keyword convergence) inherently need batch statistics. Immediate updates would churn low-evidence placeholders or thrash partially formed facts—wasting compute and polluting LTM.

What happens when corrections arrive?

Mislabels trigger recomputation passes; confidence dips or facts vanish altogether while provenance adjusts.

Overlap between module hotspots & file hotspots?

They stack: module-level cautions escalate with file-level magnifiers—a feature, not duplication.

Ecosystem linkage

Rendering diagram…
Partner skillPurpose
code-reviewPrimary consumer
memory-sdkLow-level LTM RW
alm-integrationOptionally push structured findings outbound
html-reportVisual storytelling

Further reading

On this page