bug-import

Open-source in combo-skills

Published at Nox-Lumen-tech/combo-skills/bug-import. Install from chat: /skill-install Nox-Lumen-tech/combo-skills/bug-import

Elevator pitch

bug-import is not a simple "dump bugs into the KB" ETL Skill.
It continuously turns individual bug records stored in your KB into LTM-ready pattern facts via scheduled batches — so Stage‑2 reviews hit patterns, not only keyword hits.

Why post-processing matters

Without distillation	With distillation
Retrieval surface raw bug chatter	Hits concise facts such as "module X is historically fragile"
Low precision, pricey tokens	High signal summaries with actionable wording
New hires rediscover tacit lore	Repeated lessons encoded as reusable facts
Too many noisy tickets	More evidence improves confidence ceilings

Individual bugs seldom reveal systemic risk; patterns require aggregate statistics. Phase 3 runs on a cron (default every 6h), never inline per ticket.

Three-phase pipeline

Rendering diagram…

Phase 3 — four orthogonal dimensions

① `module_hotspot`

Item	Detail
Trigger	Same `module` accumulates ≥ 3 bugs
Template	""Module X" accumulated N defects, mainly class Y issues—focus code review around Y-sensitive logic."
Base confidence	0.60
Bonuses	+0.10 at ≥5 / ≥10 tickets, +0.10 if any critical, cap 0.95
Tags	`["bug_pattern", "module:{module}", "pattern:{top_tag}"]`

② `cross_module_pattern`

Item	Detail
Trigger	Pattern tag appears in ≥ two modules
Template	"`pattern_tag` spread across multiple modules ⇒ systemic risk footprint"
Base confidence	0.65
Bonuses	+0.10 for ≥3 modules, +0.10 for ≥5 total bugs; cap 0.95
Tags	`["bug_pattern", "cross_module", "pattern:{pattern_tag}"]`

③ `file_hotspot`

Item	Detail
Trigger	Same `affected_file` appears ≥ twice
Template	Highlights chronic defect magnets for tighter review cadence
Base confidence	0.70 (file-level is sharper than module summaries)
Bonuses	+0.10 for ≥3 tickets, +0.10 for critical exposures; cap 0.95
Tags	`["bug_pattern", "file_hotspot", "file:{path}"]`

④ `recurring_root_cause`

Item	Detail
Trigger	Root-cause lexemes co-occur ≥ 3 defects (TF‑IDF / co-mention heuristics)
Template	Flags recurring RCA families for playbook checkpoints
Base confidence	0.60
Bonuses	+0.15 exact keyword overlaps, +0.10 beyond five bugs
Tags	`["bug_pattern", "recurring_root_cause", "cause:{keyword}"]`

These vantage points are deliberately non-duplicative: module ↔ cross-cutting ↔ filepath ↔ RCA narrative.

Incremental recomputation flow

Cron evaluates whether bugs arrived since prior run. New rows lacking ltm_extracted_at feed aggregators:

Match existing facts via metadata.source_bug_ids → bump confidence/update narrative
Otherwise mint new fact rows
Mark processed bugs once written

Rejected / false-positive tickets cause future passes to downgrade or delete degraded facts automatically.

Example fact envelope

{
  "type": "fact",
  "content": "Module BMS recorded eight historical defects revolving around heartbeat timeout handling …",
  "confidence": 0.85,
  "tags": ["bug_pattern", "module:BMS", "pattern:heartbeat_timeout"],
  "metadata": {
    "source_bug_ids": ["BUG-102", "BUG-118"],
    "extraction_type": "module_hotspot",
    "project": "BMS-v3",
    "severity_distribution": {"critical": 2, "major": 4, "minor": 2}
  }
}

source_bug_ids is the audit breadcrumb powering drill-down evidence during L2 findings.

Safety budgets

Guard	Threshold	Overflow behavior
Facts per cron	100	Keep highest-confidence survivors
`content` length	1000 chars	Hard truncate tail
`source_bug_ids`	50	Keep latest 50 references

End-to-end reviewer story

Historic tickets on bms/heartbeat.c produced a file_hotspot fact tying BUG‑118 / 203 / 301 / 412. A trivial PR tweaking interval + guarding hb_enabled now triggers retrieval:

unified_search(
  sources=["fact"],
  tags_include=["bug_pattern", "file:bms/heartbeat.c"],
  min_confidence=0.7
)

Facts hydrate the LLM with linked KB narratives so reviewers see prior failure physics, not lint noise alone — including concurrency regressions aligning with BUG‑203 boundary cases.

CLI / chat triggers

/bug-import import closed bugs for Jira project ABC from the past 12 months

Natural intents: "import CSV defects", "load historical bugs".

Constraints

Rule	Explanation
Avoid megabatches	Shard imports to bound LTM workloads
Scrub sensitivity	Respect export policies before ingestion
Tenant isolation	Imported corpora remain tenant-bound
Expect lag	Facts appear after next cron, not instantaneous

Partner skill	Purpose
code-review	Primary consumer
memory-sdk	Low-level LTM RW
alm-integration	Optionally push structured findings outbound
html-report	Visual storytelling