fix(db): make gaps --refresh non-destructive (FK-safe insert_gaps)
insert_gaps did a blanket 'DELETE FROM gaps', which fails with 'FOREIGN KEY constraint failed' whenever proposal_gaps references a gap (generated proposals). Delete only gaps not referenced by a proposal so the refresh preserves proposal linkage and never trips the FK. Also logs the 2026-05-22 data refresh (761->889 drafts) in dev-journal.
This commit is contained in:
@@ -4,6 +4,23 @@
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
### 2026-05-22 SESSION — Data refresh as of today: 761 → 889 drafts (full Sonnet pipeline)
|
||||||
|
|
||||||
|
**What**: First full corpus refresh since 2026-03-08. Fetched the delta from Datatracker (128 new drafts, all agent/identity/oauth-token topics), backfilled their full text, then ran the whole pipeline on Sonnet: rate → embed → extract ideas → score novelty → gap analysis → idea embeddings → convergence. Synced the fresh `drafts.db` to the server and restarted the `ietf` container so ietf.nennemann.de serves it.
|
||||||
|
|
||||||
|
**Why**: The deployed site was showing March data; the user wanted it current.
|
||||||
|
|
||||||
|
**Result** (live API stats): 816 relevant drafts (889 total, 73 false-positives), 722 authors, 973 ideas (avg novelty 2.8), 18 gaps, 170 cross-org convergent ideas (was 132). Tracked token usage ~1.0M in / 472K out on Sonnet.
|
||||||
|
|
||||||
|
**Surprise / lessons**:
|
||||||
|
- The fetch pipeline inserted the 128 new drafts but left all of them **without full text** (the text URL needs the `-NN` revision suffix; the per-source download skipped them). Wrote `scripts/backfill-unrated-text.py` (rev-fallback) to fix — analysis quality depends on full text.
|
||||||
|
- The shell `ANTHROPIC_API_KEY` env var was **stale (401)**; the valid key was in `.env`. python-dotenv doesn't override an existing env var, so the CLI silently used the bad one. Had to pass the `.env` key explicitly.
|
||||||
|
- **Bug fixed**: `db.insert_gaps()` did a blanket `DELETE FROM gaps`, which trips the `proposal_gaps.gap_id` FK whenever generated proposals exist (it did — 3 proposals / 7 links). Changed it to delete only gaps not referenced by a proposal, so `gaps --refresh` is non-destructive.
|
||||||
|
|
||||||
|
**Cost**: ~$10 tracked (Sonnet). ideas/gaps are dev-only pages, not shown on the production site, but refreshed anyway per user request.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
### 2026-05-22 SESSION — Deployed the web dashboard at ietf.nennemann.de
|
### 2026-05-22 SESSION — Deployed the web dashboard at ietf.nennemann.de
|
||||||
|
|
||||||
**What**: Brought the Flask dashboard online on the nennemann-dev server (Hetzner CAX21) behind Caddy at `https://ietf.nennemann.de`, basic_auth gated (shared `vorschau` preview password), `noindex`. Added an `ietf` Docker service to `nennemann-biz/infra/dev/docker-compose.yml` (build context `/home/dev/repos/research.ietf`, host :8082 -> container :5000, data dir mounted read-write so pageview analytics persist). Container runs in PRODUCTION mode (admin routes 404).
|
**What**: Brought the Flask dashboard online on the nennemann-dev server (Hetzner CAX21) behind Caddy at `https://ietf.nennemann.de`, basic_auth gated (shared `vorschau` preview password), `noindex`. Added an `ietf` Docker service to `nennemann-biz/infra/dev/docker-compose.yml` (build context `/home/dev/repos/research.ietf`, host :8082 -> container :5000, data dir mounted read-write so pageview analytics persist). Container runs in PRODUCTION mode (admin routes 404).
|
||||||
|
|||||||
@@ -975,7 +975,12 @@ class Database:
|
|||||||
# --- Gaps ---
|
# --- Gaps ---
|
||||||
|
|
||||||
def insert_gaps(self, gaps: list[dict]) -> None:
|
def insert_gaps(self, gaps: list[dict]) -> None:
|
||||||
self.conn.execute("DELETE FROM gaps") # Replace old analysis
|
# Replace old analysis, but keep any gap still referenced by a generated
|
||||||
|
# proposal (proposal_gaps.gap_id FK) so a refresh never destroys proposal
|
||||||
|
# linkage or trips the foreign-key constraint.
|
||||||
|
self.conn.execute(
|
||||||
|
"DELETE FROM gaps WHERE id NOT IN (SELECT gap_id FROM proposal_gaps)"
|
||||||
|
)
|
||||||
now = datetime.now(timezone.utc).isoformat()
|
now = datetime.now(timezone.utc).isoformat()
|
||||||
for g in gaps:
|
for g in gaps:
|
||||||
self.conn.execute(
|
self.conn.execute(
|
||||||
|
|||||||
Reference in New Issue
Block a user