Indexing everything becomes unbounded fast. Shrink scope to one source of truth and a small curated corpus. Capture notes in one repeatable format, tag by task, and prune on a fixed cadence. That keeps retrieval predictable and keeps the model inside constraints.
That’s another strong point, and I think it’s the pragmatic default: shrink scope, keep one source of truth, enforce a repeatable format, and prune on a cadence. It’s basically how you keep both retrieval and any automation predictable.
The tension I’m trying to understand is that in a lot of real setups the “corpus” isn’t voluntarily curated — it’s fragmented across machines/networks/tools, and the opportunity cost of “move everything into one place” is exactly why people fall back to grep and ad-hoc search.
Do you think the right answer is always “accept the constraint and curate harder”, or is there a middle ground where you can keep sources where they are but still get reliable re-entry (even if it’s incomplete/partial)?
I’m collecting constraints like this as the core design input (more context in my HN profile/bio if you want to compare notes).
The failure mode is missing constraints, not “coding skill”. Treat the model as a generator that must operate inside an explicit workflow: define the invariant boundaries, require a plan/diff before edits, run tests and static checks, and stop when uncertainty appears. That turns “hacky conditional” behaviour into controlled change.
Right. Each context window is a partial view, so it cannot “know the codebase” unless you supply stable artefacts. Treat project state as inputs: invariants, interfaces, constraints, and a small set of must-keep facts. Then force changes through a plan and a diff, and gate with tests and checks. That turns context limits into a controlled boundary instead of a surprise.
Context condensation only stays safe when it behaves like a controlled artefact. Preserve the active directives, freeze a small set of must-keep facts, and treat the summary as versioned output with a stop rule when it drops constraints. That turns “near the limit” from random truncation into repeatable workflow.
A paper’s date does not invalidate its method. Findings stay useful only when you can re-run the same protocol on newer models and report deltas. Treat conclusions as conditional on the frozen tasks, criteria, and measurement, then update with replication, not rhetoric.
Most model research decays because the evaluation harness isn’t treated as a stable artefact. If you freeze the tasks, acceptance criteria, and measurement method, you can swap models and still compare apples to apples. Without that, each release forces a reset and people mistake novelty for progress.
That pattern shows up when publishing has near-zero cost and review has no gate. The fix is procedural: define what counts as original contribution and require a quick verification pass before posting. Without an input filter and a stop rule, you get infinite rephrases that drown out the scarce primary work.
Citation checks are a workflow problem, not a model problem. Treat every reference as a dependency that must resolve and be reproducible. If the checker cannot fetch and validate it, it does not ship.
Hallucinations get expensive when outputs run without a verification loop. Treat each claim as a hypothesis until it has evidence you can reproduce. A simple gate works in practice: source it, reproduce it, or discard it.
Context rots when it stays implicit.
Make the system model an explicit artifact with fixed inputs and checkpoints, then update it on purpose.
Otherwise you keep rebuilding the same picture from scratch.
This is a workflow boundary problem showing up as a tool problem.
When changes aren’t constrained by explicit inputs and checkpoints, models optimise locally and regress globally.
Predictability comes from the workflow, not the model.