Alternate Title: What Anthropic's .claude Permissions, Opus 4.7, and a Pentest Taught Me About Scaffolding Claude
Act I: The 37 Minutes
On April 6, 2026 at 12:58:25 in the afternoon, I committed an empty V3.1 workflow scaffold to a fresh project I called v3.1-test5. I typed one sentence into the orchestrator skill ”AI web chat MVP where users can register, log in, ask a question and get an AI answer, with an admin interface to manage users” pasted in a paragraph of success criteria, and watched.
I watched all 37 minutes and 31 seconds of it.
At 13:35:56 the same afternoon, the \[Vision Complete\] commit landed. Three work units delivered, parallel worktrees merged, one auto-resolved bug, a working Flask MVP with auth + admin + LLM Q&A, and ” because the security-planner skill was already wired in” a Threat Model section in the vision document with a route-authorization matrix the LLM had filled in itself. The chat MVP ran. Logging in worked. Asking the local LLM a question returned an answer. The admin page listed users. The whole thing was secure-enough by inspection.
37 minutes. One sentence in. Working app out. I sat there at my desk and just watched it work.
That was the high-water mark.
Two days later, on April 8, I filed BUG-378: Worktree storage in .claude/ triggers permission approval pauses move to prescientflow-artifacts/worktrees/A week after that, BUG-403: Worktree subagent edits to project files still trigger permission prompts. Within a month, around fifty bugs in the BUG-340..BUG-440 range existed for one reason: the autonomy that built the chat MVP in 37 minutes was no longer autonomous.
Anthropic had tightened the permissions on .claude/. They were right to do it. It still broke my dream of a mostly autonomous and solid team of agents supporting "vibe coders" who don't have the time or technical know-how to produce details specs and fine detailed design requirements assuring a nearer to production ready build.
Act II: Why PrescientFlow Exists in the First Place
If you're new to this blog, the workflow now has a name. PrescientFlow. It's the subject of most posts in this series since The $0.42 Question in December and I Caught Opus Cutting Corners Again in February. It's a workflow system built on top of Claude Code that takes a vision document and runs it through five-agent reviews, a memory-aware planner, dependency-batched parallel work units, post-sprint QA, and a [Vision Complete] commit at the end. It exists because in early 2025 I realized something simple and inconvenient: a workflow with hooks to validate the effectiveness of Claude's output is essential to actually getting useful code out of it because an AI trained on average code produces what a solid SDLC doesn't like “average code". Not nice-to-have. Essential.
I built PrescientFlow because I had a real project to build. Several, actually.
I'd taken on a couple of small projects to help friends with their challenges: internal tools, nothing big but not simple either, the kind of thing where the requirements were clear but the dependencies were tangled (e.g.Archiva, an agentic RAG helpful for knowledge workers processing complex analysis against complex requirements-coming soon). And I'd started building my future business site and service: Riskjuggler.ai. The site is the long-term thing. The friend projects were short-term obligations. I needed both done.
What I learned building the friend projects was that PrescientFlow had to grow up and learn from real complexity before it could touch Riskjuggler.ai. The friend projects had memory needs (What files have we already touched? What bugs surfaced before? What ADRs constrain this?), context-engineering needs (which slice of code does this work unit (WU) need vs. which slice would just burn tokens?), dependency needs (this WU has to land before that one). This forced the workflow to develop a graph memory store, a planner that reads it, a sprint orchestrator that batches by dependency.
The friend projects were the proving ground. By the end of March 2026 they were 80%+ done. PrescientFlow was finally ready. Riskjuggler.ai was next.
That's the project I was elbow-deep in when, in early April, the prompts started coming.
Act III: The Floor Moved (Twice)
The first time, I was working in Riskjuggler.ai. Claude Code started prompting me to approve every git command, every python3 invocation, every mkdir. The settings.json my own deployment script had generated didn't have a permissions block (I was used to being comfortable with - “dangerously-skip-permissions because I learned how to ask for work safely long ago) - it had skill definitions and hooks, sure, but no allow array. So every command was prompting. Approve. Approve. Approve.
I checked the Anthropic release notes. There it was: tighter scrutiny on .claude/ directory writes regardless of defaultMode, stricter default-deny posture for previously accepted commands. As a security professional, my reaction was great job, smart move - ” .claude/ contains skill definitions and hooks that shape agent behavior, and treating writes to it as privileged is exactly the right posture. As a workflow builder aiming for a safe and comfortable user experience, watching my orchestrator try to spawn worktrees inside .claude/worktrees/ and prompt me three times per work unit, my reaction was oh, this is bad.
The first floor move was the permissions change. The second was Opus 4.7 itself. It came out in the middle of all of this and it was, in my testing, more inclined to pause and verify my decisions than 4.6 had been. I suspect this is a model-training difference - ” RL'd behavior toward second-guessing - ” and --auto mode (which I'd already added to several skills for orchestrator-driven runs) became a fight against the model's defaults rather than a collaboration with them. It's a clean line I'll keep saying to myself: I'm clearly not building a workflow the same way Anthropic and others are designing for that still assumes the existence of other SMEs in the pipeline and specific tooling (e.g. Github PRs) and possibly the "Ralph Wiggam loop" to bang through possibles (and tokens) till it gets something that works. All that is fine for small objectives fixing parts of a large, existing solution but I wanted to build whole apps from a Natural Language, non-technical prompt if not a well defined Product Manager's spec.
I pushed the friend projects to 100% in two days after getting the orchastrator going, manually, dodging prompts. The slowdown on Riskjuggler.ai, the project the workflow was actually built to enable, was the part that hurt.
Act IV: Sand Pebbles, Not a Beach
If you go look at the git log of PrescientFlow for April-May 2026, you won't see one big "Anthropic Compatibility" pull request. You'll see roughly 280 bug numbers between BUG-340 and BUG-617 - ” sand pebbles, not a beach. Some of the high-leverage moves were:
- Splitting .claude/ from prescientflow-artifacts/. The workflow plumbing -” skills, hooks, scripts, settings -” stayed in .claude/, properly scoped under the new permissions. The actual project artifacts -” vision documents, work units, reviews, QA reports, planner outputs, graph memory database, sprint history, completed work units, bugs - ” moved to a top-level prescientflow-artifacts/ directory that doesn't trigger the hardened scrutiny and is gitignored in case the builder does not want to commit them to their repos. This was BUG-340 through BUG-345 and a couple dozen siblings, executed over a week.
- --auto mode on every skill that participates in the orchestrator pipeline. Vision had no --auto mode. Planner's approval gate fired before the orchestrator could suppress it. Sprint had a "Ready? Say start" prompt. QA asked "Proceed with QA?". Each one was a separate bug (BUG-346, BUG-347, BUG-348, BUG-358). Each one was a separate fight against the model's tendency to verify-before-acting, especially after Opus 4.7. None of them were hard individually. The hard part was that there were so many of them (and my memory system was not designed right for my workflow project-more about that later).
- A deployment-time permissions block in settings.json. BUG-353 (April 4): "Project settings.json deploys without permissions block- users prompted for every Bash command." The fix was a dontAsk mode with an explicit allow-list of workflow commands and a deny-list of destructive ones. Unlisted commands are silently denied -” secure-by-default - but workflow commands run unattended.
- A graph-memory uplift. This one I didn't expect. The workflow's graph memory had been built for code - modules, functions, imports, calls. When PrescientFlow's own workflow-dev repo started feeding the planner, the planner kept producing weaker plans for skill-and-script-heavy work units than it did for app-code work units. The reason was simple: the graph schema was code-shaped. So we generalized it - workflow-dev projects now populate skill nodes, hook nodes, ADR nodes, and the planner gets the same richness of context for an MCP-removal sprint that it gets for a Flask MVP. Until that fix landed, a lot of bugs surfaced manually that should have been caught at plan time.
That's three architectural moves and one I-didn't-see-it-coming. Not a complete list. The full list lives in the bug archive.
The other strategic addition during this period was the security-planner.
I'd been testing Dan Miessler's Personal AI Infrastructure alongside my own workflow. Dan is a super-experienced InfoSec pro and PAI has a built in pentest capability I could use. I pointed it at my Riskjuggler.ai build. It found quite a few failures (multiple Criticals and Highs!!). Things that wouldn't look good on a security and IT pro's site. That was the catalyst.
I'd known for a long time that running a security review on every work unit alongside the regular Vision/Scope/Design/Testing/Tattle-Tale gate would be heavy token burn for low ROI on internal-only projects-same as in a human SDLC. So the security-planner was added conditionally. It fires when the vision document has a Threat Model section or when work-unit titles mention auth, payment, session, login, password, PII. Internal-only friend projects: it doesn't fire. Internet-exposed Riskjuggler.ai (coming soon): it fires. The skill loads ADR-SEC-questions.yaml, makes one narrow LLM call per applicable question per WU, and writes findings to walkthrough_findings[] in the planner output. Phase 2.5.5 of the orchestrator routes P0 findings into amended work units; Phase 2.5.6 routes P1 into auto-created follow-ups.
This was the security gate I built specifically because the average Claude-generated code is exactly what you'd expect: average code that isn't secure by default, since security is rarely taught or at least not taught thoroughly in CS programs and only gets layered in when devs get feedback from an InfoSec professional. I built the InfoSec professional scaffold into the workflow.
That gate, it turns out, had a P0 bug.
Act V: The Pentest That Caught My Gate
Yesterday -” May 5, 2026 - I pointed PAI at a fresh deployment of v3.1-test30, a chat app the workflow had built end-to-end on a HTTPS port. The pentest report came back with 0 Critical, 1 High, 2 Medium, 3 Low, and 22 controls clean.
The High was "Session not invalidated on logout." You log in, capture the session cookie, hit POST /logout, then re-issue the captured cookie against /admin. The server returns HTTP 200 with the admin dashboard, because logout deletes the client-side cookie via Set-Cookie: session=; Max-Age=0 but maintains no server-side revocation. A captured token survives logout for the full 8-hour Max-Age.
It was more than one HIGH if I'm being honest with you about what the model produces by default. The session-replay was the headline. There were also Mediums for prompt-injection on the LLM endpoint and for TLS 1.2 only offering a non-AEAD cipher (CBC + HMAC-SHA1, no AES-GCM). Lows for weak password policy, missing HttpOnly on the CSRF cookie, open registration. The full report is in /Volumes/claude/security-reports/.
My first reaction wasn't "the security-planner missed this". It was "the security-planner should have caught this. Let me trace it."
Long story short I just had to add a better question and fixed a bug that was causing the planner not to read the entire security planner guidance.
The bigger lesson is the one I keep coming back to. The security-planner gate, when it works, asks most of the right questions. The gate working exposes the next problem: even prompted to - think hard - even with the planner's culture of outcome-based titles and lean prompts and atomic transitions and a Tattle-Tale reviewer that synthesizes the other four perspectives, Claude isn't reinforcement-learned to threat-model thoroughly without scaffolding. When asked - can a captured token be replayed after logout? - via a structured walkthrough, the model gives you a usable answer. When asked "build a chat app with auth", the model gives you a chat app with auth that doesn't invalidate sessions on the server side. The reviewer skills -” Vision (right problem?), Scope (atomic, in-bounds?), Design (patterns, no regressions?), Testing (falsifiable coverage?), Tattle-Tale (synthesis) -” won't catch it on their own. They weren't designed to.
The continuous human-curated security checklist isn't a feature of the workflow. It's the deal.
The Deal
Anthropic's permission tightening was right. My workflow's autonomy promise was the wrong abstraction. Three months of recovery taught me that scaffolding the model isn't optional -” even with a security gate, an Opus 4.7 RL'd to think hard, and a planner that values atomic specs and outcome-based titles, the model produces average code that isn't secure-by-default. The pentest that exposed BUG-607 was the proof. Continuous human-curated security checklists aren't a feature. They're the deal.
Two commitments:
The security checklist gets published and maintained. ADR-SEC-questions.yaml started with 30 questions across four ADRs. It's going to grow every time a pentest finds something the existing question bank should have caught -“ or when claude enables a better model or skill to call to take care of this for us. The WU-2196/2197/2198 follow-ups added Session Lifecycle bullets to the vision template - Token storage, Storage trade-off rationale, Constant-time comparison, Transport (TLS) coupling - because SEC-002-Q1 and SEC-002-Q8 came back addressed=no in the walkthrough and we didn't have the language in the template to fix them at the source. That ratchet is going to keep tightening. I'll post the question bank publicly as it evolves.
Riskjuggler.ai gets built next. The reason I'm writing this post is that PrescientFlow is finally -” finally -” back to being able to do what it did on April 6 in 37 minutes, except now with the security-planner findings actually persisted, the threat model, security planner, and red-team spot-checks, the UAT-lead's auth-flow attack-pattern TC bank that auto-injects logout-replay and session-fixation tests whenever a vision exposes auth routes, and the worktree merger that auto-fixes mechanical archive drift.
All of which is to say: I'm not trying to change the world. I'm trying to enable vibe coders -” folks who can describe what they want but can't yet build it themselves -” to deliver more complete and more production-ready demos than they could before, at a cost that's at least defensible against the alternative of just hiring it out.
What you'll get out of this workflow, if you adopt it, isn't a guarantee of secure code. It's a structured production of work-unit specs, test plans, QA documentation, and architecture artifacts that a skilled production engineer can pick up and reuse because it's doing what we've always done-managing IT risk through a workflow that supports the devs with perspectives they didn't get on their own while deterministically validating their output to assure quality and security. If you need it, the product-manager and architect skills will generate common documentation that feeds the SDLC intake on the human side. The QA test plans feed the QA team. The non-technical vibe coder (or just the super-busy technical specialist needing time for other priorities like me) gets to ship a working demo and the prod engineer doesn't start from scratch -” they start from a spec, a test plan, and code that's already passed five-agent review.
That's the offer. It's smaller than autonomy. It's also more honest about where the model actually is.
37 minutes is back. The thing I built it for is next.
Technical Appendix A: The Recovery in Numbers
Metric | Pre-storm (v3.1-test5, April 6) | Post-storm (current, May 6) |
Bugs filed since Feb 6 post | ~280 (BUG-340 - BUG-617) | |
Bugs P0 severity in security-planner persistence path | 0 known | 1 (BUG-607, fixed via WU-2184) |
Vision Complete cycle time, single-WU sprints | ~37 min (auth + admin + LLM Q&A) | Comparable; security-planner adds ~3 min |
Reviewer perspectives per WU | 5 (Vision/Scope/Design/Testing/Tattle-Tale) | 5 (unchanged) |
Conditional gates after Phase 4 (QA) | Smoke test, Red-team, Fix-bugs micro-sprint | Same, with telemetry-emitting dispatcher for 4.6 (BUG-611) |
Threat-model questions in ADR-SEC bank | Initial 30 | 30 + 3 P1 follow-ups landed (Token storage, Storage trade-off, Constant-time comparison) |
Red-team runtime checks | 4 (timing, headers, verb-disclosure, CSRF tokens) | 8 (added session-replay, TLS-AEAD, prompt-injection canary, HttpOnly on auth cookies - ” BUG-608) |
The full PrescientFlow codebase is at https://github.com/Riskjuggler/PrescientFlow.
No comments:
Post a Comment