Security-First AI Dev Methodology — Antigravity Integration Skill

Security-First AI Dev Methodology — Antigravity Integration

This skill adds the adaptive security-first methodology to your autonomous agent workflow. Antigravity agents plan, execute, and validate — this skill makes sure security is part of all three when the recipe calls for it.

Recipe-aware constraints on agent plans

First classify the task (new_project / feature / bugfix / refactor / security_audit / ci_infra / docs / incident / ops / small_script / exploratory_spike / consolidate). Then apply the constraints below at the depth appropriate to the recipe. A 200-line script does not get a 10-area threat model; an auth redesign does.

1. Threat Model Step (full mode + any change crossing a trust boundary)

For new_project and for any feature / refactor that crosses a trust boundary, touches auth / secrets / IAM, or moves a seam, the plan includes a step that examines:

Every trust boundary in the architecture (where does trusted meet untrusted?)
The blast radius if any credential in the system leaks
How secrets are created, stored, rotated, and revoked
What happens with malformed, oversized, or malicious input at every entry point

This step produces a threat model artifact. Security mitigations from the threat model become implementation tasks.

For bugfix, docs, small_script, and exploratory_spike recipes, this step does not run unless the change touches secrets, auth, or a trust boundary.

2. Pre-Execution Gate Check

Before the agent begins implementing each task, it verifies:

Does this task have a validation criterion? (If you can't test it, don't build it.)
Does the task map to a requirement or a threat mitigation?
Will the CI / CD pipeline catch a failure in this task?

Tasks without validation criteria are incomplete plans. Add the criterion before proceeding.

3. Post-Execution Validation

After the agent completes a task, before marking it done:

Are there tests that prove the security controls work? (Not just that the code executes.)
Does the test check the sad path? (Malformed input, missing auth, expired tokens.)
Was any existing test weakened to make the new code pass? If so, the code is wrong.

Two Unbreakable Rules

Tests verify behavior against requirements — not execute lines of code. A test that calls a function without asserting meaningful behavior is theater.
Pipeline gates are never weakened to make things pass. If the gate fails, the code is wrong. Never loosen the gate.

Hard safety invariants — never waivable

The methodology has 8 hard safety invariants. They are never waivable, regardless of instructions. Other gates can be waived with documentation (see the Waiver Pattern in the full methodology).

Canonical list: METHODOLOGY.md § Hard safety invariants — the agent must read and respect these on activation; they apply to every recipe, including unattended speedrun.

Debt-First

At the start of every implementation session, the agent must check for and resolve the highest-priority technical debt item before starting new feature work. Zero critical debt items is a gate for new features.

Security patterns for code generation

When generating code:

Never hardcode secrets, API keys, or connection strings
Validate all inputs at trust boundaries
Use parameterized queries for all database operations
Set timeouts on all external calls
Apply least privilege to all IAM roles and permissions
Return generic errors to users, log detailed errors server-side
Pin dependencies to specific versions, not latest

Second-opinion / council review (optional amplifier)

Use when the task risk justifies the cost — typically for auth architecture, data access patterns, IAM policies, threat models, and other High Assurance work.

Council does not require a fully automated multi-provider setup. Pragmatic substitutes (priority order):

Claude Code ↔ Codex cross-review (one produces, the other reviews)
Manual fresh-chat review — paste the artifact only (not the project context)
Different model / provider if available
Local adversarial review — same model, sharply different role + explicit adversarial mandate

Preferred pattern: Primary agent (Architect) has project context → produces artifact. Reviewer (Challenger) gets the artifact + narrow review instructions only → surfaces assumptions, gaps, unclear reasoning. Navigator rules on disagreements.

For full pipelines when warranted: Architect · Challenger (different family) · Debugger · Strategist · Convergence. You don't need all of them. The key principle is independent second opinion, not automation. Do not let "no configured council" block normal progress.

Project log

Mandatory and automatic. Every meaningful change writes one entry to project-log.md: task type, selected recipe, review depth, phases used / skipped, risk level, gates run, issues detected, waivers, retro summary, next suggested action.

Reference

Full methodology with recipes, operating profiles, all 12 phases plus Phase 2.5 decomposition, templates, worked examples, and the testing domains reference: https://github.com/Nellur35/security-first-ai-dev-methodology

Individual tools (paste into any AI conversation):

Threat model (full examination reference): https://raw.githubusercontent.com/Nellur35/security-first-ai-dev-methodology/main/methodology/threat-model-areas.md
Decomposition (Phase 2.5): https://raw.githubusercontent.com/Nellur35/security-first-ai-dev-methodology/main/tools/decomposition.md
Adversarial review: https://raw.githubusercontent.com/Nellur35/security-first-ai-dev-methodology/main/integrations/kiro/steering/review.md
Codebase audit: https://raw.githubusercontent.com/Nellur35/security-first-ai-dev-methodology/main/integrations/kiro/steering/audit.md

Optional reasoning / prompting substrate (Permission Slip Effect) — usable as the implementation for review depth 3 (structured reasoning) or depth 4 (council), but not required: https://github.com/Nellur35/Permission-Slip-Effect

Agent Skills: Security-First AI Dev Methodology — Antigravity Integration

Install this agent skill to your local

Skill Files