AI assistance in security engineering: useful for drafts, not for judgment

Security engineering has a higher verification bar than most software development. A wrong database query is annoying. A wrong access control policy is a breach in waiting. That difference changes how AI tools are useful in this domain — not less useful, but useful in a different shape.

Where they actually help

Detection rule drafts. Writing Sigma or KQL from scratch is mostly mechanical: match this field, join to that table, exclude known-good activity. An AI produces a competent first draft from a plain-English behavior description. The draft will usually be wrong in exactly one specific way — wrong field name, wrong normalization, a clause that fires on benign activity. Fixing one specific flaw is much faster than starting with a blank editor.

Config and policy comparison. "What's different between these two IAM policies?" or "Does this firewall rule actually do what the comment says?" are tasks where an AI reads faster than you do and catches obvious gaps. It will not catch subtle semantic differences — does sts:AssumeRole with this condition actually scope to what you intend given the service's actual authorization behavior? — but it catches the obvious stuff, which is often enough to find what you need to look at.

Unfamiliar log formats. When you're looking at a SBC's syslog schema, an enterprise firewall's proprietary audit format, or a cloud provider's activity trail for a service you don't work with daily, asking an AI to explain the fields and flag what looks anomalous is faster than reading vendor docs cold. The output still needs verification against the actual behavior of the system — but it's a faster start.

Writing the mechanical parts. Runbooks, change requests, post-incident summaries, documentation for a detection alert. The AI produces a serviceable draft; you edit for accuracy and judgment. The time savings is real.

Where they fail specifically in security work

Trust path analysis. "Is this IAM policy safe?" will get you a confident answer that is frequently wrong in the ways that matter. AWS, GCP, and Azure IAM semantics are non-obvious, version-sensitive, and full of corner cases that vary by service. An AI will tell you the policy looks right. It will not tell you that a specific combination of permissions and trust conditions enables a privilege escalation path given the service's actual runtime behavior. That analysis requires someone who understands both the policy language and what the service actually does under authorization.

Incident diagnosis. An AI will produce plausible hypotheses from log samples. It will miss the cause that requires knowing what the system normally looks like — which requires having seen it. Incident work is not pattern matching against training data, it's differential diagnosis against a specific system's baseline. You own that.

Anything the output can't be verified against. In security work, "this looks right" is not sufficient for firewall rules, access policies, authentication flows, secret handling, or cryptographic implementations. If you cannot verify the output against the actual system behavior or an authoritative specification, don't ship it.

The specific failure mode

The worst case is not AI generating obviously wrong output. It's AI generating plausible output that is wrong in a way that is hard to see — the correct-looking IAM policy with one condition that doesn't do what you think, the detection rule that suppresses the alert you actually needed.

The engineers who use AI tools effectively in security work treat every output as a hypothesis that requires verification against the system, the actual policy semantics, or the actual protocol behavior. Not against the AI's description of those things. Against the primary source.

That is the skill. The AI handles generation. You own verification.