System Architecture Documentation: A Pentester's Guide

By Luke Turvey•8 June 2026•16 min read

You're probably dealing with one of two situations right now.

The client has sent a folder full of diagrams, PDFs, and screenshots that don't match each other, or they've sent almost nothing and said the architecture will become clear during testing. Both slow a pentest down. Both create avoidable reporting problems. And both make it harder to explain risk in a way the client can act on.

For security work, system architecture documentation isn't a background artefact. It's part of the test surface. Good documentation helps you scope faster, spot trust boundaries earlier, choose attack paths with intent, and tie findings back to business impact. Bad documentation does the opposite. It burns time, creates false assumptions, and leaves reports full of caveats that shouldn't have been necessary.

Why Most System Architecture Documentation Fails Security Teams

A common kickoff pattern goes like this. The client shares a “high-level architecture” PDF from last year, a network diagram exported from a cloud console, and a Word document with a few paragraphs about the application stack. None of them answers the questions a pentester needs answered. Which systems are internet-facing? Where does authentication happen? What data crosses trust boundaries? Which integrations are in scope, and which ones are merely adjacent?

That frustration isn't unusual. A 2020 survey found that 87% of architecture documents were distributed as electronic Word or PDF files, with 50% as model files and 45% as web pages, and it also found that architecture documentation is often not up to date, which is exactly why security teams so often inherit fragmented and stale views of the live system (Fraunhofer survey on architecture documentation formats and maintenance).

Security teams feel that failure more sharply than developers do. A developer can often fill in gaps by reading code, asking a teammate, or tracing one service at a time. A pentester usually starts with less context and less access. If the documentation is wrong, the assessment path is wrong. If the boundaries are fuzzy, test coverage becomes fuzzy too.

Why developer-friendly docs still fail a pentest

Most architecture documentation is written to explain how the system was built. That's useful, but it isn't enough for security review. A pentester needs the documentation to answer a different set of questions:

Exposure: Which services, APIs, portals, and administrative paths can an attacker reach?
Trust boundaries: Where does data move from one security context to another?
Privilege paths: Which components make authorisation decisions, enforce tenancy, or hold sensitive tokens?
Failure impact: If one component is compromised, what else becomes reachable?

A diagram can look polished and still be weak from a security perspective if it hides those answers.

Practical rule: If a diagram helps a stakeholder admire the architecture but doesn't help a tester choose the next attack path, it's incomplete.

There's also a maintenance problem. A single oversized diagram tends to become decorative, not operational. Teams stop trusting it, then stop updating it, then stop using it. That cycle is one reason general documentation quality breaks down in practice. The same warning shows up in broader product documentation work too, including GitDocAI on documentation quality, where the root issue is often not effort but the lack of a maintainable structure.

What security teams actually need

Useful system architecture documentation for a pentest should be selective, current, and opinionated. It should make clear what matters for testing and leave out the noise.

A security reviewer needs:

A truthful system boundary
Named components and clear relationships
Authentication and data-flow visibility
Decision context for risky design choices
Enough operational reality to match the deployed environment

That's the difference between documentation that decorates a project and documentation that supports a security assessment.

Start with Purpose and Scope Not Diagrams

The first mistake isn't drawing the wrong diagram. It's drawing before anyone agrees on the purpose.

Security documentation should start with the engagement type. A black-box external test, an authenticated web application test, a cloud configuration review, and a white-box design assessment all need different detail. If you skip that conversation, the resulting document will either be too shallow to help or so broad that nobody maintains it.

Ask questions that shape the document

At kickoff, pin down what the documentation needs to do. Don't ask for “the architecture”. Ask for the minimum set of facts that let you test cleanly and report clearly.

Use questions like these:

What is the assessment covering? The public application, internal admin paths, APIs, mobile back ends, cloud resources, or a mix?
Who are the meaningful actors? Anonymous users, customers, support staff, engineers, service accounts, third-party integrations.
Where does the system begin and end? That sounds basic, but scope confusion often comes from shared services and inherited platforms.
What data matters most? Credentials, payment data, personal data, audit logs, secrets, tenant data.
Which controls are relied upon? SSO, WAF, MFA, network segmentation, API gateways, managed identity, queue permissions.
What will the client expect in the report? Technical exploit detail, architectural remediation, assurance mapping, or all three.

A lot of this overlaps with building a truthful model of the system in the first place. If you want a simple refresher on how teams define systems and their boundaries, this explanation of information systems definition is a useful grounding point.

Scope first, then centralise the material

Most clients already have the raw material scattered across tickets, wiki pages, cloud diagrams, architecture decks, and incident notes. The problem isn't always absence. It's dispersion.

That's why it helps to use a shared workspace and a simple structure for unifying design notes, diagrams, decisions, and operating context. The principles for unifying project materials are a good example of the mindset. Keep related artefacts together so the tester, engineer, and report author aren't all reading different versions of the same system.

Here's a practical split that works well for security engagements:

Artefact	What it should answer	Who usually provides it
Scope note	What's in and out	Client lead or security owner
System context	External parties and key dependencies	Architect or tech lead
Container view	Deployed applications, services, stores	Engineering lead
Data flow view	Sensitive data movement and trust boundaries	Security or architecture
Decision records	Why risky or unusual choices were made	Architect or senior engineer

Fit the depth to the test

A frequent failure mode is asking every client for the same document pack. That's lazy, and it creates noise.

For a narrow authenticated web app test, you may only need a concise context diagram, a container view, role definitions, and notes on identity, tenancy, and data stores. For a cloud review, you'll care more about service boundaries, admin paths, secrets handling, and external integrations. For a regulated engagement, you'll also want explicit mapping between controls, sensitive processing points, and architectural decisions.

Documentation should answer the next question the tester will ask. If it doesn't, it's probably the wrong level of detail.

When purpose and scope are clear, the diagrams become smaller, sharper, and more useful.

Structure Your View with the C4 Model

Security teams don't need a giant “everything” diagram. They need a layered view that lets them move from scoping to attack planning without losing the thread. That's where the C4 model is practical rather than fashionable.

In UK software practice, the C4 model is a foundational method that uses four zoom levels: Context, Container, Component, and Code. Guidance built around this approach recommends maintaining the top three levels and warns that a single all-encompassing diagram often becomes outdated within weeks (Docsie overview of the C4 model and maintenance approach).

A diagram illustrating the four levels of the C4 model for software system architecture documentation.

Level 1 Context

This is the view you should ask for first. It shows the system's place in the world. Users, third parties, upstream identity providers, payment processors, messaging services, and anything else that crosses the system boundary belongs here.

For a pentester, the context diagram answers high-value questions quickly:

Which external systems can influence trust decisions?
Which users and third parties interact with the platform?
Which data flows leave organisational control?
Where might inherited trust create risk?

If a context diagram is missing, scoping drifts. Teams start testing what's visible rather than what's important.

Level 2 Containers

The container diagram is usually where the test plan starts to become concrete. In C4 terms, containers are deployable or runnable units such as web applications, APIs, worker services, databases, storage systems, and identity brokers.

A security reviewer should read this level with a threat lens:

Attack surface: Which containers are exposed directly or indirectly?
Segmentation: Which paths are assumed to be internal only?
Technology choices: Which frameworks, proxies, and storage layers affect likely weaknesses?
Identity handling: Where are tokens issued, validated, exchanged, or cached?

This level is also where poor documentation often shows itself. Teams label boxes by technology only, or they omit relationships that matter. “Node API” tells you very little. “Public API handling customer requests and tenant authorisation” tells you far more.

Level 3 Components

Component diagrams are where findings become easier to explain. Inside a given container, you can show the major building blocks that matter: auth handlers, business services, file processing modules, queue consumers, billing logic, and interfaces to external systems.

Security value comes from precision, not exhaustiveness. You don't need every class. You need the components that carry risk or enforce control.

A useful component view helps answer:

Where is access control enforced?
Which modules touch sensitive data?
Which components deserialize, transform, or forward untrusted input?
Where could logging, validation, or rate limiting be bypassed?

A clean component diagram often explains a report finding better than three pages of prose.

What to skip

C4 includes a code level, but most pentest documentation doesn't need formal code diagrams unless you're doing a design review or deep white-box assessment. For many engagements, context, containers, and selected components are enough.

The key is restraint. If every box is present, none of the important ones stand out.

Create Threat-Centred Data Flow Diagrams

A standard architecture diagram tells you what exists. A threat-centred data flow diagram tells you how compromise might happen.

That difference matters during testing. When a pentester maps a system mentally, the useful questions aren't limited to “what talks to what?” Key questions are “what crosses trust boundaries?”, “where is identity asserted?”, “where is data transformed?”, and “what path would an attacker try first if this control failed?”

A flowchart showing the five sequential steps to creating a threat-centred data flow diagram for system security.

A useful data flow diagram for security is not decorative. It's annotated. It carries enough context that a reviewer can infer likely abuse cases without asking for a meeting every time a line crosses a box.

What to add beyond boxes and arrows

Start with the flows that matter most. Authentication requests, session establishment, file upload paths, privileged admin actions, webhook ingestion, queue-driven processing, exports, and any path involving sensitive data.

Then layer in security context:

Trust boundaries around public, partner, internal, and privileged zones
Entry and exit points such as public APIs, admin portals, mobile back ends, webhooks, and scheduled jobs
Data classifications for credentials, personal data, secrets, payment data, logs, and generated artefacts
Control points where authentication, authorisation, validation, encryption, or audit logging are enforced
Storage locations for persistent records, object storage, caches, session stores, and secrets systems

This is close to how good threat modelling works in practice. If you need a short reference point for that discipline, Vulnsy's explanation of what a threat model is aligns well with documenting trust boundaries, key roles, and critical assets before trying to reason about attacks.

Make the diagram answer test questions

The biggest documentation error here is over-detail in the wrong places. Best-practice guidance warns against over-documenting non-critical components while leaving rationale and risks implicit, and recommends documenting key components plus explicit links between non-functional requirements such as security and compliance and the components they apply to (Bool.dev guidance on architecture documentation trade-offs).

That has a direct security implication. If your diagram shows every helper service but doesn't show where tenant isolation is enforced, it's not useful. If it lists every internal queue but doesn't mark where untrusted file content is scanned, it's not useful.

Here's a better working pattern.

Build the diagram in five passes

Draw actors and stores
Include people, systems, services, and data stores that influence security decisions or handle sensitive data.
Trace the high-risk paths
Follow login, password reset, admin actions, file processing, billing, export, background jobs, and third-party callbacks.
Mark the trust boundaries
Numerous findings stem from these boundaries. Public-to-app, app-to-internal service, tenant-to-tenant, admin-to-customer, cloud account boundaries, and third-party handoffs all matter.
Annotate control ownership
Note where validation happens, where permissions are checked, where secrets are retrieved, and where audit evidence is generated.
Record the assumptions
If the design assumes a queue is internal-only, a service account is tightly scoped, or an upstream identity provider guarantees a claim, write that down.

The fastest route to a good finding is often a bad assumption written into the architecture.

Pair diagrams with lightweight decision records

A diagram shows structure. It rarely captures why a risky design exists. That's where Architecture Decision Records help.

Use a short ADR when a design choice affects test strategy or report clarity. Examples include:

a file-processing service that accepts content before malware scanning
a tenancy model that depends on claims from an upstream gateway
a partner integration allowed to bypass normal enrolment logic
a logging architecture that omits request bodies for privacy reasons

The ADR doesn't need to be long. It should capture the choice, the alternatives considered, the trade-off accepted, and the operational consequence. That gives a pentester context for both exploitation and remediation. It also helps the client understand why the finding exists, not just where it appears.

Select Tools and Standardise Your Deliverables

Tooling choices shape whether system architecture documentation stays useful or dies after the first workshop. The right answer depends less on fashion and more on who needs to edit the artefact, how often it changes, and whether you need traceable revisions.

Best-practice guidance recommends treating documentation as a living artefact set using docs-as-code, version control, and review cycles so it continues to reflect the current system rather than an old design intent (Qt guidance on architecture documentation practices).

Docs-as-code versus GUI editors

Both approaches can work. They solve different problems.

Approach	Good fit	Advantages	Drawbacks
Docs-as-code with Mermaid or PlantUML	Teams already using Git and code review	Versioned changes, diffable diagrams, easy reuse in repos	Harder for non-technical stakeholders to edit
GUI tools such as diagrams.net or Lucidchart	Mixed audiences and workshop-heavy work	Quick editing, easier collaboration in meetings, visual flexibility	Version control is often weaker and drift is easier
Hybrid approach	Most security consultancies	Fast workshop drafting plus versioned final artefacts	Requires discipline and a clear source of truth

For pentesters, the hybrid approach is often the most practical. Sketch quickly with the client in a GUI tool. Then convert the diagrams that matter into a versioned format if they'll be reused across releases or follow-on testing.

Standardise the output, not just the drawing tool

Consultants often waste time because every engagement produces different artefacts with different naming, legends, and levels of detail. Standardisation fixes that.

Create a small internal pack:

A context diagram template with standard labels for users, third parties, and boundaries
A container template that always marks exposure, authentication points, and data stores
A threat-centred DFD template with a consistent legend for trust boundaries and sensitive flows
An ADR template with fields for decision, rationale, risk, and mitigation

Reporting quality improves when your diagrams use the same naming conventions as your findings, as clients understand them faster. A report that says “IDOR in Tenant Management API” lands better when the same service name already appears in the architecture pack.

Here's the operational side of that workflow in practice.

Screenshot from https://vulnsy.com

A reporting platform can help if it treats diagrams and evidence as part of the engagement record rather than detached attachments. For example, Vulnsy's pentest documentation workflow covers storing findings, screenshots, and supporting material in a way that fits report production, which is useful when architecture artefacts need to support final remediation narratives.

What works and what doesn't

What works:

Naming containers by role, not only by technology
Storing diagrams next to scope notes and decision records
Exporting deliverables in formats clients can review without special tooling
Using the same labels in diagrams, notes, and findings

What doesn't:

Treating screenshots from a whiteboard as final documentation
Building one master diagram for every stakeholder
Keeping the only editable copy on one consultant's laptop
Using symbols without a legend when the diagram includes security meaning

If the document can't survive handover to another tester, it isn't standardised enough.

Implement a Documentation Review Cadence

Most guidance on system architecture documentation spends plenty of time on what to include and not enough on how to keep it useful after release. That's where teams usually lose confidence. The issue isn't that the first version was poor. It's that nobody owned the drift.

A recurring criticism in general best-practice writing is that post-release governance and versioning are underexplained, even though that's where documentation typically stops being trusted as systems evolve (FreeCodeCamp discussion of the post-release documentation gap).

A checklist diagram outlining six essential steps for maintaining and reviewing system architecture documentation regularly.

Treat reviews as security work

Architecture reviews shouldn't sit in a documentation bucket. They belong in security governance because drift changes risk.

A review cadence works when it's tied to actual delivery points. Major releases, new integrations, authentication changes, tenancy changes, and infrastructure migrations all justify a review. Some guidance also recommends quarterly architecture reviews to compare documentation against deployed infrastructure and catch drift early, which is a sensible default when the platform changes regularly, as noted earlier in the section on C4-based maintenance.

Use a short review checklist

Keep the review practical. A useful cadence doesn't need ceremony. It needs ownership and a repeatable set of questions.

Check system boundaries: Have any new third parties, admin paths, or public endpoints appeared?
Review identity flows: Did token handling, SSO logic, role mapping, or service-to-service auth change?
Verify sensitive data paths: Is data entering new stores, queues, exports, or analytics pipelines?
Confirm control locations: Are validation, authorisation, logging, and secrets handling still where the docs say they are?
Update decision records: Have new trade-offs been accepted, or old assumptions been invalidated?
Retire dead artefacts: Remove diagrams nobody should rely on anymore

This work pays off long after the first engagement. Updated documentation makes retests quicker, incident response cleaner, onboarding easier, and future reports less speculative.

Good architecture documentation doesn't try to freeze the system. It tracks enough truth that security decisions remain grounded.

The teams that get value from documentation aren't the ones with the prettiest diagrams. They're the ones that review, trim, and correct them before drift becomes normal.

If your team wants pentest deliverables that stay organised from scoping through final report export, Vulnsy provides a reporting workflow built for security engagements, including structured findings, evidence handling, and reusable documentation that supports clearer client-ready output.

system architecturepenetration testingsecurity documentationthreat modellingc4 model

Written by

Luke Turvey

Security professional at Vulnsy, focused on helping penetration testers deliver better reports with less effort.