Vulnsy
Guide

How Is Penetration Testing Done: Practical 2026 Guide

By Luke Turvey26 May 202618 min read
How Is Penetration Testing Done: Practical 2026 Guide

You've got the brief, the target list, a date in the calendar, and a vague sense that “the hacking part” should start now. That's usually where newer testers get stuck. They know the tools, they know the buzzwords, but they haven't yet internalised that a professional penetration test isn't a free-form attack. It's a controlled exercise with boundaries, evidence requirements, and a deliverable that has to survive scrutiny from engineers, managers, and sometimes auditors.

That's the answer to how penetration testing is done. Not as chaos. Not as a race to get a shell. It's done as a structured workflow that starts well before exploitation and finishes well after it. In common practice, the work begins with planning and scoping, moves through reconnaissance and scanning, then exploitation, privilege escalation or persistence testing, and ends with cleanup and reporting, a lifecycle reflected in established guidance from IBM on penetration testing.

For solo testers and small teams, that structure matters even more. When you don't have separate people for scoping, testing, QA, and report production, weak process hurts fast. You lose hours chasing out-of-scope issues, gathering poor evidence, or rebuilding notes into a report at the end. Good testers learn that efficiency isn't about rushing the attack phase. It's about keeping the whole engagement organised so every hour produces something useful.

Before the Hack Understanding the Goal of a Penetration Test

A penetration test is a business validation exercise disguised as an attack simulation. You're trying to answer a practical question: if someone targeted this environment under the agreed conditions, what could they realistically reach, abuse, or extract?

That sounds obvious, but it changes how you work. If your only goal is proving compromise, you'll stop at the first shell and call it success. If your goal is validating risk, you keep going carefully enough to show impact, explain exposure, and give the client something they can fix. The end product isn't just access. It's a documented account of what was exploitable, what access was achieved, and what needs remediation.

Why the workflow is more structured than it looks

Most professional engagements follow the same broad pattern because it works. You define the target and rules first. Then you gather information, enumerate services, test hypotheses, exploit what's permitted, assess post-exploitation impact, clean up, and write the report. That flow is consistent across mainstream guidance, but the important part in practice is the logic behind it.

Each phase exists to reduce wasted effort:

  • Planning prevents avoidable risk: It keeps you away from systems that should never be touched and clarifies what “success” means.
  • Reconnaissance narrows the field: It tells you where the attack surface is instead of spraying effort everywhere.
  • Exploitation validates findings: It separates scanner noise from genuine compromise paths.
  • Reporting converts technical work into decisions: Without that, the engagement produces activity, not value.

Practical rule: If a finding can't be explained, reproduced, and prioritised, it isn't ready for the report.

That's why strong teams treat note-taking as part of testing, not admin overhead. The command you ran, the request you modified, the screenshot that proves access, the conditions needed to reproduce the issue. Those details are what turn a technical discovery into something a client can act on.

The goal isn't “hack more”

A lot of public content about pentesting makes it sound like the work is mostly tool execution. It isn't. The core skill is deciding what matters, what to ignore, and how far to go without crossing the agreed line.

That's especially true for application assessments. If your day-to-day includes web targets, it's worth refreshing your approach to effective app security testing because the best results usually come from combining methodical coverage with targeted manual work, not from leaning too hard on scanners.

A mature tester learns to think like this:

  • What is this organisation worried about?
  • Which attack path is realistic in this environment?
  • What evidence will an engineer need to fix this quickly?
  • What can I safely stop testing because it won't change the outcome?

Those questions separate a useful pentest from a noisy one.

Defining the Battlefield Scoping Legals and Test Models

Most bad engagements don't fail in exploitation. They fail in preparation. The tester starts with fuzzy boundaries, vague permissions, weak objectives, and assumptions that nobody wrote down. Then the test drifts.

A proper scope stops that. It tells you what's in, what's out, which methods are allowed, who to call if something breaks, and what the client expects to learn. In practice, a UK-aligned penetration test is usually executed as a scoped, rules-of-engagement-driven workflow: define target systems and allowed methods, perform reconnaissance and scanning, then attempt controlled exploitation, privilege escalation, and post-exploitation analysis before producing a remediation report, consistent with Imperva's overview of penetration testing methodology.

Defining the Battlefield Scoping Legals and Test Models

What needs to be in scope before testing starts

A usable scoping document isn't long for the sake of it. It's precise.

Include these points:

  • Target definition: Name the applications, hosts, ranges, environments, APIs, or offices you're authorised to test.
  • Explicit exclusions: Call out systems that are sensitive, shared, fragile, or owned by third parties.
  • Test window: State when active testing is allowed and whether out-of-hours work is permitted.
  • Permitted techniques: Clarify whether phishing, password spraying, denial-of-service style stress, social engineering, or persistence checks are allowed.
  • Communications plan: Name the primary contacts, escalation route, and emergency stop process.
  • Success criteria: Record whether the goal is compliance evidence, risk validation, attack path demonstration, or remediation verification.

If you need a starting point, a penetration testing scope of work template is useful because it forces you to turn loose assumptions into written decisions.

The legal paperwork isn't optional

Authorisation has to be unambiguous. That means a signed statement of work or rules of engagement, an authorisation letter that identifies the tester or company, and whatever confidentiality paperwork the engagement requires, usually including an NDA. If cloud assets or third-party services are involved, permissions may need to be checked against provider terms as well.

This isn't bureaucracy for its own sake. It protects both sides. It also gives you confidence when the test gets noisy and someone notices unusual traffic or login attempts. You want an audit trail that shows this activity was approved.

If the client can't clearly authorise the work, don't start the work.

Black box, grey box, and white box change the job

One of the biggest gaps in typical “how is penetration testing done” explainers is that they list the phases but skip the testing model. That omission causes real confusion. A mainstream summary may describe reconnaissance, scanning, exploitation, maintaining access, and reporting, but often doesn't explain how black-box, grey-box, or white-box access changes the work. As noted in Bright Security's penetration testing guide, that gap matters because PCI guidance treats these models as materially affecting the accuracy and completeness of results.

Here's the practical difference:

Model What you get What it's good for Common pitfall
Black box Little or no internal knowledge External attacker realism Wasting time rediscovering basics
Grey box Limited credentials or design context Balanced realism and coverage Assuming user access represents admin reality
White box Broad internal knowledge, code, configs, architecture Deep validation and thoroughness Producing findings with low real-world exploitability

Small teams often do best with grey-box work because it cuts wasted effort without removing attacker realism entirely. Black-box tests are useful, but they can spend too much time proving discoverability. White-box tests are thorough, but they can drift into code review or configuration audit if you don't keep the objective tight.

Choose the model based on the client's question, not your preference.

Mapping the Target Reconnaissance and Vulnerability Discovery

The hands-on work starts long before exploitation. Good reconnaissance gives you a map. Bad reconnaissance gives you a bag of disconnected outputs that you spend the rest of the engagement trying to interpret.

The difference is intent. You're not collecting information because the methodology says so. You're collecting it to build attack paths.

Mapping the Target Reconnaissance and Vulnerability Discovery

Passive first, then active with purpose

Passive reconnaissance means gathering useful context without directly interacting with the target systems in a way that's likely to trigger logging or alter state. Depending on the engagement, that may include public search results, cached content, breach exposure checks where authorised, technology fingerprinting from public assets, certificate transparency review, or searching platforms like Shodan for externally visible services.

Active reconnaissance begins when you touch the target. That's where tools like Nmap, Gobuster, ffuf, Nikto, Burp Suite, WhatWeb, or service-specific enumerators come in. You identify live hosts, exposed services, directories, virtual hosts, application behaviour, authentication points, and anything else that turns a broad scope into a shortlist of likely entry points.

A simple mental model helps:

  • Passive work tells you what might exist
  • Active work tells you what responds
  • Manual analysis tells you what matters

Build an attack surface map, not a tool dump

Junior testers often run several tools and save all the output, but they don't convert it into a working model. That's the missing step.

You want a map that answers questions like these:

  • Which assets are internet-facing?
  • Which technologies repeat across the environment?
  • Where are the admin interfaces?
  • Which services expose authentication?
  • Which hosts or applications suggest trust relationships?
  • Which findings can plausibly chain together?

That map can be a spreadsheet, notes in Obsidian, a markdown file, or whatever system you maintain during the job. The format matters less than discipline. If you can't look at your notes and quickly explain the attack surface to another tester, your reconnaissance isn't organised enough.

For web and app-heavy work, the OWASP Testing Guide overview is a useful reference because it helps you turn broad application testing into a repeatable checklist without losing room for manual judgement.

Scanners help. They don't decide.

Automated scanners such as Nessus, OpenVAS, Nuclei, web DAST tools, and SAST outputs can save time. They're good at breadth, repeatability, and spotting known patterns. They are not good at understanding context.

That matters because a scanner will happily hand you:

  • false positives that don't survive validation
  • low-context findings that aren't exploitable in this environment
  • duplicate issues across multiple paths
  • noisy output that hides a more important chained weakness

A scanner finding isn't a vulnerability until you've verified what it actually means on that target.

Manual verification is where essential work happens. You confirm versions carefully, test whether a condition is reachable, inspect request flow, look for authorisation gaps, and probe business logic. Scanners rarely catch the flaws that matter most in mature environments: broken assumptions between systems, weak role boundaries, insecure workflow design, and trust relationships that collapse when chained.

A practical rhythm works well here. Run broad enumeration early. Triage aggressively. Validate the high-value items first. Return to the scanner list later if you need wider coverage.

Solo testers especially need this discipline. You don't have enough time to chase every medium-confidence output. Pick the findings that could change the outcome of the test.

Gaining Access Exploitation and Post-Exploitation Techniques

This is the phase often associated with the question of how penetration testing is done. It's also the phase that goes wrong fastest when the earlier work was sloppy.

Exploitation isn't “try payloads until something lands”. It's a sequence of decisions. You validate that a weakness is real, choose a safe way to demonstrate impact, stay inside the agreed rules, and capture enough evidence that someone else could follow your path later.

A realistic exploitation chain

Take a common web application scenario. During reconnaissance, you identify an authenticated function that behaves differently for low-privilege and high-privilege users. Manual testing shows that the application trusts a client-side parameter more than it should. By modifying a request in Burp Suite, you access data that should belong to another user. That confirms an authorisation failure.

At that point, a weak tester writes “IDOR found” and moves on. A stronger tester asks the next question. Can the flaw expose administrative actions, credentials, internal references, or something that leads to code execution?

Suppose the answer is yes. An administrative workflow lets you upload a file, but validation is inconsistent. You adapt the request, get server-side execution, and land a limited shell. You've now proved a chain, not just two isolated issues.

That chain is what the client remembers.

Initial access is only the start

Once you have a foothold, the post-exploitation phase tells you whether the compromise is shallow or serious. Typical lines of enquiry include:

  • Privilege escalation: Can the low-privilege shell become an administrator or root-equivalent account?
  • Credential access: Are there API keys, SSH keys, service credentials, config secrets, or browser tokens stored locally?
  • Lateral movement: Can those credentials reach another host, application, or management plane?
  • Data access: What information is exposed under the agreed rules of engagement?
  • Persistence testing: Is there a permitted way to maintain access long enough to prove a control weakness?

The right depth depends on scope. Some clients want proof of initial compromise only. Others want you to explore realistic blast radius. Never assume. Check the rules of engagement before you touch persistence mechanisms, mailboxes, production data, or adjacent systems.

What works in practice

A lot of exploitation work is less glamorous than people expect. You confirm a known weakness manually. You search for a public proof of concept on Exploit-DB or within Metasploit. You adjust it because the target version or environment isn't identical. You test in the least destructive way available. Then you document every step.

That documentation needs to happen as you go:

  1. Record the vulnerable endpoint or service
  2. Capture the exact request, payload, or command
  3. Save evidence of the result
  4. Note any prerequisites
  5. Write down the impact while it's fresh

If you leave this until the end, you'll forget the details that make the report usable.

The cleanest exploit path usually wins. Not the cleverest one.

That's especially true on client work. A reliable manual proof beats a fragile exploit chain that only works once and leaves weak evidence behind.

Common mistakes during post-exploitation

New testers often overdo or underdo this phase.

They overdo it when they keep pivoting because they can, not because it improves the finding. They underdo it when they stop at initial code execution and never establish what the compromise means.

A practical balance looks like this:

Situation Good decision Bad decision
You have a low-privilege shell Check local privilege escalation paths and stored credentials Start changing system state unnecessarily
You found service credentials Test access to authorised related systems only Spray them across assets outside scope
You can read sensitive data Capture minimal proof and stop Dump excessive data “for evidence”
You can maintain access Verify only if RoE allows it Plant persistence without explicit approval

Cleanup matters too. Remove test accounts if you created them. Delete artefacts you uploaded where appropriate. Note anything that couldn't be fully reverted. Professionalism isn't just how you break in. It's how you leave the environment.

The Final Deliverable Efficient Reporting and Remediation Guidance

The report is where the penetration test becomes useful. Everything before it is evidence gathering.

Clients don't buy a pentest because they want a transcript of commands. They buy it because they need a clear statement of risk, proof that the risk is real, and guidance their teams can act on. If the report is weak, the value of the technical work collapses.

Industry data shows pentesters spend between 20% and 40% of total engagement time on manual report writing and formatting, according to the SANS white paper on reporting overhead. Solo testers and small consultancies feel that pain most because the same person often tests, writes, edits, and delivers.

The Final Deliverable Efficient Reporting and Remediation Guidance

What a strong pentest report contains

A good report serves more than one reader. Leadership wants the business picture. Engineers want enough technical depth to reproduce and fix. Security managers want prioritisation.

That usually means at least these components:

  • Executive summary: Plain-English overview of what was tested, what level of risk was identified, and the most important remediation themes.
  • Methodology and scope: The agreed boundaries, testing model, dates, assumptions, and exclusions.
  • Attack narrative or engagement summary: A concise explanation of notable attack paths and how issues chained together.
  • Detailed findings: For each issue, include description, affected assets, evidence, impact, reproduction steps, and remediation guidance.
  • Appendices if needed: Tooling notes, screenshots, affected endpoints, or supplementary evidence.

If you want a concrete reference for structure, this penetration test report guide shows the kind of layout and detail that makes findings easier to consume.

The report should help people fix things

A common mistake is writing findings for other testers instead of for the team that has to remediate them. The write-up is technically accurate, but it doesn't answer the practical questions the client has.

For each finding, engineers usually need to know:

  • what the issue is
  • where it exists
  • how you proved it
  • what conditions make it exploitable
  • how urgent it is in context
  • what a sensible fix looks like
  • whether retesting is needed after the fix

That last point matters. A recommendation like “sanitize input” is too vague. A better recommendation ties the fix to the observed weakness, such as enforcing server-side authorisation checks on object access, removing trust in client-supplied role indicators, hardening file upload validation, or rotating exposed credentials and reviewing their reuse.

A report earns trust when the remediation advice is specific enough that the client can hand it to the right engineer without translation.

Manual formatting is where time disappears

Small teams often accept reporting pain as normal. It doesn't have to be. The least valuable part of reporting is usually the mechanical work: copying screenshots into Word, reapplying styles, rebuilding recurring findings, checking numbering, and cleaning up formatting after revisions.

That's where dedicated reporting platforms make sense. Some teams stay with markdown and templates. Others build internal scripts. Others use reporting tools designed for pentest workflows. Vulnsy is one example. It's a platform built for scoping projects, storing reusable findings, attaching screenshots and proof-of-concept evidence, and exporting consistent DOCX deliverables without manual document assembly.

That doesn't replace technical judgement. It removes repetitive publishing work so you can spend more time validating findings and improving remediation quality.

Retesting closes the loop

The best report still isn't the end of the job. Remediation should be validated. Retesting confirms whether the fix resolves the issue and whether the control works under the same conditions that previously failed.

That's also where a clear report pays off. If your original evidence and reproduction steps are clean, retesting is efficient. If the original write-up is vague, everyone pays for it later.

For many clients, the reporting phase is the part they use longest. Exploitation happens once. The report shapes backlog decisions, remediation work, internal discussions, and audit conversations for months.

Evolving Your Pentesting Workflow

The most useful way to think about penetration testing is as a loop, not a single event. Planning shapes reconnaissance. Reconnaissance shapes exploitation. Exploitation shapes reporting. Reporting shapes remediation. Retesting then feeds the next round of planning.

That loop is why repeatability matters. Global benchmark data collected in a 2026 industry roundup suggests penetration testing is commonly treated as a repeatable control rather than a one-off exercise, with 32% of organisations testing annually or bi-annually, 51% outsourcing to third-party specialists, 81% of found vulnerabilities rated high or critical, and quarterly testing corresponding to 53% lower breach rates than annual-or-less testing in that dataset, as summarised by ZeroThreat's penetration testing statistics roundup. The source is global, but the operational lesson applies broadly. Mature teams test, remediate, and retest.

For solo testers and small teams, workflow maturity usually comes from a few boring improvements done consistently:

  • Standardise your notes: Keep the same evidence format every time.
  • Trim your toolset: A smaller set of tools you know well beats a sprawling toolkit you barely manage.
  • Triage early: Don't let scanner output control your day.
  • Write findings while testing: Waiting until the end creates rework and weak detail.
  • Reuse what should be reusable: Scope language, finding templates, remediation patterns, and report structure should not be rebuilt from scratch on every engagement.

The testers who scale cleanly aren't always the ones with the flashiest tradecraft. They're the ones who can run a disciplined engagement end to end, produce a report people can use, and repeat that standard under deadline.


If reporting is the part of your workflow that keeps eating nights and weekends, Vulnsy is worth a look. It's built for pentesters who want a cleaner way to scope engagements, manage findings, attach evidence, and generate client-ready reports without wrestling with manual formatting every time.

how is penetration testing donepenetration testing processpentesting guideethical hacking stepscybersecurity testing
Share:
LT

Written by

Luke Turvey

Security professional at Vulnsy, focused on helping penetration testers deliver better reports with less effort.

Ready to streamline your pentest reporting?

Start your 14-day trial today and see why security teams love Vulnsy.

Start Your Trial — $13

Full access to all features. Cancel anytime.