Password Cracking Software: A Pentester's 2026 Guide

By Luke Turvey•3 June 2026•16 min read

The worst advice about password cracking software is also the most common: treat it as a niche topic for criminals or a flashy sideshow in a penetration test. That view misses how these tools are used in professional security work.

In a real engagement, password cracking software isn't there to impress anyone. It helps answer practical questions. Are users choosing weak, predictable passwords? Did a breach expose reusable credentials? Can an attacker turn one foothold into wider access because the same password appears across systems? Those are assurance questions, not movie scenes.

For a pentester, the value isn't in “cracking passwords” as a headline. The value is in producing evidence a client can act on, then documenting that evidence clearly enough that security teams, managers, and auditors all understand the risk and the fix.

Beyond the Hollywood Hype of Hacking

Password cracking is still commonly pictured as a dark-room activity involving stolen data, green terminal text, and a hoodie. In practice, authorised security teams use the same category of tooling the way a locksmith tests a lock. The purpose is controlled validation. If a password store, user policy, or authentication design is weak, you want to learn that during a sanctioned assessment, not during incident response.

That distinction matters because the software itself isn't malicious in itself. Authorisation is what separates a penetration test from a criminal act. Used properly, password cracking software helps testers measure exposure after obtaining hashes during an internal assessment, validate whether password policy controls work in reality, and show clients how quickly weak credential practices turn into broader compromise.

Why the history still matters

A lot of modern discussion makes password cracking sound new. It isn't. A foundational milestone was the 1998 Electronic Frontier Foundation Deep Crack machine, which broke a DES 56-bit key in 56 hours while testing over 90 billion keys per second. That mattered because it showed, very publicly, that weak cryptography and weak secrets don't fail in theory first. They fail when dedicated hardware meets poor defensive choices.

Today's pentesting workflow inherits that lesson. We don't assume a hash is “safe enough” because it looks technical. We ask what algorithm was used, whether salts were unique, whether credentials can be attacked offline, and what business impact follows if even a small subset is recovered.

Password work in a pentest is rarely about showing off tooling. It's about proving whether the client's assumptions about credential security are actually true.

If you're mentoring junior testers, it helps to frame this as part of ethical hacking rather than a specialist trick. Vulnsy's overview of what's involved in ethical hacking is useful for that broader context, especially for teams that need to explain why credential testing belongs inside a legitimate assurance process.

What clients usually get wrong

Clients often focus on the brand names of tools. That's not the main issue. The primary issue is whether the environment gives an attacker favourable conditions, such as weak passwords, poor storage practices, or opportunities to reuse recovered credentials.

A mature tester keeps the conversation there. The finding isn't “Hashcat was used”. The finding is that the organisation made password guessing economically viable.

How Password Cracking Fundamentally Works

At a basic level, password cracking is a comparison exercise. A system stores a transformed version of the password, usually a hash, and the attacker or tester keeps generating guesses, transforms those guesses the same way, and checks for a match.

A simple analogy helps. Think of the original password as a recipe stored in a locked box. Hashing turns that recipe into a fixed label. The label identifies the recipe, but you can't read the recipe back from it. To recover the original, you don't reverse the label. You keep trying possible recipes until one produces the same label.

A diagram illustrating how password hashing with salting works and how an attacker attempts to crack it.

Hashes, salts, and peppers

A hash is the stored output of the password after the hashing function runs. Good password storage doesn't stop there.

A salt is a unique random value added to each password before hashing. That means two users with the same password should still end up with different stored hashes. This defeats simplistic precomputed lookup approaches and forces work per password.

A pepper is an additional application-level secret held separately from the password database. If implemented properly, it means a stolen hash set alone is less useful because the attacker doesn't have all the material needed to reproduce the final stored value.

Here's the practical mental model:

Hashing stops plaintext exposure in storage.
Salting stops one cracked value from cascading neatly across matching passwords in the dataset.
Peppering adds another barrier outside the database itself.

The common cracking approaches

Testers and attackers generally rely on a few familiar methods.

Dictionary attacks start with likely passwords, breached-password patterns, company themes, seasons, keyboard walks, and common user choices.
Brute-force attacks try all combinations within a defined character set and length range. This is expensive, so it's usually reserved for suitable targets or narrower search spaces.
Rule-based and hybrid attacks mutate base words in realistic ways. That includes capitalising the first letter, appending a year, swapping letters for symbols, or adding an exclamation mark.
Rainbow table attacks used precomputed hash lookups and were far more relevant before widespread unique salting. They matter mainly as a historical lesson in why salts are essential.

A junior tester should understand one thing early: password cracking software succeeds less because of magic and more because users are predictable.

Practical rule: If your candidate generation reflects how people actually choose passwords, you'll usually learn more than if you just throw blind brute force at everything.

Online and offline are not the same problem

This is the distinction responders care about most. In an online attack, guesses go against a live service. The defender can slow or stop them with rate limits, lockouts, MFA, and monitoring. In an offline attack, the attacker already has stolen hashes and does the guessing away from the victim system.

The key UK-specific problem is that once an attacker steals password hashes, the cracking process happens offline, evading logs and alerts. That changes the security question. You're no longer asking, “Can we spot repeated login failures?” You're asking, “Did we expose reusable password material, and how expensive did we make cracking?”

That's why incident responders often prioritise hash type, salting quality, and likely credential reuse before they look for login noise. There may be no login noise at all.

Legitimate Use Cases in Penetration Testing

Password cracking earns its place in a test when it helps answer a scoped security question. If it doesn't support an objective, it becomes theatre.

One common use case is password policy validation. A client may believe their policy is strong because it requires complexity. A controlled cracking exercise often shows the opposite. Users adapt to the policy in predictable ways. They capitalise the first letter, add a number at the end, and rotate through familiar patterns. Cracking recovered hashes during an internal assessment can demonstrate that the written policy and the lived reality are not the same.

Where it fits in a real engagement

Another use case is post-compromise analysis during a simulation. If testers gain access to a database, domain artefact, or application backup that contains password hashes, a cracking exercise helps determine whether that exposure would lead to broader compromise. The important deliverable isn't a pile of recovered passwords. It's evidence about blast radius.

A few examples make this concrete:

User password quality checks where the objective is to prove that weak password selection remains possible despite formal requirements.
Privilege escalation support where one recovered local admin or service credential reveals poor segregation between systems.
Credential reuse confirmation where a cracked password opens access to a different application, VPN, or administrative interface inside scope.

What useful evidence looks like

A professional tester doesn't dump cracked values into a report and call it done. The result needs context.

A good workflow usually records:

Evidence area	What matters
Source of hashes	How the hashes were obtained within authorised scope
Storage quality	Whether the hashes appeared salted and what algorithm family was involved
Recovered credential patterns	Common constructions, naming themes, or business references
Privilege relevance	Whether recovered credentials belonged to standard users, admins, or service accounts
Reuse impact	Whether the same secret unlocked additional systems in scope

That structure is what turns technical output into a finding a client can act on.

The strongest password-cracking result in a pentest is often not the one with the most recoveries. It's the one that clearly proves a business-relevant path from hash exposure to wider access.

What doesn't work

Two habits waste time.

First, indiscriminate cracking without a defined question. That burns effort and creates noisy evidence.

Second, reporting the tooling rather than the risk. Clients don't need a fan-club write-up about your GPU setup. They need to know whether their password storage, user behaviour, and access model leave them exposed.

An Overview of Common Cracking Tools

Most professional discussions start and end with Hashcat and John the Ripper, and that's fair. They remain the names professionals often encounter in assessments, incident response, and lab work. They're powerful because they combine broad hash support, flexible attack modes, and the ability to make efficient use of available hardware.

Hashcat is often the first tool people mention because it's strongly associated with GPU acceleration and high-performance offline cracking workflows. John the Ripper has long been valued for its flexibility, mature formats, and ability to fit a wide range of Unix, Windows, and mixed-environment tasks. In practice, experienced testers choose based on the job, the hash material, and the workflow they already trust.

What these tools actually provide

The software matters less than the capabilities it brings together:

Hash-format support so testers can identify and process common password storage schemes.
Attack-mode flexibility including dictionary, mask, incremental, and rule-based strategies.
Candidate transformation so realistic mutations mirror how users create passwords.
Hardware efficiency so offline testing uses available CPU or GPU resources sensibly.
Session management so longer-running jobs can be paused, resumed, and documented properly.

If you're building wider tooling familiarity, lists of best free penetration testing tools are useful for context because they place cracking tools alongside recon, exploitation, and reporting utilities rather than treating them as a world of their own.

The trade-offs junior testers often miss

A tool with more speed isn't automatically the better choice for every engagement. Speed matters when the target storage is weak and the scope allows a proper offline assessment. But flexibility, reproducibility, and clean evidence handling often matter more than chasing raw throughput.

John the Ripper, for example, may fit nicely into environments where format handling and custom workflows matter. Hashcat often shines where hardware acceleration and large candidate generation pipelines are central. Neither replaces judgement.

A more mature way to think about password cracking software is this:

Decision factor	Why it matters
Hash type	Determines feasibility and candidate strategy
Available hardware	Affects practical throughput, not just theoretical capability
Time budget	Pentests run on deadlines, so prioritisation matters
Reporting needs	You need evidence that can be explained and reproduced
Client objective	Policy validation needs different output than privilege escalation support

Teams that want a broader view of how tooling fits into engagements can also review Vulnsy's take on pen testing software, which is useful for thinking beyond a single category of utility.

What to avoid

Avoid writing about these tools as if the software alone determines success. Most failures happen earlier. Weak storage choices, reusable credentials, and poor segmentation create the conditions. The tool just measures them.

Navigating Legal and Ethical Boundaries

The line between ethical testing and criminal behaviour is simple to describe and easy to violate: authorisation. If you don't have explicit written permission, password cracking activity is not pentesting. It's unauthorised access work, or preparation for it.

In the UK, that matters because the Computer Misuse Act 1990 isn't an abstract warning. If a tester exceeds agreed scope, targets systems without permission, or handles credentials irresponsibly, the defence of “I was only testing” won't help much. Your statement of work, rules of engagement, and client approvals have to be precise.

A professional signing a formal document with a black pen, emphasizing authorization and security.

Scope before software

Before any password cracking begins, get these basics pinned down:

Written permission covering the systems, accounts, environments, and dates in scope.
Clear handling rules for recovered passwords, sensitive records, and screenshots.
Agreed escalation paths for what happens if you recover privileged or shared credentials.
Defined stop conditions for high-risk findings, such as access to production data.

That level of clarity protects both the client and the tester. It also avoids the common mistake where a technical team says “test our environment” but legal or management never approved credential recovery work.

Documentation is part of ethics

Ethical boundaries don't end when the cracking session starts. They continue through evidence handling, storage, and reporting. If a test recovers valid credentials, you should minimise exposure, avoid unnecessary plaintext retention, and share only what the client needs to remediate and validate impact.

That same mindset applies in adjacent compliance areas. For example, teams that handle meeting evidence, interviews, or call recordings often run into consent questions, so a practical guide on recording legality for meetings can be helpful as a reminder that evidence collection has legal boundaries too.

Unauthorised skill is still unauthorised activity. The professionalism isn't in what you can do. It's in what you only do with permission.

Handling discovered secrets responsibly

Once a password is recovered, curiosity becomes a liability. Don't test every possible reuse path unless the scope allows it. Don't retain a library of client passwords. Don't include unnecessary plaintext in broad report circulation.

A disciplined tester records enough to prove impact, enough to support remediation, and no more than that.

Effective Defensive Controls and Detection

When defenders ask how to stop password cracking software, the honest answer is that you usually don't “stop the software” directly. You remove the easy opportunities and raise the cost of success.

That starts with reducing reliance on passwords alone. A compromised password is far less useful when access also depends on a second factor. MFA doesn't solve every problem, but it changes the attacker's path and reduces the value of a recovered secret in many common scenarios.

A list of six numbered security measures for protection against password cracking and cyber attacks.

What UK teams should prioritise

For password storage, the UK guidance is clear. The NCSC advises storing passwords using slow, memory-hard functions like Argon2id, scrypt, or bcrypt, paired with a unique salt. That recommendation matters because once hashes are stolen, the defender's web login controls no longer matter. The fight moves to the attacker's offline compute budget.

That gives defenders a sensible order of operations:

Use MFA where it materially reduces password-only risk. For many remote access and business-critical systems, it should be standard, not optional.
Store passwords with Argon2id, scrypt, or bcrypt plus unique salts. Fast general-purpose hashes are the wrong tool for password storage.
Limit online guessing. Rate limiting, account lockouts, and anomaly detection still matter for live authentication endpoints.
Reduce reusable secrets. Segment privileges, rotate shared credentials, and remove unnecessary password-based service dependencies.

Detection has limits

Teams often ask what alerts will show an offline cracking campaign. Usually, none. If the attacker already has the hashes, the expensive part happens elsewhere. That means good detection strategy starts earlier, around the theft of the hash material, suspicious access to stores that contain it, and unusual privilege movement after the fact.

A practical defensive checklist looks like this:

Protect the stores first by hardening databases, backups, and identity infrastructure that may contain hashes or authentication artefacts.
Make theft less useful with modern password hashing, unique salts, and sound secret separation.
Constrain post-crack movement with MFA, least privilege, and reduced credential reuse.
Secure sensitive records at rest so exported assessment evidence or incident material isn't left exposed. For teams handling regulated files, this primer on GPG for regulated record security is a useful operational reference.
Review second-order risk such as password resets, service desk verification, and break-glass procedures.
Improve user choices by blocking weak and known-bad password patterns before they enter the estate.

For organisations building stronger identity controls, it also helps to align staff on what multi-factor authentication protects and where its limits are.

What works and what doesn't

What works is layered friction. Good storage, limited reuse, secondary factors, and tight privilege boundaries combine well.

What doesn't work is relying on complexity rules alone while storing passwords badly or leaving hash material exposed. If defenders only focus on login pages, they miss the underlying problem.

Documenting Cracking Findings for Impactful Reports

Many testers do the technical part well and the reporting part badly. Password findings are a common example. The notes say “cracked multiple hashes” and attach screenshots, but the client still doesn't know what to prioritise.

A stronger finding turns recovered credentials into a business-relevant story. It should explain what was obtained, why it matters, how far the risk extends, and what to fix first.

A professional analyzing a security assessment report featuring charts and data at a desk.

A reporting structure that lands well

A practical password-cracking finding usually needs four parts:

Clear title such as “Weak password storage enabled recovery of user credentials” or “Recoverable credentials allowed cross-system reuse”.
Plain-English description explaining that stolen hashes or exposed credential material could be tested offline or reused within scope.
Impact statement tied to access, privilege escalation, internal movement, or exposure of regulated data.
Actionable remediation covering password storage, reset scope, MFA rollout, privilege review, and password policy improvements.

If you recovered passwords for privileged users, say so. If you proved reuse across systems, say which systems in scope were affected. If you only established a pattern rather than broad compromise, be precise about that too. Good reporting is specific without becoming reckless.

Evidence should support the conclusion

A useful evidence pack is compact. Include the hash source, the affected account types, examples of password construction patterns where appropriate, and only the minimum plaintext needed to prove impact.

A simple internal template can help:

Report element	Good practice
Risk summary	Explain the security and business consequence in plain language
Technical detail	Note how the hashes or credentials were obtained within scope
Proof of impact	Show authorised validation of access or reuse where applicable
Remediation	Prioritise fixes that reduce future exposure, not just password resets

The client doesn't benefit from a dramatic cracking story. They benefit from a concise finding that links evidence to remediation.

The best consultants become memorable because their reports are easy to read, consistent, and defensible. That's where workflow tooling matters. If you're tired of rebuilding Word templates, reformatting screenshots, and copying standard remediation text by hand, Vulnsy helps standardise findings, attach evidence cleanly, and generate professional DOCX deliverables without the reporting grind.

password cracking softwarepenetration testingethical hackinghashcatcybersecurity

Written by

Luke Turvey

Security professional at Vulnsy, focused on helping penetration testers deliver better reports with less effort.