What we find

What others miss is rarely obvious.

Security leaders do not need another long report full of undifferentiated findings. They need confidence about what is exploitable, what is material, and what should change now. That is the gap we are built to close.

The assurance gap

A clean report is not the same as a tested assumption.

Most organizations already have some form of assurance: annual penetration tests, automated scans, compliance assessments, questionnaires, SBOM reviews, vulnerability reports. Those activities can be useful. They can also create false confidence when the test is not shaped around how the system actually works.

The gap shows up where things are hard to standardize: authorization logic, tenant boundaries, custom protocols, embedded devices, business workflows, compensating controls, and vulnerability reachability. Those areas take curiosity, judgment, and persistence from the person doing the work.

The real question is not “Was this tested?” It is “Was it tested deeply enough to answer the decision we actually need to make?”

Representative engagements

Three times standard testing stopped short.

Real Alpha Defense engagements, anonymized at each client’s request. No names, products, or exploit recipes, just how the work is different.

Authorization & multi-tenant isolation

The tenant boundary that wasn’t

A client asked us to assess a web application they were considering acquiring. The target supplied several years of clean penetration-test reports from a reputable firm. At kickoff, they explained that cross-customer access was impossible, because records were separated at the infrastructure level. For the buyer, that tenant isolation was not abstract; it was part of the acquisition risk model.

Where surface testing stops too early

A tool can send a request, receive a successful response, and treat the behavior as normal. Here, the application kept working when a session-token value was removed. It still returned a normal-looking page, with the expected number of records per page. To a scanner, the request looked successful whether the value was present or not.

What we did differently

We manually compared the actual records returned across roles, sessions, and data contexts. The page loaded either way; what changed was the data. With the value removed, the application returned more records than it should have. That value was not an authentication artifact, it was quietly filtering data by domain, organization, and production instance. The tenant boundary depended on inconsistent data-scoping, not the hard separation the buyer had been promised.

Firmware, embedded & hardware

The custom protocol no scanner understood

A client with a connected-device product line brought us in to assess multiple products over a multi-year program. One product owner was initially reluctant to share source code. Prior testing had run without it, and the argument was that a real attacker would not have it either.

Where standard methods fall short

Custom devices often hold their most important risk where generic scanners have no model: a proprietary protocol, device-specific behavior, an attack surface no public tool represents. That cannot be reduced to running a checklist.

What we did differently

We started where the client was comfortable, without source. We extracted and reverse-engineered the firmware, analyzed the custom communication protocol, built tooling tailored to the product, and wrote a fuzzer for the protocol. That produced multiple critical vulnerabilities with enough technical clarity for the client to act. It also changed the relationship: once the product owner saw the work, they provided source code and direct engineering access for the rest of the program.

Vulnerability prioritization & reachability

The 5% that actually mattered

A client had invested in a commercial binary-scanning platform and received a report listing numerous vulnerable libraries and insecure components compiled into an application. It gave useful visibility, but not the answer the team most needed: which findings created immediate, exploitable risk in the actual product?

When severity is not enough

A vulnerability list can be technically accurate and still be operationally unhelpful. If a report says a vulnerable component exists but never analyzes whether the affected code is called, reachable, or exploitable in the deployed application, the engineering team is left to triage in the dark.

What we did differently

We analyzed reachability and exploitability in the context of the actual product, separating theoretical exposure from practical risk. Only a small portion of the flagged vulnerable code created immediate exploitable risk. The client patched that subset quickly, reducing customer risk right away, then worked the broader backlog down on a more deliberate timeline.

The common thread is not a vulnerability class. It is a way of working: we look for the places where conventional assurance can appear complete while leaving the hardest question unanswered.

For security leaders

Questions to ask any testing partner.

Whether you are choosing a firm or reviewing an assessment you already have, these five reveal whether the work finds what matters or just documents what is easy to check.

01
Technical depth
Does the team doing the work have the judgment to challenge assumptions, follow evidence, and move beyond a narrow checklist?
02
Authorization
Do they test authorization by inspecting the data actually returned across roles, tenants, and edge cases, treating architecture claims as hypotheses to prove rather than assurances to accept?
03
When tooling ends
Can they reverse-engineer, build harnesses, write fuzzers, or create custom methods when the target environment requires it?
04
Exploitability
Does the report distinguish vulnerable code that merely exists from vulnerable code that is actually reachable in the deployed product?
05
Translated into action
Does the output help you decide what to fix now, what to monitor, and what can wait?

How we work

The principles behind the stories.

Test the assumption behind the control, not just the control.
Follow the evidence, the data returned, not the status code.
Build the method the system requires, even when no tool exists.
Separate theoretical concern from exploitable risk.
Translate findings into decisions: fix now, monitor, or wait.

From the blog.

Get started

Considering a second opinion?

A due-diligence assessment, or a test of something a scanner cannot reach. You reach the experienced testers who do the work, not a sales queue.

Not ready to scope a test? Just ask a question. Every message reaches an expert directly and gets a personal reply, not a sales call.

Get a Second Opinion Explore Services

What others miss is rarely obvious.

A clean report is not the same as a tested assumption.

Three times standard testing stopped short.

The tenant boundary that wasn’t

Where surface testing stops too early

What we did differently

The custom protocol no scanner understood

Where standard methods fall short

What we did differently

The 5% that actually mattered

When severity is not enough

What we did differently

Questions to ask any testing partner.

The principles behind the stories.

From the blog.

If your pentest costs the same as last year, you’re paying too much

Human-led, AI-amplified: the model that actually works

Why we write

Considering a second opinion?