February 9, 2026 • 2 min read • Fateh Mohammed
Secret Detection in Pull Requests: Beyond Regex Patterns
Regex catches known secret formats but misses custom tokens and context-dependent credentials. Here is how layered detection closes the gap.
Secret detection is one of the highest-value checks you can run on a pull request. A leaked API key or database credential in a public repository can be exploited within minutes.
Most secret scanners rely on regex patterns. That works for well-structured tokens like AWS access keys or GitHub personal access tokens. It does not work for everything.
Where regex falls short
Regex-based detection struggles with:
- Custom tokens — internal API keys that do not follow a standard format.
- Connection strings — database URLs embedded in configuration that vary by provider.
- Context-dependent secrets — a base64 string that looks benign in isolation but is a signing key in context.
- Multi-line secrets — PEM keys or certificates split across lines.
Layered detection
A robust secret detector uses multiple strategies:
1. Pattern matching
Start with regex for known formats. This catches the majority of structured secrets with zero false positives.
2. Entropy analysis
High-entropy strings in assignment contexts (environment variables, config files, constructor arguments) are likely secrets even without a known format.
3. Semantic analysis
Examine variable names, file paths, and surrounding code. A variable named db_password assigned a high-entropy string is a stronger signal than either indicator alone.
4. Contextual validation
Check whether the file is a test fixture, documentation example, or production configuration. Test fixtures with fake credentials should not block a merge.
Practical considerations
False positives in secret detection are especially costly because they erode trust fast. Developers who see three false positive secret alerts in a week will start ignoring all of them.
Balance sensitivity with precision. A detector that catches 95% of real secrets with 1% false positives is more useful than one that catches 99% with 20% false positives.