Table of contentWhy is it hard to detect secrets like API keys and other credentials?Why do code reviews fail at finding secrets in source code?What is a "good" secrets detection algorithm?What is a false positive in secrets detection?Are secrets detection algorithms language-dependent?
Secrets detection is probabilistic—that is to say that it is not always possible to determine what is a true secret (or true positive). Because secrets share very few common, distinctive factors, decisions must be taken by aggregating weak signals to make strong predictions.
One of the most common factors is that almost all secrets are strings that look random. We call these strings high entropy strings. The issue is that this common factor is not very distinctive: 99% of strings that look random in source code aren’t secrets. They are for example database IDs or other types of false positives.
Of course, some secrets have fixed patterns: AWS keys often start with AKIA for example. But most secrets do not. The portions of code surrounding them are also very different, depending on what the secrets are used for and how they are used by the developers in the context of specific applications. Usernames and passwords can be used to authenticate in many different ways, and it is really hard to distinguish between real and fake credentials used as placeholders for example.
All this makes it extremely challenging to accurately capture all true secrets without also capturing false positives. At some point, a line in the sand needs to be drawn that considers the cost of a secret going undetected (a false negative) and compares it to the outcome that too many false positives would create. Different organizations with different people, cultures and workflows will draw different lines!
Code reviews are great overall for detecting logic flaws or maintaining certain good coding practices. But they are not adequate protection for detecting secrets, mostly for two reasons:
Detecting secrets in source code is like finding needles in a haystack: there are a lot more sticks than there are needles, and you don’t know how many needles might be in the haystack. In the case of secrets detection, you don’t even know what all the needles look like!
Ideally, you want your detection system to achieve at the same time:
Balancing the equation to ensure that the algorithm captures as many secrets as possible without flagging too many false results is an intricate and extremely difficult challenge. Read more on evaluating secrets detection algorithms.
A false positive in secrets detection refers to when a secret candidate is wrongly marked as a true secret when it is in fact a non-sensitive string.
Typical examples of strings that can be mistaken for true secrets are:
Secrets detection is, for the most part, not language specific. Of course, there are some subtleties to take into account, like the way variables are assigned in any programming language. But there is no need to support all the different syntaxes in their greatest details. This means that the same algorithms can be applied to any project, in any programming language, without using things like Abstract Syntax Trees.