PCI and SHA-1
In jPTS’ user interface, we need a way to click on a given card and show all transactions for that card over a specified period of time. Because all sensitive data is stored using AES-256
in the secureData
column of the tl_capture
table using the CryptoService—a column that can be further encrypted at the database level—we need a lightweight, database index-friendly, one-way consistent hash of the primary account number.
Because computing a hash these days is an extremely fast operation, knowing the last four digits of a 16-digit PAN and the hash is enough information to brute-force the full PAN in a matter of milliseconds. Therefore, jPTS uses a dynamic per-BIN secret that incorporates several layers of defense.
Here is a screenshot of the UI page that requires this feature. When you click on a specific card, the transactions for that card alone are displayed.
The 128-bit SHA-1 algorithm, the same as other hash algorithms can be brute forced to produce a collision under certain circumstances. With today’s CPUs, finding a collision would take approximately 2600 years, a time that can be reduced to about 100 years using a cluster of GPUs. In 2017, the first collision for SHA-1 1 was found. Further research has provided additional cryptanalysis strategies that could theoretically reduce this time to approximately 24 days. (See also 2345).
Understanding Collisions
Imagine you have a PDF file that has the SHA-1 hash:
845cfdca0ba5951580f994250cfda4fee34a1b58
Using the techniques mentioned above, it is theoretically feasible to find another file different from the original one that has the same hash, though this requires significant computing effort. However, this scenario applies to large files, not to small 16 to 19 digit Primary Account Numbers (PANs).
When dealing with a limited number of digits, especially where the last digit must satisfy a modulus 10 condition (LUHN)6, finding a collision is not possible as there are no collissions. Studies have shown that the minimum size for a collision is approximately 422 bytes7. This refers to full 8-bit bytes, not just a restricted pattern within the ASCII range of 0x30 to 0x39 (representing digits 0 through 9).
Going to an extreme, imagine you have just one digit, from 0 to 9:
# | SHA-1 hash |
---|---|
0 | b6589fc6ab0dc82cf12099d1c2d40ab994e8410c |
1 | 356a192b7913b04c54574d18c28d46e6395428ab |
2 | da4b9237bacccdf19c0760cab7aec4a8359010b0 |
3 | 77de68daecd823babbb58edb1c8e14d7106e83bb |
4 | 1b6453892473a467d07372d45eb05abc2031647a |
5 | ac3478d69a3c81fa62e60f5c3696165a4e5e6ac4 |
6 | c1dfd96eea8cc2b62785275bca38ac261256e278 |
7 | 902ba3cda1883801594b6e1b452790cc53948fda |
8 | fe5dbbcea5ce7e2988b8c69bcfdfde8904aabc1f |
9 | 0ade7c2cf97f75d009975f4d720d1fa6c19f4897 |
You can see there are no collisions. If we add two digits, there are no collisions either. With three digits, still none. Even with 16 digits, none. At 19 digits, none. And with 100 digits, still none. The shortest collision theoretically possible is around 422 bytes, but we're talking about full 8-bit bytes here, not just limited to the scope of 0x30 to 0x39 values, which also need to honor the LUHN algorithm.