The Top 5 Most Common Security Issues I Discover When Reviewing Code

William Barrett

June 10th, 2022

The most important part of my job as a code reviewer is helping other developers keep their systems secure. It’s something I don’t take lightly, and there are many ways that an application can fail to pass muster from a security standpoint. Here are the things I find most often—hopefully, there will be a few items here that stick out to you the next time you see them in a PR.

NOTE: The following code examples have been contrived to provide detailed, illustrative representations of real security issues I’ve found while reviewing code. They have not been pulled from actual codebases or commits. They are written in Python and Ruby, but the concepts apply broadly.

1. Sensitive Information in Log Files

Personally Identifiable Information (PII) is governed by a few different laws—GDPR in Europe and CCPA in California being the two most relevant examples at the time of publication, though more legislation will arrive on this topic over the next few years. These laws govern the ways that companies can store and use personally identifying information, such as names, physical addresses, email addresses, ID numbers, and the like. Generally, this legislation requires that such information is stored securely. Logging is one of the biggest violators of this requirement. Take the following example:

This is ostensibly reasonable and safe, but it’s actually leaking email addresses for users into log files. What makes logging this information a security issue? Logging systems tend to not have the same protections as a database. Attackers know this and can exploit it. The email address in the above code should be replaced by an obfuscated or encrypted User ID number.

Secrets can also find their way into log files. As the name implies “secret” keys are credentials that should not be exposed, whereas some API keys are safe to expose. I find that many of the chief offenders are database drivers that will print out the entire connection string on a connection failure. Typically connection attempts for some of these sadly misguided drivers should be wrapped in an exception handler that will swallow the error rather than printing it to logs. Lesser offenders include logging added by programmers for debugging purposes which then sneaks into a pull request or logging code that is intended for the development environment that is set with the wrong log level.

2. Poor Cryptography Choices

Typically this involves systems that use a one-time key or other token. I’ve seen times when folks tried to take the easy way out by doing a Base64 encoding of a timestamp (highly guessable and not at all pseudo-random) or choosing a high-collision hash for a unique key (like md5) without a unique constraint on the table. Other failures include using encryption algorithms known to be insufficiently complex for security.

3. Insufficient Access Controls

Also known as Broken Access Control by OWASP. In 2021, Broken Access Control moved to #1 on the OWASP Top 10 list of the most critical web application security risks.

Broken Access Control moves up from the fifth position to the category with the most serious web application security risk; the contributed data indicates that on average, 3.81% of applications tested had one or more Common Weakness Enumerations (CWEs) with more than 318k occurrences of CWEs in this risk category. The 34 CWEs mapped to Broken Access Control had more occurrences in applications than any other category.

The most common example of this is related to row-level security, where a user can access a subset of the information in a table. It’s easy to forget to add a filter to a query or a check to an endpoint, and it can be tricky to see the mistake in the UI of a program if the filtering is happening appropriately elsewhere in the code. This is most easily discovered by checking that the query filters on index actions are also present on all of the other actions that operate on individual rows.

The other common access control failure that I see is missing session checks on endpoints—when, say, every endpoint requires a Python decorator to check the session, it’s easy to leave that off. The best strategy here is to default everything to closed and then use decorators to open up access instead.

More about this here.

4. Unsecured Caches

A pattern that I’m seeing more and more, as single-page applications become more complex and common, are cache layers exposed to the front-end, where any cache member is available to load if you know the cache key. This can leak sensitive information across sessions and may result in the escalation of privilege attacks, particularly if the session is used to store access keys. When implementing front-end cache access, it’s important to provide a system for limiting cache access to only those items written by the current user, or if it’s a shared cache ensuring that the writes are verified server-side to avoid injection attacks based on bad behavior from users.

5. Trusting the Client Too Much

If you’re new to development, you can think of the client as the “front-end” that users interact with and the “back-end” as the systems that power the front-end.

One of the first security lessons that I was taught was never to trust client-side input; always validate that the client is doing what we expect it to do. I am seeing more and more instances of fewer server-side controls and more reliance on the JavaScript front-end to provide data validation and control on user flow.

For example, say an application has a feature where users can upload images. The front-end client may have validation to make sure a selected file is formatted to contain the expected file type (e.g., the file name ends in “.png”, “.jpg”), but this validation can be easily circumvented by an attacker looking for ways to insert an executable file into a system. Putting too much trust in the client for this validation opens a security hole and opportunity for the attacker to do this. The application’s back-end should also perform validation that the file is the correct and expected type.

While I don’t have evidence for this beyond personal experience, PullRequest provides engineers in the reviewer network like me a unique lens and perspective in reviewing code for a multitude of engineering teams. And from what I’ve observed, this is a visible and growing trend.

I believe the increase in this behavior is a result, at least in part, of the increasing specialization in the industry—web development has become increasingly fragmented into front-end and back-end specialists and requires more collaboration between the two groups to produce functional software. When this collaboration isn’t going well, there can be a tendency for back-end concerns to migrate to the front-end, which can create security vulnerabilities.

Conclusion

These five security issues are especially important to catch in pull request code review as they’re generally very unlikely to be caught in QA. They may be surfaced in a routine penetration test (or pen test) if your organization participates in these, but pen testing is usually done in longer duration intervals while pull request code review is highly regular, ongoing, and proactive—pen testing will usually uncover security issues that already exist. In short, it’s much safer to catch security issues in code review before you, or an attacker discovers them in production.

If I had only one bit of advice that I could beam into every programmer’s mind, it would be to check every single thing you write to logs for PII and secrets, as well as every error you throw for the same. This is the most common recurring issue that I catch time and time again.

This post was originally published on the PullRequest website. On April 28th, 2022 HackerOne acquired PullRequest to help power developer-first security testing solutions.

Find post author Will Barrett here.

The Ultimate Guide to Managing Ethical and Security Risks in AI

Read the eBook