Top 5 Myths about API Security and What to Do Instead
Introduction to API Security Myths
Rob Dickinson: [00:00:00] The topic of our talk is the top five myths about API security and what to do instead. I’m Rob Dickinson, VP of engineering at Graylog. Let’s jump right into it. We’re going to go through five myths. Myth number one: attacks are rare.
Myth #1: Attacks Are Rare
It’s a terrible myth. We would like to think that our attackers are outliers.
Back in the old days, if you sampled API traffic by volume, you might see something like this: Most of your users are valid users. You might have a couple of bad apples in there. The other way this would manifest would be folks would say something like, “We’ll worry about getting attacked when we reach some level of popularity.” But even then, you’re hoping that your attackers are outliers.
The Reality of API Traffic and Attacks
With modern traffic and modern APIs, the API environments are so hostile that in a lot of cases, we see something more like this: The attackers are not just significant; the attackers may actually [00:01:00] dominate the traffic by volume. If you’re a financial services company, for example, this is probably the world that you live in on a daily basis.
The Public Nature of Cloud APIs
The reality is a lot of folks are struggling with this idea. Just to give you two real-world quotes around this, we met with a major online media company, and this was one of the quotes that came out of this, was, “Our firewall blocks roughly half of the traffic, and we know there’s still significant attacks getting through.” That’s huge. That’s a huge amount of traffic. This is a very popular online property, so this is literally millions of requests a month that are getting blocked. On the financial services side, here’s another quote: a CISO for one of the major crypto trading platforms saying that they assume that 80 percent of traffic will be malicious. Everybody wants what they have. Everybody wants to steal their coin.
The other reality of this is that your cloud APIs are really not private. Attackers don’t care that your API is only supposed to be used by your mobile app. That’s an excuse we hear a lot. Attackers don’t care about that. If it’s accessible to them, they’re going to use it. Attackers also don’t care that your API is only supposed to be used by a specific partner or for one specific purpose. If that API is accessible, they’re going to hammer on it. The other thing that comes into this is that public clouds have known IP address ranges. When you’re deploying your application to a public cloud, you’re deploying it into an IP address range where it really is discoverable. That’s pretty scary. A lot of folks don’t acknowledge the risk of this. There’s really nothing private about a public cloud, as it turns out.
What that translates to is the new reality is that attacks are a constant everyday part of life. And to validate that idea, obviously, we got some great feedback from some specific customers. We got some great feedback from the community, but we wanted to try to [00:03:00] quantify this a little bit to have a data-driven approach that would go with this.
Honeypot Applications Reveal Attack Frequency
How we proved this to ourselves is we used some Honeypot applications to do this. We took some simple Honeypot apps, we deployed those apps to the cloud, because it’s a Honeypot app we didn’t have any legit users. We didn’t run any ads that pointed to these apps. We didn’t have links that pointed to these apps. We were really anticipating that the only folks who are going to be able to find these apps were folks that were port scanning or IP scanning through the known IP address ranges for the cloud providers. We used our technology to record those API calls. Again, assuming that most of those calls are coming from spiders and bots and live attackers because it’s not a real app and it’s not something that we actually promoted in any way.
What we found was maybe not completely surprising, but really good [00:04:00] confirmation of all these ideas that we’ve talked about so far. So having done this multiple times now, having used multiple cloud environments in order to do this, we see on average it only takes about 28 minutes for the first attacker to show up. You deploy an endpoint on the cloud environment, not even an hour later, your first attackers are going to be showing up to try to break in. When you look at the volume of that, the volume that we’ve measured is 154 attacks per day, per endpoint, over 50,000 attacks per year per endpoint. That’s a really significant amount of malicious traffic. And again, it just really speaks to how hostile a lot of these API environments are, these public cloud environments are. The other thing that surprised us about this is [00:05:00] that in a lot of cases, these attack vectors aren’t really understood, and that’s why I want to jump into just a quick demo here to show what we learned from one of these honeypots because we had a case here that really surprised even us.
What I’m showing you here is this is still says Resurface, soon to be rebranded as Graylog API Security. This is the technology that we use to record those API calls and to do the analysis, the threat analysis in terms of what we were going to see here. The first couple of things that show up on this list of problems and issues that we detected were things that we expected to see. We expected to see leaked runtime errors and malformed bodies because we actually had that as a behavior of the Honeypot application. We wanted to do that on purpose. The restricted file attacks, that’s a very classic attack pattern that attackers use. So we weren’t [00:06:00] surprised to see that at all. Then we saw this. We saw, and the count is a little smaller here, but we saw basically a steady stream of redirect-oriented attacks. That was a total surprise. As it turns out, the framework that we used to build the honeypot application actually has a default redirect mechanism that’s configured on by default. We had no idea that was actually the case, and that’s a really good way for attackers to use is to redirect through your property. There were no meaningful redirections in our app. So we weren’t doing anything like OAuth or anything like that would actually require redirects as part of the processing that we were doing. So again, we thought that we knew this app. We thought we designed this little app. We thought we knew everything about what it was going to do. We got some good confirmation [00:07:00] about the kinds of attackers that showed up. Then this was really a light bulb moment for us. That there is this default behavior around redirects. We were completely unaware of it. How did we find it? We found it by monitoring the attack traffic itself. These systems, they’re very complex. They have a lot of moving parts. They change very quickly. And so I think this is just a pretty fabulous example of how even what seems like it’s a very simple case can actually be more difficult and have unexpected behaviors compared to what you expected.
Now that we’ve dispelled myth number one that attacks are rare, let’s talk about one of my next favorite myths. This idea that attackers are outsiders.
Myth #2: Attackers Are Outsiders
So the myth here and what we hope is going to happen is that our attackers are outside of our secure perimeter or [00:08:00] our safe jurisdiction. That’s a very human thing to hope for. There’s something very human about saying, “Let’s build a wall between us and the people that we don’t like.” In a lot of early web-based systems, this idea of drawing a very hardened secure perimeter was a big part of the security posture.
The Reality of Modern Threats
The reality, though, when we look at modern systems, is that these threats are everywhere. Even if you think you have some kind of secure perimeter, chances are that perimeter is much more porous than it used to be. One of those cases that you really want to be aware of is the idea of an authenticated user being an attacker. In a lot of cases where we’re thinking about perimeter security, we’re thinking about blocking those outsiders. We’re blocking the script kiddies. We’re blocking the port scanners. We’re blocking all [00:09:00] the network-level attacks. That’s all great, and that’s good, and you should do that. But you have to recognize that if a user actually signs up, and they’re a valid user, and they’re signing in with valid credentials, they’re going to breeze right through that secure perimeter. They’re going to go right through your firewall. That idea of being of authenticated users being an attacker is very much an application-level attack, but something that has really grown in popularity in the last few years. Some other variations of this, again, that are working within that secure perimeter, are two different kinds of insider attacks. You could have an attack who literally is by an insider, by an employee of the company, by a contractor, by someone else who’s affiliated with the company, somebody who has access to the equipment potentially, any of those people who are insiders, who have any kind of trusted access. [00:10:00] They can potentially be attackers. Gartner says this is one of the largest growing segments in attack traffic are the insiders. Another variation of that is host takeover. This kind of goes more to the safe jurisdiction side of things. You might say, “We don’t have any customers in country X. So we’ll block all of the traffic from that country.” What happens then is the attackers based in that country will just stage their attacks from somewhere else. In a different jurisdiction or ideally launching that attack from within your secure perimeter. These are all ideas where that idea of perimeter security is really not faring as well as it used to be.
The other thing that you see in the news quite a lot, and we’ll see more moving forward, are supply chain attackers. These are very much insider-oriented attacks. And the [00:11:00] important thing to acknowledge here is just that a lot of APIs depend on other APIs, which they depend on other APIs. You tend to have these very long dependency chains. There’s also a ton of software components that go into making a modern microservice, and all of those components essentially have threat aspects to them that can show up essentially as insider attacks, right? You are running your supply chain inside your secure perimeter, and that supply chain is very long and complicated. These are all things that we have to acknowledge that these threats can come from anywhere. It’s not just going to be arising from overseas attackers that we can easily block just based on their country of origin.
The Limitations of Perimeter Security
The other thing about thinking about perimeter security is you have to acknowledge that there are some really significant trade-offs at the perimeter. I’m not [00:12:00] anti-WAF. I’m certainly not anti-perimeter security. Everybody needs a firewall. Everybody needs a WAF. You get the best WAF that you can. But there are trade-offs involved in that. And everybody who’s running a WAF today is going to recognize these trade-offs. As you ask the WAF to be more aggressive in terms of what the WAF is blocking, the performance impact of the WAF is going to go up. It’s going to add latency to those transactions. The more that you ask that WAF to do, that’s important, right? There’s a direct correlation between the performance of transactions and the value of that service. This isn’t just inconvenience by being a little bit slower. You cross a certain threshold in performance and people just nope out. The other thing that comes out of this as you ask the WAF to be more aggressive is typically your number of false positives is going to [00:13:00] increase. This is something that a lot of organizations have a very low tolerance for dealing with false positives. The risk of losing revenue due to a legitimate transaction being blocked is a risk that a lot of organizations struggle with. This is just the reality. So as you ask the WAF to be more aggressive, the performance impact of the WAF is going to go up, and your number of false positives is likely to increase. – When a WAF is set to be more aggressive, it tends to slow down, and the likelihood of false positives typically increases. Additionally, the concept of a perimeter is not static. WAFs need to be managed, operated, and configured properly. If they’re not, it’s easy for intruders to breach the perimeter. Detecting whether the firewall is functioning as intended or providing feedback on how it could perform better is crucial in managing these trade-offs.
Another issue is that many WAFs are quite generic. They’re good at blocking common attacks but not as adept at being fine-tuned for specific applications. This makes it challenging to harden a WAF against particular user behaviors, applications, and the types of threats and risks that need to be mitigated. Performance issues and false positives can arise from a WAF that’s not properly tuned, highlighting the importance of firewall management throughout its lifecycle.
The Evolving API Economy and Zero Trust Model
APIs are increasingly designed to be shared, and companies that participate in the API economy tend to have better earnings. Gartner cites that firms with public APIs generate 12 percent more revenue than competitors without them. This is significant for business health, but the value is only realized if APIs are composable. The push for internal services to become external allows outside entities to create new systems with these services, which challenges the notion of perimeter security.
The zero trust model addresses this challenge by not relying on a secure perimeter but instead integrating checks and authentications into all systems, whether they’re internal or external. This model greatly assists in managing security.
Graylog’s Approach to Asynchronous Detection
Graylog focuses on asynchronous detection and alerting as an alternative to perimeter security. By recording requests and responses through the firewall, Graylog can provide feedback on the firewall’s effectiveness and identify areas for improvement. This includes detecting misconfigurations or alternate routes that bypass the firewall.
Graylog’s system allows for asynchronous detection of quality, security, and threat-related issues without slowing down transactions. This offers a new balance between real-time protections in the firewall and asynchronous detections in the monitoring system. Additionally, Graylog makes it easy to create custom signatures without coding, allowing for API-specific or domain-specific protections.
Myth #3: IP Addresses Equate to Users
A common misconception is that IP addresses equate to users. While IP blocking is effective against certain attacks and for enforcing jurisdictions, it’s less effective against application-level attacks. User tokens, which are more indicative of individual users, are harder to detect than IP addresses. Graylog acknowledges that while IP address blocking is useful, it’s not as effective at higher application levels.
The Complexity of IP Address Identification
An IP address may represent multiple users or a single user may use multiple IP addresses over time, which complicates the identification of individual users. Nothing wrong with that. In fact, we see this all the time, right? Anything that’s managed through DHCP—you go to your coffee shop, you connect to the network, you’re not being assigned a network address that no one else has ever used before that’s going to uniquely identify you. You’re going to get the IP address that someone else was using earlier in the day because they’re being rotated through all the time. So again, these aren’t things that are easily changeable. These concepts that we’re talking about here are really baked into TCP/IP and how the internet works at a fundamental level.
We just have to be careful about equating users and IP addresses. We can’t conflate those at a really fine grain. And part of the proof for that—if you think about this, so if you’re still struggling with this idea that, it’s fine if you are, if you’re struggling still with the idea that maybe IP addresses are individual users in certain cases or whatever, the last data point that I’ll leave you with here is think about when you log into Gmail, think about when you log into Google, think about when you log into Facebook, whatever it is. What is it that you’re logging in with? You’re not logging in with your IP address. You’re logging in with a username. And there’s a reason for that. The reason for that is because of all these things that we just covered. Because you really cannot count on an IP address to be a durable, identifying token that identifies one particular account.
Again, still has lots of merits for dealing with safe jurisdictions. Dealing with network-level attacks, things like DDoS attacks, all those things where it really makes sense to trace back to an IP range, do that all day and feel good about it, but just be careful if you’re going to cross the threshold and think about an IP address as an individual user because they really aren’t and that model is going to break down very quickly.
All right, myth number four. Authentication and TLS is all you need to do.
Myth #4: Authentication and TLS Are Sufficient
One of my favorite, another one of my favorite myths. The reality is authentication and encryption are necessary, but they’re not necessarily sufficient.
The Vulnerabilities of Authentication and Encryption
Authentication and TLS mechanisms can break. And when we look at the numbers on this, authentication failures are at the heart of many of the breaches that we see reported in the news all the time. Part of that is that authentication is something that typically applies system-wide. So as an attacker, if you can crack the authentication system, or if you can find something there that isn’t configured or it’s not working properly, that’s typically going to give you very broad access across the system. This is a really great entry point for attackers. They know that, hopefully, everything is authenticated. And so if you can crack the authentication system, you can typically get a lot of data out of the system.
Configuration Issues and Authenticated Attackers
The other reality here is looking at this kind of from the development side. A lot of folks will say, might say something like, “Yeah, but we only want our systems to run with TLS. We only want our systems to run with encryption.” That’s a lot more difficult in reality than it sounds to say it out loud. The reality is that all of these microservices are developed and tested without encryption, typically. If you’re a developer and you’re running software on your laptop as you’re developing, if you’re working in certain kinds of QA environments, it’s very commonplace not to go through the extra steps of deploying TLS in all those cases. TLS takes extra work and that extra work comes at a cost. When you look at the tech stacks that are actually running these APIs, almost all of them that I’ve seen operate in HTTP mode by default and even when you turn on TLS, they still have the ability to downgrade and run without TLS. They’re developed and tested that way. So it’s extremely difficult to ensure that your microservices are only running in TLS mode in production. It’s really easy to have a configuration or an environment-specific issue where the TLS layer is not working as expected and the microservices are happy running without that TLS layer of protection. It’s an easy mistake to make.
The other kind of configuration issue that we can get into here is configuration issues around the authentication layer or even at the gateway level. So the Optus attack hack in Australia, for example, is a good example of this. That was a data leak based on the fact that the API endpoint in question didn’t have authentication on it. It wasn’t deployed properly. It wasn’t behind the gateway the way that it was designed to be behind the gateway. So the microservice was developed assuming that the authentication is being provided by the gateway. But then the microservice gets deployed in the wrong way and it’s not sitting behind a gateway. Which means it’s just happy running without any authentication at all. These configuration issues, they’re difficult to detect in production. They’re easy to make, and they’re also easy to break because we’re changing these systems all the time. We’re rolling out new features, we’re rolling out new services, we’re hardening things, we’re making things better. Anytime we change the configuration or the working state of the system, we have the potential to break some of the security protections that we hope are going to be there all the time.
Now you could look at this list and you could say, “Yes, that’s all valid. The API provider has control over.” So you know this list here is somewhat, at your level of sophistication. But I would argue that every organization, large or small, it’s going to deal with these issues at one point or another. There’s another side to this coin, which is even if your authentication is working perfectly, and even if your TLS and your other means of encryption are working perfectly, you are still at risk to authenticated attackers. What we mean by authenticated attackers is an attacker who signs up as a customer. One of the worst cases of this is an attacker actually puts down their credit card or whatever method of payment, and they sign up as a paying customer. This is just the scene in the Bank Heist movie, where the first thing that the bank robbers do is they go to the bank, and they open a safety deposit box, so that they can get a glimpse at what the vault actually looks like, right? This idea that you can come in and you can sign up as an attacker, you can sign up as a customer and gain access that way. This is really hard to prevent against, right? Most organizations want to get more customers. Now you could say, “Hang on, we have strong know your customer requirements or we have other things that we, we are sure, we’re assured somehow that those attackers are not just showing up as valid users and, obviously, a lot of financial institutions have know your customer requirements.” Does that make them immune to attack? Absolutely not. The reality is any valid account can be used as an attack surface from this perspective. That also doesn’t necessarily mean that the account holder is in on the attack, right? I may sign up as a customer at the bank, and an attacker may take over my valid account without my knowledge, and use that as a way to stage an attack against the bank. This is also something that’s very difficult to protect against. And you have to have a specific security program that’s going to identify these kinds of attacks. Everything that we talked about here around authenticated attackers, this is going to go right through the firewall, this is going to go right through the API gateway. It looks like it’s being done on behalf of a real customer. And this means that we have to be able to tell the attackers from the valid users based on what their actual activity is. Not just their jurisdiction, not just whether they have valid credentials or not, you also have to be auditing the actual activity of those users.
All right, our last and final myth, one that’s very close to my heart as the software developer. This is something that we hear about all the time.
Myth #5: Developers Don’t Care About Security
One of the biggest problems in security is this idea that developers don’t care about security. If I had a nickel for every CISO that I’ve heard say some variation of this, I’d be a very rich person. This is really more of a cultural item than the things that we looked at before which were much more technical in nature. I think this really is something that’s holding back our industry in terms of how we think about security and especially application-level security.
The Role of Developers and QA in Security
My position here is I think developers really do care about security. So I bristle a little bit at that idea that nobody cares on that side of the house. A good developer will always try to balance correctness, performance, security, complexity, and cost. That’s what you do as a developer. You’re constantly trying to optimize for all those things and keep all those things in balance. That’s what’s going to make you a good developer and a more successful developer over time. So security is not the only consideration. But it is a consideration that’s in there, here’s where it starts to get more difficult for the developers. The developers and the DevOps staff, they typically don’t have access to those production environments. As a developer, how do I care about security when I can’t really see the impact of security in the production environment what I’m going to see as a developer is I’m going to see the system running on my laptop. I’m going to see the system running in a sterile staging environment. I love it when folks say again, not to throw stones at our friends in security but I love it when I hear folks say things like if the system was designed to be secure from the outset, then we wouldn’t be having all these problems in production. It’s very difficult to design for requirements that you can’t really see with your own and you can’t really test. And by the same token, it’s bad security policy just to make all your production systems available to whoever wants to get in and see them, right? We’re not advocating for that. It’s very difficult to react to things that we can’t see. It’s very difficult to design then protections around these things that we can’t really understand. The other part here that you have to acknowledge is that developers don’t own the roadmap in a lot of cases. So most organizations, product changes have to be approved by product management. They have to be scheduled they have to be put into a release. It’s not just going to the developer and waving your hands and say you have to care about security more. We have to fit our security concerns in balance around all these other things. There’s always lots of different things competing for developers’ attention. If you want your developers to care more about security, you have to acknowledge this. You have to open up ways of sharing that data, sharing that attack data, sharing those production incidents that can flow back through the development cycle. You need your product management team on board. When your developers are working on security, they’re not going to be working on other things and you need product management to embrace Security as one of the features and capabilities that you’re improving over time. So not just hand waving here, but really putting your developers in a place where they can make the right choices keep correctness, performance, security, complexity, and cost all in balance that makes developers like me really happy when you have the information to do that.
The other reality here is that it’s not just development, it’s also QA. QA really cares too and we should acknowledge that. So good QA teams, they try to anticipate and validate all the possible inputs, all the possible states of the system. The QA team never has access to production. They only work with sterile staging environments. How is QA supposed to know that they’re adequately testing threat vectors that are going to be expected in production? They typically can’t, they try to do their best. But just like we said before, without having direct exposure to what those security issues look like, it’s very difficult to make those kinds of threats part of your test plan. The other thing to acknowledge here when we’re asking development and QA to care more about security we also have to acknowledge that any change that the developers make have to be tested. And the tests that are involved in that have to be updated, they have to be written if none exist, they have to be updated with all the new capabilities as the security posture gets stronger. Not everyone in security knows what the development cycle looks like or what the test cycle looks like. Most environments, you can’t just have a firefighting exercise with development and say, we’ve got this thing we need to fix and that you’re going to have to fix the next day. As we said, you have to have product management involved, but you absolutely have to have QA involved. Nothing’s going to get released until the QA team says it’s okay to release that. That’s something that has to be acknowledged here as part of the landscape. The other reality here is that a lot of incomplete tests. Just as we said before, because QA doesn’t have access to these environments directly, they can’t see these behaviors directly they’re trying to be imaginative. You can’t release something without having adequate tests in place. There’s really material work here that comes into, you identified a new issue. It’s probably a new issue because it fell through all of your other existing test plans. So you’re going to have to improve those test plans. And your test cycles are going to get longer as you’re testing for more and more of those security issues. I would challenge anybody on the security side, if you’re feeling, if you’re feeling friction between your organization and the development organization, most of us are, that’s totally natural but the thing to do is to realize that it’s not just an enthusiasm gap here that you have to cross. We really have to work to get requirements and understanding over to the development team and the QA team so that they can better predict and anticipate what needs to be done. And by all means, if you do that within the existing development and QA process, you’ll make everyone involved in that process happier. That’s a huge part of this.
All right. To recap here with a few last recommendations, just to sum this up like GI Joe always said back in the day, knowing is half the battle. Here’s the things to keep in mind all on one slide.
Recommendations for Improving Security Practices
For modern APIs and modern cloud-based systems, you really should be assuming constant attacks. Don’t wait. There’s, as we showed before, those attackers are going to show up right away. The attacks will last as long as that property is reachable. So you want to prepare for this. You want to practice for those attacks. That’s going to be the normal status quo is being under attack. So you want to get more and more comfortable with that.
In terms of our thinking, we want to shift our thinking from thinking about secure perimeters and hiding behind those secure perimeters to really embracing zero trust techniques. That’s a very powerful mental model to apply to your secure design. And I think the folks who’ve already made that shift to more of a zero trust mindset are having a much better time in their security posture.
I hope I made the case here that IP addresses are not users. You always catch yourself. Or you should always double-check your thinking. When it comes to this, is it really a policy that you can attach to an IP address knowing that those IP addresses are multiplexed? Or is it a policy that really needs to be tied to the user account based on the token they provide or the name that they log in with? Those are separate concepts from the application point of view. So don’t conflate those.
Also, don’t assume that authentication, TLS, and other security features are always configured and working properly. You have to have checks and balances here. You have to have a way of auditing. That the authentication is right, auditing, that you’re not sending secure information over non-secure protocols. That has to be an active part of your security posture. It’s not just good enough after a breach to say, “Yeah, things weren’t supposed to work that way.” That’s not a valid defense. You have to have telemetry and have processes that assure that these security features are working the way that you hope that they are.
Then the last one is you’ve got to have good baton passing between security and your development and QA groups. And you’ll get a lot of love from the organization. If you can crack this as we are able to share more information, share more of that context, we’re able to have more empathy on both sides of that. We’re able to mesh up the security practices with the development QA practices. That’s something that, if we can turn that from more of an adversarial relationship into more of a cooperative relationship, that’s something that is going to show up in terms of all kinds of benefits.
Conclusion
So thank you for attending this session. I know we covered a lot of ground. Thank you.