re:Invent 2023, My Thoughts from Inside the Madness
If re:Invent and I were in a relationship, we'd be at the Tin/Aluminum anniversary by now. Of course, there were a few years in there we spent apart…but everyone did, so it's OK!
I'm going with a Thanksgiving theme for this, since re:Invent interrupted my Thanksgiving plans this year.
Setting the table
If you have stumbled across this post by accident, you should know this will be security focused…as everything related to cloud should be. Cloud service providers, like AWS, seek to abstract the operational headaches of hosting – leaving security of said abstraction to the rest of us. I like to think, the best of us! Admittedly, though, I am biased. Also, I will not break down everything announced at re:Invent, AWS does a pretty good job of that.
Official numbers are not available but estimates place attendance at around 65K people. I'd say a quarter of that were salespeople. AWS focused more on sales this year, too. The certification lounge, usually in a prominent location outside of the expo hall, was moved to the end of a hallway and in its place was a joint-use area for executive sales sessions and video/marketing. I don't like the trend, but it was inevitable.
Turkey, ham or turducken… The meat is always first
One trend obvious to everyone, including AWS, was people were there to learn about governance. When AWS opens up session registration they, they play a little game of wait-and-see. Waiting to see who registers for what and where they might need to shift sessions around before they lock-in rooms. Here is some of an email I sent to Chris Konrad, WWT's global security practice leader, during the first part of the week:
Why, though? AWS isn't new. Cloud isn't new. Don't get me wrong, I am glad I no longer feel like I am the only one beating the drum for proper cloud governance. But why are people flocking to these sessions? Like I said earlier, because cloud abstracts operations, not security. Security and governance remain a personal activity and companies have recently become aware of what they are not aware of.
More evidence of this was in the Expo Hall. Most of you know that AWS has a separate security-focused conference called re:Enforce, where everything, including the vendors, are security related. However, the re:Invent Expo Hall this year could have been confused with re:Enforce. There was more security, and more of it in prime locations on the floor. As I walked around the floor over the week, I was thinking about how the Expo Hall itself hasn't changed (physically), but the vendor names have, and the focus on security has.
So why is it that we have more visibility than ever into a platform (AWS) and more tools than we can shake a stick at, yet security and governance of cloud workloads is still viewed as an unachievable goal? It's because, for years, the approach to governing cloud environments has been upside down. This blog post isn't the place, but if you are in this boat, shoot me an email. We have much to discuss on how to get right!
Slow down, get some dressing
Prior to re:Invent, AWS released some service updates. While at re:Invent, I work-shopped some of these new changes I was hoping would make applying governance at scale easier. Although nice enhancements, what we do and scale can still be an issue as many of the services focus on account-level visibility and controls, not org level.
Take this as a friendly reminder that every improvement or change made my AWS can introduce a governance issue, especially if you are default allowing vs. denying actions. Visibility and governance across orgs, especially pieces and parts of an org is a challenge. To maintain compliance at scale and provide attestation throughout, it is crucial for governance and controls to be inherited through a framework. Do not rely on validation after-the-fact.
There is one pre-conference enhancement that I'll call out specifically, and that was to AWS's Config Rule Development Kit. Like everything else, some AI flair was added to make config rule creation a bit easier. Even with newly released RDK improvements, teams will still need a deep understanding of AWS. Complexity has been reduced, but there is still a steep learning curve to be able to detect actions in an environment constantly in flux on resources constantly changing. Why not spend the same effort on the front end and make everyone's job easier? Again, it's all about the approach and letting your developers develop.
Loosen the belt
I wasn't just there to learn about CPUs, GPUs, or regionally local S3 beast-mode, I also wanted to learn about what my consumers (developers) were there for. Cloud is built for developers – you can check another one of my blog posts about why. But they are trying to respond to business asks, whether they are pre or post development. I mentioned the RDK enhancements above for config rules, what I didn't mention are the improved integrations with 3rd party code scanning and validation tools.
Developers hate inefficiency. That is why, as security pros, we have to quit being "blockers" and get out in-front of developers needs – because anything we do to slow them down is viewed as inefficient and slowing down them and (by proxy) the business. Code scanning is typically viewed as a blocker; however, when done right – like at Google with Critique – it can be viewed as a process improvement or even an efficiency generator. If done right, these integrations, along with ML, improve a developers experience when deploying in AWS. We should check that out!
There were also some good developer sessions regarding OTEL and other workload telemetry options for containers. AWS is coming around to OTEL, and us security pro's should be ready and willing to work with our developer friends to bake reasonable options into images. Again, let developers develop. We need to work with the cloud ops or SRE teams to come up with a telemetry plan that doesn't allow node.js to upload http posts with password hashes. I mean, theoretically… So, I've heard.
This time add gravy, so it goes down easier
AI was everywhere. In the sessions, on the expo floor, people talking about it in the meal halls. By what attendees were saying, everyone is leery of AI usage. Developers at one table talking about how AI generated code and reviews will just slow them down, another group talking about AI having access to untokenized data, then another worrying about the impact on event correlation. It is for this reason, AWS held numerous sessions (aside from the avalanche of AI product announcements) to try and explain how their AI services are structured and what is and is not done with customer data.
Let's be honest, whatever AI engine, model or vendor service you leverage will need the same access as a person. And maybe that's everyone's problem with it. Often times, some of our least trained engineers and analysts have access to the most sensitive data: call center employees with access to PII, SOC employees with access to raw event data, and help desk employees with access to AD and AAA services. And who do the bad actors go after? These folks. Who will they go after now? The AI version of these folks. To leverage AI, we must let go of our corporate training philosophy and trust that every vendor will properly secure and train their AI against bad data and social engineering. I guess if it's not a person it would be anti-social engineering.
Another concern I have with all the AI-driven configuration of rules, reports, metrics and alerts is the same issue a lot of sysadmins have with built-in AWS IAM roles: when AWS decides to update them, you get the new permission set, whether you like it or not. The same goes for AI. As it learns, as the model changes, as the platform adapts, so too will what it gens for you.
Repeatability is the name of our game and AI output is a variable defined as a dynamic array. Speaking of which, one of AWS's big announcements was Q and it is off to an interesting start! Shortly after announcing new BI features at re:Invent, Q started giving away more than it was supposed to. By "more" I mean everything:
The official response was that there was no security issue found, just continuing to "tune" the product. That's like a CISO telling congress the breach never happened because they had no logs of it.
I'm not picking on AWS here, this is the new world we live in. As we move towards this new frontier, people will treat an AI system like a person and explain they just need more training, instead of finding fault with the company who created it or is doing the training. But for them, that is the beauty of AI, who do you hold accountable? The creator of the engine or the creator of the data. And what if the data is sourced from multiple "commercially-reasonable" sources?
When we get into lawyer speak, it's time to move on.
Sweet potato casserole
Before I head to the dessert table, one quick side dish that really should be called dessert: eBPF. A year or two ago, one vendor had an eBPF agent and now everyone has to say their product is eBPF aware. I'm not going to pick apart any OEM on here (call me), but it's important to know why you should care. Real quick, in Linux you have user space and kernel space. The two are segmented for security purposes, kernel space where the core OS lives and handles interacting with hardware and user space is where apps run (this is a 20K foot view, don't shoot me).
When it comes to containers and even more traditional virtual machines in cloud, efficiency is the name of the game. Anything, like a security agent, that adds latency is out. Period. If it is cumbersome to deploy, adds a traffic hop or requires key management, it is out too. So what does that leave? eBPF. eBPF is a way to write user space code that can execute in kernel space. This is especially useful when looking at data prior to or after cryptographic operations. Yes, Virginia, there is a way to inspect traffic without having to perform SSL termination.
But…there is a downside, as you can imagine. Being able to execute commands in kernel space is a dangerous power. There have already been incidents reported through eBPF. So as you start down this journey and get the emails about some great eBPF agent, think about how you are going to monitor eBPF activity and what all hooks into it.
Honestly, there was nothing earth shattering at re:Invent this year. To me, that is a good thing. It means AWS, as a platform, has stabilized and they are starting to improve functionality of their core services. Yes, they added more integrations, more AI and ML enhancements - but who isn't? Which AI is better? Time will tell, but I do like how transparent AWS seemed to be with how their AI was structured and how data was used and discarded.
New integrations with internal services and 3rd-party platforms are increasing in the areas that I care about. Let's be honest, this blog is all about what I thought and security is what I care about! As I mentioned, security is about repeatable processes and securing developers and cloud environments at scale.
Many of the newly announced integrations (by AWS and other vendors) allow for security to be introduced at multiple phases in a cloud deployment process, from the infrastructure to the app and its workload environment. The downside is that the majority of these enhancements provide security after the action. I am tired of being asked to chase the actions of others, yet be held responsible. There is a better way, so let's talk!
The final thing I will mention is to pay attention to the all the under the radar service enhancements like data caching and database sharding. I know these seem blah, but think about the functionality these enhancements add for replication, export, etc. These concerns are what new service enhancements mean to me!
Have a secure change-freeze!