What is Cloud Governance
In the past few years while I’ve been doing cloud security, I’ve observed how governance and other cloud activities work. Cloud Governance really focuses on three things: 1. Security/Legal/Compliance (aka risk reduction) 2. Cost Optimization 3. Providing the business value which is why you’re in the cloud in the first place.
Typically these functions are split in different parts of the organization. Your InfoSec team cares about cloud security, your finance department is asking questions about how you’re spending so much money and why can’t you spend less.
Meanwhile the developers are being asked to deliver features and keep the business application running.
If you break down your cloud program into these three core functions: development, security and finance, you can see how each manages an aspect of your cloud: cost, risk and agility.
Security’s goal is to counter risk.
Finance wants to counter increasing costs.
Meanwhile your developers need agility to deliver business value quickly.
Cloud Governance Models
If the three competing forces in cloud governance are Developers, Security and Finance, then the three areas they want to minimize are inagility, risk and cost.
The default state of things is Cloud Anarchy. Developers rule and things move fast. This is the land of Shadow Clouds, where the other aspects of Cost and Risk are great.
Alternately if security rules are too strict, the risk is reduced, but at the expense of time to deliver and cost.
When two sides pull, cost and risk are reduced, again at the expense of the agility needed to deliver business value.
When your cloud program is focused equally on the needs of security, finance and developers you have a modern cloud governance program.
…or as Thanos would say…
In all four diagrams, the area of the triangle is constant, when any particular constituency pulls away from the others, the areas of concern in the circle increase. In many cases it’s fine if one of the constituency sits outside the circle a little bit, but when one is 1.5x more powerful than the others (as depicted with these triangles) the impacts to cost, risk, or agility dramatically increase.
Agility is not just about being able to deploy your application without hassle from Finance & Security. Its about removing the undifferentiated heavy lifting to focus on the business problem at hand. Your business doesn’t build queuing servers, it processes data.
I used to teach a cloud security class for SANS, and I would include these slides to highlight why infosec needs to focus on more than just traditional on-premises mentality. (These slides actually came from our VP of Cloud Architecture Don Browning for a presentation to a local tech organization in Atlanta several years ago. I’ve blatantly stolen them because they sum up where a modern cloud program needs to be.)
This is a depiction of all the things AWS offers…
Yet, when traditional governance looks at traditional controls, the approved solutions tend to focus only on these…
The problem of traditional governance is that is leaves out the ability to use a vast array to tooling that AWS offers. This is what removes that undifferentiated heavy lifting, and frankly, what excites and empowers developers. Example.
Serverless is more than just Lambda and API Gateway. Those are the glue that holds these other higher order services together. And these higher order services are how companies can shed the undifferentiated heavy lifting.
Inagility is all the bureaucracy that decreases the ability to deliver business value. It is a security team that says “no”, a paved road that doesn’t go where the business needs dictate, a finance team that cannot see to invest in, or reward optimization.
The paved road
Many companies try and develop the “Paved Road” approach. They build the tooling and CI/CD pipelines to make it easy for a developer to be just a developer and not have to consider the cost and risk aspects. The paved road allows developers to go fast.
The problem with the paved road is that it only goes in one direction. When the business needs change, when new tools are called for, when a different direction is needed, the friction of going off-road is too great. Rare is the company that can build an entire Cloud Dept of Transportation.
Sometimes companies when they start their cloud journeys create the paved on-ramps. Then forget to create the actual road. So developers are ramping up to highway speed to suddenly find themselves going 80mph on a gravel road.
Paved roads typically have guardrails. Some guardrails are implemented to help remind you where the road is, and some guardrails make it impossible to leave the road. These Jersey Barrier guardrails stifle agility. When developers only know the one road they lose the ability to develop new an innovative solutions. Solutions that can potentially reduce both risk and cost.
The billing model for cloud is very, well, cloudy. It is easy to predict cost when we need 14 m5.xlarge EC2 instances running 24⁄7 to run our application. That can be budgeted for. In Corey Quinn’s talk at the DevOps Enterprise Summit last year he highlighted what finance wants. They want to allocate costs cleanly and they want a good degree of predictability. EC2 is predictable. Lambda writing to SQS, calling AWS Rekognition and saving data to S3 is much harder to predict since the costs will be based on demand. However the total cost of a serverless solution compared to an always-on solution is (unless the incoming workload is constant) much less.
Two slides I’ll highlight (because my diagramming ability is even worse than his) demonstrate that as your cloud program matures the overall efficiency goes up, but the costs become more unpredictable.
According to Corey, Allocation is another aspect of cloud finance that is vital. But unless the allocations align to the needs of the business, the data you get from your CSP’s billing portal probably doesn’t align with the questions finance and the business need to answer. Namely: “What does it cost me to do X? And should we continue to do X?” This is can be solved with a multi-account strategy where accountability lies with the application developer or owner. But that only works if the process to get a new account is not too burdensome.
Finally there is an issue around incentives. As described in my triangles and circles above, developers are incentivized to deliver features. Ergo, they are focused on reducing the in-agility in their world.
When cloud budgets are held centrally there is minimal incentive for optimization (which is even worse when Security limits the higher-order services a developer is permitted to use). Cloud Economics needs to be built into the development life-cycle. Two-Pizza teams need to be incentivized to optimize and cut waste.
A t2.micro instance (probably the most common instance in most environments) costs $8.50/mo (on-demand with list-pricing). That’s $102/yr for those who need the math done for them. Go to any development team and say “We see that i-1234567 is under-utilized and needs to be rightsized or turned off.” The level of effort, for that development team, to do anything (other than ignore the request) would easily exceed the $102 in savings.
If you multiply that request across 800 accounts and 300 or so development teams, it is easy to see why the cloud finance team can’t make headway on reducing the so-called “low hanging fruit”.
Moral of that story:
Modern Cloud Governance must align cost optimization to the developers
Two-Pizza teams need to get ice-cream when they reduce the costs of running their own application. Until the incentives are aligned for developers to cost-optimize they will continue to focus on feature delivery only.
Security teams need to be realigned too. Security is focused on reducing risk, but their view of risk is myopic. They focus on the risk to Confidentiality, Integrity and Availability. They miss the existential risk to the business that comes from disrupters who don’t have an 80 person policy team writing 300 page security policies (and create a culture of no). These security teams will make darn sure that no data breaches occur while their company goes out of business.
Risk Reduction in Modern Cloud Governance organization needs to focus on realistic cloud security standards, guardrails, and education. Guardrails need to be flexible. Not all workloads have the same level of risk tolerance. Static Web-hosting in S3 is a valid use of a public S3 bucket (ex: this website).
This is not easy! I started Practical Cloud Security to try and document what risks exist in these higher-order AWS services. I ran out of steam after two weekends. When I fetched the announcements from AWS for the last week I got:
There were 51 announcements since Friday, July 17 2020 11:23PM from the feed published on Friday, July 24 2020 11:23PM
That’s 51 different things (just in AWS!) a security team needs to review to know if the risk profile in an organization has changed and if the standards, guardrails or education programs need to be updated. And the weeks approaching AWS re:Invent double to triple that, and more of the announcements are new products that require deeper review. Oh, and rinse and repeat for Azure, Google, IBM, Oracle, Alibaba, etc. Don’t get me started on SaaS!
(As an aside: it doesn’t help when the CSP launches a new thing without the core security controls, or decides that the in-secure setting is the default one, or allows any guest in your Azure AD to create a new Subscription).
Bringing balance to the
force cloud governance
Aligning the competing needs of the developers, security and finance teams can’t be done from remote parts of the org-chart. Incentives matter and a balanced modern cloud governance approach requires close interaction and common goals and objectives.
Yet the competing interests will exist in far-flung parts of the organization. The different constituencies speak different languages. And governance cannot be a one-sized-fits-all model. The marketing website and the payment system have different needs. Large enterprises have multiple lines of business with different deadlines, different revenue models, and different risk.
Hire developers who want to go off-roading. Hire security people who will empower them to not crash and burn. Hire finance who cheer them on and can ask “where are we going today”? Culture eats strategy for breakfast. Build not just a cloud strategy, but empower a cloud culture.
A successful cloud strategy needs to bring together the three constituencies into a common conversation to balance the cost, risk, and ability to agilely deliver what the business requires. It requires less of a top-down governance model and more of a bottom up common understanding. Each side must give and take in balance.
Or you can go the way of the dinosaurs.