Clouds start up. Actually, cloud computing services are usually said to ‘spin up’ into life. It’s a sort of emotive way of saying that the technology is ready to use, while also making reference to the fact that somewhere in a datacenter, on some distant server rack, a hard disk has been spun up into life to serve and deliver a cloud instance. The fact that some or all of the disks in a particular cloud server may already be rotating is academic, cloud starts up when an instance is spun up.
As instantaneous as cloud spin-up is, like any vigorous exercise it’s best to warm up first. In cloud land, this is a practice known as cloud provisioning. It is the process of readying a cloud for its intended tasks by making sure that it has the required attributes and performance characteristics. The trouble is, as automated as so many of the technologies in this space are, cloud provisioning can be quite a chore.
But why is cloud provisioning a chore?
Because it includes the administration tasks associated with specifying the amount of processing power, storage capacity, transactional capability, security protection, disaster recovery ability, system monitoring & observability controls, cloud service dependency allocations, user access policy enforcement and so on… all the way into reporting and documentation (the list goes on, let’s agree to say that provisioning is burdensome) that a cloud has once it is ‘spun up’ and brought to life.
A process known as ‘tagging’ is one of the defined first management steps that is needed to make cloud provisioning happen. As Cast Software notes, a cloud tagging strategy specifies the standards and procedures that an organization’s cloud estate must adhere to and execute.
Also vocal on this subject is Benjamin Brial, founder of Cycloid, a company known for its cloud-centric platform engineering technologies. “Tagging is the basis for creating a structured cloud infrastructure, especially in multi-cloud environments that feature different cloud resources being used for different purposes, but doing it manually is a chore,” said Brial.
Keeping tabs on tags
“In the cloud, tagging allows users to add important and descriptive metadata (tags) to cloud infrastructure to identify resources or values like staging, development, or production,” he added. “Good tags drive good decision-making but, in a rapidly growing cloud environment, you need to be keeping track of what is happening across a lot of different areas like cost, usage, availability, performance and security in what is a constantly transforming cloud infrastructure. This requires maintaining and managing tags across a lot of different provisioned cloud environments and providers which is often a huge and messy task. To truly get a handle on tag management and make it less of a chore, you need to be able to integrate it natively in your continuous orchestration pipelines and then visualize and study multi-cloud projects in a central place to establish a baseline.”
But cloud provisioning doesn’t start and end with tagging and filling out a few document requirements. Agreeing with our warm up before a work out analogy is Omar Abi Issa, cloud specialist at OVHcloud. He thinks that cloud provisioning does include a large number of administrative tasks, so like going to the gym, habit (and perhaps the repetitive logic that comes from muscle memory) is key – so in this case, that means automation.
Bang for your (cloud) buck
“Public cloud is usually more automated than using dedicated servers and a lot of the provisioning is taken care of by the Cloud Services Provider (CSP),” said OVHcloud’s Abi Issa. “Whereas with dedicated servers, end users are responsible for everything themselves – but dedicated servers will give you more processing power for your buck. If you’re in a smaller company or have a less technical workforce, the provisioning automation in public cloud can make life a lot easier, but you won’t get the same performance-to-price ratio that you will get with dedicated [on-premises cloud] servers. However, it also depends on the infrastructure strategy that the company has, and whether it already has pre-prepared automations available such as backing up infrastructure, scaling infrastructure etc.”
He further notes that from a user perspective, choice of automation tools can also make a difference. For example, with public cloud service providers, organizations tend to use Terraform to automate infrastructure and network deployment or switch to services like Ansible or Puppet for OS and other software deployment.
Instant access factors
Dave Chapman, chief cloud evangelist at Capgemini has been through more cloud provisioning consultations than he cares to remember (but in a good way, with good customers). He says that cloud provisioning is a huge step onwards from pre-cloud infrastructure provisioning in the era before we had the always-on pipe of the web-connected cloud to draw from.
“It is a major step onwards because it allows almost instant access to processing power, data storage and the latest functional tooling. Cloud provisioning can be automated in such a way that supports faster time to market and enables innovation by supporting experimentation without long-term commitment. It also unlocks accurate billing via tagging and facilitates financial and sustainability operations through both the tagging and through automated de-provisioning. All of which means that an organization only runs what it needs, when it needs it. This represents significant efficiency and sustainability benefits,” said Chapman.
He further notes that cloud platform operations is an ongoing developmental and operational task, so building the core capability should be central to any cloud transformation programme. Yet today, many businesses are under-investing when it comes to doing it properly – and that’s a predicament that the Capgemini team are clearly keen to fix.
Distributed cloud, distributed provisioning
According to Jay Jenkins in his role as chief technology officer (CTO) for cloud computing at Akamai, there’s some good news for cloud provisioning aficionados. The positive message here is that much of the hard work that’s been done to migrate to DevOps practices (the harmonious union of Developers & Operations teams working towards shared goals) can be reused in the realm of cloud provisioning. His reference relates to methodologies and practices including Infrastructure-as-Code (IaC) and Continuous Integration & Continuous Deployment (CI/CD) today, which are argued to be essential for repeatable deployments – and, in the distributed environments that typify enterprise cloud, it simply means that you have more targets.
“There are a number of areas where distributed cloud provisioning is different to today’s centralized models. Day 2 operations for distributed deployments need additional tools. Tools for observability need to be extended across a fleet of locations. Networking and load balancing need to be reconsidered. If a service is not just distributed but needs to be deployed across multiple clouds, then the tooling needs to enable this,” advised Jenkins. “In order to do this properly, then you also need to understand what your trying to achieve with a distributed architecture. Is it performance? Privacy? Regulatory compliance? Sustainability goals?”
Jenkins and the Akamai team say that a good platform engineering team should ideally be governing workload placement to meet these particular needs and remove any toil from the application developers. Developers need to focus on composable architectures so that the right services can be placed across the distributed fleet.
Cloud Security Posture Management
What all this brings us to is the notion of Cloud Security Posture Management (CSPM).
Saumitra Das, vice president of engineering at Qualys thinks that we (as a technology industry) have become better at security around cloud provisioning from the time that Amazon S3 buckets (a public cloud storage resource on AWS) being public was so common. “There are several more ways to deploy this now, with more and more automation possible to make this a scalable process. Tools like the Center for Internet Security (CIS) benchmarks help maintain industry standards and they can be applied at different points in the software development lifecycle. It can be done all the way shifting left by applying build time controls to Infrastructure as Code (IaC), to on the right after deployment using Cloud Security Posture Management (CSPM) tools. However, there are still too many organizations that still have deployments done from the cloud console that are hard to maintain and verify,” said Das.
The Qualys cloud engineer reminds us that most CSPM and IaC tools have these benchmarks and they are continuously updated to match the changes happening in the major clouds. There are other best practice benchmarks from each cloud provider as well. The key appears to be to prioritize, because there will always be more things to manage than any organization can can fix.
“For example, if the CIS benchmark flags 10,000 policy violations, then a firm needs to decide in which order to fix them. This can be a combination of the severity of the control, which cloud account it occurs in (production, development or QA), whether those accounts have sensitive data in them, or whether they are on resources that can access that sensitive data, whether the misconfigurations allow an external attacker to get in, etc. One key piece that is needed is to combine vulnerability in computing assets with cloud misconfigurations, because you frequently need both of these to make an attack work end to end,” explained Das.
We can automate cloud provisioning and management better by shifting left and not allowing bad configurations to go into production. We can also combine the outputs of vulnerability assessment with cloud misconfiguration assessment to fix cyber risk in the right order, taking into account the probability of attack along with impact. It’s important to automate guardrails in every new account in a centralized way, so the individual team’s skills do not create diverse attack surfaces.
What’s next for cloud provisioning?
Modern thoughts in cloud provisioning today are primarily based around Infrastructure-as-Code i.e. the ability to pre-architect many (if not all) of the lower substrate foundations that enterprise software applications need to run. Software engineers need to think about where components are placed. These thoughts then need to be engineered into declarative assets or scripts. For large distributed and dynamic architectures, this is cumbersome.
What if we could create a fabric of networking and infrastructure that can move components closer to the user or to the device dynamically. What if it could move the services based on performance, cost, privacy, regulations, or sustainability goals? This is exactly what Akamai is trying to do with its Gecko project.
“In the future, infrastructure will be self-driving and cloud provisioning will be even more automated than it is today,” concludes Akamai’s Jenkins “Components will be moved based on real-world application behavior and the goals of the organization’s own IT stack. Infrastructure will not be deployed as code separate from the application; infrastructure will be inferred based on the code and profile of the application, it’s data, its services and its users.”
In other words, cloud provisioning has been a chore at times in the past, but in our increasingly automated infrastructure future, cloud provisioning will have been provisioned and provided for.
Read More