Cloud Foundation - Azure Landing Zone

Cloud Foundation – Azure Landing Zone

Published On: 30/04/2023 Author: MKK

Cloud Foundation – Landing Zone design

Cloud computing is becoming an increasingly strong enabler of efficiency, agility, and quick innovation as a result of the expanding breadth and depth of cloud services. Building a fundamental cloud environment, however, involves decisions to be made across a variety of cloud products, services, and solutions, as well as those offered by cloud partners. Customers are looking for direction to help them set up and operate an environment that is compatible with their IT practices, empowers their builders and operators, and satisfies their governance needs. The vendor should give this instruction.

This article presents a guided path strategy to assist customers in designing and evolving their Cloud environment based on a consolidated collection of definitions, scenarios, advice, and automations. The approach is introduced here as part of this article. Considerations regarding people, processes, and technologies are all incorporated into the strategy for developing a cloud environment.

How can CSP assist cloud adoption?

To support Cloud adoption, AWS has a phenomenal framework called the Cloud Foundations Framework through which we have a foundational set of capabilities that enable us to deploy, administer, and govern your workloads in order to facilitate cloud adoption. A capability consists of a definition, scenarios, guidance, and supporting solutions for establishing and managing a particular portion of a cloud environment. Capabilities are designed to integrate with your technology ecosystem as a whole.

AWS has defined 29 capabilities that encompass six categories to help you establish a cloud foundation, as depicted in the image below: Apart from that, AWS created a couple of articles and a whitepaper about how to use these resources for a real-time deployment scenario as well. Here are the links to the Cloud Foundations Framework Overview and for in-depth technical information, see the Establishing your Cloud Foundation on AWS whitepaper.

On the other side, with Microsoft Azure, a strategy that is more or less identical has been pursued, and frameworks that are termed the Cloud Adoption Framework and the Well-Architected Framework have been established. In addition to these, there are several third-party consulting businesses like Nord Cloud and Dexmach that have established accelerators based out of CAF, termed the Azure Cloud Foundation. Both of these accelerators follow their own unique deployment approaches.

To this point, we have seen what is natively accessible from the top two public cloud vendors, AWS and Microsoft; however, when it comes to real-time deployment, the real fun begins. Let’s put all of them to work and construct a real-time cloud foundation project on either Amazon Web Services or Microsoft Azure.

Landing Zone Design – Azure

We can say that a Landing Zone is a baseline of standardized infrastructure that is required for cloud-based operations. It is a common foundation, or a city plan, in terms of networking, governance, logging, and auditing, as well as security, and it is built on a common set of best practices and regulations, as well as standards, and it is centrally administered. There are four pillars that make up landing zones. The LZ simplifies the process by which cloud resources are accessed based on the access levels and actors that have been set.

Management Group & Subscription, Network & Subnets, Identity & Access Management, and Security & Compliance are the four pillars that are commonly defined as being the foundation of the Landing Zone as far as Azure Public Cloud is concerned.

Constructing the Azure Landing Zone so that it conforms to the Frameworks that were mentioned earlier.

Because of the rapid pace of digital change, businesses need a fresh strategy for developing and delivering information technology services. Utilizing the public cloud is an essential component of this, but in order to derive the most value possible from using the public cloud, you will need to adopt new ways of working and establish a robust foundation upon which to build and scale.

When laying these foundations, businesses frequently run into difficulties with Landing Zone implementations, including the following:

Time and money: The implementation of a Landing Zone that has been thoughtfully designed can take many months, and it can be expensive to learn on the job.

Governance complexity: Beginning on a small scale and expanding eventually results in increased complexity and higher operational costs.

Security Challenges: Security and guardrails need to be at the forefront of design; otherwise, it will not be maintainable at scale.

Automation is an afterthought If automation is not at the core of design, it will hinder progress down the line.

I’ve established an approach for implementing public clouds that is resilient, scalable, and secure; I call it the Azure Cloud Foundation Framework, which is an extract and essence of CAF, ALZ, and ESLZ. This methodology is based on my experience and the things I’ve learned from working on more than 27+ public cloud transformation projects in the past several years. It is a clever approach to laying the groundwork for the adoption of cloud computing in terms of the technology, organization, and governance involved, as well as the following means:

The foundation will be ready to use in one to three weeks and can be modified and expanded.

Governance at scale requires automation of subscriptions, networking, policies, auditing, & logging.

Enterprise-ready security, based on best practices blueprints and guardrails, requires built-in security at scale for centralized requirements management.

Automation is needed to run a fast growing cloud application estate efficiently and cost-effectively.

To create a production-ready cloud infrastructure, you must make many considerations while establishing a cloud adoption strategy. Early decisions can affect your ability to improve and scale your ecosystem. Complexity has driven clients to seek prescriptive assistance across a variety of services that might establish a foundational environment.

Where to start the project?

The foundational components of an Azure Cloud Foundation project include essential elements that form the basis of your cloud environment. These components are designed to provide a stable and secure platform for building, deploying, and managing your applications and services. Here are some key foundational components: We must invest sufficient effort in each component to build a solid foundation.

Management Group: An Azure Management Group is a hierarchical container within the Azure Management Hierarchy that helps you manage access, policies, and compliance across multiple Azure subscriptions. It serves as a way to apply governance and management controls at a higher level, making it easier to manage large-scale Azure deployments.

Azure Subscription: is the billing and management entity in Azure. It’s where you allocate resources, manage access, and control costs. Consider organizing your subscriptions based on business units, projects, or environments.

Resource Group: Resource Groups help you logically organize and manage related resources together. They provide a way to apply policies, manage access, and track costs for a specific set of resources.

Network & Subnets:

Virtual Network (VNet): A VNet is a logically isolated network within Azure. It allows you to segment resources and control communication between them.
Subnets: Subnets are subdivisions of a VNet that enable you to further isolate and manage resources. Virtual Network Segmentation Patterns, Topology considerations, Virtual WAN etc.,
Network Security Groups (NSGs): NSGs control inbound and outbound traffic to resources based on rules you define.
Azure Firewall: Provides centralized network security and protection for resources within VNet.
Load Balancers: Distribute incoming network traffic across multiple resources for improved availability and performance.
Bastion Host: Deployment patterns, where to deploy (subscription), what tools to be deployed?
Other Connectivity: On-premise connectivity considerations, DC DR, BCP considerations, etc.,

Identity & Access Management:

RBAC Design principles considerations and recommendations
Role-Based Access Control (RBAC): RBAC defines roles and permissions to control who can perform specific actions on Azure resources.
Service Principals: Service principals are used to authenticate applications, services, and automation scripts.
Identifying the personas and their day in life scenarios considerations, PIM considerations etc.,

Azure Storage:

Azure Storage Accounts: These provide scalable and durable cloud storage for various data types, including blobs, files, tables, and queues.
Azure Managed Disks: Managed Disks offer durable and scalable disk storage for virtual machines.

Logging and Monitoring::

Azure Monitor: Collects and analyzes telemetry data from Azure resources, enabling performance monitoring and troubleshooting.
Azure Log Analytics: Centralizes log data and offers advanced querying and analysis capabilities.

Security and Compliance:

Azure Security Center: Provides advanced threat protection across Azure resources and helps implement security best practices.
Azure Policy: Defines and enforces policies for resource configurations and compliance. Policy as a Code considerations using Git (Hub/Lab)
Encryption: Implement encryption for data at rest and in transit using Azure services.

Automation and Deployment:

Azure Resource Manager (ARM) Templates: ARM templates define infrastructure as code, allowing for consistent resource deployments.
Azure Automation: Automate management tasks using runbooks and scripts.

DevOps Tools:

A distributed version control system widely used for source code management and collaboration.
Configuration Management and Infrastructure as Code for building, changing, and versioning infrastructure safely and efficiently.
Continuous Integration (CI) and Continuous Deployment (CD) that automates the building, testing, and deployment of code.

Cost Management (FinOps):

Azure Cost Management and Billing: Monitor and control cloud spending through budgeting, cost analysis, and reporting.
Resource Inventory Management: Visibility and configuration of cloud-based IT-level service or workload resources. A bigger IT-level system of record can track environment resources and configurations (e.g., CMDB for ITSM-managed environments) to provide visibility and configuration management of all cloud software, hardware, and other resources.
Records Management: allows you to establish data retention according to internal policy and regulatory needs, including how to archive data before deletion. Financial records, transactional data, audit logs, business documents, PII, and other retention-controlled data may be included.

Backup and Disaster Recovery:

Azure Backup: Provides data protection for virtual machines, files, and applications.
Azure Site Recovery: Enables disaster recovery planning by replicating workloads to another Azure region or on-premises datacenter.
Third Party backup solution: Any existing 3rd party backup solution can be leveraged

Governance and Compliance:

Azure Policy and Blueprints: Define and enforce governance standards and best practices across your organization.
Regulatory Compliance: Implement configurations to meet industry-specific compliance requirements.

These foundational components lay the groundwork for creating a stable, secure, and manageable cloud environment in Azure. They provide the building blocks necessary for your applications and services to thrive while maintaining control, security, and efficiency.

Management Group & Subscription

If your company has a large number of Azure subscriptions, you may be in need of an effective method to manage the access policies, compliance requirements, and other aspects of those subscriptions. Management groups offer a governance scope that is additional to that of subscribers. You will organize subscriptions into management groups, and the governance conditions that you provide will propagate throughout all connected subscriptions via inheritance.

Regardless of the kinds of subscriptions you may have, the use of management groups grants you access to enterprise-level management on a scalable basis. On the other hand, each and every subscription that is part of the same management group is required to trust the identical Azure Active Directory (Azure AD) tenant.

You may, for instance, add policies to a management group in order to restrict the regions that are accessible for the formation of virtual machines (VMs). Only authorized regions will be able to create virtual machines if this policy is put into effect, as it will be applied to all nested management groups, subscriptions, and resources.

To organize resources into a hierarchy for the purpose of unified policy and access control, you are able to construct a configurable structure using management groups and subscriptions. The diagram that follows provides an illustration of one method of constructing a hierarchy for governance by making use of management groups.

Within the management group that is referred to as “Corp,” you have the ability to construct a hierarchy that implements a policy, such as one that confines virtual machine placements to the Western United States. This policy will be passed down to all of the Enterprise Agreement (EA) subscriptions that are offspring of that management group, and it will apply to all of the VMs that are covered by those subscriptions. The owner of the resource or subscription will not be able to make any changes to this security policy, which will result in improved governance.

You are requested to have a Security subscription separately to isolate security workloads if you desire a large expansion plan in the Security domain in the near future or in the long run, but it is just an optional step. This is another key point that should be considered, so Create a Security Management Group within the Platform Management Group and a Security subscription underneath the Security Management Group as shown below.

It is essential to keep in mind that Microsoft has various pricing structures in place for non-production and production workloads. Because of this, it is best to keep your non-production and production subscriptions separate in order to cut down on the amount of money spent on these Non-Production workloads. So the recommendation from my side is to create a Management group under the BU name and create three distinct subscriptions underneath per application, as shown in the below picture.

Azure offers several consumption models that allow you to pay for and use cloud services based on your specific needs and preferences. These models provide flexibility and cost optimization for various types of workloads. Here are some of the key Azure consumption models:

*Consumption Models*	*Description*
Pay-As-You-Go (PAYG):	You are billed for the resources you use on an hourly or per-minute basis. This model offers flexibility and scalability, making it suitable for short-term projects, development, and testing.
Azure Reservations:	Allows you to pre-purchase resources like virtual machines or databases for a one- or three-year term. This model can provide significant cost savings compared to PAYG rates, making it ideal for predictable workloads.
Azure Spot Virtual Machines:	Azure Spot VMs provide access to unused Azure capacity at significantly reduced prices compared to regular VMs. However, resources may be reclaimed if capacity is needed elsewhere.
Azure Dev/Test Pricing:	Azure offers discounted rates for virtual machines used in development and testing environments. This pricing model helps reduce costs for non-production workloads.
Azure Hybrid Benefit:	The Azure Hybrid Benefit allows you to apply on-premises Windows Server or SQL Server licenses toward virtual machines in Azure. This can result in cost savings by using existing licenses.
Enterprise Agreements (EAs):	Enterprise Agreements are long-term agreements that offer discounts for organizations with larger-scale deployments. EAs provide custom pricing and flexible payment terms.
Azure Marketplace and AppSource:	Azure Marketplace and AppSource offer various third-party solutions that you can purchase or subscribe to on top of your Azure resources.
CSP (Cloud Solution Provider) Program:	CSP partners offer Azure services along with their managed services and support as part of a single solution. This model is suitable for businesses looking for a fully managed Azure experience.
Reserved Capacity for Software:	Azure offers reserved capacity for software products like SQL Database or Cosmos DB, allowing you to commit to a specific amount of resources at a discounted rate.
Azure Kubernetes Service (AKS) Consumption:	AKS provides serverless Kubernetes, and you pay only for the worker nodes you use, making it cost-effective for containerized workloads.

When choosing an Azure consumption model, consider your workload requirements, budget constraints, and long-term plans. Each model offers different advantages and trade-offs, so it’s important to select the one that aligns with your organization’s needs. Azure’s flexibility allows you to switch between consumption models as your needs evolve, enabling you to optimize costs and performance over time.

Summing Up: Key Takeaways from Our Foundation Journey:

Are you saying that the showcased Management Group’s organizational structure is unchanged?
I didn’t state that; rather, I meant that since this structure follows Microsoft best practices, it is preferable to utilize it.

What is that Landing Zone A1 & A2 subscription created under Corp Management Group?
Those are all just an examples of Microsoft’s best practices, so these subscriptions are not actually created under the Corp Management group.

How did you come to the conclusion that each application requires three separate subscriptions?
According to my past experiences, this model is capable of meeting the needs of all teams, including FinOps. There is an alternative model, such as “single subscription for all BU applications,” however using that model will result in an increase in complexity and overhead.

Are there any other factors that can be considered for grouping subscriptions?
Yes there are plenty of factors that can be considered, they are “Organizational Unit”, “App Category”, “Location”, “Restriction Level”, “App Stack”, and “Usage Pattern”

Why do Dev and Test have different subscriptions instead of one?
It doesn’t matter from an architectural standpoint because it assists with billing, but this method is ideal from a CI/CD perspective.

What consumption model are you suggesting?
It depends on your organization’s FinOps strategy; however, I always prefer the Enterprise Agreement (EA) for a large environment because it will save you money in the long run with broader service limits.

Why are you saying that “Security” Management group is an optional not mandatory?
This is because we have another group called “Connectivity” where we can control all of these security services, and we rename this group “Network & Security.” However, if you are comfortable keeping all security resources such as Firewall, Key vaults, DDoS, DNS, LBs in a separate management group, that is fine, please go ahead and create a separate management group named “Security” underneath “Platform” management group

Does this management group and subscription supported in all consumption models?
Management groups aren’t currently supported in Cost Management features for Microsoft Customer Agreement (MCA) subscriptions.

Can you elaborate about the list of roles and Management Group access, and inheritance?
Certainly, I will elaborate on the same subject in the RBAC section; for the time being, it will just cause confusion.

Resource Group

Azure Resource Groups are containers that help you manage and organize related Azure resources. When working with Resource Groups, there are several important considerations to keep in mind to ensure efficient resource management, security, and compliance. Here are some key considerations:

Logical Grouping:

Group resources that are related to a specific application, project, or environment within a single Resource Group.
Avoid mixing resources from different applications or purposes within the same Resource Group.

Resource Tagging:

Implement tagging for resources within a Resource Group.
Tags can provide additional context, aid in resource management, and support cost allocation.

Resource Policies:

Define and enforce policies specific to your Resource Groups using Azure Policy.
Policies can help ensure compliance with standards and best practices.

Resource Permissions:

Apply appropriate role-based access control (RBAC) to Resource Groups to control who can manage or access the resources.
Consider assigning permissions at both the Resource Group level and the resource level as needed.

Resource Group Limits:

Understand the limits and quotas for Resource Groups within your subscription and region.
Plan your resource allocation accordingly to avoid hitting these limits.

Resource Group Locking: There are two types of resource locks: Delete and Read-Only.

Delete Lock: This prevents the resource from being deleted, but other modifications are allowed.
Read-Only Lock: This prevents all types of modifications to the resource, including both read and write actions.

Monitoring & Alerting:

Implement monitoring and alerting for resources within a Resource Group.
Use Azure Monitor to track performance, detect issues, and set up alerts.

There are many other factors to take into account, such as “Resource Naming Conventions“, “Resource Group Structure“, “Resource Dependency“, “Resource Lifecycle Management“, “Resource Movement“, “Resource Group Cleanup“, “Resource Group Templates“, and “Cross-Resource Group Dependencies“, but it’s possible that these things won’t be necessary until later in the design phase of the project.

I would strongly suggest making use of Azure Resource Locks, as they help prevent accidental deletions or modifications to critical Azure resources. When applied, resource locks prevent users from performing certain actions on the locked resources, helping to maintain the integrity and stability of those resources.

Summing Up: Key Takeaways from Our Foundation Journey:

What is an Azure Resource Group
An Azure Resource Group is a logical container that holds related Azure resources for an application or a solution. It helps manage and organize resources, making it easier to monitor, manage, and apply policies across those resources collectively.

What’s the purpose of using Azure Resource Groups?
Azure Resource Groups provide several benefits, including: Logical grouping, Resource management, Access control and RBAC, and Monitoring and billing

What’s the relationship between resources and resource groups?
Resources (like virtual machines, storage accounts, databases, etc.) are individual entities you create within a resource group. A resource group can contain multiple resources, and these resources can be of different types.

Can I move resources between different resource groups?
Yes, you can move resources between resource groups or even between subscriptions, as long as the resources support the move operation and both the source and target locations are in the same Azure region.

What’s the role of resource group tags?
Resource group tags are metadata assigned to resources within a resource group. They help categorize and track resources for purposes like billing, monitoring, and management. Tags can be used to organize resources based on criteria important to your organization.

How can I apply access control to a resource group?
Azure Role-Based Access Control (RBAC) allows you to control who can do what within your resource group. You can assign roles (like Owner, Contributor, or Reader) to users, groups, or applications at the resource group level to manage access to its resources.

What happens if I delete a resource group?
Deleting a resource group will also delete all the resources contained within it. This action is irreversible, so be careful when deleting resource groups.

Can I deploy resources from Azure Marketplace directly into a resource group?
Yes, you can deploy various solutions and resources from the Azure Marketplace directly into a specific resource group. This simplifies the deployment process and ensures the resources are placed where you want them.

How do I manage policies for a resource group?
Azure Policy allows you to enforce organizational standards and policies across your resources. You can apply policies to a resource group to ensure compliance and governance.

Network & Subnets

Azure Networking is a crucial component in which we are going to decide and finalize the vNet model, Segmentation, NSGs, Firewall, Load Balancer, Gateways, ExpressRoute Connectivity, DNS, Peering, Resilience, security, isolation, scalability, connectivity to Services, and Hybrid Scenarios

To begin, let’s begin with the design of the vNet: As I have already mentioned in the design diagram that is located above, every management group possesses a subscription, and as the graphic demonstrates, every subscription possesses at least a subnet. Please be aware that Azure requires a minimum of five IP addresses in each subnet for administration purposes. Because of this, you will need to keep this need in mind while creating the IP address schema and the connectivity design. Additionally, you will need to determine how many IP addresses and how many subnets are required for each subscription.

Let’s bring the above diagram here to examine it carefully, and figure out what all our vNets are.

I have bordered all of our possible virtual networks in the above diagram. Depending on the number of IP addresses and subnets requirement, we are able to partition it as well. Therefore, in our situation, we require a virtual network for the Identity subscription, a virtual network for the Management subscription, a virtual network for the Connectivity subscription, and three virtual networks for the Development, Test, and Production subscriptions per application. If your BU1 has one more application named App02, then we may need to create three more subscriptions as follows: Dev-Sub-02, Test-Sub-02, and Prod-Sub-02, Naming conventions are up to your organization’s standards. In our case, I’ve used App01 and App02, which represent Application 01 and Application 02 respectively.

As it was previously said, the Development and Testing networks do not need to be segmented and can instead have a flat structure. This is because even in the worst-case scenario (an attack), that would not have an effect on any production; nonetheless, the decision ultimately rests with your organizational strategy and standards.

However, in order to fulfill the requirements of the production subscription, it needs to be segmented by how many mask bits are based on the IP addresses and subnet requirements. Suppose the Production app is running in a 3-tier model, then there must be 3 segments, and let’s assume each segment needs at least 25 usable IPs with future growth, then we need to be segmented with /27 mask bits, so the goal is to support the numbers of IP addresses with a restricted broadcast domain, and it doesn’t matter how many subnets this /27 mask bit supports.

Do I need to worry about Network Topology?

Not at all, it is immediately obvious that a significant amount of connectivity is terminated in the primary area, and from there, it will link to a variety of legs, including vNets, the BC/DR region, the Internet, on-premise connectivity, ExpressRoute, Site-to-site VPN connectivity, end-users’ landing points, and so on and so forth. Therefore, it is quite evident to us that the landing point is going to be the Hub location, and all of the legs that were indicated above are going to be spokes, so the connectivity model is a hub-and-spoke.

We have a ready-made networking service called Azure Virtual WAN that provides a unified management interface for many different features related to networking, security, and routing. In addition to its capacity for connectivity, it is also capable of supporting forward-looking technologies such as SD-WAN and VPN CPE.

There will be one more Virtual WAN Hub included in the connectivity subscription of its DR Region because the DR will be placed in the closest region on the same continent as the primary location. This configuration will be an exact replica of the primary location’s. We have no problems connecting Region 01 Hub to Region 02 Hub; this type of connectivity is known as “Hub-to-Hub connectivity” in Microsoft lingo.

Any connection to any other network can be made using the global transit network architecture’s virtual WAN hubs. Because of the architecture’s design, the necessity for full mesh or partial mesh communication between spokes, which is more difficult to construct and maintain, is eliminated or greatly reduced. In addition, routing control is simpler to setup and keep up-to-date in hub-and-spoke networks as opposed to mesh networks.

As per Microsoft: Any-to-any connection enables an organization to connect its internationally dispersed users, branches, datacenters, virtual networks (VNets), and apps to one another by way of the “transit” hub(s). This is possible within the context of a global architecture. The Azure Virtual WAN serves as the international communications backbone.

Load-Balancing Services

As we know, load balancing distributes workloads over many computing resources. Load balancing optimizes resource utilization, throughput, response time, and does not overload any resource. By distributing workloads among redundant computing resources, it can boost availability.

Azure offers load-balancing services to disperse workloads across many computer resources in two dimensions: global versus regional and https versus non-https. These resources include:

Azure Front Door	Global	HTTP(S)	Layer7
Azure Traffic Manager	Global	Non-HTTP(S)	DNS (GTM)
Azure Application Gateway	Regional	HTTP(S)	Layer7
Azure Load Balancer	Regional or Global	Non-HTTP(S)	Layer4

Microsoft provides the following flowchart which helps us to choose an application load-balancing solution. The flowchart helps you through crucial decision factors to make a suggestion.

Start with this flowchart. Every application has different needs, therefore start with the recommendation. Then assess more thoroughly. Each workload in your application should be evaluated separately. A complete solution may include multiple load-balancing solutions.

Azure Bastion Services

Azure Bastion enables you access to a virtual machine via your browser, the Azure portal, or your local SSH or RDP client. Azure Bastion is a fully managed PaaS service you provision in your virtual network. It allows secure and smooth RDP/SSH communication to your virtual machines via Azure portal or native client over TLS. Virtual machines connected by Azure Bastion don’t need a public IP address, agent, or client software.

Bastion secures RDP and SSH for all VMs in its virtual network. Azure Bastion secures RDP/SSH access to virtual machines without exposing them to the outside world.

Azure Bastion employs an HTML5 web client that is automatically transmitted to the user’s local device. Your RDP/SSH session is encrypted using TLS on port 443. This makes it more secure for traffic to traverse firewalls. Bastion supports TLS version 1.2 and later. Previous versions of TLS are not supported.

Azure Bastion establishes an RDP/SSH connection to your Azure VM utilizing the private IP address of the VM. On your virtual machine, a public IP address is not required.

You are not required to configure any NSGs on the Azure Bastion subnet. As Azure Bastion connects to your virtual machines using a private IP, you can configure your NSGs to only permit RDP/SSH from Azure Bastion. This eliminates the need to manage NSGs whenever you require a secure connection to your virtual machines. To learn more about NSGs, please visit Network Security Groups.

Azure Bastion is a fully managed platform-as-a-service (PaaS) service from Azure that provides secure RDP/SSH connectivity. Because you do not need to expose the VMs to the internet, they are protected against port scanning by malicious and errant users.

Azure Bastion resides at the perimeter of your virtual network, eliminating the need to safeguard each VM in your virtual network. The Azure platform protects against zero-day exploits by hardening and updating Azure Bastion on your behalf.

There are three distinct ways we can deploy bastion hosts:

Single instance of Bastion host across all applications
A single bastion is used to connect to all applications and environment networks, resulting in the lowest cost and best user experience. Only two Bastion hosts are sufficient for both region, but we will have connectivity throughout all resources will be a negative stuff (~$140 to $210 per month per Bastion)

Single Bastion per project environment for each region.
Network configuration is limited to a single project environment among all applications in a region. Six Bastion hosts are required if there are three project environments per application (Dev, Test & Prod) per region. (~$140 to $210 per month per Bastion)

Single bastion host per application per environment per region
Highly secured however it is most expensive, poor user experience and high overhead and highly complexed scenario (~$140 to $210 per month per Bastion)

Azure Bastion enables manual scalability of hosts. You can configure the number of host instances (scale units) to manage the number of concurrent RDP/SSH connections supported by Azure Bastion. Azure Bastion can manage more concurrent sessions when the number of host instances is increased. Reducing the number of instances reduces the number of supported concurrent sessions. Azure Bastion supports a maximum of fifty host instances. This capability is exclusive to the Azure Bastion Standard SKU.

Summing Up: Key Takeaways from Our Foundation Journey:

You are saying that rule out all the topology possibility and straight-away use Hub & Spoke?
In contrast, if you configured Mesh, Star, or Full Mesh, we would not be able to control and the creation of high-complex NSGs for intercommunication will become nightmare when we grow

You never evaluated the popular alternative Route Server before choosing Virtual WAN, why?
That’s a good question Peter and Kevin, I’m starting this foundation project from scratch (greenfield) also I want to simplify and streamline my network management, so I’ve chosen Azure Virtual WAN.

So, if it is not greenfield project you would have evaluated Route Server as well correct?
Absolutly yes, If our setup requires a classic hub and spoke architecture and I want to use Network Virtual Appliances which do not support active / active and easy scaling, I would have used Route Server.

Why can’t traffic manager or any 3rd party virtual appliance as an option here?
Traffic manager could be an option provided it supports Hub & Spoke architecture, but it is meant for DNS-based traffic routing service and can efficiently be used when active/active or twin DC/DR scenarios

What is your personal choice of Azure Bastion Host? and why?
Single Bastion per project environment for each region. Though we are paying more for the setup but worth it from security perspective. Why? because in the other options discussed, resources must hop between environments, which is highly vulnerable and carries the risk of credential theft, which can be easily exploited to intrude

If I have my own license for a 3rd Party virtual load balancer such as F5, NGINX, can I use it?
Yes, Azure provides support for various third-party networking appliances and virtual appliances, allowing you to integrate them into your Azure infrastructure to enhance network functionality, security, and load balancing.

At this point, you should have a solid understanding of the overall design principles in the Azure Cloud Foundation project you’re working on or plan to work on.

Identity & Access Management

Identity Access Management, also known as IAM, refers to the specifics of how RBAC is put into action. Role-Based Access Control is exactly what RBAC stands for.

IAM refers to the management of roles as well as the assignment of Privileges and Identities, and it is found within the console of a cloud provider. IAM stands for identity access management, and it is the set of technologies that gives you the ability to build up critical preventive policies that contribute to preventing security events from occurring.

How Azure RBAC works?

Through the use of Azure RBAC, you can manage who has access to which resources by assigning Azure roles. It is essential that you have a solid grasp of this idea since it explains how permissions are administered. There are three components that make up a role assignment: the security principal, the job definition, and the scope.

Security Principal: Security Principal: In-short: is an Azure Object (identity) that can be assigned to a role (ex: users, groups or service principal or applications). To remember it easy: Who can be done?

A security principal is an object that represents that managed identity. Any one of these security principals can have a role assigned to them by the user.

Role Definition: In-short: is a collection of actions that the assigned identity will be able to perform. To remember it easy: What can be done?

Azure includes several built-in roles that you can use. For example, the Virtual Machine Contributor role allows a user to create and manage virtual machines. If the built-in roles don’t meet the specific needs of your organization, you can create your own Azure custom roles. This video provides a quick overview of built-in roles and custom roles.

Azure has data actions that enable you to grant access to data within an object. For example, if a user has read data access to a storage account, then they can read the blobs or messages within that storage account. For more information, see Understand Azure role definitions.

Scope: In short: one or more Azure resources that the access applies to. To remember it easy: Where can it be done?

A scope can be specified on four different levels in Azure: the management group, the subscription, the resource group, or the resource itself.

The structure of scopes can be thought of as a parent-child connection. At any of these levels of scope, you have the ability to designate responsibilities.

Role Assignments: In-short: Its a combination of the role definition (what can be done?), Security Principal (Who can be done?) and the scope (where can it be done?).

The creation of a role assignment is required to provide access, whereas the removal of a role assignment is required to revoke access.

The following diagram shows an example of a role assignment. In this example, the Marketing group has been assigned the Contributor role for the pharma-sales resource group. This means that users in the Marketing group can create or manage any Azure resource in the pharma-sales resource group. Marketing users do not have access to resources outside the pharma-sales resource group, unless they are part of another role assignment.

You can assign roles using the Azure portal, Azure CLI, Azure PowerShell, Azure SDKs, or REST APIs.

To Summarize RBAC:

RBAC is an authorization system built on Azure Resource Manager (ARM)
Designed for fine-grained access management of Azure Resources
Role assignment is combination of

Role definition – list of permissions like create VM, delete SQL, assign permissions, etc
Security Principal – user, group, service principal and managed identity and
Scope – resource, resource groups, subscription, management group

Hierarchical

Management Groups > Subscriptions > Resource Groups > Resources

Built-in and Custom roles are supported

Privileged Identity Management (PIM):

Privileged Identity Management, also known as PIM, is a service that is included in Azure Active Directory (Azure AD) that gives you the ability to manage, regulate, and monitor access to significant resources within your company.

What does it do?

Privileged Identity Management provides time-based and approval-based role activation to mitigate the risks of excessive, unnecessary, or misused access permissions on resources that you care about. Here are some of the key features of Privileged Identity Management:

Provide just-in-time privileged access to Azure AD and Azure resources
Assign time-bound access to resources using start and end dates
Require approval to activate privileged roles
Enforce multi-factor authentication to activate any role
Use justification to understand why users activate
Get notifications when privileged roles are activated
Conduct access reviews to ensure users still need roles
Download audit history for internal or external audit
Prevents removal of the last active Global Administrator and Privileged Role Administrator role assignments

Find out Eligible Assignments accounts:

Conduct and participate in a brainstorming session to come up with a list of the types of accounts that are permanently eligible to make requests for elevated roles in order to carry out their regular responsibilities.

When the list is finished being compiled, you should sign in as a global administrator and complete the tasks of assigning appropriate roles to all of the accounts that were chosen. Please note: Ensure that MFA is enabled prior in notice.

Summing Up: Key Takeaways from Our Foundation Journey:

I feel like you’ve only talked about RBAC and not how to move forward?
Is that so? But do you know what? I’ve already done what needs to be done. Because what you know about RBAC is enough, and your mind has probably already started to think about what needs to be done by now. Still not, read the remain questions below..!!

Now that I understand how RBAC functions, how can I implement it in my production?
First, you make a list of the personas (roles) based on how your current IT team set up. Then, you decide what permissions each persona should have by default and what permissions do they need to be able to do their elevated task. Don’t forget: the principle of least privilege

I have a complete list of Personas, including Cloud Architect, Engineer, Developer, Test Engineer, Security Engineer, and Data Engineer. What next?
Good to know. Now create a spreadsheet with columns for Persona, Roles and Responsibilities, Default Azure Role, Scope, PIM Eligible, Eligiblity Duration, etc.

I’ve completed the Persona spreadsheet, so I’ll assign default roles, create groups, and add the appropriate users as members to PIM-eligible role assignment?
Certainly, all you need to do is assign a least privilege permission by default and enable eligible assignments as per your spreadsheet and thats it

From the perspective of a persona, how do they invoke or activate their permissions that are assigned to them?
They may simply navigate to Azure AD Roles and check the eligible roles tab to see if there are any roles that are waiting to be activated. Upon activation they can perform their elevated administrative tasks

Would these RBAC privilege access are monitored for auditing?
Yes, Azure PIM provides robust capabilities for auditing and monitoring. It records all activation requests, approvals, denials, and access activities. This audit trail assists organizations in monitoring and reviewing the usage of privileged roles, thereby facilitating compliance and security assessments.

Security & Compliance

Utilizing Azure’s extensive security tools and capabilities is one of the most compelling arguments for deploying applications and services on the platform. These tools and capabilities facilitate the development of secure solutions on the Azure platform. Microsoft Azure ensures the confidentiality, availability, and integrity of customer data, in addition to facilitating transparent accountability.

These security services enable you to meet the security requirements of your business and safeguard your cloud-based Resource, Access, Cost, Security, and Operations (RACSO)

We will only skim the surface of each security concept that is accessible in Azure, so we won’t be able to cover all of these in depth. However, the concepts that are absolutely necessary for Cloud Foundation production will be covered in great depth throughout this article.

The security services are mapped and organized on the map according to the resources (columns) that they safeguard. Additionally, the following classifications (rows) for services are shown in the diagram:

Secure and protect – Layered, defense-in-depth services for identity, hosts, networks, and data. This set of security services and capabilities helps you assess and improve Azure security.
Services that identify potentially dangerous behaviors and make it easier to protect against them are referred to as “detect threats.”
Services that pull logging data so that you can investigate a suspicious activity and reply to it are known as investigate and respond services.

The Microsoft cloud security benchmark contains a collection of high-impact security recommendations that you may utilize to help safeguard the services that you use in Azure. These recommendations are as follows:

Controls on security – These guidelines are applicable across the board to both your Azure tenant and the Azure services you use. Each proposal includes a list of the various stakeholders who are normally engaged in the process of either designing, approving, or implementing the benchmark

Service baselines are the application of the controls to specific Azure services in order to make recommendations for the security setup of those specific services.

Policy driven Governance

Since we are already familiar with what is available with Azure Security, let’s go with our discussion of Landing Zone and delve a little more into the topic.

Defining and implementing numerous policies to ensure compliance, security, and best practices is required in order to create a well-governed and secure Cloud Foundation Landing Zone in Azure. These policies ensure that best practices are followed. Azure Policy is a service that gives us the ability to create, assign, and manage policies in your Azure environment, with the goal of enforcing rules and having impacts on the resources there. The following is a list of Azure Policies that you might want to think about implementing in our Cloud Foundation Landing Zone:

Tagging Policies: Enforce the use of specific tags on resources to aid in categorization, cost allocation, and management.

Required Tags Policy: Enforce the presence of specific tags (e.g., “Environment,” “Owner”) on resources.

Resource Naming Policies: Enforce consistent naming conventions for resources.

Naming Convention Policy: Enforce a naming pattern for resources (e.g., “prefix-environment-resourcetype-number”).

Network Security Policies: Enforce network security best practices.

Network Security Groups (NSG) Policy: Ensure that NSGs are associated with resources to control inbound and outbound traffic.
Deny Public IP Policy: Deny the creation of public IP addresses unless a specific justification is provided.

Identity and Access Management (IAM) Policies: Enforce least privilege access to resources.

Role-Based Access Control (RBAC) Policies: Ensure proper RBAC assignments to limit access based on job roles.
Deny High-Privilege Roles Policy: Deny assignments of high-privilege roles unless approved.

Encryption Policies: Enforce data encryption at rest and in transit.

Require Encryption Policy: Require resources to have encryption enabled, such as Azure Disk Encryption or HTTPS.

Compliance Policies: Enforce compliance with specific regulations or standards.

CIS Benchmark Policy: Enforce Azure CIS benchmark recommendations.
HIPAA Compliance Policy: Enforce compliance with HIPAA regulations for healthcare data.

Resource Lock Policies: Prevent accidental deletion or modification of critical resources.

Prevent Deletion Lock Policy: Prevent deletion of critical resources.

Virtual Machine (VM) Policies: Enforce VM-specific configurations.

Allowed VM SKUs Policy: Allow only specific VM SKUs based on your organization’s standards.
OS Disk Encryption Policy: Enforce encryption for OS disks of VMs.

Monitoring and Auditing Policies: Ensure proper monitoring and auditing settings.

Diagnostic Settings Policy: Enforce diagnostic settings for resources to send logs and metrics to Azure Monitor.

Service Limits and Quotas Policies: Ensure that resources are provisioned within acceptable limits.

Resource Quotas Policy: Enforce resource quotas to prevent excessive resource consumption.

These are just 10 examples from Azure Security Benchmark v3 (ASB), and the specific policies you choose to apply will depend on your organization’s requirements, industry regulations, and best practices. Azure Policy provides a wide range of built-in policies, and you can also create custom policies tailored to your organization’s needs. It’s important to regularly review and update these policies as your environment evolves.

The Azure Security Benchmark workbook is designed to enable Cloud Architects, Security Engineers, and Governance Risk Compliance Professionals to gain situational awareness for cloud security posture and hardening. Benchmark recommendations provide a starting point for selecting specific security configuration settings and facilitate risk reduction. The Azure Security Benchmark includes a collection of high-impact security recommendations for improving posture.

This workbook provides visibility and situational awareness for security capabilities delivered with Microsoft technologies in predominantly cloud-based environments. Customer experience will vary by user and some panels may require additional configurations for operation. Recommendations do not imply coverage of respective controls as they are often one of several courses of action for approaching requirements which is unique to each customer. Recommendations should be considered a starting point for planning full or partial coverage of respective requirements.

Not to worry too much; I have already compiled CIS Benchmark 1.3 in addition to Azure Security Benchmark v3 (ASB) of defender and more than 90+ Enterprise Scale Landing Zone Policies in a spreadsheet. This spreadsheet would assist any organization in meeting the majority of the typical standards for their corporate governance. You can download individually by clicking the hyperlink above and consolidate it by yourself or alternatively you can download already consolidated spreadsheet using this link straight-away.

Summing Up: Key Takeaways from Our Foundation Journey:

So you are saying that CIS Benchmark + ASB + ESLZ Policies are suffice for LZ Governance?
Certainly yes, Policy-driven Governance is a cornerstone in Enterprise-scale Landing Zone (ESLZ!). It’s possible to codify corporate, industry or country specific governance requirements declaratively using Azure Policy. ESLZ provides 90+ custom policies which help in meeting most common corporate governance requirements

Do I need to apply all the policies mentioned in your spreadsheet?
Not all policies, you find out the policies which is relevant for your Landing Zone and your workloads, not all policies are applicable.

When I bring more services or applications in my Azure environment, what shall I do?
You will have to evaluate the applied policies again and amend the existing policies if required or bring in more policies from the spreadsheet, feel free to reach me if you need any support from my end as well.

This comes an end to the Azure Landing Zone Guide. If you follow the advice in this article for your LZ projects, I can guarantee that you will never become stopped somewhere in the midst of the process, and the environment will be extremely secure. Please feel free to reach me if you need any support from my end, Thank you. Regards, MKK