Azure DevOps Self-Hosted VMSS Agents Part 2

Creating an Azure Virtual Machine Scale from an Azure Compute Gallery Image.

Overview

In Azure DevOps Self-Hosted VMSS Agents Part 1 we created a VM image using virtual-environments. We created a VM image definition and created an initial VM image version as generated by the script executions. In this post we will be creating an Azure Virtual Machine Scale Set(VMSS) based on the aforementioned VM image version.

The VMSS creation has been done using the Azure CLI. We could have done this in Terraform, Bicep etc. but as I had started the series using predominantly shell scripting I thought I would keep things consistent. You can take the essence of what is being performed and port it to your preferred method if desired.

Create Azure Networking

The creation of the VMSS has a dependency on networking in Azure. It would also be possible to use an existing resource group, vnet and subnet, but to aid with this example deployment we will create a specific network. This will be done using the same container approach used in part 1.

First run our deployment environment:

1docker run -it --rm ghcr.io/tonyskidmore/azure-tools:latest /bin/bash

Within the container clone the repository that contains the scripts that will generate our Azure resources:

1git clone https://github.com/tonyskidmore/azdo-vmss-virtual-environments.git
2cd azdo-vmss-virtual-environments

Export the necessary environment variables that the scripts use to authenticate to Azure, substituting the values below with your Azure service principal details.

1 export ARM_SUBSCRIPTION_ID=00000000-0000-0000-0000-000000000000
2 export ARM_TENANT_ID=00000000-0000-0000-0000-000000000000
3 export ARM_CLIENT_ID=00000000-0000-0000-0000-000000000000
4 export ARM_CLIENT_SECRET=AAABjkwhs7862782626_BsGGjkskj_MaGv

Launch the script to build the network resources:

1scripts/network-create.sh

By default, the following resources will be created in the target Azure subscription as part of the scripted network execution:

ResourceResources namesDescription
Resource Grouprg-azdo-agents-networks-01Resource group created by the scripts/network-create.sh script
Virtual networkvnet-azdo-agents-01Virtual network created by the scripts/network-create.sh script
Subnetsnet-azdo-agents-01Network subnet created by the scripts/network-create.sh script

The above values can be changed as well as other settings by exporting environment variables prior to running the script.

Environment variableDefault valueDescription
AZ_NET_RESOURCE_GROUP_NAMErg-azdo-agents-networks-01Resource group created by the scripts/network-create.sh script
AZ_LOCATIONuksouthLocation for all created resources
AZ_NET_NAMEvnet-azdo-agents-01Virtual network created by the scripts/network-create.sh script
AZ_NET_ADR_PREFIXES172.16.0.0/12Virtual network prefix created by the scripts/network-create.sh script
AZ_SUBNET_NAMEsnet-azdo-agents-01Network subnet created by the scripts/network-create.sh script
AZ_SUBNET_ADR_PREFIXES172.16.0.0/24Network subnet prefix created by the scripts/network-create.sh script

PATH issue

There have been issues raised in relation to the Linux $PATH environment against both the virtual-environments #3695 and azure-pipelines-agent #3461 that are now both closed but still unresolved (at time of writing). The issue is described in detail in both of these links but basically when using the Linux image generated by virtual-environments with a VMSS the $PATH is only a subset of what it should be, for example when compared against a Microsoft hosted-agent using the same image. The result is that not all tools will execute because they are not found in the $PATH e.g. ansible.

I experimented with a few things to see if I could resolve this issue without major modifications to the image generation process or by having to create an alternate image. I thought that maybe using cloud-init would allow me to make some modifications that would fix the problem. Initially it appeared to work using this custom-data script (which I retained for information purposes) but it proved to be unreliable after more in depth testing.

What does seem to have fixed the issue, at least with the symptoms I have been seeing, is to run a script prior to the deployment of Azure Pipelines agent extension, as described in Customizing Virtual Machine Startup via the Custom Script Extension. I have configured this during the scripted VMSS deployment. In my testing so far it has proven to be reliable and fixes the issue I had encountered. I would be interested in receiving feedback on how this approach works for others.

Create Azure VM Scale Set

Once the networking dependency has been addressed we can move on to create the VMSS. The container workflow is the same as the above, we just need to specify the relevant script:

1scripts/vmss-create.sh

Note: The --generate-ssh-keys and --authentication-type SSH options have been included on the az vmss create command but no provision has been put in place in the scripts currently to save the generated keys, as this is not a requirement for this walk through.

Created Azure Resources

By default, the following resources will be created in the target Azure subscription as part of the scripted execution:

ResourceResources namesDescription
Resource Grouprg-vmss-azdo-agents-01Resource group where the Virtual machine scale set will be located
Virtual machine scale setvmss-azdo-agents-01The Virtual machine scale set linked to our virtual-environments Ubuntu 20.04 image

Many of the values that control the VMSS creation can be changed by exporting environment variables prior to running the script.

Environment variableDefault valueDescription
AZ_ACG_NAMEacg_01The Azure Compute Gallery that contains the VM image version
AZ_ACG_RESOURCE_GROUP_NAMErg-ve-acg-01The resource group that contains the above Azure Compute Gallery
AZ_ACG_DEFubuntu20The VM image definition name
AZ_ACG_VERSION1.0.0Image version to use
AZ_NET_RESOURCE_GROUP_NAMErg-azdo-agents-networks-01Resource group where the Virtual network is located
AZ_NET_NAMEvnet-azdo-agents-01Virtual network where the VMSS will be located
AZ_SUBNET_NAMErg-azdo-agents-networks-01Subnet where the VMSS will attached
AZ_VMSS_RESOURCE_GROUP_NAMErg-vmss-azdo-agents-01Resource group where the VMSS will be located
AZ_VMSS_NAMEvmss-azdo-agents-01Name of the VM Scale Set to be created/updated
AZ_VMSS_VM_SKUStandard_D2_v3VM SKU for the instances in the VMSS
AZ_VMSS_STORAGE_SKUStandardSSD_LRSStorage SKU for the instances in the VMSS
AZ_VMSS_ADMIN_NAMEadminuserAdmin username of the created instances
AZ_VMSS_INSTANCE_COUNT0The default VM instance count. Setting at 0 means instances will only be created on demand
AZ_VMSS_MANAGED_IDENTITYtrueWhether to create an Azure AD Manage Identity (MI) for the VMSS
AZ_VMSS_CREATE_RBACtrueAs part of the VMSS creation to assign Contributor and User Access Administrator roles to the subscription for the above MI
AZ_VMSS_CUSTOM_DATAtrueWhether to supply custom data for cloud-init. Looks for the file scripts/custom-data.sh if enabled
AZ_VMSS_EXT_CSEtrueWhether the Custom Script Extension should be activated
AZ_VMSS_EXT_CSE_URIsee belowThe location of the Custom Script Extension file to download
AZ_VMSS_EXT_CSE_CMD./cse-vmss-startup.shThe name of the Custom Script Extension file to execute

Default value for AZ_VMSS_EXT_CSE_URI: https://raw.githubusercontent.com/tonyskidmore/azurerm-vmss-cse/main/cse-vmss-startup.sh

You are now ready to move on to Creating an Azure DevOps Virtual Machine Scale Self-Hosted Agent Pool!