We believe empowering engineers drives innovation.

Pairing With AI: Infrastructure as Code

By Kevin Formsma
June 22, 2023

Introduction

Tools to improve the development experience have existed for decades. From syntax checking to code formatting, refactoring to auto-completion, developers working within an IDE frequently make use of extensions to provide additional functionality which will improve their productivity. These development extensions have recently been enhanced by the possibilities driven by AI, to provide better recommendations to developers. As infrastructure engineers, let’s explore some of those new tools and how they interact with Infrastructure-as-Code (IaC) tasks.

Pairing Partners

Pair programming is a well established practice for two individuals to synchronously work on a piece of code in a collaborative manner. One developer usually drives while the other observes and provides real-time feedback, suggestions, and debugging. Despite the benefits, challenges exist for pair programming. This is where AI tools come in. As an AI pairing partner, developers don’t need to coordinate schedules for synchronous time, manage social dynamics or awkward breaks. The AI partner takes the observer role, gathering information from the context of the current codebase and providing suggestions on what comes next.

For this coding exercise, we will look at two AI pairing tools that integerate into VSCode. GitHub developed Copilot as one of the first AI-driven pair programming tools. Copilot allows developers to write projects with suggestions that work like auto-completion, but generated via an AI model powered by OpenAI Codex. AWS released a similar AI-powered IDE integration, CodeWhisperer. Both AI pairing tools showcase code suggestions generated from inline comment descriptions. The examples below demonstrate how this feature reacts to providing suggestions in the context of writing IaC tasks.

Examples

The example infrastructure task for our exploration is to deploy an AWS S3 bucket. This bucket should include the following explicit configuration:

This resource will be configured in three different IaC tools: CloudFormation, Terraform, and CDK. Copilot and CodeWhisperer are given the same context prompt and evaluated in each IaC tool. Let’s see what they suggest as our pairing partner.

CloudFormation

CloudFormation is represented in JSON or YAML rather than a true programming language. The AI pairing tools typically call out specific programming languages they interact with, but a lot of IaC tasks involve templating resources out in YAML rather than Python or Go. We start with an empty YAML template with a comment to prompt for creation of the S3 resource.

# AWS Cloudformation template with a S3 bucket
# configured to block public access, disable ACLs, enable SSE-KMS encryption
# Tagged with "AI" -> "HelloWorld"

Copilot

Copilot suggesting CloudFormation

Copilot is able to use the context from the comments to provide suggestions on the implementation. While it was able to create the basic template structure and add the S3 bucket complete with tags, encryption and public access blocked; it was not able to understand what was intended by “disable ACLs” in the prompt. When the developer assists to add this information, it sets the ownership to BucketOwnerPreferred rather than BucketOwnerEnforced which correctly disables ACLs.

CodeWhisperer

AWS CodeWhisperer did not attempt to generate suggestions for the CloudFormation template. This is disappointing but not surprising since it is not listed directly as a supported language.

Terraform

Terraform uses HCL (HashiCorp Configuration Language) to declare the resources to be managed. For the context, a single file was started that includes the basic terraform config, provider settings and a variable for the bucket name. The S3 bucket is again described as a comment.

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 4.0"
    }
  }
  required_version = "~> 1.4"
}
provider "aws" {
  region = "us-east-1"
}
variable "bucket_name" {
  description = "Name of the S3 Bucket"
  type        = string
}

# S3 bucket configured with tagged with "AI" -> "HelloWorld"
# Disable public access for the bucket
# Disable ACLs for the bucket
# Enable SSE-KMS encryption for the bucket

Copilot

Copilot suggesting Terraform

Copilot suggestions are mixed here when using the context from the Terraform file. Initially, it suggests a S3 resource with deprecated (acl, server_side_encryption_configuration) and invalid attributes (e.g. block_public_acls, restrict_public_buckets). Once the invalid code is removed by the developer, it correctly generates resources to meet the desired configuration for the S3 bucket. This again requires some developer input to help guide Copilot towards the valid resources available in Terraform. The largest fault here is when the suggestions are generated for invalid resource or attribute names.

CodeWhisperer

Again, as with CloudFormation, AWS CodeWhisperer did not attempt to generate suggestions for the Terraform file.

CDK

The CDK supports using several different programming languages. For this example, Python was chosen as both AI pairing tools have support for the language. A basic CDK project was initialized using the CDK Toolkit. The python file for the CDK Stack was updated to provide the context for the S3 resource in a comment as shown.

from aws_cdk import (
    aws_s3 as s3,
    Stack,
    Tags
)
from constructs import Construct

class CdkStack(Stack):

    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        # S3 bucket configured with 
        # - Disable public access for the bucket
        # - Disable ACLs for the bucket
        # - Enable SSE-KMS encryption for the bucket
        # - Tag with "AI" -> "HelloWorld"

Copilot

Copilot suggesting CDK

Copilot quickly generates a L2 construct for the S3 bucket with most settings provided. It has to be prompted again for the configuration that disables ACLs, for which it suggests the wrong setting. In some iterations of testing with Copilot, it generated a configuration for ownership controls that looked correct, but was actually for the L1 S3 bucket construct. As with the Terraform example, Copilot is prone to providing invalid suggestions while giving the appearance of correctness.

CodeWhisperer

CodeWhisperer suggesting CDK

CodeWhisperer is finally able to provide some useful suggestions as a pairing partner within the context of CDK. It suggests a level 2 construct for the bucket with encryption and public access blocked. When prompted for the ownership controls, it correctly chooses the value that disables ACL, something that Copilot was not able to do. It makes quick work of adding the appropriate tags to the resource. Importantly, none of the suggestions provided here by CodeWhisperer are invalid or confuse level 1 and level 2 CDK constructs.

Final Thoughts

AI pairing tools that can quickly generate IaC resources or adapt configuration will improve the developer experience for maintaining increasingly complex infrastructure. Copilot and CodeWhisperer provide a starting point as AI Pairing tools are at the beginning of their maturity. While initially focused on improving application development, these tools have a future in broader workflows.

AWS has an opportunity to improve CodeWhisperer so that it supports and understands its own services, namely through expanding into support of CloudFormation. Both CloudFormation and Terraform are well defined, and should be readily incorporated into the models to provide accurate suggestions to the developer. Github’s Copilot is flexible in the coding contexts it will provide suggestions, but was more prone to providing incorrect code. That might be problematic for developers with less experience or those learning a new IaC tool or cloud service. Github is already working on more advanced features for Copilot, as envisioned in their Copilot-X preview.

As these tools mature, the user story for using an AI pairing tool for IaC development will improve. In the current state, Copilot is the most usable, particularly for experienced developers that are able to provide partial code contexts and recgonize incorrect suggestions. As organizations focus on improving the developer experience, AI pairing will likely become a mainstay.