-- Leo's gemini proxy

-- Connecting to capsule.adrianhesketh.com:1965...

-- Connected

-- Sending request

-- Meta line: 20 text/gemini; charset=utf-8

capsule.adrianhesketh.com

home

Migrating Go and Node.js Fargate tasks and Lambda functions to Graviton ARM processors with CDK

I've been using AWS's Graviton ARM processors for personal projects for a while.

They're cheaper, use lower power chips, and might be better for the environment [0].

[0]

But... can I actually use them for commercial projects? I decided to try and migrate some production services:

TypeScript Lambda functions.

Go Lambda functions.

A Next.js Web application written in TypeScript, running in Fargate.

Docker containers that run Go code.

Since I use an M1 Mac to develop on, I was fairly confident that my code could would work fine on an ARM processor in AWS.

The migration process is explained in an AWS blog [1], but this post covers the use of CDK instead.

[1]

TypeScript / Node.js Lambda functions

Migrating Node.js Lambda functions with CDK is very straightforward. It's just a case of setting the `architecture` field to `ARM_64` in the `NodejsFunction` properties.

import { Stack, StackProps, CfnOutput } from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as lambda from 'aws-cdk-lib/aws-lambda';
import * as lambdaNodeJS from 'aws-cdk-lib/aws-lambda-nodejs';

export class LfurlStack extends Stack {
  constructor(scope: Construct, id: string, props?: StackProps) {
    super(scope, id, props);

    const f = new lambdaNodeJS.NodejsFunction(this, 'test', {
      entry: 'function.ts',
      architecture: lambda.Architecture.ARM_64,
    })
    const functionUrl = f.addFunctionUrl({
      authType: lambda.FunctionUrlAuthType.NONE,
    })
    new CfnOutput(this, 'functionUrl', {
      value: functionUrl.url,
    })
  }
}

Go Lambda functions

Migrating a Go function requires a shift of runtime from `awslambda.Runtime_GO_1_X()` to `awslambda.Runtime_PROVIDED_AL2()` as well as shifting the `Architecture` field to `awslambda.Architecture_ARM_64()`.

In my production account, this caused AWS Security Hub to complain that I was using an unsupported runtime for Lambda, even though it is supported. In fact, it's more up-to-date to use Amazon Linux 2 than the built-in Go runtime [2].

[2]

This blog post from Capital One [3] also talks about using a compile tag to reduce the binary size by stripping out some (no longer required, since we're using `provided.AL2`) RPC code from the output binaries to make it start up faster.

[3]

I tried with and without. Adding `-tags lambda.norpc` to my build config took 280KB off the output binary size, but there was negligible difference between cold starts. However, since it's supported and tested in the AWS libraries [4], I stuck with it.

[4]

package main

import (
	"github.com/aws/aws-cdk-go/awscdk/v2"
	"github.com/aws/aws-cdk-go/awscdk/v2/awslambda"
	awsapigatewayv2 "github.com/aws/aws-cdk-go/awscdkapigatewayv2alpha/v2"
	awsapigatewayv2integrations "github.com/aws/aws-cdk-go/awscdkapigatewayv2integrationsalpha/v2"
	awslambdago "github.com/aws/aws-cdk-go/awscdklambdagoalpha/v2"
	"github.com/aws/constructs-go/constructs/v10"
	jsii "github.com/aws/jsii-runtime-go"
)

func NewExampleStack(scope constructs.Construct, id string, props *awscdk.StackProps) awscdk.Stack {
	stack := awscdk.NewStack(scope, &id, props)

	bundlingOptions := &awslambdago.BundlingOptions{
		GoBuildFlags: &[]*string{jsii.String(`-ldflags "-s -w" -tags lambda.norpc`)},
	}
	f := awslambdago.NewGoFunction(stack, jsii.String("handler"), &awslambdago.GoFunctionProps{
		Runtime:      awslambda.Runtime_PROVIDED_AL2(),
		Architecture: awslambda.Architecture_ARM_64(),
		Entry:        jsii.String("../lambda"),
		Bundling:     bundlingOptions,
	})
	// Add a Function URL.
	url := f.AddFunctionUrl(&awslambda.FunctionUrlOptions{
		AuthType: awslambda.FunctionUrlAuthType_NONE,
	})
	awscdk.NewCfnOutput(stack, jsii.String("lambdaFunctionUrl"), &awscdk.CfnOutputProps{
		ExportName: jsii.String("lambdaFunctionUrl"),
		Value:      url.Url(),
	})
}

Next.js and Go Docker containers

I use a minor variation on the Vercel example [5] to create my production Next.js Docker containers.

[5]

I could build and run the Dockerfile perfectly on my ARM Mac already, so I didn't have to make any changes.

However, if you're customising your Docker images to use additional tools, you might need to make sure that you're downloading the appropriate x86 or ARM version of binary distributions.

I found that using `dpkg` to get the architecture and to normalise the filenames made it simpler for me.

RUN curl -fsSL -o awscli_amd64.zip https://awscli.amazonaws.com/awscli-exe-linux-x86_64-2.7.12.zip
RUN curl -fsSL -o awscli_arm64.zip https://awscli.amazonaws.com/awscli-exe-linux-aarch64-2.7.12.zip
RUN curl -fsSL -o go_amd64.tar.gz "https://go.dev/dl/go1.18.3.linux-amd64.tar.gz"
RUN curl -fsSL -o go_arm64.tar.gz "https://go.dev/dl/go1.18.3.linux-arm64.tar.gz"
# Use the specific architectures.
RUN mv "/downloads/awscli_$(dpkg --print-architecture).zip" /downloads/awscli.zip
RUN mv "/downloads/go_$(dpkg --print-architecture).tar.gz" /downloads/go.tar.gz

CDK changes were minimal. I only had to change the `platform` of the `DockerImageAsset` to `LINUX_ARM64` and the `cpuArchitecture` of the `FargateTaskDefinition` to `ARM64`.

import { Stack, StackProps, CfnOutput } from 'aws-cdk-lib';
import { Construct } from 'constructs';
import { DockerImageAsset, Platform } from 'aws-cdk-lib/aws-ecr-assets';
import * as path from 'path';
import { ApplicationLoadBalancedFargateService } from 'aws-cdk-lib/aws-ecs-patterns';
import { ContainerImage, FargateTaskDefinition, CpuArchitecture, OperatingSystemFamily } from 'aws-cdk-lib/aws-ecs';

export class ArmTestStack extends Stack {
    constructor(scope: Construct, id: string, props?: StackProps) {
        super(scope, id, props);

        const image = new DockerImageAsset(this, "ArmNodeExample", {
            directory: path.join(__dirname, "../node-docker-example"),
            platform: Platform.LINUX_ARM64,
        })
        const taskDefinition = new FargateTaskDefinition(this, "TaskDef", {
            runtimePlatform: {
                operatingSystemFamily: OperatingSystemFamily.LINUX,
                cpuArchitecture: CpuArchitecture.ARM64,
            },
            cpu: 1024,
            memoryLimitMiB: 2048,
        });
        taskDefinition.addContainer("Web", {
            portMappings: [{ containerPort: 3000 }],
            image: ContainerImage.fromDockerImageAsset(image),
        });
        const service = new ApplicationLoadBalancedFargateService(this, "LoadBalancedService", {
            assignPublicIp: true,
            taskDefinition,
        })
        new CfnOutput(this, "endpointURL", { value: service.loadBalancer.loadBalancerDnsName, })
    }
}

Go Docker containers

The process for updating the Go-based Docker CDK projects was exactly the same for migrating Node.js Docker containers, just set the Docker image platform and the task definition's runtime platform to ARM64.

The `DockerImageAsset` construct in CDK is smart enough to compile the Go code using the `GOOARCH` flag to build for ARM64.

Github Actions CI/CD pipelines

Github Actions with Lambda

For TypeScript / Node.js and Go, these all just worked without any changes. Nothing to do at all!

Github Actions with Fargate / Docker

For a simple "Hello World" app, I just needed to add the QEMU and Docker Buildx support to allow the x86 runners to be able to build for ARM64, and that was it.

- name: Set up QEMU
uses: docker/setup-qemu-action@v2

- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v2

Unfortunately, when I tried to get the Next.js multi-stage example working. I got errors like this:

Step 4/23 : COPY package.json yarn.lock* package-lock.json* pnpm-lock.yaml* ./
failed to get destination image "sha256:b146e0c3ee7ccd3e761491562dfa9c96075c7ed9932c6237a9217cc13fb6c527": image with reference sha256:b146e0c3ee7ccd3e761491562dfa9c96075c7ed9932c6237a9217cc13fb6c527 was found but does not match the specified platform: wanted linux/arm64, actual: linux/amd64

It seems that Github Actions can also get confused between ARM and x86 downloads, and might need a reminder to use the ARM version (run `docker pull --plaform=linux/arm64 container:label`). Unfortunately, I ran into other weirdness around builds, and wasted a lot of time trying to work out what the problem was.

I "fixed" everything by moving to a self-hosted Github runner than runs in an ARM instance within AWS.

I didn't want to run my own CI/CD instance, but the CDK set up is fairly straightforward, and it comes with some benefits:

Can use AWS Session Manager to get onto the machine and read the logs.

CI server can have access to run integration tests against internal services.

CI server can be more performant than the ones in Github.

Cheaper to run your own than paying for build minutes.

VPC flow logs can highlight unusual behaviour.

package main

import (
	_ "embed"
	"os"

	"github.com/aws/aws-cdk-go/awscdk"
	"github.com/aws/aws-cdk-go/awscdk/awsec2"
	"github.com/aws/aws-cdk-go/awscdk/awsiam"
	"github.com/aws/aws-cdk-go/awscdk/awsssm"
	"github.com/aws/constructs-go/constructs/v3"
	"github.com/aws/jsii-runtime-go"
)

type CIRunnerStackProps struct {
	awscdk.StackProps
	VPCID string
}

//go:embed userdata.sh
var userData string

func NewCIRunnerStack(scope constructs.Construct, id string, props *CIRunnerStackProps) awscdk.Stack {
	var sprops awscdk.StackProps
	if props != nil {
		sprops = props.StackProps
	}
	stack := awscdk.NewStack(scope, &id, &sprops)

	vpc := awsec2.Vpc_FromLookup(stack, jsii.String("SharedVpc"), &awsec2.VpcLookupOptions{
		VpcId: &props.VPCID,
	})
	role := awsiam.NewRole(stack, jsii.String("BuildServerRole"), &awsiam.RoleProps{
		AssumedBy: awsiam.NewServicePrincipal(jsii.String("ec2.amazonaws.com"), nil),
	})

	machineImage := awsec2.MachineImage_FromSSMParameter(jsii.String("/aws/service/canonical/ubuntu/server/focal/stable/current/arm64/hvm/ebs-gp2/ami-id"), awsec2.OperatingSystemType_LINUX, nil)

	// Add an EC2 instance running ARM64.
	instance := awsec2.NewInstance(stack, jsii.String("BuildServer"), &awsec2.InstanceProps{
		InstanceType:     awsec2.InstanceType_Of(awsec2.InstanceClass_BURSTABLE4_GRAVITON, awsec2.InstanceSize_SMALL),
		MachineImage:     machineImage,
		Vpc:              vpc,
		AllowAllOutbound: jsii.Bool(true),
		BlockDevices: &[]*awsec2.BlockDevice{
			{
				DeviceName: jsii.String("/dev/sda1"),
				Volume: awsec2.BlockDeviceVolume_Ebs(jsii.Number(128),
					&awsec2.EbsDeviceOptions{
						DeleteOnTermination: jsii.Bool(true),
						VolumeType:          awsec2.EbsDeviceVolumeType_GP3,
					},
				),
				MappingEnabled: jsii.Bool(true),
			},
		},
		DetailedMonitoring:        jsii.Bool(true),
		RequireImdsv2:             jsii.Bool(true),
		Role:                      role,
		SecurityGroup:             nil,
		UserData:                  awsec2.UserData_Custom(&userData),
		UserDataCausesReplacement: jsii.Bool(true),
		VpcSubnets:                &awsec2.SubnetSelection{},
	})
	instance.Role().AddManagedPolicy(awsiam.ManagedPolicy_FromAwsManagedPolicyName(jsii.String("AmazonSSMManagedInstanceCore")))
	// Enable CloudWatch Agent (https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/install-CloudWatch-Agent-on-EC2-Instance.html).
	instance.Role().AddManagedPolicy(awsiam.ManagedPolicy_FromAwsManagedPolicyName(jsii.String("CloudWatchAgentServerPolicy")))

	// Give the Github Runner instance access to the secrets it needs to register itself.
	patParameter := awsssm.StringParameter_FromSecureStringParameterAttributes(stack, jsii.String("GithubPAT"), &awsssm.SecureStringParameterAttributes{
		ParameterName: jsii.String("/github/actions/runner/pat"),
	})
	patParameter.GrantRead(instance)

	// aws ssm start-session --target ${INSTANCE_ID} --region=eu-west-1
	awscdk.NewCfnOutput(stack, jsii.String("InstanceID"), &awscdk.CfnOutputProps{
		Value: instance.InstanceId(),
	})

	return stack
}

You might have noticed the use of two external dependencies. One is the `userdata.sh` file used to configure the instance, the other is a Github Personal Access Token which I'd previously added to SSM parameter store.

The Github Personal Access token is used to register the runner with Github Actions. It must be generated by an owner of the Organisation, and also have the `admin:org` and `admin:enterprise` scopes enabled.

The `userdata.sh` file installs Docker, the AWS CLI, and then installs the Github Actions runner.

Note that I've hard coded it to work for `https://github.com/a-h` - you'll need to adjust this to match your Github organisation.

#!/bin/sh

# Install Docker
apt-get -y update
apt-get -y install htop jq apt-transport-https ca-certificates curl gnupg lsb-release unzip
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo "deb [arch=arm64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null
apt-get -y update
apt-get install -y docker-ce docker-ce-cli containerd.io
usermod -a -G docker ubuntu
systemctl start docker
systemctl enable docker

# Install AWS CLI.
apt-get install -y awscli

# https://docs.github.com/en/actions/hosting-your-own-runners/configuring-the-self-hosted-runner-application-as-a-service

# Get the personal access token, and runner registration token from SSM.
export GITHUB_PAT=`aws ssm get-parameter --region=eu-west-1 --name="/github/actions/runner/pat" --query "Parameter.Value" --output text --with-decryption`

# Install actions-runnner.
mkdir -p /opt/actions-runner
cd /opt/actions-runner
curl -o actions-runner-linux-arm64-2.294.0.tar.gz -L https://github.com/actions/runner/releases/download/v2.294.0/actions-runner-linux-arm64-2.294.0.tar.gz
echo "98c34d401105b83906fd988c184b96d1891eaa1b28856020211fee4a9c30bc2b  actions-runner-linux-arm64-2.294.0.tar.gz" | shasum -a 256 -c
tar xzf ./actions-runner-linux-arm64-2.294.0.tar.gz
echo "Configuring"
sudo chown -R github:github /opt/actions-runner
sudo RUNNER_ALLOW_RUNASROOT=1 ./config.sh --unattended --work "_work" --url https://github.com/a-h --pat $GITHUB_PAT --replace --name "ARM Runner"
echo "Installing"
sudo ./svc.sh install

# Start service.
sudo ./svc.sh start

To use the runner, I had to configure the Github Actions YAML to use it by setting the `runs-on` field to use the runner.

jobs:
  deploy:
    runs-on: [self-hosted, Linux, ARM64]

If Github Actions had built-in ARM64 runners, that would be really helpful. Unfortunately, it's not even on the Github roadmap. There's a feedback item at [7] if you'd like to vote for that.

[7]

Go Lambda function performance

The Go ARM Lambda functions started faster than before. It's clearly visible in the graph when I switched.

However, it could be that the performance improvement is down to removing the Lambda RPC code and migrating to AL2 than ARM.

./go_arm_lambda_init_duration.png

There was no detectable shift in duration. Probably because it's a highly network bound service.

./go_arm_lambda_duration.png

Fargate Tasks

I didn't spot any difference in the ARM performance compared to the x64 version in the Next.js container I migrated. So, it's just cheaper for this use case.

Summary

ARM Lambda functions and Fargate tasks are my default for new workloads now.

It was really easy to migrate AWS Lambda functions. While there's a risk in the migration, the payoff seems to be a faster cold start and 20% off the cost.

For Docker, it's also easy to migrate, but Github Actions spoiled it by not having ARM build servers, but keeping the Github Actions runner isn't much effort at all. It's been running for a few weeks without problems.

Meeting CIS AWS Foundations Benchmarks

Process for creating a React page

Home