Overview

Workload Identity Federation allows AWS Lambda to authenticate to Google Cloud without using service account keys. Instead, it leverages AWS IAM roles to assume a Google Cloud service account's identity, granting the Lambda function access to specified GCP resources.

Prerequisites

Before proceeding, ensure the following Google Cloud APIs are enabled in your project:

IAM API
Resource Manager API
Service Account Credentials API
Security Token Service API

These APIs can be enabled using the gcloud services enable command.

Service Account Creation with required permissions(below is for GCS storage and Pub/Sub topics)

Service Account Token Creator
Pub/Sub Subscriber
Storage Bucket Viewer
Storage Object Viewer
Workload Identity User

Configuration Steps

The following sections detail the CLI commands used to set up the Workload Identity Federation.

1. Create Workload Identity Pool

A workload identity pool is a Google Cloud Identity and Access Management (IAM) entity that facilitates the management of external identities, enabling them to authenticate and access Google Cloud resources without the need for traditional service account keys.

gcloud iam workload-identity-pools create identity-pool-name \
    --project=gcp-project-id \
    --location="global" \
    --description="workload identity pool description" \
    --display-name="workload identity pool display name"

Note: Replace identity-pool-name with a name identifier for identity pool and gcp-project-id with your GCP project id. Update description and display name as per your standard naming convention.

2. Create Provider for Identity Pool (AWS Lambda)

This step creates a provider within the workload identity pool, specifically for AWS Lambda. The attribute-mapping defines how AWS IAM attributes are mapped to Google Cloud attributes.

gcloud iam workload-identity-pools providers create-aws provider_name \
--project=gcp-project-id \
--location="global" \
--workload-identity-pool="identity-pool-name" \
--account-id="aws-account-id" \ 
--attribute-mapping="google.subject=assertion.arn,attribute.account=assertion.account,attribute.aws_role=assertion.arn.extract('assumed-role/{role}/')"

Note: Replace provider_name with a name identifier for eg. aws-provider , gcp-project-id with project id, identity-pool-name with pool name created in above step and aws-account-id with your aws account id. Keep attributes as default.Don’t hardcode any role here.

3. Bind IAM Policy with Service Account for Workload Identity Pool

This command grants the iam.serviceAccountTokenCreator role to a specific AWS Lambda IAM role within the workload identity pool, allowing it to impersonate the GCP service account.

gcloud iam service-accounts add-iam-policy-binding gcp-service-account-email \   --member="principalSet://iam.googleapis.com/projects/gcp-project-number/locations/global/workloadIdentityPools/identity-pool-name/attribute.aws_role/aws-lambda-iam-role" \
--role="roles/iam.serviceAccountTokenCreator" \
--project gcp-project-id

gcloud iam service-accounts add-iam-policy-binding gcp-service-account-email \   --member="principalSet://iam.googleapis.com/projects/gcp-project-number/locations/global/workloadIdentityPools/identity-pool-name/attribute.aws_role/aws-lambda-iam-role" \
--role="roles/iam.workloadIdentityUser" \
--project gcp-project-id

Note: Replace gcp-service-account-email with service account email, gcp-project-number with

project no., identity-pool-name with pool name, aws-lambda-iam-role with AWS Lambda Role Name, gcp-project-id with project id.

4. Create Credentials from Workload Identity Pool

This command generates a client library configuration file that AWS Lambda can use to authenticate with Google Cloud.

gcloud iam workload-identity-pools create-cred-config projects/gcp-project-number/locations/global/workloadIdentityPools/identity-pool-name/providers/provider_name \
--service-account=gcp-service-account-email \
--aws \
--output-file=clientLibraryConfig-aws-provider.json #name for credentials json file of workload identity pool

Note: Replace gcp-project-number with project number, identity-pool-name with pool name, provider_name with provider name gcp-service-account-email with service acocunt email.

5. Configure AWS Lambda Environment Variables

Within your AWS Lambda function's environment variables, set the following:

Environment Variable	Value
GOOGLE_CLOUD_PROJECT	gcp-project-id
GOOGLE_APPLICATION_CREDENTIALS	clientLibraryConfig-aws-provider.json(Path for credential file in AWS Lambda Function)

Layer Creation for Lambda:

Create an AWS Lambda layer that includes the required GCP client libraries, enabling the Lambda function to consume messages from a Pub/Sub topic and access files in a GCP Storage bucket.

Steps for Layer Creation:

1.Create folder for layer

mkdir -p gcp-layer/python
cd gcp-layer/python

2. Install required GCP libraries

python3.10 -m pip install google-cloud-pubsub google-cloud-storage google-auth -t python/

3.Package the layer

cd ..
zip -r gcp-layer.zip python

4. Publish Lambda layer

aws lambda publish-layer-version \
--layer-name gcp-client-libs \
  --description \"Google Cloud Python clients for Pub/Sub and Storage\" \
  --zip-file fileb://gcp-layer.zip \
  --compatible-runtimes python3.10

5. Add the layer to your Lambda function

aws lambda update-function-configuration \
 --function-name your-lambda-name \
  --layers arn:aws:lambda:REGION:ACCOUNT_ID:layer:gcp-client-libs:VERSION

Note: To avoid compatibility issues, create the Lambda layer on a machine or virtual environment that uses the same Python version as your Lambda runtime.

Below is a sample Lambda function using boto3 to access Google Cloud Storage objects and handle Pub/Sub events via Workload Identity Federation.

Lambda Code for accessing GCS storage bucket objects

import json

from google.cloud import storage


def lambda_handler(event, context):

    try:

        # The storage.Client() will automatically pick up credentials

        # from the GOOGLE_APPLICATION_CREDENTIALS environment variable

        # and the project from GOOGLE_CLOUD_PROJECT.

        client = storage.Client()


        bucket_name = "test-gcs-bucket" # Replace with your GCS bucket name

        # Example: List objects in the bucket

        blobs = client.list_blobs(bucket_name)

        object_names = [blob.name for blob in blobs]


        # Example: Download a specific object

        # blob_name = "newclientLibraryConfig-aws-provider.json" # Replace with an actual object name

        # bucket = client.bucket(bucket_name)

        # blob = bucket.blob(blob_name)

        # content = blob.download_as_text()


        return {

            "statusCode": 200,

            "body": json.dumps({

                "message": f"Successfully listed objects in bucket {bucket_name}",

                "objects": object_names

                # "downloaded_content": content # Uncomment if you want to download and return content

            })

        }


    except Exception as e:

        return {

            "statusCode": 500,

            "body": f"Error: {str(e)}"

        }

Lambda Code for consuming Pub/Sub Events:

import json  

import os  

from google.cloud import pubsub_v1  



# No need to explicitly load credentials or set environment variables here.  

# They are already set in the Lambda environment configuration.  



def lambda_handler(event, context):  

    try:  

        # The Pub/Sub client will automatically pick up credentials  

        # from GOOGLE_APPLICATION_CREDENTIALS and project from GOOGLE_CLOUD_PROJECT  

        # environment variables.  

        subscriber = pubsub_v1.SubscriberClient()  



        # Get project_id from environment variable, as it's already set for GCS  

        project_id = os.environ.get("GOOGLE_CLOUD_PROJECT")  

        if not project_id:  

            raise ValueError("GOOGLE_CLOUD_PROJECT environment variable not set.")  



        subscription_id = "test-topic-sub" # Your Pub/Sub subscription ID  

        subscription_path = f"projects/{project_id}/subscriptions/{subscription_id}"  



        # Pull messages from the subscription    

        response = subscriber.pull(  

            request={"subscription": subscription_path, "max_messages": 10}, # Increased to 10 for efficiency  

            retry=None # You might want to add a retry strategy for production  

        )  



        ack_ids = []  

        messages_data = [] # To store decoded message data  



        for received_message in response.received_messages:  

            # Decode the message data  

            message_data = received_message.message.data.decode('utf-8')  

            messages_data.append(message_data)  

            ack_ids.append(received_message.ack_id)  



            print(f"Received message: {message_data}")  

            # Optional: print attributes  

            if received_message.message.attributes:  

                print(f"Attributes: {received_message.message.attributes}")  



        if ack_ids:  

            # Acknowledge the messages to remove them from the subscription  

            subscriber.acknowledge(  

                request={"subscription": subscription_path, "ack_ids": ack_ids}  

            )  

            print(f"Acknowledged {len(ack_ids)} messages.")  

        else:  

            print("No messages pulled from subscription.")  



        return {  

            "statusCode": 200,  

            "body": json.dumps({"processed_messages": messages_data})  

        }  



    except Exception as e:  

        print(f"Error processing Pub/Sub messages: {e}")  

        return {  

            "statusCode": 500,  

            "body": f"Error: {str(e)}"  

        }  

    finally:  

        # Ensure the subscriber client is closed  

        if 'subscriber' in locals() and subscriber:  

            subscriber.api.transport.close()

Conclusion:

Workload Identity Federation offers a secure and credential-free approach to accessing Google Cloud resources from external identity providers. By eliminating the need for long-lived service account keys, it significantly reduces the attack surface and simplifies identity management across hybrid and multi-cloud environments. With proper configuration, as demonstrated in this blog, organizations can seamlessly authenticate workloads while adhering to modern security best practices.

Summary:

In this guide, we successfully configured Workload Identity Federation to allow secure, keyless access from AWS Lambda to GCP resources like Pub/Sub and GCS Storage Bucket.

Next Steps:

I will provide terraform module code for setting up workload identity federation in my next blog.

Ref Link: https://cloud.google.com/iam/docs/workload-identity-federation-with-other-clouds

Feel free to reach out to me on Linkedin

Workload Identity Federation: AWS Lambda Access to GCP Cloud Storage and Pub/Sub

Table of contents