GCP Cybersecurity Lab: Unmasking Malicious Activity with Cloud Logging & Monitoring

José ToledoJosé Toledo
44 min read

Disclaimers & Personal Context

  • My Views: This project and the views expressed in this blog post are my own and do not necessarily reflect the official stance or opinions of Google Cloud or any other entity.

  • Learning Journey: This lab is an opportunity for me to continue expanding my self-learning journey across various cloud providers. I want to recognize that Google Cloud Platform actually has phenomenal, expertly built courses that I certainly don't intend to replace. If you're looking for structured, official training, check out Cloud Skills Boost – it's a fantastic resource!

  • Lab Environment: This lab is for educational purposes only. All "malicious" activities are simulated using benign scripts and intentional misconfigurations within my dedicated lab project. No real malware is involved.

  • Cost & Cleanup: I'm starting this lab with a fresh GCP account, similar to what a new user might experience. At the time of this writing (mid-2025), new GCP sign-ups typically come with a generous $300 in free credits, which should be more than enough to complete this lab without incurring significant costs. I'll provide a comprehensive cleanup section at the very end of this guide to help you remove all created resources and avoid any unexpected billing.

  • Crucial Tip: Always perform cloud labs in a dedicated, isolated project to avoid impacting production environments or existing resources. Ask me how I know – I may or may not have broken things by testing in production before... and learned the hard way!

Introduction

In today's digital landscape, the cloud is where a vast amount of sensitive data and critical operations reside. As more organizations move to cloud platforms like Google Cloud Platform (GCP), the need for robust cybersecurity skills has never been higher. But how do I learn to detect suspicious activity when I don't have a real attack to analyze? That's exactly what this lab is for!

I'm here to explore logging and monitoring in GCP with a simple, hands-on lab. My goal is to simulate common security vulnerabilities and "malicious" activities within a controlled environment. Then, the real fun begins: I'll act as a cloud security detective, using GCP's powerful logging and monitoring tools to find the evidence, analyze what happened, and understand how to prevent it.

Be Prepared: This is a Comprehensive Lab! This guide covers a lot of ground and involves many steps. Depending on your experience and how many breaks you take, this lab could easily take 2-4 hours (or more) to complete from start to finish. Feel free to complete it in multiple sittings!

This lab is designed to be flexible: you can choose your preferred way to follow along:

  • Command Line Interface (CLI) Enthusiasts: Copy-paste the provided gcloud CLI commands directly into Cloud Shell or your local terminal. This is often faster and more repeatable.

  • Console Explorers: For many steps, I'll also provide instructions on how to achieve the same results by clicking your way through the intuitive Google Cloud Console. This is great for visual learners and understanding where things live.

    • Note for Console users: When following Console instructions, you won't be running the gcloud CLI commands. This means you'll need to manually retrieve details like internal VM IP addresses from the GCP Console UI when prompted (e.g., from the Compute Engine VM instances list).

I recommend using Google Cloud Shell for this lab. It comes with the gcloud CLI pre-installed and authenticated, saving you setup time. To access Cloud Shell, simply click the rectangle icon with >_ (typically located at the top-right of the GCP Console window).

Let's get started!

Phase 0: Prerequisites & Environment Setup

This initial phase ensures my GCP project is properly configured and ready to host our cybersecurity lab.

1. Create or Select My Dedicated GCP Project

  • Why a dedicated project? Isolation is key for security labs. A dedicated project makes it easy to track resources, manage permissions, and clean up completely afterward.

  • Option A: Create a New Project (Cloud Console - Recommended):

    1. Open the GCP Console.

    2. At the top of the page, click on the project selector dropdown.

    3. In the "Select a project" dialog, click NEW PROJECT or if you just set this account up you can use the default project.

    4. Enter a descriptive Project name (e.g., GCP Security Lab - My Project).

    5. Click CREATE.

    6. Once the project is created, ensure it's selected in the project selector dropdown.

  • Option B: Select an Existing Project (gcloud CLI):

    • If you already created the project via the console, you can select it using the gcloud CLI:

        # My project ID for this lab is polar-cyclist-466100-e3
        gcloud config set project polar-cyclist-466100-e3
      

2. Set Project ID Environment Variable

  • Why an environment variable? Using an environment variable for my project ID makes gcloud commands cleaner, less prone to typos, and easily adaptable.

  • Important Security Note: While I'm showing my project ID here for demonstration purposes, in real-world scenarios, it's generally good practice to keep your project IDs private.

  • How to set the variable (Cloud Shell or local terminal):

      # Set my project ID for the lab
      export GCP_PROJECT_ID="polar-cyclist-466100-e3"
      echo "GCP_PROJECT_ID is set to: $GCP_PROJECT_ID"
      # Remember to copy and paste this line, then ensure polar-cyclist-466100-e3 is correct.
    

3. Enable Required GCP APIs

  • Why enable APIs? Many GCP services require their specific APIs to be explicitly enabled in your project before you can interact with them. Enabling them now prevents errors later on.

  • How to enable (gcloud CLI - Recommended):

      gcloud services enable \
          compute.googleapis.com \
          logging.googleapis.com \
          monitoring.googleapis.com \
          iam.googleapis.com \
          storage.googleapis.com \
          securitycenter.googleapis.com \
          --project=$GCP_PROJECT_ID
    

    (This command may take a minute or two to complete as services are activated.)

  • How to enable (Cloud Console - Alternative):

    1. In the GCP Console, navigate to APIs & Services > Enabled APIs & Services (use the navigation menu on the left).

    2. Click + ENABLE APIS AND SERVICES.

    3. Search for and enable the following APIs one by one by clicking on them and then clicking "ENABLE":

      • Compute Engine API

      • Cloud Logging API

      • Cloud Monitoring API

      • Cloud IAM API

      • Cloud Storage API

      • Security Command Center API

Important Note for Cloud Shell Users: Redeclaring Variables

If you're using Cloud Shell and decide to take a break, close your browser tab, or open a new Cloud Shell session, your shell's environment variables (like $GCP_PROJECT_ID, $REGION, $ZONE, etc.) will not persist automatically.

To avoid "command not found" or "Project ID must be specified" errors, it's a good practice to re-export these variables at the beginning of each phase when you return to the lab.

Here are the essential variables you'll use throughout the lab. Copy and paste this block if you ever restart your Cloud Shell:

# Essential Variables to redeclare if your Cloud Shell session restarts
export GCP_PROJECT_ID="polar-cyclist-466100-e3" # Your Project ID
export REGION="us-central1"
export ZONE="${REGION}-a"

# IPs and Names (will be updated after VMs are created/verified in Phase 2)
# If you restart your session AFTER Phase 2, you'll need to manually set these from your VM list
export VM_ATTACKER_INTERNAL_IP="10.128.0.2" # Get this from 'gcloud compute instances list'
export VM_VICTIM_INTERNAL_IP="10.128.0.3"  # Get this from 'gcloud compute instances list'
export SENSITIVE_BUCKET_NAME="${GCP_PROJECT_ID}-sensitive-data"

# Networking Resources (will be updated after they are created in Phase 1)
export ROUTER_NAME="nat-router-${REGION}"
export NAT_NAME="nat-gateway-${REGION}"
export NETWORK_NAME="default"

(When you see variable declarations like this at the start of a new phase, remember to run them if your session is fresh.)

Phase 1: Secure Infrastructure Build

In this crucial phase, I'll lay down the foundation of my lab environment. This involves setting up my "attacker" and "victim" virtual machines (VMs), establishing initial, secure network rules, and configuring the necessary outbound internet access. The goal here is to establish a clear, secure baseline before I introduce any "malicious" activities later on.

1. Create My Custom Service Accounts

  • Why custom service accounts? In GCP, VMs operate with an associated Service Account, which acts as their identity. This service account dictates what permissions the VM has to interact with other GCP services (like Cloud Storage or other Compute Engine resources). By creating dedicated, minimal service accounts now, I can later demonstrate a common security mistake: intentionally over-permissioning one of them to simulate a privilege escalation attack.

  • How to create (gcloud CLI - Recommended): I'll create sa-attacker-vm for my attacker VM and sa-victim-vm for my victim VM. Initially, I'll grant them only the very basic roles/compute.viewer permission.

      # For my vm-attacker
      gcloud iam service-accounts create sa-attacker-vm \
          --display-name="Service Account for Attacker VM Lab" \
          --project=$GCP_PROJECT_ID
    
      # Grant initial, minimal permissions (Compute Viewer)
      gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
          --member="serviceAccount:sa-attacker-vm@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
          --role="roles/compute.viewer"
    
      # For my vm-victim
      gcloud iam service-accounts create sa-victim-vm \
          --display-name="Service Account for Victim VM Lab" \
          --project=$GCP_PROJECT_ID
    
      # Grant initial, minimal permissions (Compute Viewer)
      gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
          --member="serviceAccount:sa-victim-vm@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
          --role="roles/compute.viewer"
    

    Tip: After creating service accounts, it's always a good idea to wait a minute or two (maybe ten in my case…) for them to fully propagate across GCP before trying to use them in subsequent steps. This helps avoid "permission denied" or "resource not found" errors during initial setup.

  • How to create (Cloud Console - Alternative):

    1. Navigate to IAM & Admin > Service Accounts in the GCP Console.

    2. Click + CREATE SERVICE ACCOUNT.

    3. For sa-attacker-vm:

      • Service account name: sa-attacker-vm

      • Description: Service account for Attacker VM Lab

      • Click CREATE AND CONTINUE.

      • For Grant this service account access to project, select Compute Engine Viewer (role ID roles/compute.viewer).

      • Click CONTINUE, then DONE.

    4. Repeat steps 2-3 for sa-victim-vm:

      • Service account name: sa-victim-vm

      • Description: Service account for Victim VM Lab

      • Grant it the Compute Engine Viewer role as well.

2. Deploy My Virtual Machines (VMs)

  • Why deploy VMs? I need two isolated Compute Engine instances to simulate my attack scenario: one that initiates the "malicious" actions (vm-attacker) and one that serves as the target (vm-victim). I'll configure them with only internal IP addresses for enhanced security. This also forces me to implement Cloud NAT later, demonstrating a secure outbound connectivity pattern.

  • How to deploy (gcloud CLI - Recommended): I'll ensure both VMs are in the same region and zone for easy internal communication. My chosen zone is us-central1-a.

      # Define region and zone variables for consistency
      export REGION="us-central1"
      export ZONE="${REGION}-a"
      echo "Deploying VMs in zone: $ZONE"
    
      # Create vm-attacker
      gcloud compute instances create vm-attacker \
          --project=$GCP_PROJECT_ID \
          --zone=$ZONE \
          --machine-type=e2-micro \
          --network-interface=network=default,no-address \
          --maintenance-policy=MIGRATE \
          --provisioning-model=STANDARD \
          --service-account=sa-attacker-vm@${GCP_PROJECT_ID}.iam.gserviceaccount.com \
          --scopes=https://www.googleapis.com/auth/cloud-platform \
          --tags=attacker-vm,ssh \
          --create-disk=auto-delete=yes,boot=yes,device-name=vm-attacker,image=projects/debian-cloud/global/images/family/debian-12,mode=rw,size=10,type=pd-standard \
          --no-shielded-secure-boot \
          --no-shielded-vtpm \
          --no-shielded-integrity-monitoring \
          --labels=vm-type=attacker,lab=security
    
      # Create vm-victim
      gcloud compute instances create vm-victim \
          --project=$GCP_PROJECT_ID \
          --zone=$ZONE \
          --machine-type=e2-micro \
          --network-interface=network=default,no-address \
          --maintenance-policy=MIGRATE \
          --provisioning-model=STANDARD \
          --service-account=sa-victim-vm@${GCP_PROJECT_ID}.iam.gserviceaccount.com \
          --scopes=https://www.googleapis.com/auth/cloud-platform \
          --tags=victim-vm,ssh \
          --create-disk=auto-delete=yes,boot=yes,device-name=vm-victim,image=projects/debian-cloud/global/images/family/debian-12,mode=rw,size=10,type=pd-standard \
          --no-shielded-secure-boot \
          --no-shielded-vtpm \
          --no-shielded-integrity-monitoring \
          --labels=vm-type=victim,lab=security
    

    These commands will typically take a few minutes to complete.

  • How to deploy (Cloud Console - Alternative):

    1. Navigate to Compute Engine > VM instances in the GCP Console.

    2. Click + CREATE INSTANCE.

    3. For vm-attacker:

      • Name: vm-attacker

      • Region: us-central1

      • Zone: us-central1-a

      • Machine configuration: Series E2, Type e2-micro.

      • Boot disk: Click CHANGE. Select Debian GNU/Linux, Debian 12 (bookworm) (or latest stable Debian). Size 10 GB, Standard persistent disk. Click SELECT.

      • Identity and API access:

        • Service account: Select sa-attacker-vm@YOUR_PROJECTID.iam.gserviceaccount.com.

        • Access scopes: Keep Allow default access.

      • Firewall: Ensure Allow HTTP traffic and Allow HTTPS traffic are UNCHECKED.

      • Advanced options > Networking, Disks, Security, Management...

        • Go to the Networking tab.

        • Under Network interfaces, click the pencil icon next to default (or your VPC network name).

          • External IP: Select None.

          • Network tags: Type attacker-vm and press Enter. Then type ssh and press Enter.

          • Click Done.

      • Click CREATE.

    4. Repeat steps 2-3 for vm-victim:

      • Name: vm-victim

      • Use sa-victim-vm as the Service account.

      • Add network tags victim-vm and ssh.

      • Ensure no external IP.

  • Verify VMs are Deployed and Running:

    • Why: It's good practice to immediately confirm that your resources have been created as expected before moving on. This step will also provide you with the internal IP addresses of your VMs, which you'll need shortly.

    • How to verify (gcloud CLI):

        gcloud compute instances list --project=$GCP_PROJECT_ID
      

      Look for output similar to this:

        NAME: vm-attacker
        ZONE: us-central1-a
        MACHINE_TYPE: e2-micro
        PREEMPTIBLE:
        INTERNAL_IP: 10.128.0.2  <-- Note this IP for vm-attacker
        EXTERNAL_IP:
        STATUS: RUNNING
      
        NAME: vm-victim
        ZONE: us-central1-a
        MACHINE_TYPE: e2-micro
        PREEMPTIBLE:
        INTERNAL_IP: 10.128.0.3  <-- Note this IP for vm-victim
        EXTERNAL_IP:
        STATUS: RUNNING
      

      For my lab, vm-attacker's internal IP is 10.128.0.2 and vm-victim's is 10.128.0.3. Make a note of your specific IPs, as they might differ slightly.

3. Configure Initial Network Security (Firewall Rules)

  • Why configure firewall rules? Firewall rules control network traffic to and from my VMs. I'll start by ensuring I can SSH into my VMs for management, and then I'll create a rule that explicitly denies the "malicious" communication. This establishes a known, secure network baseline.

  • How to configure (gcloud CLI - Recommended):

    • Allow SSH for Management (via IAP - Identity-Aware Proxy):

        gcloud compute firewall-rules create allow-ssh-from-iap \
            --project=$GCP_PROJECT_ID \
            --network=default \
            --action=ALLOW \
            --direction=INGRESS \
            --rules=tcp:22 \
            --source-ranges=35.235.240.0/20 \
            --target-tags=ssh \
            --description="Allow SSH from IAP for VM management"
      
      • This rule allows me to SSH into my VMs using Google's secure Identity-Aware Proxy. Identity-Aware Proxy (IAP) lets users connect to VM instances over HTTPS without exposing them to the public internet directly. It's a great security practice as it centrally manages access to your VMs based on IAM roles, rather than relying solely on firewall rules for external access. Learn more about IAP.
    • Block Malicious Communication (Initial DENY): Now, create the firewall rule that denies traffic on my "malicious" port (8080) from vm-attacker's internal IP to instances tagged victim-vm.

        # IMPORTANT: Replace '10.128.0.2' with the actual INTERNAL_IP of your vm-attacker that you noted down!
        export VM_ATTACKER_INTERNAL_IP="10.128.0.2"
      
        gcloud compute firewall-rules create block-malicious-traffic-initial \
            --project=$GCP_PROJECT_ID \
            --network=default \
            --action=DENY \
            --direction=INGRESS \
            --rules=tcp:8080 \
            --source-ranges="$VM_ATTACKER_INTERNAL_IP/32" \
            --target-tags=victim-vm \
            --priority=1000 \
            --description="Initial rule to block malicious traffic from attacker IP to victim VMs."
      
  • How to configure (Cloud Console - Alternative):

    1. Navigate to VPC Network > Firewall rules in the GCP Console.

    2. Click + CREATE FIREWALL RULE.

    3. For allow-ssh-from-iap:

      • Name: allow-ssh-from-iap

      • Direction: Ingress

      • Action: Allow

      • Targets: Specified target tags, then enter ssh

      • Source filter: IPv4 ranges, enter 35.235.240.0/20

      • Protocols and ports: Specified protocols and ports, select tcp and enter 22.

      • Click CREATE.

    4. For block-malicious-traffic-initial:

      • Name: block-malicious-traffic-initial

      • Direction: Ingress

      • Action: Deny

      • Targets: Specified target tags, then enter victim-vm

      • Source filter: IPv4 ranges, then enter VM_ATTACKER_INTERNAL_IP/32 (you'll need to manually use the IP you noted from gcloud compute instances list).

      • Protocols and ports: Specified protocols and ports, select tcp and enter 8080.

      • Click CREATE.

4. Enable Outbound Internet Access for VMs (Cloud NAT)

  • Why enable Cloud NAT? My VMs do not have external IP addresses for security reasons. However, to install software (like Apache2 via apt update/install), they need a way to make outbound connections to the internet. Cloud NAT provides this securely by allowing VMs with internal IPs to initiate outbound connections without exposing them to inbound internet traffic.

  • How to enable (gcloud CLI - Recommended):

    • Create a Cloud Router: This is a prerequisite for a NAT gateway.

        export ROUTER_NAME="nat-router-${REGION}"
        export NAT_NAME="nat-gateway-${REGION}"
        export NETWORK_NAME="default"
      
        gcloud compute routers create ${ROUTER_NAME} \
            --project=$GCP_PROJECT_ID \
            --region=${REGION} \
            --network=${NETWORK_NAME} \
            --description="Cloud Router for NAT in ${REGION}"
      
    • Create the NAT Gateway: This connects to the router and provides the NAT functionality for the subnet where my VMs live.

        gcloud compute routers nats create ${NAT_NAME} \
            --project=$GCP_PROJECT_ID \
            --router=${ROUTER_NAME} \
            --region=${REGION} \
            --nat-all-subnet-ip-ranges \
            --auto-allocate-nat-external-ips \
            --enable-dynamic-port-allocation \
            --enable-logging \
            --log-filter=ERRORS_ONLY
      

      This step may take a few minutes to complete as the NAT gateway provisions.

  • How to enable (Cloud Console - Alternative):

    1. Navigate to Network Services > Cloud NAT in the GCP Console.

    2. Click CREATE NAT GATEWAY.

    3. Gateway name: nat-gateway-us-central1

    4. VPC network: default

    5. Region: us-central1

    6. Cloud Router: Select Create new router.

      • Name: nat-router-us-central1

      • Click CREATE.

    7. NAT mapping: Select Automatic (recommended).

    8. Region subnets: Ensure your us-central1 subnet is selected.

    9. NAT IP addresses: Select Automatic IP address allocation.

    10. Click CREATE.

Phase 2: Initial Lab Verification & Victim Preparation

Now that my core infrastructure is in place from Phase 1, it’s time to verify everything is working as expected and prepare my "victim" VM for the upcoming simulated attacks. This ensures that when I introduce "malicious" activity, I have a clear baseline of what a "good" and "secure" state looks like.

1. Verify VM Status and Internal IPs

  • Why verify? Before I proceed, I need to confirm that both my vm-attacker and vm-victim are running correctly and to get their internal IP addresses. These internal IPs are crucial for my firewall rules and for direct VM-to-VM communication later in the lab.

  • How to verify (gcloud CLI - Recommended):

      gcloud compute instances list --project=$GCP_PROJECT_ID
    

    Look for the output similar to what you saw earlier:

      NAME: vm-attacker
      ZONE: us-central1-a
      MACHINE_TYPE: e2-micro
      PREEMPTIBLE:
      INTERNAL_IP: 10.128.0.2  <-- IMPORTANT: Note this IP for vm-attacker!
      EXTERNAL_IP:
      STATUS: RUNNING
    
      NAME: vm-victim
      ZONE: us-central1-a
      MACHINE_TYPE: e2-micro
      PREEMPTIBLE:
      INTERNAL_IP: 10.128.0.3  <-- IMPORTANT: Note this IP for vm-victim!
      EXTERNAL_IP:
      STATUS: RUNNING
    

    For my lab, I'll be using 10.128.0.2 as vm-attacker's internal IP and 10.128.0.3 as vm-victim's internal IP. Make sure you use your specific IPs if they are different, as they are unique to your project's VPC network.

2. Test SSH Connectivity to Both VMs

  • Why test SSH? I need to confirm that I can successfully connect to my VMs. This is how I'll perform configurations and execute commands directly on the instances.

  • How to test (gcloud CLI - Recommended):

      echo "Attempting to SSH into vm-attacker..."
      gcloud compute ssh vm-attacker --zone=$ZONE --project=$GCP_PROJECT_ID
      # Once successfully connected and you see the prompt (e.g., 'user@vm-attacker:~$' ), type 'exit' to return to Cloud Shell.
      exit
    
      echo "Attempting to SSH into vm-victim..."
      gcloud compute ssh vm-victim --zone=$ZONE --project=$GCP_PROJECT_ID
      # Once connected, type 'exit' to return to Cloud Shell.
      exit
    
  • How to test (Cloud Console - Alternative):

    1. Navigate to Compute Engine > VM instances in the GCP Console.

    2. Locate vm-attacker (and then vm-victim).

    3. In the "Connect" column, click the SSH button. A new browser window or tab will open with an SSH session to your VM.

    4. Verify you see the VM's command prompt. Close the SSH window/tab when done.

3. On vm-victim: Install Apache & Prepare Listener

  • Why prepare vm-victim? For my first simulated attack, vm-victim needs to be listening on a specific "malicious" port so that vm-attacker has a target to connect to. I'll install a lightweight web server (Apache2) and configure it, and then place a dummy "sensitive" file that the attacker will attempt to "exfiltrate."

  • How to prepare (Inside vm-victim SSH session - Recommended):

    • First, SSH into vm-victim from your Cloud Shell:

        gcloud compute ssh vm-victim --zone=$ZONE --project=$GCP_PROJECT_ID
      
    • Once inside the vm-victim SSH session, run the following commands one by one:

        # Update package lists (this should now work due to Cloud NAT!)
        sudo apt update -y
      
        # Install Apache2
        sudo apt install apache2 -y
      
        # Configure Apache to listen on port 8080
        # I'll back up the original config first, good practice!
        sudo cp /etc/apache2/ports.conf /etc/apache2/ports.conf.bak
        echo "Listen 8080" | sudo tee -a /etc/apache2/ports.conf
        # Modify the default virtual host to serve on 8080
        sudo cp /etc/apache2/sites-available/000-default.conf /etc/apache2/sites-available/000-default.conf.bak
        sudo sed -i 's/<VirtualHost \*:80>/<VirtualHost \*:8080>/g' /etc/apache2/sites-available/000-default.conf
        sudo systemctl restart apache2
      
        # Verify Apache is listening on 8080
        # You should see output indicating port 8080 is in a 'LISTEN' state.
        sudo ss -tuln | grep 8080
      
        # Create a simple "sensitive" file for exfiltration
        echo "This is sensitive data from the victim VM!" | sudo tee /var/www/html/sensitive_data.txt
      
    • After running all commands inside vm-victim, type exit to close the SSH session and return to Cloud Shell:

        exit
      

4. On vm-attacker: Test Blocked Communication (Expected to Fail)

  • Why test for failure? This is my critical baseline verification. I want to explicitly prove that my initial DENY firewall rule is correctly enforcing security before I intentionally break it. If this connection succeeds, something is wrong with my firewall rule.

  • How to test (Inside vm-attacker SSH session - Recommended):

    • First, SSH into vm-attacker from your Cloud Shell:

        gcloud compute ssh vm-attacker --zone=$ZONE --project=$GCP_PROJECT_ID
      
    • Once inside the vm-attacker SSH session, run the following command (remembering to use vm-victim's actual internal IP):

        # IMPORTANT: Replace '10.128.0.3' with the actual INTERNAL_IP of your vm-victim!
        VM_VICTIM_INTERNAL_IP="10.128.0.3" 
      
        echo "Attempting to connect to vm-victim at: $VM_VICTIM_INTERNAL_IP:8080 (EXPECTED TO FAIL)"
        curl -v --connect-timeout 5 "$VM_VICTIM_INTERNAL_IP:8080/sensitive_data.txt"
      
    • Expected Result: The curl command should fail with a timeout or connection refused error. It might hang for a few seconds before failing. This is exactly what I want to see!

    • After running the command inside the VM, type exit to close the SSH session and return to Cloud Shell:

        exit
      

Now, let's move into a critically important phase: Phase 3: Enable Comprehensive Logging & Monitoring. This is where I'll set up the "eyes and ears" of my security operations, ensuring that all the "malicious" activities I'm about to simulate are thoroughly recorded. This proactive approach is essential for effective detection and analysis.

Phase 3: Enable Comprehensive Logging & Monitoring

Goal: To effectively detect malicious activities, I need to ensure the right logs are being collected before any incidents occur. This phase sets up the core observability tools that will allow me to be a true security detective later.

1. Enable VPC Flow Logs for My Subnet

  • Why enable Flow Logs? Network traffic is a goldmine for security insights. VPC Flow Logs record IP traffic flow (including source/destination IPs, ports, protocols, and whether traffic was allowed or denied) to and from network interfaces in my Virtual Private Cloud (VPC). This will be absolutely crucial for detecting and understanding the network connections in Scenario 1 (Unauthorized Port Access).

  • How to enable (gcloud CLI - Recommended):

      gcloud compute networks subnets update default \
          --region=$REGION \
          --enable-flow-logs \
          --logging-metadata=include-all \
          --logging-flow-sampling=1.0 \
          --logging-aggregation-interval=INTERVAL_5_SEC \
          --project=$GCP_PROJECT_ID
    

    Here, --logging-flow-sampling=1.0 means I'm collecting 100% of the traffic samples (for maximum detail in this lab), and --logging-aggregation-interval=INTERVAL_5_SEC means logs are aggregated every 5 seconds (for higher granularity).

  • How to enable (Cloud Console - Alternative):

    1. Navigate to VPC Network > VPC networks in the GCP Console.

    2. Click on the default network.

    3. Go to the Subnets tab.

    4. Find the us-central1 subnet and click its name.

    5. Click EDIT.

    6. Scroll down to Flow logs and select On.

    7. For finer detail (recommended for this lab), set:

      • Aggregation interval: 5 seconds

      • Sample rate: 1 (100%)

      • Include metadata: Include all metadata

    8. Click SAVE.

2. Install Google Cloud Ops Agent on Both VMs

  • Why install Ops Agent? While GCP collects some basic VM metrics and logs, the Ops Agent provides much deeper visibility inside the VM's operating system. It collects comprehensive system metrics (like CPU, memory, disk I/O) which go to Cloud Monitoring, and detailed OS logs (syslog, auth.log, and importantly, application logs like Apache's access/error logs) which go to Cloud Logging. This will be vital for debugging, performance monitoring, and detecting unusual activity (like my CPU-intensive script in Scenario 3 or Apache access logs in Scenario 1).

  • How to install (Inside VM SSH session - Recommended):

    • First, SSH into vm-attacker from your Cloud Shell:

        gcloud compute ssh vm-attacker --zone=$ZONE --project=$GCP_PROJECT_ID
      
    • Once inside the vm-attacker SSH session, run these commands:

        # Download the Ops Agent installation script
        curl -sSO https://dl.google.com/cloudagents/add-google-cloud-ops-agent-repo.sh
      
        # Run the script to install the agent and set up the repository
        sudo bash add-google-cloud-ops-agent-repo.sh --also-install
      

      This script will download and install the Ops Agent. It might take a couple of minutes to complete.

    • After running the commands inside the VM, type exit to close the SSH session and return to Cloud Shell:

        exit
      
    • Now, repeat the exact same steps for vm-victim:

      • SSH into vm-victim from your Cloud Shell:

          gcloud compute ssh vm-victim --zone=$ZONE --project=$GCP_PROJECT_ID
        
      • Once inside the vm-victim SSH session, run the Ops Agent installation commands again:

          # Download the Ops Agent installation script
          curl -sSO https://dl.google.com/cloudagents/add-google-cloud-ops-agent-repo.sh
        
          # Run the script to install the agent and set up the repository
          sudo bash add-google-cloud-ops-agent-repo.sh --also-install
        
      • Type exit to close the SSH session.

          exit
        

2.1. Grant Ops Agent Permissions to VM Service Accounts (CRITICAL STEP!)

  • Why: The Ops Agent collects logs and metrics and sends them to Cloud Logging and Cloud Monitoring. The service accounts associated with your VMs (sa-attacker-vm and sa-victim-vm) need explicit IAM permissions to write to these services. Without these roles, the Ops Agent will fail its API checks and won't send any data, regardless of its configuration.

  • How to grant (gcloud CLI - Recommended):

      echo "Granting roles/logging.logWriter and roles/monitoring.metricWriter to sa-attacker-vm..."
      gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
          --member="serviceAccount:sa-attacker-vm@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
          --role="roles/logging.logWriter" \
          --condition=None
      gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
          --member="serviceAccount:sa-attacker-vm@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
          --role="roles/monitoring.metricWriter" \
          --condition=None
    
      echo "Granting roles/logging.logWriter and roles/monitoring.metricWriter to sa-victim-vm..."
      gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
          --member="serviceAccount:sa-victim-vm@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
          --role="roles/logging.logWriter" \
          --condition=None
      gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
          --member="serviceAccount:sa-victim-vm@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
          --role="roles/monitoring.metricWriter" \
          --condition=None
    

    These IAM changes can take 1-2 minutes to fully propagate. It's a good idea to wait a moment before proceeding.

2.2. Configure Ops Agent for Apache Logs on vm-victim (Critical Step! If missed, you won’t get logs)

  • Why: Even after installing the Ops Agent, it doesn't automatically collect all specific application logs (like Apache's) without explicit configuration. This step tells the agent exactly where to find Apache logs and how to send them to Cloud Logging.

  • How to configure (Inside vm-victim SSH session - Recommended and proven to work):

    • First, SSH into vm-victim from your Cloud Shell:

        gcloud compute ssh vm-victim --zone=$ZONE --project=$GCP_PROJECT_ID
      
    • Once inside vm-victim, copy and paste this entire block of commands. This script, sourced from GCP's own documentation, will correctly configure the Ops Agent and restart it.

        # Configures Ops Agent to collect telemetry from the app. You must restart the agent for the configuration to take effect.
      
        set -e
      
        # Check if the file exists
        if [ ! -f /etc/google-cloud-ops-agent/config.yaml ]; then
          # Create the file if it doesn't exist.
          sudo mkdir -p /etc/google-cloud-ops-agent
          sudo touch /etc/google-cloud-ops-agent/config.yaml
        fi
      
        # Create a back up of the existing file so existing configurations are not lost.
        sudo cp /etc/google-cloud-ops-agent/config.yaml /etc/google-cloud-ops-agent/config.yaml.bak
      
        # Configure the Ops Agent.
        sudo tee /etc/google-cloud-ops-agent/config.yaml > /dev/null << EOF
        metrics:
          receivers:
            apache:
              type: apache
          service:
            pipelines:
              apache:
                receivers:
                  - apache
        logging:
          receivers:
            apache_access:
              type: apache_access
            apache_error:
              type: apache_error
          service:
            pipelines:
              apache:
                receivers:
                  - apache_access
                  - apache_error
        EOF
      
        # Restart the Ops Agent to apply the new configuration
        sudo systemctl restart google-cloud-ops-agent
        echo "Ops Agent restarted after configuration."
        sudo systemctl status google-cloud-ops-agent # Verify status
      
  • After running the commands inside the VM, type exit to close the SSH session and return to Cloud Shell:

      exit
    

3. Enable Cloud Audit Logs (Data Access) for Cloud Storage

  • Why enable Data Access logs? Cloud Audit Logs (cloudaudit.googleapis.com/activity) are enabled by default and track administrative actions (e.g., who created a VM, who changed an IAM policy). However, by default, they don't log actual data read or write operations for services like Cloud Storage. To detect someone accessing sensitive files in my buckets (like you’ll see in Scenario 2), I need to explicitly enable these "Data Access" logs.

  • How to enable (gcloud CLI with yq - Recommended for accuracy):

    • Note: If yq is not installed in your Cloud Shell, you'll need to install it first. I found that it wasn't pre-installed in my Cloud Shell. Here's how:

        # Verify if yq is installed (should return a path if installed)
        which yq
        # If no output, install yq:
        YQ_VERSION="v4.42.1" # Check https://github.com/mikefarah/yq/releases/latest for the latest version
        wget https://github.com/mikefarah/yq/releases/download/${YQ_VERSION}/yq_linux_amd64 -O yq
        chmod +x yq
        sudo mv yq /usr/local/bin/
      
    • Now, use yq to modify your project's IAM policy to enable these audit logs.

        # 1. Fetch the current IAM policy and save it to a temporary file
        gcloud projects get-iam-policy $GCP_PROJECT_ID --format=yaml > /tmp/policy.yaml
      
        # 2. Add the audit config for Cloud Storage Data Access logs using yq
        yq -i '
        .auditConfigs += [
          {"service": "storage.googleapis.com", "auditLogConfigs": [{"logType": "DATA_READ"}, {"logType": "DATA_WRITE"}]}
        ]
        ' /tmp/policy.yaml
      
        # 3. Apply the modified IAM policy
        gcloud projects set-iam-policy $GCP_PROJECT_ID /tmp/policy.yaml
      

      You may be prompted to confirm changes to the IAM policy; type y or A if so.

  • How to enable (Cloud Console - Alternative):

    1. Navigate to IAM & Admin > Audit Logs in the GCP Console.

    2. In the "Data Access audit logs configuration" table, find Google Cloud Storage.

    3. Click the checkbox next to Google Cloud Storage.

    4. In the info panel that appears on the right, under "Log Types", select all three checkboxes:

      • Admin Read (usually enabled by default)

      • Data Read

      • Data Write

    5. Click SAVE.

Let's dive into Phase 4: Executing Malicious Scenarios. This is the core "attack" part of the lab, where I'll intentionally introduce vulnerabilities and perform simulated attacks.

Phase 4: Executing Malicious Scenarios

Goal: In this phase, I will systematically introduce vulnerabilities into my environment and then execute simulated attacks. The primary purpose of these "attacks" is to generate specific, detectable security events that I can later find and analyze using my logging and monitoring setup. Remember, this is all within your controlled lab environment!

Scenario 1: Unauthorized Port Access (Firewall Misconfiguration)

This scenario simulates a common security vulnerability where a network port is accidentally (or maliciously) opened, allowing unauthorized access to a service that should be private.

  1. The "Bad Permission": Modify Firewall Rule (Change from DENY to ALLOW)

    • Why: I previously set up a DENY firewall rule to block traffic on port 8080 from vm-attacker to vm-victim. To simulate a misconfiguration, I now need to change this rule to ALLOW. Since gcloud doesn't let me directly update a firewall rule's action, I'll delete the old DENY rule and re-create it with an ALLOW action.

    • How to modify (gcloud CLI - Recommended):

        echo "Deleting the DENY firewall rule..."
        gcloud compute firewall-rules delete block-malicious-traffic-initial --project=$GCP_PROJECT_ID --quiet
      
        echo "Creating a new firewall rule with ALLOW action for the malicious port..."
        # IMPORTANT: Replace '10.128.0.2' with the actual INTERNAL_IP of your vm-attacker!
        export VM_ATTACKER_INTERNAL_IP="10.128.0.2" # Using my vm-attacker IP for this example
      
        gcloud compute firewall-rules create block-malicious-traffic-initial \
            --project=$GCP_PROJECT_ID \
            --network=default \
            --action=ALLOW \
            --direction=INGRESS \
            --rules=tcp:8080 \
            --source-ranges="$VM_ATTACKER_INTERNAL_IP/32" \
            --target-tags=victim-vm \
            --priority=1000 \
            --description="Malicious (misconfigured) rule: Allows traffic from attacker IP to victim VMs."
      

      (You should see output confirming the deletion and creation of the rule.)

    • How to modify (Cloud Console - Alternative):

      1. Navigate to VPC Network > Firewall rules in the GCP Console.

      2. Find the rule named block-malicious-traffic-initial.

      3. Select the checkbox next to its name and click the DELETE button at the top. Confirm the deletion.

      4. Click + CREATE FIREWALL RULE.

      5. Name: block-malicious-traffic-initial (use the exact same name)

      6. Description: Malicious (misconfigured) rule: Allows traffic from attacker IP to victim VMs.

      7. Direction of traffic: Ingress

      8. Action on match: Allow (This is the crucial change!)

      9. Targets: Specified target tags, then enter victim-vm

      10. Source filter: IPv4 ranges, then enter VM_ATTACKER_INTERNAL_IP/32 (use the actual IP you noted for vm-attacker, e.g., 10.128.0.2/32).

      11. Protocols and ports: Specified protocols and ports, select tcp and enter 8080.

      12. Ensure the rule is enabled.

      13. Click CREATE.

  2. On vm-attacker: Successful Connection and Data Exfiltration

    • Why: With the firewall now "misconfigured" (allowing traffic), vm-attacker can successfully connect to vm-victim and access the sensitive data. This is the simulated network attack and data theft.

    • How to execute (gcloud CLI with --command - Recommended):

        # IMPORTANT: Replace '10.128.0.3' with the actual INTERNAL_IP of your vm-victim!
        export VM_VICTIM_INTERNAL_IP="10.128.0.3" # Using my vm-victim IP for this example
      
        echo "Attempting to connect to vm-victim at: $VM_VICTIM_INTERNAL_IP:8080 (EXPECTED TO SUCCEED)"
        gcloud compute ssh vm-attacker --zone=$ZONE --project=$GCP_PROJECT_ID --command="
        # The VM_VICTIM_INTERNAL_IP is passed directly into the command string here
        curl -s $VM_VICTIM_INTERNAL_IP:8080/sensitive_data.txt
      
        curl -s $VM_VICTIM_INTERNAL_IP:8080/sensitive_data.txt > exfiltrated_sensitive_data.txt
      
        echo \"Verifying content of exfiltrated_sensitive_data.txt:\"
        cat exfiltrated_sensitive_data.txt
        "
      
    • Expected Result: You should see the content "This is sensitive data from the victim VM!" printed directly in your terminal, and the exfiltrated_sensitive_data.txt file (on vm-attacker) will contain that text. This signifies a successful unauthorized access.

Scenario 2: Service Account Privilege Escalation (Cloud Storage Data Exfiltration)

This scenario simulates an attacker leveraging overly permissive IAM roles on a service account to gain unauthorized access to sensitive data stored in Cloud Storage.

  1. Create a Sensitive Cloud Storage Bucket:

    • Why: This bucket will hold my "sensitive" data that the attacker will try to steal. It needs to be in my project.

    • How to create (gcloud CLI - Recommended):

        export SENSITIVE_BUCKET_NAME="${GCP_PROJECT_ID}-sensitive-data" # This uses your project ID to ensure uniqueness
        echo "Creating sensitive Cloud Storage bucket: gs://${SENSITIVE_BUCKET_NAME}..."
      
        gcloud storage buckets create gs://${SENSITIVE_BUCKET_NAME} \
            --project=$GCP_PROJECT_ID \
            --location=$REGION \
            --uniform-bucket-level-access
      

      Bucket names must be globally unique. Using your project ID in the name helps ensure this.

    • How to create (Cloud Console - Alternative):

      1. Navigate to Cloud Storage > Buckets in the GCP Console.

      2. Click + CREATE BUCKET.

      3. Name: Enter your-project-id-sensitive-data (e.g., polar-cyclist-466100-e3-sensitive-data).

      4. Choose where to store your data: Select Region and then us-central1.

      5. Choose a default storage class: Standard.

      6. Choose how to control access to objects: Uniform.

      7. Click CREATE.

  2. Upload "Sensitive" Files to the Bucket:

    • Why: I need some dummy "sensitive" data in the bucket for the attacker to attempt to exfiltrate.

    • How to upload (gcloud CLI - Recommended):

        echo "Creating dummy sensitive file locally in Cloud Shell..."
        echo -e "Admin_Password=VerySecret123\nDB_User=dbadmin\nDB_Pass=SuperSecureDB!" > secret_passwords.txt
      
        echo "Uploading sensitive file to the bucket..."
        gcloud storage cp secret_passwords.txt gs://${SENSITIVE_BUCKET_NAME}/secret_passwords.txt
      
    • How to upload (Cloud Console - Alternative):

      1. Navigate to Cloud Storage > Buckets in the GCP Console.

      2. Click on the name of your newly created bucket (your-project-id-sensitive-data).

      3. Click UPLOAD FILES.

      4. On your local computer (not Cloud Shell), create a simple text file named secret_passwords.txt with some dummy sensitive content.

      5. Select and upload this file.

  3. On vm-attacker: Initial Attempt to Access Bucket (Expected to Fail)

    • Why: My sa-attacker-vm (the service account associated with vm-attacker) currently only has roles/compute.viewer. It should not be able to list or access objects in Cloud Storage. This confirms the initial, secure (least privilege) state of the service account before I escalate its permissions.

    • How to attempt (gcloud CLI with --command - Recommended):

        echo "Attempting initial Cloud Storage access from vm-attacker (EXPECTED TO FAIL)..."
        gcloud compute ssh vm-attacker --zone=$ZONE --project=$GCP_PROJECT_ID --command="
        # Directly use the full bucket name string here:
        gcloud storage ls gs://polar-cyclist-466100-e3-sensitive-data/
      
        # Directly use the full bucket name string here:
        gcloud storage cp gs://polar-cyclist-466100-e3-sensitive-data/secret_passwords.txt .
        "
      
    • Expected Result: Both gcloud storage commands within the SSH session should return "Permission denied" or similar authorization errors.

  4. The "Bad Permission": Grant Excessive IAM Role

    • Why: This is the core misconfiguration. I am intentionally granting sa-attacker-vm the ability to read Cloud Storage objects. This simulates a common privilege escalation vulnerability where an entity (like a VM's service account) is given more permissions than it needs, allowing it to access data it shouldn't.

    • How to grant (gcloud CLI - Recommended):

        echo "Granting 'roles/storage.objectViewer' to sa-attacker-vm..."
        gcloud projects add-iam-policy-binding $GCP_PROJECT_ID \
            --member="serviceAccount:sa-attacker-vm@${GCP_PROJECT_ID}.iam.gserviceaccount.com" \
            --role="roles/storage.objectViewer" \
            --condition=None
      

      IAM changes can take 1-2 minutes to fully propagate across GCP. I'll add a sleep command in the next step to account for this propagation time.

    • How to grant (Cloud Console - Alternative):

      1. Navigate to IAM & Admin > IAM in the GCP Console.

      2. Click + GRANT ACCESS.

      3. In the New principals field, type or select sa-attacker-vm@YOUR_PROJECT_ID.iam.gserviceaccount.com.

      4. In the Select a role field, search for Storage Object Viewer (role ID roles/storage.objectViewer).

      5. Click SAVE.

  5. On vm-attacker: Successful Data Exfiltration from Cloud Storage

    • Why: With the new, excessive permission now granted to its service account, vm-attacker can successfully access the sensitive bucket and exfiltrate the data. This is the simulated privilege escalation and data theft.

    • How to execute (gcloud CLI with --command):

        echo "Attempting successful Cloud Storage access from vm-attacker..."
        gcloud compute ssh vm-attacker --zone=$ZONE --project=$GCP_PROJECT_ID --command="
        echo \"Waiting 60 seconds for IAM propagation...\"
        sleep 60 # Give IAM changes time to propagate
      
        # Directly use the full bucket name string here:
        echo \"Attempting to list objects in the sensitive bucket (EXPECTED TO SUCCEED)...\"
        gcloud storage ls gs://polar-cyclist-466100-e3-sensitive-data/
      
        # Directly use the full bucket name string here:
        echo \"Attempting to download a sensitive file (EXPECTED TO SUCCEED)...\"
        gcloud storage cp gs://polar-cyclist-466100-e3-sensitive-data/secret_passwords.txt .
      
        echo \"Verifying content of secret_passwords.txt:\"
        cat secret_passwords.txt
        "
      
    • Expected Result: The gcloud storage ls command should list secret_passwords.txt, and gcloud storage cp should successfully download it. cat secret_passwords.txt will display the sensitive passwords. This confirms a successful privilege escalation and data exfiltration.

Scenario 3: Malicious Script Execution / Resource Abuse

Goal: I'll simulate a VM running an unauthorized, resource-intensive process, which could indicate activity like cryptomining, often a sign of compromise. This will generate metrics and logs that I can detect later.

  1. On vm-attacker: Prepare and Run a CPU-Intensive Script * Why: I'll use a simple Python script that performs continuous hashing. This is a CPU-bound task that will drive up vm-attacker's CPU utilization, mimicking the resource consumption of a cryptocurrency miner or other unwanted workload. This will generate metrics and logs that I can detect.

    • Action A: Create the Python script file locally in your Cloud Shell.

        # Create the Python script file in your current Cloud Shell directory
        cat <<EOF > cpu_intensive_script.py
        import hashlib
        import os
        import sys
        import time
      
        def cpu_intensive_task(duration_seconds=300):
            start_time = time.time()
            print(f\"[{time.ctime()}] Starting CPU-intensive task for {duration_seconds} seconds...\")
            counter = 0
            while (time.time() - start_time) < duration_seconds:
                hashlib.sha256(os.urandom(1024)).hexdigest()
                counter += 1
            print(f\"[{time.ctime()}] CPU-intensive task complete. Hashed {counter} times.\")
      
        if __name__ == \"__main__\":
            duration = 300
            if len(sys.argv) > 1:
                try:
                    duration = int(sys.argv[1])
                except ValueError:
                    print(\"Invalid duration, using default 300 seconds.\")
            cpu_intensive_task(duration)" > cpu_intensive_script.py
        EOF
      
    • Action B: Copy the script to vm-attacker's home directory.

        echo "Copying script to vm-attacker..."
        gcloud compute scp cpu_intensive_script.py vm-attacker:~ --zone=$ZONE --project=$GCP_PROJECT_ID
      
    • Action C: SSH into vm-attacker and execute the script.

        echo "Starting CPU-intensive script on vm-attacker..."
        gcloud compute ssh vm-attacker --zone=$ZONE --project=$GCP_PROJECT_ID --command="
        sudo apt update -y && sudo apt install python3 -y # Ensure Python 3 is installed
      
        chmod +x cpu_intensive_script.py
      
        # Run the script in the background for 5 minutes (300 seconds)
        nohup python3 cpu_intensive_script.py 300 > cpu_script.log 2>&1 &
      
        echo \"CPU-intensive script started in the background. It will run for 5 minutes.\"
        "
      
    • Expected Result: The command should execute successfully, indicating the script has started in the background on vm-attacker. You won't see direct CPU usage immediately in your terminal, but it will begin to affect vm-attacker's CPU metrics.

Alright, the stage is set, the attacks have (simulated) occurred, and our logging and monitoring infrastructure is primed. It's time for the grand finale: Phase 5: Detecting and Analyzing the Attacks in Cloud Logging & Monitoring!

This is where I put on my detective hat. All the setup and "malicious" activities were just to generate the clues. Now, I'll use GCP's observability tools to piece together what happened, identify the vulnerabilities, and understand how I would detect these in a real-world scenario.

Phase 5: Detect & Analyze Events in Cloud Logging & Monitoring

Goal: My primary goal in this phase is to use Cloud Logging (for raw log analysis) and Cloud Monitoring (for metrics, dashboards, and alerts) to find the evidence of the simulated attacks. This demonstrates how GCP's built-in tools can be leveraged for security operations.

First, a quick refresher on the tools:

  • Cloud Logging: The central place in GCP to collect, store, analyze, and export all your logs from GCP services, VMs, and custom applications.

  • Cloud Monitoring: GCP's service for collecting metrics, creating dashboards to visualize them, and setting up alerts based on metric thresholds or log patterns.

Let's dive into finding the evidence for each scenario.

1. Cloud Logging (Logs Explorer)

  • Why Logs Explorer? This is my primary interface for searching, filtering, and analyzing all the log data collected from my GCP resources. It's like my digital crime scene investigation kit.

  • How to access:

    • Navigate to Monitoring > Logs Explorer in the GCP Console.

Scenario 1: Unauthorized Port Access (Firewall Misconfiguration)

This attack involved a network connection being allowed that should have been blocked. I'll look for: the firewall rule change itself, the network traffic flow, and application-level logs.

  1. Clue 1: Firewall Rule Change (Admin Activity Log)

    • What I'm looking for: Evidence that someone modified my block-malicious-traffic-initial firewall rule from DENY to ALLOW. This is an administrative action recorded in Audit Logs.

    • How to find (Logs Explorer Query):

      • You'll paste the following text into the "Query" section of the Logs Explorer (the large text box).

      • Important: If you're returning to this lab after a break, make sure to adjust the time range in the Logs Explorer to cover when you actually performed the firewall rule change! You can click on the time range selector (e.g., "Jul 18, 10:22 PM - Jul 19, 2:24 AM" in the screenshot below) and choose a wider range like "Last 24 hours" or "Last 7 days" if needed.

          resource.type="gce_firewall_rule"
          protoPayload.methodName="v1.compute.firewalls.insert"
          protoPayload.request.name="block-malicious-traffic-initial"
        
        • Analyze: Look at the protoPayload.request.alloweds field (it should be an array containing an entry for TCP port 8080) in the log detail. This confirms the new rule allows the traffic. The protoPayload.authenticationInfo.principalEmail will show who made the change (your user account).
    • How to find (Logs Explorer Query - for the deletion of the DENY rule):

        resource.type="gce_firewall_rule"
        protoPayload.methodName="v1.compute.firewalls.delete"
      
      • Analyze: This log entry confirms the removal of the old rule.
    • What this means: These logs are critical administrative change records. If these actions weren't authorized (e.g., if someone else deleted the DENY rule and inserted an ALLOW rule), it would immediately indicate a compromise of an administrator's account or an insider threat.

  2. Clue 2: VPC Flow Logs (Connection Accepted)

    • What I'm looking for: Direct evidence of network traffic from vm-attacker to vm-victim on port 8080 that was allowed by the firewall. My VPC Flow Logs will show this.

    • How to find (Logs Explorer Query):

      • In Logs Explorer, clear any previous text from the main "Query" text box.

      • Then, paste the entire query below into that main "Query" text box.

      • Important: Adjust the time range! Make sure your time range selected at the top of Logs Explorer covers the exact time you executed the curl command in Scenario 1 (after changing the firewall rule to ALLOW).

      • Note on variables in queries: The query below uses ${GCP_PROJECT_ID}. In Logs Explorer, this will either expand automatically (if it's tied to your shell environment) or you might need to manually replace ${GCP_PROJECT_ID}with your actual project ID (e.g., polar-cyclist-466100-e3). Also, replace 10.128.0.2 and 10.128.0.3 with your actual VM IPs.

        logName="projects/${GCP_PROJECT_ID}/logs/compute.googleapis.com%2Fvpc_flows"
        jsonPayload.connection.src_ip="10.128.0.2"
        jsonPayload.connection.dest_ip="10.128.0.3"
        jsonPayload.connection.dest_port=8080
  • Click Run query.

  • Analyze: You should now see log entries for this specific connection, like the one you provided earlier! Its presence confirms the traffic flowed after the firewall rule change.

    • What this means: This is concrete proof of unauthorized network access. Correlating this with the firewall rule change shows the attack vector.
  1. Clue 3: Apache Access Logs (from Ops Agent on vm-victim)

    • What I'm looking for: Application-level evidence on vm-victim that a connection was received by Apache. My Ops Agent (installed in Phase 3) collects these.

    • How to find (Logs Explorer Query): First, get vm-victim's Instance ID. Run gcloud compute instances describe vm-victim --zone=$ZONE --format='value(id)' --project=$GCP_PROJECT_ID in Cloud Shell. Copy the numerical ID. (e.g., 1469099579618837772).

    • Paste this query into the Query section.

        resource.type="gce_instance"
        resource.labels.instance_id="1469099579618837772" 
        log_id("apache_access")
      
      • Analyze: You should find an entry indicating a GET /sensitive_data.txt request from 10.128.0.2(or your attacker's IP).
    • What this means: This confirms the application itself (Apache) received the request, showing the full chain of events from network misconfiguration to application compromise.

Scenario 2: Service Account Privilege Escalation (Cloud Storage Data Exfiltration)

This attack involved a service account gaining excessive permissions and then accessing sensitive data in Cloud Storage. I'll look for the IAM change and the data access.

  1. Clue 1: IAM Policy Change (Admin Activity Log)

    • What I'm looking for: The administrative action where sa-attacker-vm was granted the roles/storage.objectViewer role.

    • How to find (Logs Explorer Query):

      • Paste this query into the Query section.

      • Note: When using this query in Logs Explorer, you might also find it helpful to select "Activity" under the "Log names" filter in the UI to narrow the displayed logs.

          logName="projects/${GCP_PROJECT_ID}/logs/cloudaudit.googleapis.com%2Factivity"
          protoPayload.methodName:SetIamPolicy
          protoPayload.serviceData.policyDelta.bindingDeltas.role="roles/storage.objectViewer"
        
        • Click Run query.

        • Analyze: This log entry confirms the sensitive permission was granted. Expand the log and check protoPayload.serviceData.policyDelta to see the exact role (roles/storage.objectViewer) and member (sa-attacker-vm) that were added. The protoPayload.authenticationInfo.principalEmail will show who performed this action (your user account, in this lab).

        • What this means: This is a critical security event. Granting overly broad permissions is a common attack vector for privilege escalation. Any SetIamPolicy event that grants sensitive roles should be thoroughly investigated.

  2. Clue 2: Cloud Storage Data Access Logs (Object Read/List)

    • What I'm looking for: Direct evidence that sa-attacker-vm actually listed and downloaded the sensitive file from the Cloud Storage bucket. I enabled these specific "Data Access" logs in Phase 3.

    • How to find (Logs Explorer Query):

        log_id("cloudaudit.googleapis.com%2Fdata_access")
        protoPayload.authenticationInfo.principalEmail="sa-attacker-vm@polar-cyclist-466100-e3.iam.gserviceaccount.com" # REPLACE with your project ID
        (protoPayload.methodName="storage.objects.get" OR protoPayload.methodName="storage.objects.list")
        protoPayload.resourceName="projects/_/buckets/polar-cyclist-466100-e3-sensitive-data/objects/secret_passwords.txt" # REPLACE with your project ID and bucket name
      
      • Analyze: You should see entries for storage.objects.list and storage.objects.get where the principalEmail is your sa-attacker-vm service account. This provides irrefutable proof of data access.
    • What this means: This confirms data exfiltration. Coupled with the IAM change, it shows how a privilege escalation directly led to data compromise.

Scenario 3: Malicious Script Execution / Resource Abuse

This attack involved a VM running a CPU-intensive script, potentially indicative of cryptomining. I'll primarily look at metrics and VM internal logs.

Cloud Monitoring (Metrics Explorer, Dashboards, Alerts)

  • Why Cloud Monitoring? While Logs Explorer is great for detailed forensic analysis, Cloud Monitoring excels at visualizing trends, setting thresholds, and alerting on anomalies (like sustained high CPU or unusual network spikes).

  • How to access:

    • Navigate to Operations > Monitoring in the GCP Console.

This is best detected by monitoring resource metrics.

  1. Clue 1: High CPU Usage (Metrics Explorer)
  • What I'm looking for: A clear, sustained spike in CPU utilization on vm-attacker corresponding to when I ran the cpu_intensive_script.py. The Ops Agent is designed to send these host metrics to Cloud Monitoring.

  • How to find (Metrics Explorer):

    • In Cloud Monitoring, navigate to Metrics Explorer.

    • Select a metric:

      • Resource Type: VM Instance

      • Metric: CPU utilization (found under Instance -> CPU)

    • Filter:

      • Add a filter for instance_name: vm-attacker
    • Group by: instance_name (optional, but helps visualize individual VM metrics)

    • Aggregator: mean (or max)

    • Aligner: mean (or max)

    • Time range: Adjust the time range (e.g., "Last 1 hour" or "Last 30 minutes") to cover when you ran the script.

    • Analyze: You should see a clear, sustained increase in CPU utilization (e.g., to 80-100%) for vm-attackerduring the script's execution period. This is the primary indicator.

  • What this means: Unexplained high CPU utilization, especially on a VM that typically has low usage, is a strong indicator of compromise, cryptomining, or an unauthorized workload. This is a crucial metric for security monitoring.

  1. Clue 2: Network Bytes (Metrics Explorer - Cross-Scenario Insight)

    • What I'm looking for: While not the primary detection for cryptomining (which is CPU-bound), observing network traffic can be useful for data exfiltration (Scenario 1 & 2).

    • How to find (Metrics Explorer):

      • Metric: VM Instance -> Network bytes received or Network bytes sent

      • Filter: instance_name="vm-victim" (for Scenario 1) or instance_name="vm-attacker" (for Scenario 2, if data was sent out to internet, though ours was internal).

      • Analyze: You might see smaller spikes corresponding to the curl or gcloud storage operations.

Building Custom Dashboards & Alerts (Advanced Detection)

For real-world monitoring, I wouldn't manually search logs or metrics every time. I'd set up dashboards for a quick overview and alerts for immediate notification.

  1. Create Custom Dashboards:

    • Why: Dashboards provide a centralized, visual overview of key metrics and log patterns.

    • How to create:

      • In Cloud Monitoring, navigate to Dashboards > Create Custom Dashboard.

      • Add Widget:

        • Line Chart: For vm-attacker CPU utilization.

        • Stacked Bar Chart: For VPC Flow Logs (log_id("vpc_flows")), showing count of action="ALLOW" for specific IP/port combinations (you'd need to create a log-based metric for this first).

        • Gauge/Scorecard: For a custom log-based metric tracking "Sensitive IAM Role Grants."

      • Save your dashboard.

  2. Create Log-Based Metrics (Crucial for Dashboard/Alerting on Logs):

    • Why: You can convert log patterns into numerical metrics. This allows you to graph log events on dashboards and set alerts.

    • How to create:

      • Navigate to Operations > Logging > Log-based Metrics.

      • Click CREATE METRIC.

      • For "Sensitive IAM Role Grants" (Scenario 2):

        • Metric Type: Counter

        • Log Filter: log_id("cloudaudit.googleapis.com%2Factivity") AND protoPayload.methodName="google.iam.admin.v1.IAM.SetIamPolicy" AND protoPayload.response.bindings.role="roles/storage.objectViewer"

        • Name: sensitive_iam_role_grants_counter

        • Click CREATE METRIC.

      • For "Unauthorized Port 8080 Access" (Scenario 1):

        • Metric Type: Counter

        • Log Filter: log_id("vpc_flows") AND jsonPayload.connection.src_ip="10.128.0.2" AND jsonPayload.connection.dest_ip="10.128.0.3" AND jsonPayload.connection.dest_port=8080 AND jsonPayload.action="ALLOW" (Adjust IPs for your VMs)

        • Name: unauthorized_port_8080_access

        • Click CREATE METRIC.

  3. Create Alerting Policies:

    • Why: Alerts provide immediate notification when a suspicious condition is met, allowing for rapid response.

    • How to create:

      • In Cloud Monitoring, navigate to Alerting > Create Policy.

      • For High CPU Usage (Scenario 3):

        • Select Metric: VM Instance -> CPU utilization

        • Filter: instance_name="vm-attacker"

        • Transform data: Keep default for mean and 5 min window.

        • Configure alert trigger: Condition is above 80% for 5 minutes.

        • Notification channels: Add an email or other channel.

        • Name: High CPU on Attacker VM

        • Click CREATE POLICY.

      • For Sensitive IAM Role Grant (Scenario 2 - using Log-based Metric):

        • Select Metric: Find your custom sensitive_iam_role_grants_counter metric (under Global-> Logging).

        • Transform data: Keep default for sum and 5 min window.

        • Configure alert trigger: Condition is above 0 for 5 minutes (meaning any count greater than zero).

        • Notification channels: Add an email or other channel.

        • Name: Sensitive IAM Role Granted

        • Click CREATE POLICY.

      • For Unauthorized Port 8080 Access (Scenario 1 - using Log-based Metric):

        • Select Metric: Find your custom unauthorized_port_8080_access metric.

        • Transform data: Keep default for sum and 1 min window.

        • Configure alert trigger: Condition is above 0 for 1 minute.

        • Notification channels: Add an email or other channel.

        • Name: Unauthorized Port 8080 Access

        • Click CREATE POLICY.

Phase 6: Cleaning Up Your Lab Environment

  • Why clean up? This is a critical final step in any cloud lab! To avoid incurring unnecessary costs for resources you're no longer using and to keep your GCP project tidy, it's essential to delete all the resources I created during this lab.

I'll provide gcloud CLI commands for quick cleanup, and I'll outline the Console steps as well.

Important Note on Deletion Order: Resources sometimes have dependencies (e.g., you can't delete a network router if a NAT gateway is using it, or a service account if it's attached to a running VM). I'll provide the commands in a logical order to minimize dependency errors.

1. Delete Compute Engine VMs

  • Why: VMs are one of the primary sources of cost. Deleting them first ensures you stop accruing compute charges.

  • How to delete (gcloud CLI - Recommended):

      echo "Deleting Compute Engine VMs (vm-attacker and vm-victim)..."
      gcloud compute instances delete vm-attacker vm-victim --zone=$ZONE --project=$GCP_PROJECT_ID --quiet
    
  • How to delete (Cloud Console - Alternative):

    1. Navigate to Compute Engine > VM instances in the GCP Console.

    2. Select the checkboxes next to vm-attacker and vm-victim.

    3. Click the DELETE button at the top and confirm the deletion.

2. Delete Cloud Storage Bucket

  • Why: Even small amounts of data in Storage buckets can accrue charges over time.

  • How to delete (gcloud CLI - Recommended):

      export SENSITIVE_BUCKET_NAME="${GCP_PROJECT_ID}-sensitive-data"
      echo "Deleting Cloud Storage bucket: gs://${SENSITIVE_BUCKET_NAME}..."
      gcloud storage rm -r gs://${SENSITIVE_BUCKET_NAME} --project=$GCP_PROJECT_ID --quiet
    
  • How to delete (Cloud Console - Alternative):

    1. Navigate to Cloud Storage > Buckets in the GCP Console.

    2. Select the checkbox next to your your-project-id-sensitive-data bucket.

    3. Click the DELETE button at the top and confirm the deletion.

3. Delete Firewall Rules

  • Why: While generally free, keeping unnecessary firewall rules is bad security practice.

  • How to delete (gcloud CLI - Recommended):

      echo "Deleting Firewall Rules (allow-ssh-from-iap and block-malicious-traffic-initial)..."
      gcloud compute firewall-rules delete allow-ssh-from-iap block-malicious-traffic-initial --project=$GCP_PROJECT_ID --quiet
    
  • How to delete (Cloud Console - Alternative):

    1. Navigate to VPC Network > Firewall rules in the GCP Console.

    2. Select the checkboxes next to allow-ssh-from-iap and block-malicious-traffic-initial.

    3. Click the DELETE button at the top and confirm the deletion.

4. Delete Cloud NAT Gateway

  • Why: The NAT Gateway itself has a cost, even if traffic is low.

  • How to delete (gcloud CLI - Recommended):

      echo "Deleting Cloud NAT Gateway..."
      export ROUTER_NAME="nat-router-${REGION}"
      export NAT_NAME="nat-gateway-${REGION}"
      gcloud compute routers nats delete ${NAT_NAME} --router=${ROUTER_NAME} --region=$REGION --project=$GCP_PROJECT_ID --quiet
    
  • How to delete (Cloud Console - Alternative):

    1. Navigate to Network Services > Cloud NAT in the GCP Console.

    2. Select the checkbox next to nat-gateway-us-central1.

    3. Click the DELETE button at the top and confirm the deletion.

5. Delete Cloud Router

  • Why: The Cloud Router, a prerequisite for NAT, also incurs a small cost.

  • How to delete (gcloud CLI - Recommended):

      echo "Deleting Cloud Router..."
      gcloud compute routers delete ${ROUTER_NAME} --region=$REGION --project=$GCP_PROJECT_ID --quiet
    
  • How to delete (Cloud Console - Alternative):

    1. Navigate to Network Services > Cloud Routers in the GCP Console.

    2. Select the checkbox next to nat-router-us-central1.

    3. Click the DELETE button at the top and confirm the deletion.

6. Delete Custom Service Accounts

  • Why: While typically free, it's good practice to clean up unused service accounts.

  • How to delete (gcloud CLI - Recommended):

      echo "Deleting Service Accounts (sa-attacker-vm and sa-victim-vm)..."
      gcloud iam service-accounts delete sa-attacker-vm@${GCP_PROJECT_ID}.iam.gserviceaccount.com --project=$GCP_PROJECT_ID --quiet
      gcloud iam service-accounts delete sa-victim-vm@${GCP_PROJECT_ID}.iam.gserviceaccount.com --project=$GCP_PROJECT_ID --quiet
    
  • How to delete (Cloud Console - Alternative):

    1. Navigate to IAM & Admin > Service Accounts in the GCP Console.

    2. Select the checkboxes next to sa-attacker-vm and sa-victim-vm.

    3. Click the DELETE button at the top and confirm the deletion.

7. (Optional) Disable Cloud Audit Data Access Logs

  • Why: If you enabled Data Access logs for Cloud Storage, you can disable them to reduce log volume if you don't need them for other purposes in your project.

  • How to disable (gcloud CLI - Recommended):

      # 1. Fetch the current IAM policy
      gcloud projects get-iam-policy $GCP_PROJECT_ID --format=yaml > /tmp/policy.yaml
    
      # 2. Use yq to remove the audit config for storage.googleapis.com
      #    (Note: This requires yq to be installed, as it was in Phase 3)
      yq -i 'del(.auditConfigs[] | select(.service == "storage.googleapis.com"))' /tmp/policy.yaml
    
      # 3. Apply the modified IAM policy
      gcloud projects set-iam-policy $GCP_PROJECT_ID /tmp/policy.yaml
    
  • How to disable (Cloud Console - Alternative):

    1. Navigate to IAM & Admin > Audit Logs in the GCP Console.

    2. Find Google Cloud Storage in the list.

    3. Click the checkbox next to it.

    4. In the info panel on the right, uncheck Data Read and Data Write.

    5. Click SAVE.

8. Delete the Entire GCP Project (Most Comprehensive Cleanup)

  • Why: This is the most thorough way to ensure all resources and associated configurations are removed, guaranteeing no further costs.

  • How to delete (Cloud Console - Recommended):

    1. Go to IAM & Admin > Settings in the GCP Console.

    2. Click SHUT DOWN.

    3. Enter your Project ID (polar-cyclist-466100-e3) to confirm. Note: Project deletion can take several days to complete fully.

Conclusion & Next Steps

Phew! If you've made it this far, congratulations amigos! You've successfully navigated a comprehensive GCP cybersecurity lab. You've built a multi-VM environment, simulated various attack scenarios, meticulously enabled logging and monitoring, and then acted as a digital detective to unearth the evidence of those attacks using Cloud Logging and Cloud Monitoring.

It's important to note that this lab was simplified for clarity and accessibility. In the real world, detecting a sophisticated threat actor is a far more complex challenge, involving advanced threat intelligence, anomaly detection, security information and event management (SIEM) systems, and deep forensic analysis. However, this lab serves as an excellent foundation and a great way to familiarize yourself with where crucial security signals reside within GCP. Understanding where to look and how logs and metrics behave in a simulated compromise is an invaluable skill.

Your Challenge: To truly deepen your learning, I challenge you to go back to Cloud Logging's Logs Explorer and Cloud Monitoring's Metrics Explorer. Don't just copy-paste my queries. Instead:

  • Try to generate the log queries on your own. Experiment with different filters.

  • Think about what other types of events or metrics you could use to detect these scenarios.

  • Consider what insights you would genuinely benefit from in a real security operations center (SOC) for each attack type. How would you prioritize the information?

What's Next? This lab touched upon just a few facets of GCP security. Consider exploring:

  • Security Command Center's other capabilities (even in the Free Tier).

  • Setting up VPC Service Controls for data perimeter security.

  • Implementing Identity-Aware Proxy for applications, not just SSH.

  • Diving deeper into Cloud IAM best practices.

Just like always, the journey of learning cybersecurity never truly ends.

Thanks for making it to the end. Keep learning!

0
Subscribe to my newsletter

Read articles from José Toledo directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

José Toledo
José Toledo