Project 1: Creating a Custom GCC Pass:

Samarth SharmaSamarth Sharma
4 min read

Introduction

For Stage 1 of my SPO600 project, I developed a custom pass for the GCC compiler that analyzes compiled code by printing each function's name, counting basic blocks, and displaying the total number of GIMPLE statements in each function. This will help to provide insights into code structure during compilation.

Environment Setup

I established a three-directory structure for my GCC development environment:

bashCopy# Directory structure
~/git/gcc/         # GCC source code repository
~/gcc-build-001/   # Build directory
~/gcc-test-001/    # Installation and testing directory

After reviewing the GCC documentation, I identified four key files that would need modification to implement a custom pass:

  1. A new source file (tree-my-pass.cc) - Containing the pass implementation

  2. passes.def - To register my pass in the compilation pipeline

  3. tree-pass.h - To declare my pass's factory function

  4. Makefile.in - To integrate my new source file into the build system

Developing the Custom Pass

Step 1: Creating the Pass Source File

First, I created a new file tree-my-pass.cc in the GCC source code directory with the following implementation:

#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "backend.h"
#include "tree.h"
#include "gimple.h"
#include "tree-pass.h"
#include "gimple-iterator.h"
#include "pass_manager.h"
#include "basic-block.h"

namespace {

const pass_data my_pass_data = {
    GIMPLE_PASS,         // Pass type: operates on GIMPLE
    "my-pass",           // Name of the pass
    OPTGROUP_NONE,       // No optimization group
    TV_NONE,             // No TV id
    0,                   // No properties required
    0,                   // No properties provided
    0,                   // No properties destroyed
    0                    // No specific flags
};

class my_pass : public gimple_opt_pass {
public:
    my_pass(gcc::context *ctxt)
        : gimple_opt_pass(my_pass_data, ctxt) { }

    // The gate method: always run this pass
    bool gate(function *) override {
        return true;
    }

    // The execute method: called for every function
    unsigned int execute(function *fun) override {
        if (dump_file) {
            fprintf(dump_file, "Processing function: %s\n", function_name(fun));

            // Count basic blocks
            int basic_block_count = 0;
            basic_block bb;
            FOR_EACH_BB_FN(bb, fun) {
                basic_block_count++;
            }
            fprintf(dump_file, "Number of basic blocks: %d\n", basic_block_count);

            // Count GIMPLE statements
            int gimple_stmt_count = 0;
            FOR_EACH_BB_FN(bb, fun) {
                for (gimple_stmt_iterator gsi = gsi_start_bb(bb);
                     !gsi_end_p(gsi);
                     gsi_next(&gsi)) {
                    gimple_stmt_count++;
                }
            }
            fprintf(dump_file, "Number of GIMPLE statements: %d\n", gimple_stmt_count);
        }
        return 0;
    }
};

} 

// Factory function to create an instance of the pass
gimple_opt_pass *
make_pass_my_pass(gcc::context *ctxt)
{
    return new my_pass(ctxt);
}

This implementation:

  • Defines a GIMPLE pass that runs on each function

  • Counts basic blocks in each function

  • Counts GIMPLE statements in each function

  • Outputs the results to the dump file

Step 2: Registering the Pass in passes.def

To register my pass in the GCC pipeline, I added the following line to passes.def:

NEXT_PASS (pass_my_pass, 1)

This registers my pass with a unique name and indicates it should be executed in the compiler's pass pipeline.

Step 3: Declaring the Pass in tree-pass.h

To make my pass's factory function available, I added this declaration to tree-pass.h:

extern gimple_opt_pass *make_pass_my_pass (gcc::context *ctxt);

Step 4: Updating Makefile.in

To include my pass in the build, I added it to the list of object files in Makefile.in:

tree-my-pass.o \

I made sure to use the same indentation and trailing backslash as other entries to maintain proper Makefile syntax.

Building and Testing Process

Building GCC with My Custom Pass

I configured and built GCC with my custom pass:

~/gcc-build-001
../git/gcc/configure --prefix=$HOME/gcc-test-001 --enable-languages=c --disable-bootstrap --disable-multilib
time make -j$(nproc) |& tee rebuild.log

Testing the Custom Pass

After resolving the build issues, I tested my pass with a simple C program:

#include <stdio.h>

int add(int a, int b) {
    return a + b;
}

int main() {
    int result = add(5, 3);
    printf("Result: %d\n", result);
    return 0;
}

Compiling with my custom pass enabled:

~/gcc-test-001/bin/gcc -fdump-tree-my-pass test.c -o test

This generated a dump file test.c.234t.my-pass with output like:

Processing function: add
Number of basic blocks: 1
Number of GIMPLE statements: 1

Processing function: main
Number of basic blocks: 2
Number of GIMPLE statements: 4

Reflections and Learning

This project provided has given nice understanding of GCC's pass management system and intermediate representations. What surprised me most was how GIMPLE statements fragment seemingly simple operations - a single line of C code can generate multiple GIMPLE statements, revealing the complexity hidden beneath high-level code. Another thing I noticed was how GCC reuses basic blocks during optimization. Functions that looked like they should have many control paths were often optimized down to just 2-3 basic blocks, showing how the compiler eliminates redundant code paths that we might not notice.

This project's code is available at my GitHub repository: https://github.com/samartho4/SPO-600/

Troubleshooting

0
Subscribe to my newsletter

Read articles from Samarth Sharma directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Samarth Sharma
Samarth Sharma

Looping around thinking to write it down...