Welcome again everyone, I want to start off by saying that I finally came up with a solution that works for stage 2 and also meets the criteria of stage 3 by including more than 2 variants! This solution took me quite some time. In fact, I spent several days deciphering where I went wrong and started to compile an entirely new solution.

I’ll first start off by giving some context as its been quite some time since my last artcle: When I started building my GCC pass for clone pruning, my goal was to find and analyze cloned functions that are generated by AFMV and determine if these variants are identical. If they are, we prune these functions. I found that my first implementation was a strong one, but it had limitations and didn’t achieve the expectations. I’ll go over the list of things that went wrong with the old code. Please visit the following link to see the old code: https://github.com/Hamza-Teli/spo600-project-stage-2/blob/main/tree-hteli1.cc

Old Code Issues

There were several issues with the old code which are listed and explained below:

Manual variant handling → The old logic relied on two maps, for resolver and variants and assumed each base function has only two variants. It also hard-coded for default functions so if a .default suffix was mising, it would break
Two passes: I used two seperate FOR_EACH_FUNCTION macros. The first pass identifies resolver functions and the second gathers variants. This is quite inefficient.
Incorrect basic block and gimple counts: I think because I was iterating again and again, I think it gave incorrect counts for GIMPLE and basic blocks.

New Code Changes

Now I’ll discuss the changes I made to the new code and the approach I took to finally get a working solution. However, please note that I will also discuss the limitations of my code towards the end. Please see the new code here and below as well: https://github.com/Hamza-Teli/spo600-project-stage-2/blob/main/tree-hteli1-final-version.cc

/* This pass was creatd by Hamza Teli with the help of
Professor Chris Tyler's SPO600 wiki and lecture. 

This pass accomplishes the following:


*/
// These headers were taken from Professor Chris Tyler's Week 7 Lecture
// These headers were taken from Professor Chris Tyler's Week 7 Lecture
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "backend.h"
#include "tree.h"
#include "gimple.h"
#include "tree-pass.h"
#include "ssa.h"
#include "tree-pretty-print.h"
#include "gimple-iterator.h"
#include "gimple-walk.h"
#include "internal-fn.h"
#include "gimple-pretty-print.h"
#include "cgraph.h"

// Added headers
#include "gimple-ssa.h"
#include "attribs.h"
#include "pretty-print.h"
#include "tree-inline.h"
#include "intl.h"
#include "function.h"
#include "basic-block.h"
#include <string>
#include <map>
#include <vector>
#include <cstdio>


// Namespace <--- This section I learned from SPO600 Week 7 - Class 1 Lecture from Professor Chris Tyler
namespace{

    struct FunctionMeta {
        std::string name;          // the name of the function
        function *funcPtr;         // pointer to function
        size_t gimpleCount;        // number of gimple
        size_t basicBlockCount;    // number of basic blocks
    };

    // Then we can store it in a map
    static std::map<std::string, FunctionMeta> functionMap;


    const pass_data pass_data_hteli1 = {
            GIMPLE_PASS, /* type */
            "hteli1", /* name of my pass [We will use this inside passes.def as pass_hteli1_pass]*/ 
            OPTGROUP_NONE, /* optinfo_ flags */
            TV_NONE, /* tv_id */
            PROP_cfg, /* specify that we need properties */
            0, /* the properties provided */
            0, /* the properties destroyed */
            0, /* todo_flags_start */
            0, /* todo_flags_finish */
    };

    /*
        Please refer to the instructions below from Professor CHris Tyler that helped me build the class:
        Pass Code
            Each pass provides a C++ source file which provides:
            -   A pass_data structure which defines the pass details (defined in tree-pass.h);
            -   A class which defines a gimple_opt_pass and includes the public methods gate and execute.
            -   The gate method controls whether the pass is active. It can consider multiple factors, including command-line arguments, source language, and target. This method returns a boolean.
            -   The execute method executes the pass. It accepts a pointer to a function object and will be called once for each function in the unit being compiled (which is not always what you want!).
            -   A method in the anon namespace named make_pass_name which returns a pointer to the gimple_opt_pass described above.
    */


    // This is where you identify the class
    class pass_hteli1 : public gimple_opt_pass {
        public:

            // Constructor -------------------
            pass_hteli1(gcc::context *ctxt) : gimple_opt_pass(pass_data_hteli1, ctxt) {

            }

            // The gate function 
            bool gate(function *) final override {
                // return 
                return true;

            }

            // The execute function: this is where the magic happens
            unsigned int execute (function * func) override {

                // Get function name
                std::string functionName = IDENTIFIER_POINTER(DECL_NAME(func->decl));

                // Get basic blocks
                size_t basicBlockCount = get_number_of_basic_blocks(func);
                // Get gimple count
                size_t gimpleCount = get_number_of_gimple_statements(func);

                functionMap[functionName] = {
                    functionName,
                    func,
                    gimpleCount,
                    basicBlockCount
                };

                // This is where we will get started with identifying the functions that have been cloned
                if (dump_file) {
                    fprintf(dump_file, "Function: %s\n", functionName.c_str());
                    fprintf(dump_file, "  Basic Blocks: %zu\n", basicBlockCount);
                    fprintf(dump_file, "  GIMPLE Stmts: %zu\n", gimpleCount);
                    fprintf(dump_file, "---------------------------\n");
                }

                print_function_map_summary();

                // Return value
                return 0;
            }


            // Get the number of basic blocks
            size_t get_number_of_basic_blocks(function *func) {
                // Instantiate counter
                size_t count = 0;

                basic_block bb;
                FOR_EACH_BB_FN(bb, func){
                    count++;
                }

                return count;
            }

            // Get the number of statements inside the function
            size_t get_number_of_gimple_statements(function *func) {
                // Initialize count and basic block
                size_t count = 0;
                basic_block bb;


                FOR_EACH_BB_FN(bb, func) {
                    for (gimple_stmt_iterator gsi = gsi_start_bb(bb);
                        !gsi_end_p(gsi);
                        gsi_next(&gsi)) {

                        gimple *stmt = gsi_stmt(gsi);
                        if (!stmt) continue; 
                        count++;
                    }
                }

                if (dump_file) {
                    fprintf(dump_file, "Number of gimple statements is %zu\n", count);
                }

                return count;
            }

            // Print the entire map; called at end of each execute, but only
            // the last invocation will show the full dataset.
            void print_function_map_summary() const {
                if (!dump_file)
                    return;

                fprintf(dump_file, "\n=== FunctionMap Summary (end of current execute) ===\n");

                for (const auto &pair : functionMap) {
                    const auto &baseEntry = pair.second;
                    const std::string &baseName = baseEntry.name;

                    // Skip functions that are not base (i.e., have a dot in their name)
                    if (baseName.find('.') != std::string::npos)
                        continue;

                    // Construct resolver name
                    std::string resolverName = baseName + ".resolver";

                    // Check if resolver exists
                    if (functionMap.find(resolverName) == functionMap.end())
                        continue; // No resolver, skip

                    // Now gather all variants that start with baseName + "." and are NOT the resolver
                    std::vector<FunctionMeta> variants;
                    for (const auto &innerPair : functionMap) {
                        const std::string &variantName = innerPair.first;
                        if (variantName.find(baseName + ".") == 0 &&
                            variantName != resolverName) {
                            variants.push_back(innerPair.second);
                        }
                    }

                    // Compare base function with each variant
                    for (const auto &variant : variants) {
                        fprintf(dump_file,
                                "Comparing base '%s' with variant '%s'\n",
                                baseName.c_str(), variant.name.c_str());

                        if (variant.basicBlockCount == baseEntry.basicBlockCount &&
                            variant.gimpleCount == baseEntry.gimpleCount) {
                                fprintf(dump_file,
                                    "PRUNE: %s (BB: %zu, GIMPLE: %zu) and %s (BB: %zu, GIMPLE: %zu) are identical\n",
                                    baseName.c_str(), baseEntry.basicBlockCount, baseEntry.gimpleCount,
                                    variant.name.c_str(), variant.basicBlockCount, variant.gimpleCount);

                        } else {
                            fprintf(dump_file,
                                "NOPRUNE: %s (BB: %zu, GIMPLE: %zu) and %s (BB: %zu, GIMPLE: %zu) differ\n",
                                baseName.c_str(), baseEntry.basicBlockCount, baseEntry.gimpleCount,
                                variant.name.c_str(), variant.basicBlockCount, variant.gimpleCount);
                        }
                    }
                }

                fprintf(dump_file, "=== End Summary ===\n");
            }
    };
}

// This is used inside the tree-pass.h file
gimple_opt_pass* make_pass_hteli1(gcc::context *ctxt) {
    return new pass_hteli1(ctxt);
}

A new struct that holds the functions attributes I want to record such as function name, pointer, GIMPLE count, and basic block count → I created a new struct named FunctionMeta that holds these attributes. Please note, that because this pass its running each time for each function, all I had to do was uncomment the function pointer parameter in the execute function, and use that to get the name. I then used the old methods I created in the old pass named get_number_of_gimple_statements and get_number_of_basic_blocks. Then I just kept adding to the static vector that holds a series of FunctionMeta structs.
New print_function_map_summary() method → This function basically analyzes the whole map at once and prints the results
Improved Variant detection → So instead of assuming each base has 2 variants, the new code first finds the base function, then its corresponding resolver, and anything else it captures as long as the base name matches.
Logic to print PRUNE/NOPRUNE → All variants are compared against the base and its printed as PRUNE or NOPRUNE as long as the GIMPLE and basic block counts match.

This is a brief overview of what was changed, and now I’ll show you a demonstration on each machine.

Outputs

For both machines, after I made the changes to my tree pass inside git/gcc/gcc directory. I executed the following commands:

Please navigate to the gcc-build-001 directory and execute the following commands:

time make -j$(nproc) |& tee fourth_build.log
make install
addgcc

PLEASE NOTE: addgcc is an alias I created from before. If you don’t know what this alias does, please read my previous article. All it does is it switches my gcc compiler to the test one.

x86 Output

On the x86 machine, I navigated to the test_clone directory inside the spo600 folder that Professor Chris Tyler provided. Again, if you are confused, please read the previous article. I went into this directory and executed make clean then make. This generated two files that I viewed that shows my pass works! This can be seen below:

PRUNE File Output - x86


Number of gimple statements is 58
Function: main
  Basic Blocks: 4
  GIMPLE Stmts: 58
---------------------------

=== FunctionMap Summary (end of current execute) ===
Comparing base 'scale_samples' with variant 'scale_samples.popcnt'
PRUNE: scale_samples (BB: 31, GIMPLE: 165) and scale_samples.popcnt (BB: 31, GIMPLE: 165) are identical
=== End Summary ===

NOPRUNE File Output - x86

Number of gimple statements is 58
Function: main
  Basic Blocks: 4
  GIMPLE Stmts: 58
---------------------------

=== FunctionMap Summary (end of current execute) ===
Comparing base 'scale_samples' with variant 'scale_samples.arch_x86_64_v3'
NOPRUNE: scale_samples (BB: 31, GIMPLE: 165) and scale_samples.arch_x86_64_v3 (BB: 40, GIMPLE: 193) differ
=== End Summary ===

As you can see that it works and shows the necessay meta data that I captured inside the FunctionMeta map. It makes a comparison and outputs the necessary result! (Took me so long but it was all worth it)

Let’s move on to the AArch64 system and test out my gcc there now.

AArch64 output

Again, I re-built the gcc, switched the system gcc to my custom one, and compiled the test_clone files. Let’s look at the output below

PRUNE File Output - AArch64


Number of gimple statements is 55
Function: main
  Basic Blocks: 4
  GIMPLE Stmts: 55
---------------------------

=== FunctionMap Summary (end of current execute) ===
Comparing base 'scale_samples' with variant 'scale_samples.rng'
PRUNE: scale_samples (BB: 32, GIMPLE: 130) and scale_samples.rng (BB: 32, GIMPLE: 130) are identical

NOPRUNE File Output - AArch64


Number of gimple statements is 55
Function: main
  Basic Blocks: 4
  GIMPLE Stmts: 55
---------------------------

=== FunctionMap Summary (end of current execute) ===
Comparing base 'scale_samples' with variant 'scale_samples.sve2'
NOPRUNE: scale_samples (BB: 32, GIMPLE: 130) and scale_samples.sve2 (BB: 13, GIMPLE: 49) differ
=== End Summary ===

If only you can see the smile on my face right now because it works :)

I want to conclude this section by saying that on both systems, the correct PRUNE and NOPRUNE output is being displayed. Now I will go into the logic of my code that allows it to meet the criteria of stage 3.

Stage 3 Logic

If you look at my pass I have a function named print_function_map_summary(). This function contains the logic that accounts for each variant, no matter how many there are. You can see the code below as well

// Print the entire map; called at end of each execute, but only
            // the last invocation will show the full dataset.
            void print_function_map_summary() const {
                if (!dump_file)
                    return;

                fprintf(dump_file, "\n=== FunctionMap Summary (end of current execute) ===\n");

                for (const auto &pair : functionMap) {
                    const auto &baseEntry = pair.second;
                    const std::string &baseName = baseEntry.name;

                    // Skip functions that are not base 
                    if (baseName.find('.') != std::string::npos)
                        continue;

                    // Construct resolver name
                    std::string resolverName = baseName + ".resolver";

                    // Check if resolver exists
                    if (functionMap.find(resolverName) == functionMap.end())
                        continue; // No resolver, skip


                    std::vector<FunctionMeta> variants;
                    for (const auto &innerPair : functionMap) {
                        const std::string &variantName = innerPair.first;
                        if (variantName.find(baseName + ".") == 0 &&
                            variantName != resolverName) {
                            variants.push_back(innerPair.second);
                        }
                    }

                    // Compare base function with each variant
                    for (const auto &variant : variants) {
                        fprintf(dump_file,
                                "Comparing base '%s' with variant '%s'\n",
                                baseName.c_str(), variant.name.c_str());

                        if (variant.basicBlockCount == baseEntry.basicBlockCount &&
                            variant.gimpleCount == baseEntry.gimpleCount) {
                                fprintf(dump_file,
                                    "PRUNE: %s (BB: %zu, GIMPLE: %zu) and %s (BB: %zu, GIMPLE: %zu) are identical\n",
                                    baseName.c_str(), baseEntry.basicBlockCount, baseEntry.gimpleCount,
                                    variant.name.c_str(), variant.basicBlockCount, variant.gimpleCount);

                        } else {
                            fprintf(dump_file,
                                "NOPRUNE: %s (BB: %zu, GIMPLE: %zu) and %s (BB: %zu, GIMPLE: %zu) differ\n",
                                baseName.c_str(), baseEntry.basicBlockCount, baseEntry.gimpleCount,
                                variant.name.c_str(), variant.basicBlockCount, variant.gimpleCount);
                        }
                    }
                }

                fprintf(dump_file, "=== End Summary ===\n");
            }

So how does it accomplish this? There are 3 core steps.

Look through each function in the map → It’ll skip the resolver and consider the base functions
Collect all variants → It does this by using a for loop for each base function and collecting all its variants by using its basename and ensuring its not a resolver
Compares the data inside the struct

This implementation took me long and it finally works. Although it works, I do want to comment on its limitations and inefficiencies.

Limitations of current implementation

If you look at my code, you can tell that the vector map I created to hold each FunctionMeta struct is done for every single function. Which in my opinion is very inefficient. Theres no need to collect the data of variants and functions without a resolver. We should just discard them from the beginning. So I think its better if I filter out non-relevant functions earlier.

Secondly, the print_function_map_summary() method, is called for each pass on every function. This too is very inefficient as I’m logging redundant information again and again, the summary is only printed correctly at the end of the pass because that includes every function. That too is another issue. So what I would do differently is figure out how I can ensure that is only called after the last function pass. But is there a method or way of knowing? I have no idea and have not read enough documentation to know.

Lastly, I believe my pass does not include additional important checks that Professor Chris Tyler pointed out as well. For example, while my pass does compare the number of basic blocks and gimple statements, it doesnt go further. Like for example I don’t check the type of operation being performed in assignments. In addition, I’m not tracking how variables are reused or if the operands are the same. If I did this, my comparison would be more accurate.

In conclusion, I believe that by adding these optimizations like first filtering out functions,one singular output, and additional checks would drastically improve the performance and accuracy of my pass.

Conclusion

Because this is my last article for this course, I want to conclude by discussing more about this course to others. Who knows, maybe before taking the course, you come across this article. I want to say to other students at Seneca that this course is definitely not easy. If you are looking for a professional option thats easy, you are in the wrong place. HOWEVER, if you are looking to learn some new concepts I cannot recommend this course enough. Professor Chris Tyler challenges you and provides the opportunity for you to work on some very interesting topics. I can say with full confidence that I learned a lot during this course. It made me think in different ways, I brushed up on my C programming syntax as well, and looked at solutions differently. Thank you Professor Chris Tyler!

References

Please visit the SPO600 wiki (the link has changed): http://spo600.proximity.on.ca/doku.php?id=spo600:start

Project Stage 2 Fix & Stage 3 Implementation!

Table of contents