Week 11 - Project Stage 2: Phase 2A - Enhancing Cloned Function Identifier


Welcome back! This article is a continuation of my previous article, “Week 11 - Project Stage 2: Phase 1&2 (Create New Pass + Identify Cloned Function)”, where I created the custom GCC pass and implemented the cloned function logic. Upon reviewing the output again, I found that my pass needed to be fixed and the output needed to be more refined. Please read below on the improvements I made :)
And as always to access the github repository, please visit the following link: https://github.com/Hamza-Teli/spo600-project-stage-2
What’s new?
Building upon my earlier work, I made the following enhancements:
- A more enhanced variant mapping → I created a new variant map that pairs each base function with its variants. This map stores a vector of variants as its values. This way I can easily track the function variants.
// The execute function: this is where the magic happens
unsigned int execute (function * /*func*/) override {
// I created a static flag to ensure it only prints once
static bool end = false;
if (end) {
return 0;
}
// Lets create a map that holds the functions
std::map<std::string, std::string> resolverMap;
// another map to store the variant functions (the key is the basename and corresponding values are variants)
std::map<std::string, std::vector<std::string>> variantMap;
// Use cgraph node
cgraph_node *node;
// This is where we will get started with identifying the functions that have been cloned
if (dump_file) {
- Bug Fix for default functions → Upon reviewing my code, I found that it was excluding the default function from the variant list. This was because I excluded functions without a dot which is not right.
// Second pass goes here where we use the names inside our map and find all the variants
FOR_EACH_FUNCTION(node) {
// Get the function pointer
function *current_function_pointer = node->get_fun();
// Validate
if (!current_function_pointer)
continue;
// Get the complete funciton name
std::string functionName(function_name(current_function_pointer));
// Instantiate the variables
std::string baseName;
std::string suffix;
// Get the dot
size_t dot = functionName.find('.');
// Check if the dot is there
if (dot== std::string::npos){
// If there is no dot then we treat it as a default one
baseName = functionName;
suffix = "default";
functionName = baseName + ".default";
}
else {
baseName = functionName.substr(0, dot);
suffix = functionName.substr(dot + 1);
}
// Now we check that if the function has a resolver suffix, if so just continue
if (suffix == "resolver") {
continue;
}
- Made the code more modular → I created a helper function that simply prints the maps I created. This way its easier for the end user to read whats going on. (I will probably extend this further as I have quite a bit of duplicate code)
// This function will take the resolver map and corresponding variants and print them
void print_all_cloned_variants(const std::map<std::string, std::vector<std::string>> &variantMap, const std::map<std::string, std::string> &resolverMap) {
// Use for loop
for (const auto &element : variantMap) {
// Get the key and value
const std::string &baseName = element.first;
const std::vector<std::string> &variants = element.second;
// Now we print the resolver in a nice clean matter
fprintf(dump_file, "------------------------- Summary --------------------------\n");
fprintf(dump_file, "Resolver Function: %s\n", resolverMap.at(baseName).c_str());
// Now simply print the variants for that resolver function
for (const auto &variant : variants) {
fprintf(dump_file, " --------> Variant: %s\n", variant.c_str());
}
fprintf(dump_file, "-----------------------------------------------------------\n");
}
}
- Refined the Output Presentation → I also made the output a lot easier to read by adding extra fprintf statements that divide sections. Just makes it easier for the end user to see whats going on.
// Now we check if base has a resolver
if (resolverMap.find(baseName) != resolverMap.end()) {
variantMap[baseName].push_back(functionName);
fprintf(dump_file, "**** ---> Clone variant successfully found: %s (base function: %s) with resolver: %s\n", functionName.c_str(), baseName.c_str(), resolverMap[baseName].c_str());
fprintf(dump_file, "--------------------------------------------------------------------------------\n");
}
- Bug Fix for output → The print cloned variants function was printing for each iteration, therefore, I moved it pass the if(dump_file) block and created a static bool flag that tells me if the end was reached.
// Custom function that prints the map created
///.... FOR EACH
}
print_all_cloned_variants(variantMap, resolverMap);
// Set end to true
end = true;
// Return value
return 0;
}
Testing & Output
After making these changes, I tested the pass by re-running make clean
, and make
inside the following directory: spo600/examples/test-clone
. It created the following files:
clone-test-aarch64-noprune-clone-test-core.c.265t.hteli1
clone-test-aarch64-prune-clone-test-core.c.265t.hteli1
PRUNE File Output
[hteli1@aarch64-002 test-clone]$ cat clone-test-aarch64-prune-clone-test-core.c.265t.hteli1
;; Function scale_samples (scale_samples.default, funcdef_no=23, decl_uid=5500, cgraph_uid=24, symbol_order=23)
--------------------------------------------------------------------------------
**** ---> Resolver was found for base function: scale_samples
**** ---> Clone variant successfully found: scale_samples.rng (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
**** ---> Clone variant successfully found: scale_samples.default (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
------------------------- Summary --------------------------
Resolver Function: scale_samples.resolver
--------> Variant: scale_samples.rng
--------> Variant: scale_samples.default
-----------------------------------------------------------
NOPRUNE File Output
[hteli1@aarch64-002 test-clone]$ cat clone-test-aarch64-noprune-clone-test-core.c.265t.hteli1
;; Function scale_samples (scale_samples.default, funcdef_no=23, decl_uid=5500, cgraph_uid=24, symbol_order=23)
--------------------------------------------------------------------------------
**** ---> Resolver was found for base function: scale_samples
**** ---> Clone variant successfully found: scale_samples.sve2 (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
**** ---> Clone variant successfully found: scale_samples.default (base function: scale_samples) with resolver: scale_samples.resolver
--------------------------------------------------------------------------------
------------------------- Summary --------------------------
Resolver Function: scale_samples.resolver
--------> Variant: scale_samples.sve2
--------> Variant: scale_samples.default
-----------------------------------------------------------
Conclusion
In conclusion, these changes improved the logic, made my code more modular, and cleaner. Currently, I think my code works as required so I will move on to the next stage. However, if I need to make any changes, I will document it in a new article, so stay tuned!
In the next stage, I will now create the logic to compare the variants I found. I will attempt to do this by utilizing the variant map I already created from before!
Subscribe to my newsletter
Read articles from Hamza Teli directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
