Static Compilation. Where is my printf?

Static compilation is the process of compiling a computer program in such a way that all the library code that the program depends on is included within the program's executable file. This is done by linking the program with static libraries (.a files in Unix-like systems, .lib files in Windows) rather than dynamic libraries (.so files in Unix-like systems, .dll files in Windows).

When you statically compile a program, the compiler includes copies of all the routines the program uses directly into the executable. The advantages of static compilation include:

  • Portability: The resulting executable is self-contained, which means it does not depend on the system's shared libraries and can be run on any compatible system without additional dependencies.

  • Performance: Sometimes, statically compiled programs can run slightly faster because they don't incur the overhead of dynamic linking at runtime.

  • Reliability: Since all the code the program needs is contained within its own executable, it's not susceptible to issues like "dependency hell" or problems arising from the wrong version of a shared library being present on the system.

However, there are also disadvantages:

  • Size: Statically compiled executables are typically larger because they include all the code they use, rather than sharing common libraries across the system.

  • Updates: If a library has a bug that is fixed or improved, you need to recompile the entire program with the updated static library to benefit from the changes. With dynamic libraries, you can simply update the library on the system.

  • Memory Usage: Multiple running instances of statically compiled programs do not share common library code in memory, leading to higher memory usage.

Overall, static compiling allows for the creation of executables that are self-contained and include all required library code. This approach has its advantages in terms of simplicity and reliability, but it can lead to larger file sizes and the possibility of redundancy.

On a Linux system, tools like objdump and nm can be used to examine statically built C programs to find the printf function and its call locations within the binary. Allow me to show you the way:

  1. Compile the Program Statically: First, you need to compile your C program statically. You can do this using the -static flag with gcc:

     gcc -static -o myprogram myprogram.c
    
  2. Identify the printf Function in the Binary: Use the nm tool to list symbols in the binary. The printf function will be included in the binary since it's statically linked:

     nm --defined-only myprogram | grep ' printf'
    

    This should give you the address of the printf function within your binary.

  3. Disassemble the Binary: Use objdump to disassemble the binary and find the printf code:

     objdump -d myprogram > myprogram.asm
    

    Then you can search for the address found with nm in the myprogram.asm file to see the disassembled code for printf.

  4. Find Calls to printf: To find where printf is being called from, you can search for the call instruction in the disassembly:

     grep -B 5 'call.*<printf>' myprogram.asm
    

    The -B 5 flag will show you 5 lines before the call instruction, which can help you identify the calling function. The output will show you the addresses of the instructions that are calling printf.

  5. Analyze the Call Sites: Each call site will have an address in the disassembly. You can look around these addresses to understand the context of the call, such as which function is making the call and what parameters are being passed.

You can see the assembly language representation of the machine code—the actual binary code for printf—in the disassembly. The results of objdump could be difficult to understand if you aren't an expert in assembly language.

Following these steps will assume that you are using an AMD64 machine and have the necessary tools to work in an environment similar to Unix. The process and tools you use may change depending on the system or architecture you're working with.

0
Subscribe to my newsletter

Read articles from Jyotiprakash Mishra directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Jyotiprakash Mishra
Jyotiprakash Mishra

I am Jyotiprakash, a deeply driven computer systems engineer, software developer, teacher, and philosopher. With a decade of professional experience, I have contributed to various cutting-edge software products in network security, mobile apps, and healthcare software at renowned companies like Oracle, Yahoo, and Epic. My academic journey has taken me to prestigious institutions such as the University of Wisconsin-Madison and BITS Pilani in India, where I consistently ranked among the top of my class. At my core, I am a computer enthusiast with a profound interest in understanding the intricacies of computer programming. My skills are not limited to application programming in Java; I have also delved deeply into computer hardware, learning about various architectures, low-level assembly programming, Linux kernel implementation, and writing device drivers. The contributions of Linus Torvalds, Ken Thompson, and Dennis Ritchie—who revolutionized the computer industry—inspire me. I believe that real contributions to computer science are made by mastering all levels of abstraction and understanding systems inside out. In addition to my professional pursuits, I am passionate about teaching and sharing knowledge. I have spent two years as a teaching assistant at UW Madison, where I taught complex concepts in operating systems, computer graphics, and data structures to both graduate and undergraduate students. Currently, I am an assistant professor at KIIT, Bhubaneswar, where I continue to teach computer science to undergraduate and graduate students. I am also working on writing a few free books on systems programming, as I believe in freely sharing knowledge to empower others.