Optimizing C++ Code for CKB-VM

CryptapeCryptape
20 min read

This article documents the optimization efforts to directly port C++ code over to the CKB Virtual Machine (CKB-VM) without any modifications, using Bitcoin’s code as the example.

It explores the challenges in tailoring the Bitcoin codebase for CKB-VM (ckb-bitcoin-vm) and details the innovative approaches to overcome binary size constraints, shedding light on the potential for porting more C/C++ codebases to the CKB-VM.

Why musl libc & LLVM libcxx

Feel free to skip this recap section if you are familiar with the latest in CKB contract development.

Early Days With GNU

In the very early days, we used RISC-V's patched GNU toolchain to compile C and C++ code (Polyjuice, the Ethereum compatible layer on CKB, worked this way). GNU toolchain provided us with a version of newlib libc (though we patched it a bit since CKB-VM lacked an MMU) on RISC-V. It also enabled GCC's libstdc++ to compile C++ code. At this stage, we were cool as long as we stuck with the complete GNU toolchain.

Transitioning to LLVM

As time goes by, LLVM’s support for RISC-V was rapidly catching up. We gained more benefits switching over to Clang with LLVM toolchain thanks to its stability, popularity (more maintainers are actively working on LLVM these days) and ubiquity (LLVM binary release was available for almost all platforms). The official LLVM distribution was then the go-to for building CKB-VM programs, no changes or forks needed.

Challenges in New Workflow

This new workflow, while beneficial, certainly came with its own caveats: LLVM releases do not ship with a C standard library (typically named libc) nor a C++ standard library (conventionally named libc++) for RISC-V, which can be problematic for non-trivial C/C++ code. In the past we maintained a crappy subset of libc, which is then good enough for the libraries used, but it could not solve the whole problem. In addition, the C++ code had been pretty much ignored in the past.

Porting libc & libc++ to CKB-VM

Lately, we've been porting full-fledged libc and libc++ to CKB-VM, to compile C/C++ code without hacks anymore. The two libraries we ported over to CKB-VM's RISC-V environment are:

  • musl libc: a mature, high-quality, and cross-platform implementation of C standard library. Visit the GitHub repo.

  • LLVM libcxx: the C++ standard library from the LLVM suite, with over a billion of active users from Apple’s macOS, iOS, watchOS, and tvOS, Google Search, the Android operating system, FreeBSD, etc.. Visit the GitHub repo.

Now, one can combine the official LLVM toolchain, musl libc, and LLVM libcxx together to compile much non-trivial C/C++ code to CKB-VM. To me, this would unlock a new set of possibilities for CKB-VM, including the example used in this post: we compiled part of the Bitcoin implementation in C++ directly to CKB-VM without any code changes.

It's worth mentioning that only musl libc requires slight changes to cope with CKB-VM's runtime environment. No changes are required to LLVM toolchain and the LLVM libcxx. However, as you will see later, optional tweaks to the LLVM libcxx can greatly reduce the final generated binary size. We do want to stress that those changes will remain optional as an enhancement. It's always possible to take the upstream libcxx and compile it to CKB-VM as it is.

Compiling C++ Code From Bitcoin Codebase

Understanding Build System in C/C++

Typical C/C++ codebases have build systems like Makefiles or CMake. The proper way of building such codebase is to tweak the arguments to a build systems for a different target. We've seen such workflows where we fetch and build the secp256k1 library in this case, or configure and compile using Clang with LLVM toolchain for RISC-V here.

However, it might not always be this case. The Bitcoin codebase, by default, only compiles everything together into a single binary. We only need the Bitcoin Script VM related code from Bitcoin. In addition, not all of Bitcoin's code can be compiled to CKB-VM's RISC-V configuration now as threading and networking are two major blockers.

Customizing Bitcoin Codebase for CKB-VM

Here we took a different approach:

  1. Start with the file containing Bitcoin Script VM implementation.

  2. Compile this file to RISC-V format, linked with other code for reading Bitcoin transaction and then feed it into the Bitcoin Script VM. (Of course this interpreter C++ source file would reference functions from other C++ source file, the linking step would emit errors saying certain functions cannot be found.)

  3. Look for the additional functions from the Bitcoin codebase, add additional C++ source files containing those functions to our compilation workflow.

  4. Repeat the whole process until we have enough functions and C++ source files for the Bitcoin Script VM interpreter to use.

Now we have our CKB-VM runnable binary. The description above might sound a bit dry. To make it concrete, we have prepared a branch for you to try the whole process out:

$ git clone --recursive <https://github.com/xxuejie/ckb-bitcoin-vm>
$ cd ckb-bitcoin-vm
$ git checkout 7ceb35103c35dfccb32232da92621eb190fc0e67

In this commit, we are only linking main.o, compiled from main.cpp, with interpreter.o, compiled from interpreter.cpp. The latter provides the implementation of VerifyScript function for executing and verifying Bitcoin Script.

If we compile the code as it is, errors will be thrown:

$ make
(omitted...)

ld.lld-18: error: undefined symbol: CSHA256::CSHA256()
>>> referenced by interpreter.cpp:28 (deps/bitcoin/src/script/interpreter.cpp:28)
>>>               build/interpreter.o:(EvalScript(std::__1::vector<std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, std::__1::allocator<std::__1::vec
tor<unsigned char, std::__1::allocator<unsigned char>>>>&, CScript const&, unsigned int, BaseSignatureChecker const&, SigVersion, ScriptExecutionData&, ScriptError_t*))
>>> referenced by interpreter.cpp:28 (deps/bitcoin/src/script/interpreter.cpp:28)
>>>               build/interpreter.o:(EvalScript(std::__1::vector<std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, std::__1::allocator<std::__1::vec
tor<unsigned char, std::__1::allocator<unsigned char>>>>&, CScript const&, unsigned int, BaseSignatureChecker const&, SigVersion, ScriptExecutionData&, ScriptError_t*))
>>> referenced by interpreter.cpp:28 (deps/bitcoin/src/script/interpreter.cpp:28)
>>>               build/interpreter.o:(EvalScript(std::__1::vector<std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, std::__1::allocator<std::__1::vec
tor<unsigned char, std::__1::allocator<unsigned char>>>>&, CScript const&, unsigned int, BaseSignatureChecker const&, SigVersion, ScriptExecutionData&, ScriptError_t*))
>>> referenced 11 more times

ld.lld-18: error: undefined symbol: CSHA256::Write(unsigned char const*, unsigned long)
>>> referenced by interpreter.cpp:28 (deps/bitcoin/src/script/interpreter.cpp:28)
>>>               build/interpreter.o:(EvalScript(std::__1::vector<std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, std::__1::allocator<std::__1::vec
tor<unsigned char, std::__1::allocator<unsigned char>>>>&, CScript const&, unsigned int, BaseSignatureChecker const&, SigVersion, ScriptExecutionData&, ScriptError_t*))
>>> referenced by interpreter.cpp:28 (deps/bitcoin/src/script/interpreter.cpp:28)
>>>               build/interpreter.o:(EvalScript(std::__1::vector<std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, std::__1::allocator<std::__1::vec
tor<unsigned char, std::__1::allocator<unsigned char>>>>&, CScript const&, unsigned int, BaseSignatureChecker const&, SigVersion, ScriptExecutionData&, ScriptError_t*))
>>> referenced by interpreter.cpp:29 (deps/bitcoin/src/script/interpreter.cpp:29)
>>>               build/interpreter.o:(EvalScript(std::__1::vector<std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, std::__1::allocator<std::__1::vec
tor<unsigned char, std::__1::allocator<unsigned char>>>>&, CScript const&, unsigned int, BaseSignatureChecker const&, SigVersion, ScriptExecutionData&, ScriptError_t*))
>>> referenced 62 more times

(omitted...)

Retrieving the Missing SHA25 Files

One obvious finding is that the implementation for SHA256 is missing. We can search in the Bitcoin codebase for CSHA256 class. It turns out this file contains the very CSHA256 implementation we need. Now we can modify the Makefile with compilation step for sha256.cpp file:

# Note this line is also modified
BITCOIN_LIBS := interpreter.o sha256.o

# (omitted...)

build/interpreter.o: deps/bitcoin/src/script/interpreter.cpp $(MUSL_TARGET) $(LIBCXX_TARGET)
        $(CLANGXX) -c $< -o $@ $(CXXFLAGS)

# This is newly added
build/sha256.o: deps/bitcoin/src/crypto/sha256.cpp $(MUSL_TARGET) $(LIBCXX_TARGET)
        $(CLANGXX) -c $< -o $@ $(CXXFLAGS) -I build

See this commit for details. Note that to compile sha256.cpp file, a Bitcoin config header file is required. Luckily, an empty header file suffices for the problem. For now we keep the Bitcoin config header file in build folder. This explains why we need a new -I argument when compiling sha256.cpp file.

Handling Additional Libraries like JSONLite

When compiling the code, the linker does not complain about the missing CSHA256 related functions. However, there are other functions missing:

$ make
(omitted...)

ld.lld-18: error: undefined symbol: jsonlite_static_buffer_init
>>> referenced by main.cpp:196
>>>               build/main.o:(main)

ld.lld-18: error: undefined symbol: jsonlite_parser_init
>>> referenced by main.cpp:196
>>>               build/main.o:(main)

ld.lld-18: error: undefined symbol: jsonlite_default_callbacks
>>> referenced by main.cpp:35
>>>               build/main.o:(main)
>>> referenced by main.cpp:35
>>>               build/main.o:(main)
>>> referenced by main.cpp:35
>>>               build/main.o:(main)

(omitted...)

It's not just Bitcoin related code that needs to be linked to our binary. Other parts are also required, for example, the JSONlite used by our main.cpp to parse inputs in JSON format.

Let's add the JSONlite implementation to the binary:

# Note this line is also modified
BITCOIN_LIBS := interpreter.o sha256.o jsonlite.o

# (omitted...)

build/sha256.o: deps/bitcoin/src/crypto/sha256.cpp $(MUSL_TARGET) $(LIBCXX_TARGET)
        $(CLANGXX) -c $< -o $@ $(CXXFLAGS) -I build

# This is newly added
build/jsonlite.o: deps/jsonlite/amalgamated/jsonlite/jsonlite.c $(MUSL_TARGET) $(LIBCXX_TARGET)
        $(CLANG) -c $< -o $@ $(CFLAGS) -I deps/jsonlite/amalgamated/jsonlite

See here for the exact commit. The only thing worth mentioning is that JSONLite is written in C, hence we compile it with Clang as C source file rather than C++.

We can repeat the same process above to add more missing functions. Here are some more fixes in the same way.

Adding Secp256k1 Library

There is something worth mentioning when we add pubkey.cpp file:

  • This cpp file includes a line for secp256k1.h header file, so we need to adjust the compilation arguments accordingly.

  • For the same reason, pubkey.cpp file would reference functions used by the secp256k1 library, as shown below:

$ git checkout 4edbbe95c0c7155b811aaee46088b9d9196c414d
$ make
(omitted..)

ld.lld-18: error: undefined symbol: secp256k1_ec_pubkey_parse
>>> referenced by pubkey.h:62 (deps/bitcoin/src/pubkey.h:62)
>>>               build/pubkey.o:(CPubKey::Verify(uint256 const&, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>> const&) const)

ld.lld-18: error: undefined symbol: secp256k1_ecdsa_signature_normalize
>>> referenced by pubkey.cpp:284 (deps/bitcoin/src/pubkey.cpp:284)
>>>               build/pubkey.o:(CPubKey::Verify(uint256 const&, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>> const&) const)
>>> referenced by pubkey.cpp:419 (deps/bitcoin/src/pubkey.cpp:419)
>>>               build/pubkey.o:(CPubKey::CheckLowS(std::__1::vector<unsigned char, std::__1::allocator<unsigned char>> const&))

ld.lld-18: error: undefined symbol: secp256k1_ecdsa_verify
>>> referenced by pubkey.cpp:0 (deps/bitcoin/src/pubkey.cpp:0)
>>>               build/pubkey.o:(CPubKey::Verify(uint256 const&, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>> const&) const)

(omitted..)

The typical workflow in the C involves configuring and compiling the secp256k1 library, then linking it to our binary, like we used to do here before. However, the secp256k1 library is a well organized library, where one major source file includes all implementations across multiple header files. We can simply compile and add this single secp256k1.c source file to our compilation pipeline.

# Note this line is also modified
BITCOIN_LIBS := interpreter.o sha256.o jsonlite.o script.o hash.o pubkey.o \\\\
        secp256k1.o

# (omitted...)

build/pubkey.o: deps/bitcoin/src/pubkey.cpp $(MUSL_TARGET) $(LIBCXX_TARGET)
        $(CLANGXX) -c $< -o $@ $(CXXFLAGS) -I deps/bitcoin/src/secp256k1/include

# This is newly added
build/secp256k1.o: deps/bitcoin/src/secp256k1/src/secp256k1.c $(MUSL_TARGET) $(LIBCXX_TARGET)
        $(CLANG) -c $< \\\\
                -o $@ \\\\
                $(CFLAGS) \\\\
                -DENABLE_MODULE_EXTRAKEYS \\\\
                -DENABLE_MODULE_SCHNORRSIG \\\\
                -DECMULT_WINDOW_SIZE=6 \\\\
                -I deps/bitcoin/src/secp256k1/include

See here for the exact commit. Note that secp256k1 requires certain tweaks so as to work properly in our environment.

Compiling & Linking Secp256k1 Precomputed Tables

When running the compilation command, you should notice there are some more secp256k1 related symbols reported missing:

$ git checkout 31b6b00d59892099d3dcb5500a551d6198dd6971
$ make
(omitted..)

ld.lld-18: error: undefined symbol: secp256k1_pre_g_128
>>> referenced by scalar_impl.h:166 (deps/bitcoin/src/secp256k1/src/scalar_impl.h:166)
>>>               build/secp256k1.o:(secp256k1_ecmult)
>>> referenced by scalar_impl.h:166 (deps/bitcoin/src/secp256k1/src/scalar_impl.h:166)
>>>               build/secp256k1.o:(secp256k1_ecmult)

ld.lld-18: error: undefined symbol: secp256k1_pre_g
>>> referenced by scalar_impl.h:166 (deps/bitcoin/src/secp256k1/src/scalar_impl.h:166)
>>>               build/secp256k1.o:(secp256k1_ecmult)
>>> referenced by scalar_impl.h:166 (deps/bitcoin/src/secp256k1/src/scalar_impl.h:166)
>>>               build/secp256k1.o:(secp256k1_ecmult)

(omitted..)

These symbols are the precomputed tables used by secp256k1 library, exactly the parts missing from the above secp256k1.c source file. We need to manually add them to our compilation pipeline:

# Note this line is also modified
BITCOIN_LIBS := interpreter.o sha256.o jsonlite.o script.o hash.o pubkey.o \\\\
        secp256k1.o precomputed_ecmult.o

# (omitted..)

build/secp256k1.o: deps/bitcoin/src/secp256k1/src/secp256k1.c $(MUSL_TARGET) $(LIBCXX_TARGET)
        $(CLANG) -c $< \\\\
                -o $@ \\\\
                $(CFLAGS) \\\\
                -DENABLE_MODULE_EXTRAKEYS \\\\
                -DENABLE_MODULE_SCHNORRSIG \\\\
                -DECMULT_WINDOW_SIZE=6 \\\\
                -I deps/bitcoin/src/secp256k1/include

# This is newly added
build/precomputed_ecmult.o: deps/bitcoin/src/secp256k1/src/precomputed_ecmult.c $(MUSL_TARGET) $(LIBCXX_TARGET)
        $(CLANG) -c $< \\\\
                -o $@ \\\\
                $(CFLAGS) \\\\
                -DENABLE_MODULE_EXTRAKEYS \\\\
                -DENABLE_MODULE_SCHNORRSIG \\\\
                -DECMULT_WINDOW_SIZE=6 \\\\
                -I deps/bitcoin/src/secp256k1/include

See here for the exact commit. Now all requirements related to secp256k1 has been properly added to our binary. We continue the same steps above:

  • Take a missing symbol

  • Locate the cpp file containing the symbol in Bitcoin's codebase

  • Add the very cpp file to the compilation pipeline

  • Repeat

Here are the final missing pieces. Now we've added enough Bitcoin's code to make a proper binary for Bitcoin Script verification:

$ git checkout 72be89ea92e215214f3e922e03a9f159edc0d42f
$ make
$ ckb-debugger --bin build/bitcoin_vm_stripped bitcoin_vm "$(curl -s <https://mempool.space/api/tx/382b61d20ad4fce5764aae6f4d4e7fa10abbb3f9ed8692fb262b70a3ed494d5c>)"
Run result: 0
Total cycles consumed: 1297195(1.2M)
Transfer cycles: 79134(77.3K), running cycles: 1218061(1.2M)

The Makefile at TIP follows the same compilation process but re-organized to help us locate the original cpp source file semi-automatically.

Coincidences That Made It Work

Several coincidences actually come together to make the above workflow work happen:

  • Unique Source File Name: All the required source files have different names, so we put all the compiled object files in the same build folder. Should more than one source file share the same name, we’d have to either rename the files, or maintain a hierarchy of directories in build folder.

  • Compatibility with Clang: Each required source files can be compiled by Clang to RISC-V, using musl libc/libcxx of one’s choice.

Example: Using Bitcoin's EncodeHexTx

Compatibility with Clang is really crucial. Let's see a real example here: Bitcoin's codebase has an EncodeHexTx function that can be super helpful to generate the serialized format of a Bitcoin transaction. We wanted to use this function during development to compare the recovered transaction from JSON, with the expected transaction, making sure our implementation is correct.

Effort Required and Binary Size Constraints in CKB-VM

But the actual path here is hardly smooth: while we only require a single EncodeHexTx function from core_write.cpp file, we have to make sure all symbols, referenced directly / indirectly by core_write.cpp file, are included in the linking phase, even if a majority of the functions are never executed. If you try this compilation approach, you will realize it includes of a huge number of files, as seen in this commit. In addition to pulling in much more files than originally required, even more adjustments are required to ensure successful compilation:

  • Threading Support: Threading support must be enabled in libcxx, though we never really use, nor can use any threads in CKB-VM.

  • Atomic Builtins: Atomic builtins must be provided (so if you are trying to build the code from this commit, make sure to do another git submodule update --init)

  • Config Definitions: bitcoin-config.h file must be filled with proper definitions

  • Symbol Definition: A G_TRANSLATION_FUN symbol must be defined for Bitcoin i18n usage (though we never really use this symbol)

  • Name Collision: We have to include chainparams.cpp and kernel/chainparams.cpp, which share the same name. To avoid name collision, we rename the object file for the latter one as kernel_chainparams.o

Despite our efforts to compile and link with minimal binary size, the resulting binary has 941K in size, far exceeding CKB-VM’s limit. As we shall see in the next section, purging unused C++ functions requires a manual process for now. It will also take huge efforts to reduce this binary down to a reasonable size.

More Challenges Porting C++ Codebases

What we are trying to say here is: Bitcoin codebase is a well organized, much better organized than many other C/C++ codebases out there. Even so, we would hit on source files that would be hard to include in the final binary. In the above example, we are only hitting on the problem of binary size. Beyond this, some C/C++ source file might use certain features not yet supported on CKB-VM, such as networking code. It’s pure luck that we managed to nail down a small set of files that can be compiled and linked into a CKB-VM program for Bitcoin Script verification. Many C/C++ libraries might require manual adjustments before porting over to CKB-VM.

Stripping Down Binary Size

In addition to current techniques of reducing binary size of C programs, C++ code also has its own quirks that requires attention.

We have also prepared a branch so you can try the code at different stages:

$ git clone --recursive <https://github.com/xxuejie/ckb-bitcoin-vm>
$ cd ckb-bitcoin-vm
$ git checkout c0f0ab77eae78753d4bf9b5d7c992b772aecfbd3
$ make
$ ls -lh build/bitcoin_vm_stripped
-rwxrwxr-x 1 ubuntu ubuntu 549K Sep  6 01:18 build/bitcoin_vm_stripped

Initial Techniques for Size Reduction

At this stage, we have only applied old techniques to compile C code: build C code using -Os optimization level, then use -ffunction-sections -fdata-sections when compiling a C source file, then use --gc-sections when linking, so the linker can help us purge all unused functions. However, as we shall see the final binary size is 549K, which is too big for CKB. Of course we will need more techniques to reduce the binary size.

Fine-Tuning libcxx Configuration

The libcxx comes with different build options, and the first low-hanging fruit we can reach, is tweaking the configuration a bit to reduce code size:

LLVM_CMAKE_OPTIONS := -DCMAKE_BUILD_TYPE=MinSizeRel
LLVM_CMAKE_OPTIONS += -DLIBCXX_ENABLE_UNICODE=OFF -DLIBCXX_ENABLE_RANDOM_DEVICE=OFF
LLVM_CMAKE_OPTIONS += -DLIBCXXABI_SILENT_TERMINATE=ON

$(LIBCXX_TARGET): $(MUSL_TARGET)
        cd $(LIBCXX) && \\\\
                CLANG=$(CLANG) \\\\
                        BASE_CFLAGS="$(BASE_CFLAGS)" \\\\
                        MUSL=$(MUSL)/release \\\\
                        LLVM_VERSION="18.1.8" \\\\
                        LLVM_CMAKE_OPTIONS="$(LLVM_CMAKE_OPTIONS)" \\\\
                  ./build.sh
        touch $@

See here for the exact commit. These following steps outline the process of optimizing the build configuration for libcxx and libcxxabi:

  • Instruct CMake to build libcxx using MinSizeRel configuration, which is the release configuration with minimal binary size

  • Disable unnecessary features in libcxx, including: unicode and random device support

  • Instruct libcxxabi to use a silent, minimal C++ terminate function (we don't really want debug infos in the release configuration, simple termination with non-zero exit code is more than what we need.)

After those changes, the final binary is reduced to 473K, now can fit in a CKB block to be deployed. But the binary size is still far from optimal, we will need to look elsewhere.

Tackling Unwanted Support for Localization and Wide Character

After some analysis into the binary layout (unfortunately, this is all manual process for now, I don't have anything reusable to share now, sorry), it appears that libcxx contains a huge part of the code for localization (including charset, date, time, money format, etc.) and wide character support. Libcxx does provide options so we can turn them off, but the reality is not great:

  • Localization is largely tied to C++'s IOstream module. When localization is disabled, IOstream will be missing. But Bitcoin requires IOstream module for serialization support and other needs. There is certain work to resolve this, but as far as the latest version (LLVM 19.1.0-rc2) we test, there is still no way to disable localization completely while maintaining full access to the IOstream module.

  • When musl libc is being used, there is code using wide character header files, regardless of the fact whether wide character feature is enabled or not

Historically, most C++ code would use dynamic linking, so the binary size of the libcxx is not a very big concern. Even though static linking is indeed supported, it’s fair to say that proper modularity is still a work in process in the whole C++ world.

To cope with those issues, we have to patch libcxx:

  • Instead of disabling localization directly, we’ve patched libcxx to insert a non-return exit function call at the start of functions used to initialize localization objects. With those hints, Clang is smart enough to figure out that most of the localization-related functions will not be reached, and will then purge them. By merely inserting an exit call with non-zero error codes, we are essentially saying "all localization related code path will result in script execution failure", which won't harm security.

  • We’ve also patched libcxx so musl-related header file would not try to touch wide character header files.

This commit is slightly larger so we won't bother listing the changed part, the full commit can be found here. If you want to find out what exact lines changed in libcxx, use the following commands:

$ git checkout f26293fcacd75863db1cfcc1afaf7e55e67e1325
$ curl -s <https://raw.githubusercontent.com/llvm/llvm-project/llvmorg-18.1.8/libcxx/include/__support/musl/xlocale.h> | diff -u - llvm_patch/libcxx/include/__support/musl/xlocale.h
$ curl -s <https://raw.githubusercontent.com/llvm/llvm-project/llvmorg-18.1.8/libcxx/src/locale.cpp> | diff -u - llvm_patch/libcxx/src/locale.cpp

With the patch to xlocale.h file, we can now add -DLIBCXX_ENABLE_WIDE_CHARACTERS=OFF to completely turn off wide character support in libcxx, but localization still has certain quirks. Though we cannot disable it completely, we can try to comment out code as we do above.

Those patches are more of a temporary solution, and are also tied to specific LLVM versions. When a project is upgraded to a newer version of LLVM, it is very likely that those patches will need to be adjusted accordingly. Also it's possible that a certain version of LLVM will take care of those problems (or maybe we can help out to take care of the issues), so patches would be no longer needed.

With those changes, the final binary can be reduced to 323K, it is a good result but let's see if we can do less than 323K.

Tackling Unpurged Functions Due to C++ Exceptions

Let's pay attention to a specific function: the FormatParagraph function in strencodings.cpp file in Bitcoin's codebase. Even though strencodings.cpp file is introduced to Bitcoin VM due to other functions in the file, this FormatParagraph function is never called from our binary. One can change this function so it returns with a non-zero error code immediately, the resulting binary would still function properly. However, if you remove this function from strencodings.cpp file, the compilation process will still succeed, and one can find that the resulting binary size is actually reduced.

We can do some diggings here:

$ git checkout f26293fcacd75863db1cfcc1afaf7e55e67e1325
$ make
$ llvm-readelf-18 --section-headers build/strencodings.o | grep FormatParagraph
  [65] .text._Z15FormatParagraphNSt3__117basic_string_viewIcNS_11char_traitsIcEEEEmm PROGBITS 0000000000000000 001322 00027e 00  AX  0   0  2
  [66] .rela.text._Z15FormatParagraphNSt3__117basic_string_viewIcNS_11char_traitsIcEEEEmm RELA 0000000000000000 081380 0009d8 18   I 215  65  8
  [67] .gcc_except_table._Z15FormatParagraphNSt3__117basic_string_viewIcNS_11char_traitsIcEEEEmm PROGBITS 0000000000000000 0015a0 00007c 00   A  0   0  4
  [68] .rela.gcc_except_table._Z15FormatParagraphNSt3__117basic_string_viewIcNS_11char_traitsIcEEEEmm RELA 0000000000000000 081d58 000480 18   I 215  67  8
$ llvm-readelf-18 --symbols build/bitcoin_vm | grep FormatParagraph
 59657: 0000000000036510   554 FUNC    LOCAL  HIDDEN      4 _Z15FormatParagraphNSt3__117basic_string_viewIcNS_11char_traitsIcEEEEmm

We can see that this FormatParagraph function has indeed been compiled with -ffunction-sections -fdata-sections, so a separate text section is generated for this very function. However, the linker would still choose to keep this function in the final binary, even though --gc-sections command is used as a linking argument.

This means that not all C++ functions can be purged properly using --gc-sections.

Let's try a small experiment here:

$ git checkout f26293fcacd75863db1cfcc1afaf7e55e67e1325
$ make
$ llvm-objcopy-18 --remove-section=.rela.gcc_except_table._Z15FormatParagraphNSt3__117basic_string_viewIcNS_11char_traitsIcEEEEmm build/strencodings.o build/strencodings.o
$ rm build/bitcoin_vm*
$ make
$ llvm-readelf-18 --symbols build/bitcoin_vm | grep FormatParagraph
(no output)
$ ls -lh build/bitcoin_vm_stripped
-rwxrwxr-x 1 ubuntu ubuntu 320K Sep  6 02:32 build/bitcoin_vm_stripped

If we manually remove the relocation section of C++ except table for FormatParagraph, and run linker using the processed object file, the linking process would succeed in removing FormatParagraph from the final binary, resulting in a smaller binary size!

We will have to dive into the actual implementation of the linker to know the exact mechanism here, but I can provide a hint of mine here:

  • Some C++ functions might contain code that throws exceptions. To support proper stack unwinding in case of exceptions, Clang would need to generate except table for each of such function.

  • Linker does not process (or this feature has not yet been developed) the full list of functions which can be the recipient of a stack unwinding process, so linker can only gather all available except tables from all functions.

  • Each of such except table would also pose a dependency on the actual function, so even though a function is not actually used, it is indirectly included in the final binary.

It really is a trial-and-error process so we know in the above experiment, removing the relocation section of the except table alone can aid the linker to purge the function. This leads to a workflow we can use here:

Step 1: Locate a C++ function that is not purged and will never be called by any code (This is really important here, since removing the except table of a function that might be called, may have unexpected results.)

Step 2: Purge the relocation section of the except table from this function in the build process

We have put together a workflow to streamline this process as much as possible. See this commit for the complete setup. Right now, we use this manual approach to purge two functions that are relatively big and unused:

  • FormatParagraph from strencoding.cpp in Bitcoin codebase

  • SHA256AutoDetect from sha256.cpp in Bitcoin codebase. What's important here, is by purging SHA256AutoDetect function, TransformD64 function from the same file will also be purged from the binary. This TransformD64 function alone saves us ~20K of the final binary size.

Purging these two functions here using the above workflow reduces the size of the final binary to 299K. I do agree the workflow here is less optimal, but it works and solves our problem for now. Maybe there is a better and more automatic process somewhere unknown yet, we will continue searching for it.

Optimizing Beyond C++

If you compare the code from our experiment branch with the current TIP, you will realize there are some more optimizations:

  • We added a compile-time macro, so one can make all the printf calls dummy function in musl. Even though you won't use printf directly in your code, you might still find printf related code in the final binary. Two popular reasons are:

    • Assertions in C would introduce fprintf calls

    • C++ might print a message in case of uncatched exceptions. By adding this macro to dummify printf functions, we do save ~5K of the final binary size.

  • We also added NO_DEBUG_INFO to strip certain debug code from our main.cpp file.

The final result here, is a binary of 288K in size. There might still be room to improve (It‘s possible that there are more unused C++ functions in the binary due to exception handling), but we will call it a stop here. It is good enough for us :)

Conclusion

Despite all those quirks and problems, personally I find it quite inspiring since a lot of C / C++ can now be compiled and executed on CKB-VM. The potential here could be really promising.

✍🏻 Written by Xuejie Xiao


Find more of his work on his personal site: Less is More.

His previous posts include:

By the way, Cryptape is seeking system engineer wizards! Join us to revolutionize blockchain technology: join@cryptape.com

0
Subscribe to my newsletter

Read articles from Cryptape directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Cryptape
Cryptape