Python developer learns C++

Samuel DrewSamuel Drew
15 min read

Brain-space is a limited resource. I hadn’t touched C/C++ since uni (excluding some tinkering with Arduino C) until a couple months ago when I was inspired by Mitchell Hashimoto, Co-founder and Former CEO of Hashicorp. In an interview, Hashimoto said he read the entire documentation of the Zig programming language to write his Ghostty terminal emulator. The whole thing. Top to bottom. What a concept! I’m very used to skimming the docs or reading bits that are relevant to a particular challenge. But I always assumed I didn’t have the brain-space to realistically store every page of the documentation.

And the reality is, I probably don’t. I can’t just go read the whole of cpprefference.com[1]. I also definitely wont go and read all of every C++ Standards Committee paper. Because that would be insane. But what I did do what you also can and should do is read learncpp.com.

Learncpp.com is a well written, concise rendering of the foundational knowledge required to write C++. But many of the concepts can be transferred to programming languages in general. I can confidently say I’m a far better software engineer than I was before I started reading it.

I’m not going to summarise learncpp.com in this article. Instead, I’ll share a list of things that, as an experienced DevOps engineer who’s mostly only ever used Python, were either a brand new concept to me or were a correction to a misunderstanding I had based on some false assumptions I’d made long ago. My hope for this article? I want the inquisitive reader to be the informed reader. To see that there’s so many things readers may not have known about C++ that the authors cover in expertly succinct detail with examples, quizzes and zero condescending tones.

A disclaimer: learncpp is still a working site and the writers are updating it constantly, not only when new cpp standards are released but also just depending on how readers respond to each article, feedback is listened to and new articles are written all the time. So this article may become outdated very quickly but with that in mind, lets treat this like a snapshot in time.

Compile time optimisations and the constexpr keyword

The compiler will try (when configured as such) to optimise as much as it can. There are a lot of optimisation techniques the compiler uses. One such technique is called constant propagation. It’s when the compiler replaces occurrences of a variable that is known to be constant with its value, saving the user time that would otherwise be spent fetching the variable from memory.

Another one of the techniques compilers use is called compile-time evaluation. That is, the compiler will look for any expressions that are evaluatable at compile-time (such as, variables that do not change after their initialisation, const variable declarations and statements that use only constant variables). It will then evaluate those expressions at compile time, replacing anywhere in the code that they appear with their evaluated expression.

Programmers can use the constexpr keyword in a few different ways to tell the compiler that the expression is evaluatable at compile-time (though the compiler has the final say on whether or not it does actually perform the evaluation at compile-time.)

Programmers can force the compiler to evaluate a statement at compile-time. A statement in an initialiser for a constexpr object will force the compiler to evaluate the statement at compile time. This means the statement must also be a constexpr. If it isn’t, the compiler will error.

int main() { 
    constexpr int expr { 7 }; // expr **may** be evaluated at compile time
}
int main() { 
    constexpr int expr { 7 }; // Because variable x is constexpr, expr must
    constexpr int x { expr }; // be evaluted at compile-time
}

Of course coming from an interpreted language, I’ve never considered how optimisations work or what degree of control the user might have over them. The authors of learncpp.com go into more details and cover a few more optimisation techniques used in compilers.

The static keyword does COMPLETELY different things depending on where it is used

Static means a few different things in C++. Static-scope refers to objects that remain in-scope throughout the life of the program when it is run (as opposed to automatic-scope variables which go out of scope as soon as the execution gets past the braced section (block statement) of code inside which the variable was declared.)

int x { 80 };    // x is a global variable. global variables have static-scope 

int main () {
    int y { 100 };    // y has automatic scope
    {
        int z { 120 };    // z also has automatic scope
    }    // z goes out of scope here
}    // y goes out of scope here
// x remains in scope until the program exits

The static keyword can be used to declare a statically scoped variable within a block of code. Doing so will create a static local variable. This variable will exist in memory throughout the life of the program, just like a global variable. The difference is that the static local variable is only accessible from within the block statement. This is useful in functions that are called multiple times where we want an object used in the function to persist between function calls (eg. a unique id number that is incremented every time a function is called.)

The static keyword can also be used in the global namespace (not within any block statements). This will make objects and functions internally linked (as non-const global expressions and functions default to external linkeage). This means the internally linked global variable is only accessible within the translation unit (file) that they are declared in. Externally linked globals on the other hand can be accessed from any translation unit.
There’s so much more under this topic including namespaces, using directives, and inline functions. Chapter 7 is a brilliant read for anyone who finds C++ semantics daunting at first glance.

Templating classes, functions, Aliases - Oh my!

Seeing an unfamiliar symbol in a language can be a bit spooky. You might spend some time ignoring it, hoping it’ll go away. After all, surely those angled brackets are more afraid of you than you are of them. right? Let’s hope so. Angled brackets are used in C++ in templating, both declaring and invoking. lets see my beautiful examples.

In Python, everything is sort of pass by reference and completely untyped. That is, any object can become any type at any time with not so much as a judgmental side-eye from the interpreter. It will happily create a single array of [int, int, char, string, double, whateveryoulike] and allow users to change any of those array items into anything else. I knew going into C++ that this wasn’t the case as I had messed around with statically typed languages like Java and Arduino C before. So if a user were to try to define a function (or class) in C++, the compiler would need to know exactly what types it can expect as function parameters (or class members).

#include <iostream>
#include <string>
void myprintfunc(std::string x) {
    std::cout << "my parameter: " << x << "\n";
}

int main () {
    myprintfunc("spaghetti"); // outputs: "my parameter: spaghetti"
    myprintfunc(8.5); // compile time error: no matching function for parameter of type "float"
}

In the above example, we have a function that can take a string. But what if we wanted to pass it a float? One way to solve this would be to write overloaded functions for each parameter type that users are expected to need. This would quickly become a long and tiresome task. Codebases might have hundreds of functions like this. Each would need overloaded versions for each parameter type. It would also rely on the programmer to know exactly what types are expected and to not mistakenly forget any types.

The neat C++ alternative is to use templating (the spooky, scary angled brackets <>). By using a templated function here, the compiler can be instructed to generate its own overloaded functions as required by the code that calls it.

#include <iostream>
#include <string>

template <typename T>
void myprintfunc(T x) {
    std::cout << "my parameter: " << x << "\n";
}

int main () {
    myprintfunc<int>(8);         // outputs: "my parameter: 8"
    myprintfunc<double>(8.5);    // outputs: "my parameter: 8.5"
    myprintfunc<std::string>(std::string("hello world"));    // outputs: "my parameter: hello world"
}

Behind the scenes, the compiler generates overloaded functions every time the function is called with a new type in <> angled brackets.
In C++20 and newer, the angled brackets can be omitted sometimes if the type can be inferred.

Templating can also be used to create class templates in much the same way. It just allows the user to create objects of that class with any type substituted for T. Templating is used a lot in the standard library (eg. in std::array and std::vector for example). I use templating in all my classes in my Cpp data structures and algorithms library.

Overloading operators (and functions)

Operators in C++ are implemented as functions. The arguments to the operator are the operands that appear on either side of it. Operators can be unary, binary or ternary where they have 1, 2 or 3 arguments (eg. -19 uses the unary - operator to flip the sign of the value, 2+2 uses the binary + operator to add the arguments on either side of it.)[2]

Just like functions, operators can be overloaded to extend their functionality. Lets say for example, we create a new class. If we try to send an instance of that class to the output stream via the << operator, we will get a compilation error because << doesn’t know what to do with our new class. We can overload it to “teach” it how to print to console.

class Myint {
public:
    int m_int {30};
};

std::ostream& operator<< (std::ostream& out, const Myint& myint)
{
    out << myint.m_int;
    return out;    // we have to return std::cout here so we can chain << operators together.
}
}

int main() {
    Myint newint {};
    std::cout << newint;    // prints myint.m_int to console
}

When overloading a binary operator, the 1st parameter is the left operand (what appears on the left side of the operator in normal use) and the 2nd parameter is the right operand. We can overload operators outside any class body (as above) or we would overload operators using member functions (also known as methods). Because member functions 1st argument is always the implicit object (this in C++, self in python, Java and others) we can omit the 1st argument in the definition as we would a member function definition.

I’ve never stopped to consider how operators are implemented in a language before. Since learning about operator overloading in C++, I will always be thinking about this whenever I learn a new language.

the = delete specifier

For a language known for being strict and explicit, there’s actually a lot of ways that C++ allows implicit conversions, promotions, and type inference.

For example, chars can be implicitly converted to integral types. If we have a function that accepts an int parameter, but it is called with a parameter of type char, the char will be implicitly converted into an int so that the function may be used.

If we need to prevent this behavior, we can tell the compiler not to use specific overloads of the function.

// code example from learncpp.com (https://www.learncpp.com/cpp-tutorial/deleting-functions/)
#include <iostream>

void printInt(int x)
{
    std::cout << x << '\n';
}

void printInt(char) = delete; // calls to this function will halt compilation
void printInt(bool) = delete; // calls to this function will halt compilation

int main()
{
    printInt(97);   // okay
    printInt('a');  // compile error: function deleted
    printInt(true); // compile error: function deleted

    return 0;
}

This can be combined with a deleted function template to only allow a specific type to be used.

I like the fine-grained control this gives me as a programmer working in C++. Python type hints are nice, but things like this and constexpr make me feel empowered to write fast code while taking preventative measures against user error.

Copy constructors and deep copy vs shallow copy

Classes have a special kind of constructor called a copy constructor. Even if we don’t define it, the class will have an implicit (default) copy constructor. The copy constructor is invoked whenever an instance of the class is copied. This can be done explicitly when we initialise a new object with another object of the same class. The copy constructor is also called automatically[3] when we pass an object by value and when a function returns an object by value (as opposed to by reference or address).

The thing about the copy constructor is that it defaults to shallow copy. That is, unless we specify our own copy constructor, our class will be copied member-wise and if we have any dynamically allocated members (members that are actually pointers to an object in memory on the heap), the pointer will be copied, not the data that they point to. This saves time but will lead to the data having two copies of an object pointing to the same data. If one object’s destructor is called and it destroys the data it’s pointing to, the copy of the object will be pointing to unallocated memory and using it will lead to undefined behavior.

Deep vs Shallow copies was a concept I was familiar with in theory but had never had to implement in practice. Since reading this section, I implemented deep copy constructors in my data structures library.

L-values, R-values, and temporaries

The way I remember the difference in my head is that L-values are anything that would sit comfortably on the left side of an assignment operator. But the authors of learncpp.com have a better definition.

An lvalue (pronounced “ell-value”, short for “left value” or “locator value”, and sometimes written as “l-value”) is an expression that evaluates to an identifiable object or function (or bit-field).

The importance of l-value expressions may not be apparent until we know what r-values do. An r-value is the opposite of an l-value. An r-value evaluates to a value, not an object. This becomes tricky because some functions (and operators) expect certain arguments to be L-values and others to be r-values. L-values will be implicitly evaluated into the their value to get an r-value. However, r-values cannot be converted into an L-value. The assignment operator = expects an L-value on the left and an r-value on the right, if an L-value is placed on the right, it will be converted into an r-value. The distinction is important because r-values can behave differently in functions to L-values. Ref qualifiers can be used in function definitions to change how functions handle references to L-values and references to r-values.

The given example for this is when using an accessor that provides a reference to a member when the implicit object is an r-value. This is dangerous because when the statement in which the r-value is used is finished, the r-value is destroyed. If there are any remaining pointers or references to the now destroyed r-value object, accessing them will result in undefined behaviour.

#include <iostream>
#include <string>
#include <string_view>

class Thing
{
private:
    std::string m_name{};
public:
    Thing(std::string_view name)
        : m_name {name} 
        {
        }

    const std::string& getName() const {return m_name;} // accessor returns a reference
};

// createThing() returns a Thing by value (which means the returned value is an rvalue)
Thing createThing()
{
    Thing newThing {name};
    return newThing;
}

int main()
{
    // reference becomes dangling when return value of createThing() is destroyed
    const std::string& thingName {createThing().getName()}; 
    std::cout << thingName << '\n'; // undefined behavior
}

Temporary objects (also anonymous objects) are objects that are created in an expression just for the duration of the expression, after which they will be destroyed. When a function returns by value, that means the resulting value will be an r-value. Because the r-value in the above example is a temporary object, when the statement finishes, it is destroyed and thingName becomes dangling.

I know that section was a mouthful. I’m not great at explaining it. Check out chapter 12 for a better explanation and read the rest too because it’s really good.

Array decay (C-style Arrays are not just pointers)

This isn’t actually a C++ thing so much as a fun-fact about C that is brought up. I wrote it down because that’s something I (and many others) learned in uni. C-style arrays are just pointers, right? WRONG! But because doing just about anything to them will turn them into pointers (array decay). Seriously:

In most cases, when a C-style array is used in an expression, the array will be implicitly converted into a pointer to the element type, initialized with the address of the first element (with index 0). Colloquially, this is called array decay (or just decay for short).

Thus, many people have the incorrect belief that arrays and pointers are one and the same in C. And from that, the idea that arrays don’t know how long they are in C. They do. As long as we have a non-decayed array, we can find the length of it, for example, by passing it as an argument to the sizeof() function. In practice this doesn’t help because if we have an undecayed array, we likely defined its length.

Because of this is generally agreed that C-style arrays are totally whack, thus it’s recommended to use the standard library classes std::array and std::vector.
I thought it was a neat piece of information to take back into an C projects I do in the future.

Further reading…

There’s so much more on this website including smart pointers, copy elision, standard library algorithms, and more. These are just the things I took note of as particularly interesting or weird. I hope the inspired reader goes and reads the whole thing. I’d recommend it to any software engineer. C/C++ is the foundation that many other languages build on top of. I’ve been working in software for a few years now and never known that this delightful fountain of knowledge was here all along. Have I “learned” C++? No. I plan to build all my projects in C++ for a little while to get a handle on the language. After that I’ll try my hand at other systems-level languages like Odin or Zig (I think I’ll appreciate them more after I’ve worked in C++ for a while).

Brain-space is a funny thing. There always seems to be more of it hiding around every corner. I managed to fill a lot of knowledge gaps these past 2 months and when the time comes to learn another language, I will be so much better at it for having learned how C/C++ works. Thanks, me from 2 months ago.


[1] Also Hashimoto says he wouldn’t do this in the interview. He explicitly says “I would not do this with C++” so take from that what you will.

[2] operator precedence is another interesting topic I hadn’t thought about before reading. And it’s not just BOMDAS order of operations. Have you ever considered the = assignment operator? It’s a binary operator too and can be overloaded.

[2] if the copy is not elided in compilation.

0
Subscribe to my newsletter

Read articles from Samuel Drew directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Samuel Drew
Samuel Drew

I am a developer from Brisbane. Now living in Wellington. I've gone from a degree in software engineering to a career in traditional engineering and then back again to software engineering as a DevOps and Cloud infrastructure engineer. I love learning how things are made and I try to simplify things down to understand them better. I'm using hashnode as a blogging platform to practice my technical writing.