C++: RAII

What is RAII?

RAII stands for “Resource Acquisition Is Initialization”. It is probably one of the nicest and most useful features that C++ has given to the world. D and Rust incorporated it as part of their specifications too.

What does it mean?

RAII, “obviously”, means that any entity that asks for a resource from the system (memory, file handle, network connection, etc.) should be responsible for releasing such a resource when its life has ended.

In C++ jargon, it means that any resource needed by an object must be acquired by the object’s constructor and released in its destructor.

Thanks to this very interesting feature, when a variable that represents an object created with value semantics goes out of scope, its destructor is invoked automatically and seamlessly, thus releasing any resource the object could have acquired in its lifetime.

That solves a lot of resource-related issues in a very transparent way when the variables go out of scope:

  • Any dynamically allocated memory owned by this object can be released.
  • Any file handle open can be closed.
  • Any network connection can be closed.
  • Any database connection can be closed.
  • Any Registry handled returned.
  • Any mutex unlocked
  • … and so on.

The nicest thing is, though several languages provide garbage collectors that are limited to handling memory, RAII is a cleaner alternative to handling NOT ONLY memory but any kind of resources.

Let’s see how we can use it.

First, let’s see how these constructor and destructor are invoked:

#include <iostream>

class A final
{
    int n;
public:
    explicit A(int n) : n{n} { std::cout << "Hello " << n << std::endl; }
    ~A() { std::cout << "Bye " << n << std::endl; }
};

void test()
{
    A a{1};
    A b{2};
    A c{3};
}

int main()
{
    std::cout << "Begin" << std::endl;
    test();
    std::cout << "End" << std::endl;
}

When running this, the output will be:

Begin
Hello 1
Hello 2
Hello 3
Bye 3
Bye 2
Bye 1
End

We can notice two things here:

  1. The destructors are invoked automatically before exiting the function test. Why there? Because a, b, and c were created in that code block.
  2. The destructor calling order is inverse to its creation order.

So, since the destructors are invoked automatically, we can use that interesting feature (RAII) to free any resource acquired by our code. For example, modifying the class A to store that int value in the heap instead (bad idea, by the way):

class A final
{
    int* pn;
public:
    explicit A(int n) 
    : pn{new int{n}} 
    {
        std::cout << "Hello " << *pn << std::endl; 
    }

    ~A()
    { 
        std::cout << "Bye " << *pn << std::endl; 
        delete pn;
    }
};

Notice that I am acquiring the resource (allocating memory) in the constructor and releasing it in the destructor.

In this way, the user of my class A does not need to worry about the resources it uses.

“Out of scope” also means that if my function ends abruptly or returns prematurely, the compiler will guarantee that the destructor of the objects will still be invoked before transferring the control to the caller.

Let’s test that by adding an exception:

#include <iostream>

class A final
{
    int* pn;
public:

    explicit A(int n) 
    : pn{new int{n}} 
    {
        std::cout << "Hello " << *pn << std::endl; 
    }

    ~A()
    { 
        std::cout << "Bye " << *pn << std::endl; 
        delete pn;
    }
};

void test(int nonzero)
{
    A a{1};
    A b{2};

    if (nonzero == 0)
        throw "Arg cannot be zero";

    A c{3};
}

int main()
{
    std::cout << "Begin" << std::endl;
    try
    {
        test(0);
    }
    catch (const char* e)
    {
        std::cout << e << std::endl;
    }
    std::cout << "End" << std::endl;
}

Notice that I am throwing an exception after objects a and b were created. When the exception occurs, the function test ends abruptly, but it will invoke the destructors of a and b before going to the catch block.

The destructor of object c was not invoked because the object was not created when the exception occurred.

The same behavior occurs if you return prematurely from a function.

Now, look at the class B that I have added to my example:

#include <iostream>

class A final
{
    int* pn;
public:
    explicit A(int n) 
    : pn{new int{n}} 
    {
        std::cout << "Hello " << *pn << std::endl; 
    }

    ~A()
    { 
        std::cout << "Bye " << *pn << std::endl; 
        delete pn;
    }
};

class B final
{
    A a;
    A b;
    
public:
    B(int valueA, int valueB) : a{valueA}, b{valueB} { }
};

void test()
{
    B a { 4, 5};
    B b { 6, 7};
}

int main()
{
    std::cout << "Begin" << std::endl;
    test();
    std::cout << "End" << std::endl;
}

The output is:

Begin
Hello 4
Hello 5
Hello 6
Hello 7
Bye 7
Bye 6
Bye 5
Bye 4
End

Why are the destructors of A being called when B objects go out of scope if I did not write a destructor for B?

Because when you do not write a destructor, the compiler generates one automatically that invokes the destructors of all member variables with value semantics.

So, if your basic classes handle resources explicitly, the likelihood of you needing to acquire or release resources explicitly in your constructors or destructors is actually low.

What about pointers?

RAII does not work with raw pointers, so if you declare something like:

int* array = new int[1024];

in a function, nothing will happen when that variable array goes out of scope.

Is there any way to have pointers handled by RAII?

YES! Through smart pointers!

Other non-memory related uses?

  • std::ifstream and std::ofstream close automatically the file they opened to be read or written.
  • std::lock_guard<T> locks a mutex in its constructor and unlocks it in its destructor, avoiding threads locked by mistake.
  • If you are writing some UI, you probably could need a MouseRestorer that would automatically set the mouse to its default value after being changed to an hourglass in a time-consuming piece of code

C++17: std::any

When trying to implement something that will store a value of an unknown data type (to be as generic as possible, for example), we had these possibilities before C++17:

  • Having a void* pointer to something that will be assigned at runtime. The problem with this approach is that it leaves all responsibility for managing the lifetime of the data pointed to by this void pointer to the programmer. Very error prone.
  • Having a union with a limited set of data types available. We can use still use this approach using C++17 variant.
  • Having a base class (e.g. Object) and store pointers to instances derived of that class (à la Java).
  • Having an instance of template typename T (for example). Nice approach, but to make it useful and generic, we need to propagate the typename T throughout the generic code that will use ours. Probably verbose.

So, let’s welcome to std::any.

std::any, as you already guess it, is a class shipped in C++17 and implemented in header <any> that can store a value of any type, so, these lines are completely valid:

std::any a = 123;
std::any b = "Hello";
std::any c = std::vector<int>{10, 20, 30};

Obviously, this is C++ and you as user need to know the data type of what you stored in an instance of std::any, so, to retrieve the stored value you have to use std::any_cast<T> as in this code:

#include <any>
#include <iostream>

int main()
{
    std::any number = 150;
    std::cout << std::any_cast<int>(number) << "\n";
}   

If you try to cast the value stored in an instance of std::any to anything but the actual type, a std::bad_any_cast exception is thrown. For example, if you try to cast that number to a string, you will get this runtime error:

terminate called after throwing an instance of 'std::bad_any_cast'
  what():  bad any_cast

If the value stored in an instance of std::any is an instance of a class or struct, the compiler will ensure that the destructor for that value will be invoked when the instance of std::any goes of scope.

Another really nice thing about std::any is that you can replace the existing value stored in an instance of it, with another value of any other type, for example:

std::any content = 125;
std::cout << std::any_cast<int>(content) << "\n";

content = std::string{"Hello world"};
std::cout << std::any_cast<std::string>(content) << "\n";

About lifetimes

Let’s consider this class:

struct A
{
  int n;
  A(int n) : n{n} { std::cout << "Constructor\n"; }
  ~A() { std::cout << "Destructor\n"; }
  A(A&& a) : n{a.n} { std::cout << "Move constructor\n"; }
  A(const A& a) : n{a.n} { std::cout << "Copy constructor\n"; }
  void print() const { std::cout << n << "\n"; }
};

This class stores an int, and prints it out with “print”. I wrote constructor, copy constructor, move constructor and destructor with logs telling me when the object will be created, copied, moved or destroyed.

So, let’s create a std::any instance with an instance of this class:

std::any some = A{4516};

This will be the output of such code:

Constructor
Move constructor
Destructor
Destructor

Why two constructors and two destructors are invoked if I only created one instance?

Because the instance of std::any will store a copy (ok, in this case a “moved version”) of the original object I created, and while in my example it may be trivial, in a complex object it cannot be.

How to avoid this problem?

Using std::make_any.

std::make_any is very similar to std::make_shared in the way it will take care of creating the object instead of copying/moving ours. The parameters passed to std::make_any are the ones you would pass to the object’s constructor.

So, I can modify my code to this:

auto some = std::make_any<A>(4517);

And the output will be:

Constructor
Destructor

Now, I want to invoke to the method “print”:

auto some = std::make_any<A>(4517);
std::any_cast<A>(some).print();

And when I do that, the output is:

Constructor
Copy constructor
4517
Destructor
Destructor

Why such extra copy was created?

Because std::any_cast<A> returns a copy of the given object. If I want to avoid a copy and use a reference, I need to explicit a reference in std::any_cast, something like:

auto some = std::make_any<A>(4517);
std::any_cast<A&>(some).print();

And the output will be:

Constructor
4517
Destructor

It is also possible to use std::any_cast<T> passing a pointer to an instance of std::any instead of a reference.

In such case, if the cast is possible, will return a valid pointer to a T* object, otherwise it will return a nullptr. For example:

auto some = std::make_any(4517);
std::any_cast<A>(&some)->print();
std::cout << std::any_cast<int>(&some) << "\n";

In this case, notice that I am passing a pointer to “some” instead of a reference. When this occurs, the implementation returns a pointer to the target type if the stored object is of the same data type (as in the second line) or a null pointer if not (as in the third line, where I am trying to cast my object from type A to int). Using this version overloaded version with pointers avoids throwing an exception and allows you to check if the returned pointer is null.

std::any is a very good tool for storing things that we, as implementers of something reusable, do not know a priori; it could be used to store, for example, additional parameters passed to threads, objects of any type stored as extra information in UI widgets (similar to the Tag property in Windows.Forms.Control in .NET, for example), etc.

Performance wise, std::any needs to store stuff in the heap (this assert is not completely correct: Where the stuff is actually stored depends on the actual library implementation and some of them [gcc’s standard library] store locally elements whose sizeof is small [thanks TheFlameFire]) and also needs to do some extra verification to return the values only if the cast is valid, so, it is not as fast as having a generic object known at compile time.

C++20: Concepts, an introduction

I am pretty new doing C++ Concepts, so I will post here the things I will learn while starting to use them.

C++ Concepts are one of these three large features that are shipped with C++20:

  • Concepts
  • Ranges
  • Modules

Basically, C++ Concepts define a set of conditions or constraints that a data type must fulfill in order to be used as a template argument.

For example, I would want to create a function that sums two values and prints the result. In C++17 and older I would code something like this:

template <typename A, typename B>
void sum_and_print(const A& a, const B& b)
{
    std::cout << (a + b) << "\n";
}

And it works properly for types A and B that DO have the operator+ available. If the types I am using do not have operator+, the compiler naïvely will try to substitute types A and B for the actual types and when trying to use the missing operator on them, it will fail miserably.

The way the compiler works is correct, but failing while doing the actual substitution with no earlier verification is kind of a reactive behavior instead of a proactive one. And in this way, the error messages because of substitution error occurrences are pretty large, hard to read and understand.

C++20 Concepts provide a mechanism to explicit the requirements that, in my example, types A and B would need to implement in order to be allowed to use the “sum_and_print” function template. So when available, the compiler will check that those requirements are fulfilled BEFORE starting the actual substitution.

So, let’s start with the obvious one: I will code a concept that mandates that all types that will honor it will have operator+ implemented. It is defined in this way:

template <typename T, typename U = T>
concept Sumable =
 requires(T a, U b)
 {
    { a + b };
    { b + a };
 };

The new keyword concept is used to define a C++ Concept. It is defined as a template because the concept will be evaluated against the type or types that are used as template arguments here (in my case, T and U).

I named my concept “Sumable” and after the “=” sign, the compiler expects a predicate that needs to be evaluated on compile time. For example, if I would want to create a concept to restrict the types to be only “int” or “double”, I could define it as:

template <typename T>
concept SumableOnlyForIntsAndDoubles = std::is_same<T, int>::value || std::is_same<T. double>::value;

The type trait “std::is_same<T, U>” can be used here to create the constraint.

Back to my first example, I need that operator+ will be implemented in types A and B, so I need to specify a set of requirements for that constraint. The new keyword “requires” is used for that purpose.

So, any definition between braces in the requires block (actually “requires” is always a block, even when only a requirement is specified) is something the types being evaluated must fulfill. In my case, “a+b” and “b+a” must be valid operations. If types T or U do not implement operator+, the requirements will not be fulfilled and thus, the compiler will stop before even trying to substitute A and B for actual types.

So, with such implementation, my function “sum_and_print” works like a charm for ints, doubles, floats and strings!

But, what if I have another type like this one:

struct N
{
    int value;

    N operator+(const N& n) const
    {
        return { value + n.value };
    }
};

Though it implements operator+, it does not implement operator<< needed to work with std::cout.

To add such constraint, I need to add an extra requirement to my concept. So, it could be like this one:

template <typename T, typename U = T>
concept Sumable =
 requires(T a, U b)
 {
    { a + b };
    { b + a };
 }
 && requires(std::ostream& os, const T& a)
 {
     { os << a };
 };

The operator && is used here to specify that those requirements need to be fulfilled: Having operator+ AND being able to do “os << a“.

If my types do not fulfill such requirements, I get an error like this in gcc:

<source>:16:5:   in requirements with 'std::ostream& os', 'const T& a' [with T = N]
<source>:18:11: note: the required expression '(os << a)' is invalid
   18 |      { os << a };
      |        ~~~^~~~

That, though looks complicated, is far easier to read than the messages that the compiler produces when type substitution errors occur.

So, if I want to have my code working properly, I need to add an operator<< overloaded for my type N, having finally something like this:

#include <iostream>

template <typename T, typename U = T>
concept Sumable =
 requires(T a, U b)
 {
    { a + b };
    { b + a };
 }
 && requires(std::ostream& os, const T& a)
 {
     { os << a };
 };

template <Sumable A, Sumable B>
void sum_and_print(const A& a, const B& b)
{
    std::cout << (a + b) << "\n";
}

struct N
{
    int value;

    N operator+(const N& n) const
    {
        return { value + n.value };
    }
};

std::ostream& operator<<(std::ostream& os, const N& n)
{
    os << n.value;
    return os;
}

int main()
{
    sum_and_print( N{6}, N{7});
}

Notice that in my “sum_and_print” function template I am writing “template <Sumable a, Sumable b>” instead of the former “template <typename A, typename B>“. This is the way I ask the compiler to validate such type arguments against the “Sumable” concept.


What if I would want to have several “greeters” implemented in several languages and a function “greet” that will use my greeter to say “hi”. Something like this:

template <Greeter G>
void greet(G greeter)
{
    greeter.say_hi();
}

As you can see, I want my greeters to have a method “say_hi“. Thus, the concept could be defined like this one in order to mandate the type G to have the method say_hi() implemented:

template <typename G>
concept Greeter = requires(G g)
{
    { g.say_hi() } -> std::convertible_to<void>;
};

With such concept in place, my implementation would be like this one:

template <typename G>
concept Greeter = requires(G g)
{
    { g.say_hi() } -> std::convertible_to<void>;
};

struct spanish_greeter
{
    void say_hi() { std::cout << "Hola amigos\n"; }
};

struct english_greeter
{
    void say_hi() { std::cout << "Hello my friends\n"; }
};


template <Greeter G>
void greet(G greeter)
{
    greeter.say_hi();
}


int main()
{
    greet(spanish_greeter{});
    greet(english_greeter{});
}

Why would I want to use concepts instead of, say, base classes? Because:

  1. While using concepts, you do not need to use base classes, inheritance, virtual and pure virtual methods and all that OO stuff only to fulfill a contract on probably unrelated stuff, you simply need to fulfill the requirements the concept defines and that’s it (Interface Segregation of SOLID principles would work nice here, anyway, where your concepts define the minimum needed possible constraints for your types).
  2. Concepts are a “Zero-cost abstraction” because their validation is performed completely at compile-time, and, if properly verified and accepted, the compiler does not generate any code related to this verification, contrary to the runtime overhead needed to run virtual things in an object-oriented approach. This means: Smaller binaries, smaller memory print and better performance!

I tested this stuff using gcc 10.2 and it works like a charm.

C++: Smart pointers, part 1

This is the first of several posts I wrote related to smart pointers:

  1. Smart pointers
  2. unique_ptr
  3. More on unique_ptr
  4. shared_ptr
  5. weak_ptr

Memory management in C is error-prone because keeping track of every block of memory allocated and deallocated can be confusing and stressful.

Although C++ has the same manual memory management as C, it provides additional features that make memory management easier:

  • When an object is instantiated on the stack (e.g., Object o;), the C++ runtime ensures that the object’s destructor is invoked when the object goes out of scope (when the end of the enclosing block is reached, a premature ‘return’ is encountered, or an exception is thrown), thereby releasing all memory and resources allocated for that object. This very nice feature is called RAII.
  • (Ab)using the feature of operator overloading, we can create classes that simulate pointer behaviour. These classes are called: Smart pointers.
Continue reading “C++: Smart pointers, part 1”