C++: Strategy Pattern vs. Policy Based Design

Strategy Pattern

In object-oriented programming languages like Java or C#, there is the very popular Strategy Pattern.

The Strategy Pattern consists of injecting “strategies” into a class that will consume them. These “strategies” are classes that provide a set of methods with a specific behavior that the consuming class needs. The purpose of this pattern is to offer versatility by allowing these behaviors to be modified, extended, or replaced without changing the class that consumes them. To achieve this, the consumer is composed of interfaces that define the behavior to be consumed, and at some point (typically during instantiation), the infrastructure injects concrete implementations of these interfaces into the consumer.

This is the example that will be used in this post: A store needs a product provider and a stock provider to gather all the information it requires.

Using the Strategy Pattern, the class diagram for the example is shown below:

In C++ the implementation of this pattern is similar to the one described here: The “interfaces” (implemented as Abstract Base Classes in C++) IProductProvider and IStockProvider and their data classes and structs will be declared as follows:

struct Product
{
    std::string id;
    std::string name;
    std::string brand;
};

class IProductProvider
{
public:
    virtual ~IProductProvider() = default;
    
    virtual std::shared_ptr<Product> get_product_by_id(const std::string& id) const = 0;
    virtual std::vector<std::string> get_all_product_ids() const = 0;
};

struct Stock final
{
    std::string product_id;
    int count;
};

class IStockProvider
{
public:
    virtual ~IStockProvider() = default;
    
    virtual int get_stock_count_by_product_id(const std::string& id) const = 0;
};

Now, the following code shows the service class that uses both providers, implementing the Strategy Pattern:

struct ProductAndStock final
{
    std::shared_ptr<Product> product;
    int stock_count;
};

class Store
{
    std::unique_ptr<IProductProvider> _product_provider;
    std::unique_ptr<IStockProvider> _stock_provider;
    
public:
    Store(
        std::unique_ptr<IProductProvider> product_provider, 
        std::unique_ptr<IStockProvider> stock_provider)
    : _product_provider{std::move(product_provider)}
    , _stock_provider{std::move(stock_provider)}
    {
    }
    
    std::vector<ProductAndStock> get_all_products_in_stock() const
    {
        std::vector<ProductAndStock> result;
        
        for (const auto& id : _product_provider->get_all_product_ids())
        {
            // We will not load products that are not in stock
            const auto count = _stock_provider->get_stock_count_by_product_id(id);
            if (count == 0)
                continue;
                
            result.push_back({_product_provider->get_product_by_id(id), count});
        }
        
        return result;
    }
};

Assuming the product and stock information is available in a database, programmers can implement both interfaces and instantiate the Store like this::

Store store{
        std::make_unique<DBProductProvider>(), 
        std::make_unique<DBStockProvider>()};

In the declaration above, it is assumed that DBProductProvider and DBStockProvider are derived from IProductProvider and IStockProvider, respectively, and that they implement all the pure virtual methods declared in their base classes. These methods should contain code that accesses the database, retrieves the information, and maps it to the data types specified in the interfaces.

Pros:

  1. The Store class is completely agnostic about how the data is retrieved, which means low coupling.
  2. Due to the reason above, the classes can be replaced with other ones very easily, and there is no need to modify anything in the Store.
  3. For the same reason, this makes creating unit tests very easy because creating MockProductProvider and MockStoreProvider classes is actually trivial.

Cons:

Not a “severe” negative point, but because everything is virtual here, polymorphism takes a little extra time for the actual bindings at runtime.

Alternative to Strategy Pattern: Policy-Based Design

Policy-Based Design is an idiom in C++ that solves the same problem as the Strategy Pattern, but in a C++ way, i.e., with templates ;)

Policy-Based Design involves having the consumer class as a class template, where the “Policies” (what was called “Strategy” in the Strategy Pattern) are part of the parameterized classes in the template definition.

The example above would look like this with Policy-Based Design:

struct Product
{
    std::string id;
    std::string name;
    std::string brand;
};

struct Stock final
{
    std::string product_id;
    int count;
};

struct ProductAndStock final
{
    std::shared_ptr<Product> product;
    int stock_count;
};

template <
    typename ProductProviderPolicy,
    typename StockProviderPolicy>
class Store
{
    ProductProviderPolicy _product_provider;
    StockProviderPolicy _stock_provider;
    
public:
    Store() = default;
    
    std::vector<ProductAndStock> get_all_products_in_stock() const
    {
        std::vector<ProductAndStock> result;
        
        for (const auto& id : _product_provider.get_all_product_ids())
        {
            // We will not load products that are not in stock
            const auto count = _stock_provider.get_stock_count_by_product_id(id);
            if (count == 0)
                continue;
                
            result.push_back({_product_provider.get_product_by_id(id), count});
        }
        
        return result;
    }
};

To use it:

Store<DBProductProvider, DBStockProvider> store;

Pros:

  1. Better performance because all bindings are performed by the compiler, which creates a very optimized piece of code by knowing all the code in advance. Notice even that the _product_provider and _stock_provider instances are not even pointers!
  2. Classes are still replaceable with other implementations, though not at runtime.
  3. It is also easy to create unit tests with this approach.

Cons:

  1. In my very specific example, no IProductProvider or IStockProvider are defined because of the nature of templates. So the methods that need to be implemented must be documented somewhere. BUT, that can be solved through C++ Concepts, right? Thanks to C++ Concepts the validation that the providers implement all methods needed is done in compile time.

Something like this:

template <typename T>
concept TProductProvider = 
    requires(T t, const std::string& id) {
        { t.get_product_by_id(id) } -> std::same_as<std::shared_ptr<Product>>;
        { t.get_all_product_ids() } -> std::same_as<std::vector<std::string>>;
    };
    
template <typename T>
concept TStockProvider =
    requires(T t, const std::string& id) {
        { t.get_stock_count_by_product_id(id) } -> std::same_as<int>;
    };

and then declare the Store as follows:

template <
    TProductProvider ProductProviderPolicy, 
    TStockProvider StockProviderPolicy>
class Store
{
    ProductProviderPolicy _product_provider;
    StockProviderPolicy _stock_provider;
    
public:
    Store() = default;
    
    std::vector<ProductAndStock> get_all_products_in_stock() const
    {
        std::vector<ProductAndStock> result;
        
        for (const auto& id : _product_provider.get_all_product_ids())
        {
            // We will not load products that are not in stock
            const auto count = _stock_provider.get_stock_count_by_product_id(id);
            if (count == 0)
                continue;
                
            result.push_back({_product_provider.get_product_by_id(id), count});
        }
        
        return result;
    }
};

The static nature of this idiom makes it not a completely general solution, but it can cover, speculatively, a large percentage of cases. Sometimes, programmers think in dynamic solutions to solve problems that do not require such dynamism.

C++: std::monostate

std::monostate was released with std::variant in C++17.

Essentially, std::monostate is a type where all its instances have exactly the same (and empty) state, hence its name.

Think of its implementation as being similar to something like this:

class empty { };

And what is it useful for?

  1. An instance of this class can be used to represent the initial value of a std::variant when declared as its first type. A std::variant MUST always contain a valid value, so std::monostate is quite useful when its initial value is not known a priori or when the other data types do not have default constructors.
std::variant<std::monostate, int, std::string> value; // here value is instantiated using std::monostate.
  1. It can be used when representing a “no-value” as a valid std::variant value is needed.
std::variant<int, std::string, std::monostate> value{2}; // stores an int.
value = std::monostate{}; // now stores a new instance of std::monostate.
  1. Independently from std::variant, it can be used when a class with no data is required. A concrete example?

Currently, I am working on a UI layout system. The layout_container can store UI widgets and, depending on the layout policies used, they need to specify a “constraint.” The constraint could be a relative XY coordinate, or a position inside a border_layout (north, south, etc.) or NOTHING (for example, in a flow-based layout, the UI controls are positioned one next to the other, flowing like text, and thus, they do not need any constraint).

So I could define my layout_container like this:

template <typename Constraint, typename LayoutPolicy>
class layout_container;

and have the specific types like:

using border_layout_container = layout_container<border_layout_constraint, border_layout_policy>;
using xy_layout_container = layout_container<xy_layout_constraint, xy_layout_policy>;
using flow_layout_container = layout_container<std::monostate, flow_layout_policy>;

I do not need to change anything in my layout_container to support (or not) constraints thanks to std::monostate.

  1. Additionally, std::monostate implements all comparison operators (operator==, operator!=, etc.). All instances are equal, by the way. Finally, std::monostate can also be used as a key on std::unordered_map instances because it provides a specialization for std::hash.

Want to read more about std::variant? Check this out!

C++: [[maybe_unused]

When you declare a function or a method with an argument that will not be used, the compiler emits a warning (if -Wunused-parameter is enabled in gcc and clang) about that issue.

See the code below:

#include <iostream>
#include <string>

void print_msg(const std::string& e)
{
  std::cout << "Hello world\n";
}

int main()
{
  print_msg("Friday");
  return 0;
}

The g++ compiler output is;

In function 'void print_msg(const std::string&)':
<source>:4:35: warning: unused parameter 'e' [-Wunused-parameter]
    4 | void print_msg(const std::string& e)
      |                ~~~~~~~~~~~~~~~~~~~^

It is a good idea to enable this compiler flag (-Wunused-parameter) to remove all unused parameters from your code, making it cleaner.

That could mean removing the parameter completely like below:

void print_msg();

or simply removing the variable name from the argument list:

void print_msg(const std::string& );

Why would you choose the latter instead of the former?

  1. You are sure you will use that parameter later, so you prefer keeping it to avoid modifying all the places where your function is called.
  2. You are overriding a method where the parameter is not used and you cannot modify the interface of the virtual method you are implementing.
  3. You want to catch an exception but do not want to do anything with the exception itself (probably you want to ignore the exception, log a message, or generate a more generic error code).

However, in some scenarios, you simply do not know if a variable will be used or not (e.g. conditional compilation based on #define and #ifdef preprocessor directives or template-based code).

Consider this example:

You have a log subsystem but want to enable it with the LOG_ENABLED preprocessor definition. This is its implementation:

void log_msg(const std::string& msg)
{
#ifdef LOG_ENABLED    
    std::cout << msg << "\n";
#else
    // do nothing
#endif
}

int main()
{
  log_msg("This is a something to log");
}

If the LOG_ENABLED definition is set, everything will work as expected, otherwise the “unused parameter” warning will occur.

Since it is a good idea to enable such warnings AND it is also a good idea to have your code as clean and expressive as possible, [[maybe_unused]] is the hero here.

[[maybe_unused]] is an attribute introduced in C++17 that is used to mark structs, classes, variables, arguments, enums, etc. as “probably unused”, so the compiler simply ignores if such symbols are not being used in the scope they are defined. Thus, apart from removing the warning, marking a variable as “maybe unused” is a nice way to self-document the code and the intentions behind the symbols you create.

Your program will look like this with this attribute:

#include <iostream>
#include <string>


void log_msg([[maybe_unused]] const std::string& msg)
{
#ifdef LOG_ENABLED    
    std::cout << msg << "\n";
#else
    // do nothing
#endif
}

int main()
{
  log_msg("This is a something to log");
}

More on this attribute here: https://en.cppreference.com/w/cpp/language/attributes/maybe_unused

RAII

What is RAII?

RAII stands for ‘Resource Acquisition Is Initialization.’ It is arguably one of the most elegant and useful features that C++ has introduced to the world. Both D and Rust have incorporated it into their specifications as well.

What does it mean?

RAII, “obviously,” means that any entity requesting a resource from the system (memory, file handle, network connection, etc.) should be responsible for releasing that resource when its lifetime has ended.

In C++ jargon, this means that any resource needed by an object must be acquired by the object’s constructor and released in its destructor.

Thanks to this very interesting feature, when a variable representing an object created with value semantics goes out of scope, its destructor is invoked automatically and seamlessly, releasing any resources the object may have acquired during its lifetime.

That solves many resource-related issues in a very transparent way when the variables go out of scope:

  • Any dynamically allocated memory owned by this object can be released.
  • Any open file handle can be closed.
  • Any network connection can be closed.
  • Any database connection can be closed.
  • Any registry handle can be released.
  • Any mutex can be unlocked.

… and so on.

The best part is that, while several languages offer garbage collectors limited to handling memory, RAII provides a cleaner alternative for managing not only memory but any type of resources.

Now, let’s see how it can be used.

First, let’s look at how these constructors and destructors are invoked:

#include <iostream>

class A final
{
    int n;

public:
    explicit A(int n) : n{n} { std::cout << "Hello " << n << std::endl; }
    ~A() { std::cout << "Bye " << n << std::endl; }
};

void test()
{
    A a{1};
    A b{2};
    A c{3};
}

int main()
{
    std::cout << "Begin" << std::endl;
    test();
    std::cout << "End" << std::endl;
}

When running this, the output will be:

Begin
Hello 1
Hello 2
Hello 3
Bye 3
Bye 2
Bye 1
End

Two things can be noticed here:

  1. The destructors are invoked automatically before exiting the test function. Why is that? Because a, b, and c were created in that code block.
  2. The order of destructor calls is the reverse of their creation order.

Since the destructors are invoked automatically, they can leverage this interesting feature (RAII) to free any resources acquired by their code. For example, they could modify class A to store that int value on the heap instead (which is a bad idea, by the way):

class A final
{
    int* pn;

public:
    explicit A(int n) 
    : pn{new int{n}} 
    {
        std::cout << "Hello " << *pn << std::endl; 
    }

    ~A()
    { 
        std::cout << "Bye " << *pn << std::endl; 
        delete pn;
    }
};

Note that resources are acquired in the constructor and released in the destructor.

In this way, the user of class A does not need to worry about the resources it uses.

“Out of scope” also means that if the function ends abruptly or returns prematurely, it will be guaranteed that the destructors of the objects are invoked before control is transferred back to the caller.

This will be tested by adding an exception:

#include <iostream>

class A final
{
    int* pn;

public:
    explicit A(int n) 
    : pn{new int{n}} 
    {
        std::cout << "Hello " << *pn << std::endl; 
    }

    ~A()
    { 
        std::cout << "Bye " << *pn << std::endl; 
        delete pn;
    }
};

void test(int nonzero)
{
    A a{1};
    A b{2};

    if (nonzero == 0)
        throw "Arg cannot be zero";

    A c{3};
}

int main()
{
    std::cout << "Begin" << std::endl;
    try
    {
        test(0);
    }
    catch (const char* e)
    {
        std::cout << e << std::endl;
    }
    std::cout << "End" << std::endl;
}

Note that an exception is thrown after objects a and b are created. When the exception occurs, the function test ends abruptly, but the destructors of a and b will be invoked before entering the catch block.

The destructor of object c is not invoked because the object was not created when the exception occurred.

The same behavior occurs if a function is exited prematurely.

Now, look at class B that has been added to the example:

#include <iostream>

class A final
{
    int* pn;
public:
    explicit A(int n) 
    : pn{new int{n}} 
    {
        std::cout << "Hello " << *pn << std::endl; 
    }

    ~A()
    { 
        std::cout << "Bye " << *pn << std::endl; 
        delete pn;
    }
};

class B final
{
    A a;
    A b;
    
public:
    B(int valueA, int valueB) : a{valueA}, b{valueB} { }
};

void test()
{
    B a { 4, 5};
    B b { 6, 7};
}

int main()
{
    std::cout << "Begin" << std::endl;
    test();
    std::cout << "End" << std::endl;
}

The output is:

Begin
Hello 4
Hello 5
Hello 6
Hello 7
Bye 7
Bye 6
Bye 5
Bye 4
End

Why are the destructors of A called when B objects go out of scope if a destructor for B was not written?

Because when a destructor is not defined, one is automatically generated by the compiler that invokes the destructors of all member variables with value semantics.

Thus, if basic classes handle resources explicitly, the likelihood of needing to acquire or release resources explicitly in constructors or destructors is actually low.

What about pointers?

RAII does not work with raw pointers, so if something like this is declared in a function:

int* array = new int[1024];

nothing will happen when that variable array goes out of scope.

Is there any way to have pointers handled by RAII?

YES! Through smart pointers!

Other non-memory related uses?

  • std::ifstream and std::ofstream close automatically the file they opened to be read or written.
  • std::lock_guard<T> locks a mutex in its constructor and unlocks it in its destructor, avoiding threads locked by mistake.
  • If UI is being written, a MouseRestorer could be needed that automatically sets the mouse to its default value after it has been changed to an hourglass during a time-consuming piece of code.

C++ vs. Rust: Factorial

Let’s dig a little bit in some implementations (in C++ and Rust) of a function that computes the factorial of an integer passed as argument. I will try to make them as similar as possible and will expose the differences between both:

Recursive factorial

This is the simplest and most known factorial implementation:

int factorial(int n)
{
    if (n <= 1)
        return 1;

    return n * factorial(n - 1);
}

Now, the “same” implementation in Rust:

fn factorial(n : i32) -> i32
{
    if n <= 1
    {
        return n;
    }
        
    return n * factorial(n - 1);
}

Though similar and probably producing the same binaries, there are very interesting differences to take into account:

  1. All functions in Rust start with the “fn” keyword. In C++ they start with the function return type, void, or auto.
  2. In Rust you must specify the return type after ->. Since C++11 you can do the same if you mark your method as “auto“. If you do not specify the return type, the function does not return anything (like a C++ void function).
  3. The Rust type “i32” refers to a 32-bit integer. In C++ “int” represents an integer that could have (as far as I know, all current implementations have a 32-bit integer called: “int“) a 32-bit representation. This could be not true for old platforms, compilers or very small ones where the int could be 16-bit. Having an integer with well-defined size for all platforms make code portability easier. (C++ also have the int32_t alias, but is not an actual type).
  4. Rust’s “if” discourages the usage of parenthesis in the expression to evaluate.
  5. Rust mandates the “if” and “else” blocks will be enclosed with curly braces.

Non-recursive implementation

C++ version, using “while“. I am not using “for” because the C/C++/Java/C# -like “for” does not exist in Rust.

int nonRecursiveFactorial(int n)
{
    int r = 1;
    
    while (n >= 1)
    {
        r *= n;
        n--;
    }

    return r;
}

And now, the same code in Rust:

fn non_recursive_factorial(mut n : i32) -> i32
{
    let mut r = 1;
    
    while n >= 1
    {
        r *= n;
        n -= 1;
    }
    
    return r;
}

Again, interesting differences:

  1. I called “nonRecursiveFactorial” my function in C++ and “non_recursive_factorial” my function in Rust. Though I can call my functions whatever I want, the Rust compiler suggests me to use snake_case instead of camelCase.
  2. Notice I marked my argument as “mut” and my variable rmut” as well. “mut” stands for “mutable” and means that the value of that variable can be modified in its lifetime. All variables in Rust are immutable by default (similar to a C++ const variable) and that simple feature removes a lot of concurrency problems.
  3. Again, while does not have parenthesis in its expresion.
  4. Notice I am writing n -= 1; instead of n--; in Rust. Rust does not have “++” or “--” operators because their existence would make lifetime management complicated and the code with those operators can be hard to read in some scenarios.

I want to use “for” anyway

C++ version:

int factorialWithFor(int n)
{
    int r = 1;
    
    for (int i = 2; i < n + 1; i++)
        r *= i;

    return r;
}

Rust version:

fn factorial_with_for(n : i32) -> i32
{
    let mut r = 1;
    
    for i in 2..n + 1
    {
        r *= i;
    }
    
    return r;
}

Once more, interesting differences:

  1. The “for” loop in Rust is a range-based-for-loop, similar to the C++11 range-based-for-loop or C# foreach.
  2. The variable i inside the loop is mutable, it is declared and lives only in that block.
  3. After the “in” Rust keyword, I wrote “2..n+1“. That is the Rust way of creating a range of values between [2; n + 1[ (so the loop will run until n only).
  4. If I would want to have a countdown instead, I could write “(2..n + 1).rev()” instead, that would downcount from n to 2.

Until now, very nice language indeed.

C++17: Structured bindings

When accessing to an element of a given compound type, you probably want to get its internal fields in order to work with them.

When tuples were introduced (my post about tuples), the way of doing this was similar to this:

std::tuple<int, std::string, std::string> person { 20050316, "Isabel", "Pantoja" };
int birth_date = std::get<0>(person);
std::string& first_name = std::get<1>(person);
std::string& last_name = std::get<2>(person);

In this way, you were able to access any elements inside the tuple. Quite useful, but quite verbose as well.

So, std::tie could help us to make this code by far easier to read:

std::tuple<int, std::string, std::string> person { 20050316, "Isabel", "Pantoja" };

int birth_date;
std::string first_name;
std::string last_name;

std::tie(birth_date, first_name, last_name) = person;

The magic that std::tie does here is extracting all elements inside of the tuple and mapping them to the values passed as non-const-references to the std::tie function.

C++17 introduced “structured binding”, a far better and evident and elegant way of doing this without getting lost in what std::tie or std::get<N> do.

Structured binding, as its name suggests, binds a set of variables to its corresponding values inside a tuple. Actually these constructs are supported:

  • std::tuple<T...>
  • std::pair<T, U>
  • std::array<T, N> and stack-based old-style arrays
  • Structs!

How does it work?

The example above could be re-written like this:

std::tuple<int, std::string, std::string> person { 20050316, "Isabel", "Pantoja" };
auto& [birth_date, first_name, last_name] = person;

So, the compiler will “decompose” person in its “atomic” values and will bind the birth_date reference to the first value in person, the first_name to the second value and the last_name to the third one.

Notice that birth_date, first_name, and last_name are references because of auto&. If I would have used auto instead, they would have been actual values instead.

And you can do the same thing with arrays:

std::array<int, 4> values {{ 8, 5, 2, 9 }};
auto [v0, v1, v2, v3] = values;
std::cout << "Sum: " << (v0 + v1 + v2 + v3) << "\n";

int values2[] = { 5, 4, 3 };
auto [i0, i1, i2] = value2;
std::cout << i0 << "; " << i1 << "; " << i2 << "\n";

With std::pair:

auto pair = std::make_pair(19451019, "María Zambrana");
auto& [date, name] = pair;
std::cout << date << ": " << name << "\n";

Or even with structs!!!

struct Point
{
  int x;
  int y;
};

Point p { 10, 25 };
auto [x, y] = p;
std::cout << "(" << x << "; " << y << ")\n";

This feature makes iterating maps incredibly elegant.

Compare this pre-C++11 code:

for (std::map<std::string, std::string>::const_iterator it = translations.begin();
     it != translations.end();
     it++)
{
  std::cout << "English: " << it->first << "; Spanish: " << it->second << "\n";
}

Using C++17, it could have been written like this in C++17:

for (auto& [english, spanish] : translations)
{
  std::cout << "English: " << english << "; Spanish: " << spanish << "\n";
}

Amazing!

C++ vs. Rust: Hello World

This is my first program in Rust, obviously, a “Hello World“! :)

Two ways of creating it:

1. Everything manually

a. Need to create a file with .rs extension. In my case: HelloWorld.rs

b. Write the actual program in that file:

// Hello World in Rust

/* 
 * Same multiline comment like in C
 */

fn main()
{
  println!("Hello world");
}

c. Go to the command line and compile the file with the rustc compiler

rustc HelloWorld.rs -o HelloWorld

d. Execute the binary file

./HelloWorld
Hello world

2. Using cargo.

a. cargo is the Rust package manager that helps you managing dependencies and also is useful in the build process. You can use it to create your program and compile it. So, we can create a new “cargo package“:

cargo new HelloWorld

b. That creates a new “cargo package” called “HelloWorld“: cargo new creates a new “HelloWorld” directory, that contains a Cargo.toml descriptor file and a src directory that contains a main.rs skeleton file that already contains a “Hello World” project similar to the one I wrote above:

fn main() {
    println!("Hello, world!");
}

c. To compile the package, we can build it using cargo too:

cd HelloWorld
cargo build

d. If everything is ok, a new directory called target is created and inside it, directories for debug or release builds (cargo build --release). Going to the debug folder we can run the HelloWorld executable.

3. “Hello World” content

The comments are similar to the C-like languages: // for simple line and /* */ for multiline comments.

The “fn” keyword identifies a block as a Rust function. All functions in Rust have a name and a set of arguments. The return type will be deduced automatically by the compiler.

main“, as in C, is the program entry point and is the function that is executed when you invoke the program from the command line.

println!” is a Rust macro that prints a line of the text specified. A Rust macro is a piece of code able to generate code by itself (metaprogramming).

This is it for now. I will continue writing about Rust while learning it. Thanks for reading!

4. Comparison with C++

This is the most similar implementation of “Hello world” in C++:

#include "fmt/core.h"

int main()
{
  fmt::print("Hello world\n");
}

I used libfmt to print the "Hello world” text to make both implementations as similar as possible.

Notice that Rust does not need any #include stuff. Actually Rust lets you import libraries in a modern way (similar to Java imports or C++20 modules) but println! is a macro included in the standard library, imported by default.

Function main() is also the program entry point in Rust, but it does not return anything, in C++, it MUST return an int, that, if not explicitly mentioned, it will return 0.

C++: boost::pfr

At work I used this library some days ago and I find it amazingly useful:

Basically, boost::pfr lets you work with the fields of a given struct, as a std::tuple, letting you access the elements through an index representing the position where the field is defined at inside the struct.

boost::pfr has also the benefit of being header-file-only and of not having any dependency with any other boost library, making it easy to install (copying its header files and that’s it) and use it.

Let’s see the example below. Suppose you have a struct Person:

struct Person
{
  std::string first_name;
  std::string last_name;
  int birth_year;
};

With boost::pfr, you can access any member in the struct given an index using a method called get() that is very similar to its std::tuple counterpart:

#include "boost/pfr.hpp"

int main()
{
  Person composer { "Martin", "Gore", 1961};
  
  auto& a = boost::pfr::get<0>(composer); // will return a std::string reference
  auto& b = boost::pfr::get<1>(composer);
  auto& c = boost::pfr::get<2>(composer); // will return a reference to the int field

  std::cout << a << ", " << b << ", " << c << "\n";
  return 0;
}

To get then number of elements in a struct, you have boost::pfr::tuple_size:

int main()
{
  std::cout << boost::pfr::tuple_size<Person>::value << "\n";  // returns 3 for our struct
}

To get the type of a given parameter, you have boost::pfr::tuple_element:

int main()
{
  boost::pfr::tuple_element<0, Person>::type s; 
  return 0;
}

In the example above, I declared a variable s whose type is the same type of the element in the 0-th position in the struct Person (i.e. a std::string).

Use cases:

You could create a generic library to iterate through all the members of any struct and do something with each member (kind of std::for_each [boost::pfr provides a for_each_field function template to do this iteration] or std::visit), for example, printing its values out (see my example below), saving them to disk, deserializing them or populating its values from a library (think of a SQL result set or a JSON object).

On printing a struct values out, boost::pfr already ships with a method called io_fields that does exactly that:

int main()
{
  Person singer { "Dave", "Gahan", 1962 };
  std::cout << boost::pfr::io_fields(singer) << "\n";
}

This prints out “{ Dave, Gahan, 1962 }“.

boost::pfr also ships with a lot of utilities to compare elements from a struct in other or to use any struct as a key on standard library maps and sets.

To learn more about this nice library, visit its documentation: https://apolukhin.github.io/pfr_non_boost/

C++14: [[deprecated]]

[[deprecated]] is another attribute that is useful to mark something (a function, a method, a variable, a class, etc.) as still valid, but that has been superseded by other newer stuff and that probably will be removed in the future.

In a similar vein to [[nodiscard]], [[deprecated]] can return a message explaining why this entity has been marked as such.

The compiler will show a warning when a deprecated entity is being actually used in our code.

For example, I have this code:

#include <iostream>

void print(const std::string& msg)
{
    std::cout << msg << "\n";
}

int main()
{
    print("Hello world");
}

After that a lot of functions and code started to use my print() function, I realize that a newer version with std::string_view instead of std::string could have better performance and, since I do not want to break any code, I consider having both functions in my system.

So, to discourage the usage of my old function, I mark it as deprecated:

#include <iostream>


void println(std::string_view msg)
{
    std::cout << msg << "\n";
}

[[deprecated("Use println instead")]]
void print(const std::string& msg)
{
    std::cout << msg << "\n";
}

int main()
{
    print("Hello world");
}

But, since I am still using the old version in my main() function, the compiler will return a warning like this one:

main.cpp:17:24: warning: ‘void print(const string&)’ is deprecated: Use println instead [-Wdeprecated-declarations]

That will dissappear when I will replace all the invocation to print() with println().

C++17: [[nodiscard]] attribute

C++17 adds a new attribute called [[nodiscard]] to let the user know that a return value from a function or method should be handled properly or assigned to a value.

For example, look to this code:

int sum(int a, int b)
{
  return a + b;
}

int main()
{
  sum(10, 20);
  return 0;
}

It produces no result or side-effects, but if the programmer forgot assigning the return value to a variable by mistake, the error will not be immediately obvious.

Now, in this scenario:

char* getNewMessage()
{
  char* nm = new char[100];
  strcpy(nm, "Hello world");
  return nm;
}

int main()
{
  getNewMessage();
  return 0;
}

There is a memory leak produced because the returned value was not stored anywhere and there is no way to deallocate its memory.

Marking a function or method with [[nodiscard]], encourages the compiler to show a compilation warning when it is invoked and its return value is simply bypassed.

You can also write an additional message with the [[nodiscard]] attribute. That message will be displayed if a warning is generated.

In my examples, we could mark my functions like this:

#include <cstring>

[[nodiscard]]
int sum(int a, int b)
{
  return a + b;
}

[[nodiscard("Release the memory using delete[]")]]
char* getNewMessage()
{
  char* nm = new char[100];
  strcpy(nm, "Hello world");
  return nm;
}

int main()
{
  sum(10, 20);
  getNewMessage();
  return 0;
}

And in this case, g++ returns the following compilation warnings:

In function 'int main()':
<source>:19:6: warning: ignoring return value of 'int sum(int, int)', declared with attribute 'nodiscard' [-Wunused-result]

<source>:20:16: warning: ignoring return value of 'char* getNewMessage()', declared with attribute 'nodiscard': 'Release the memory using delete[]' [-Wunused-result]

Though using it could add a lot of verbosity to your method declarations, it is a good idea using it because it prevents some errors to occur.

More on [[nodiscard]]: https://en.cppreference.com/w/cpp/language/attributes

C++20: Useful concepts: Requiring type T to be derived from a base class

How can I create a concept that requires my parameterized types to inherit from a base class?

I am creating a heavily object-oriented class hierarchy and I want the functions related to that class hierarchy to accept only instances of classes derived from the base class of that hierarchy.

So, for example, I have this base class:

class Object
{
public:
    virtual ~Object() = default;

    virtual std::string to_string() const = 0;
};

I create derived classes from such class, for example: class Int.

class Int : public Object
{
    int n;
public:
    Int(int n) : n{n}  {}

    std::string to_string() const override
    {
        return std::to_string(n);
    }
};

And now I want to create a function template “print” that invokes the method “to_string()” of the given parameter and prints it out.

Because of my design, I want my function template to accept only instances of classes derived from my class Object.

In this case, I can create a C++20 concept using the type trait std::is_base_of<Base, Derived>, like this:

#include <type_traits>

template <typename T>
concept ConceptObject = std::is_base_of<Object, T>::value;

In the lines above, I am creating a new C++20 concept called “ConceptObject” that will require my type T fulfill the requirement of T being a derived class from Object.

So, finally, my function template “print” can be expressed in this way:

template <ConceptObject T>
void print(const T& s)
{
    std::cout << s.to_string() << "\n";
}

And it will only compile if the parameter s is an instance of a class derived from Object:

int main()
{
    print(Int{5});
}

Pretty nice!

If you want to read more about C++ concepts, I have this post introducing them: C++20: Concepts, an introduction

C++20: {fmt} and std::format

In this post I will show a very nice open source library that lets the programmers create formatted text in a very simple, type-safe and translatable way: {fmt}

{fmt} is also the implementation of reference for the C++20 standard std::format (the {fmt}‘s author [https://www.zverovich.net/] submitted his paper to the standard committee) but until this moment (June, 2021) only Microsoft Visual Studio implements it. So I will describe the standalone library for now.

These are the main features we can find in {fmt}:

  • We can specify the format and the variables to be formatted, similar to C printf family.
  • The {fmt} functions do not require the programmer to match the type of the format specifier to the actual values the are being formatted.
  • With {fmt}, programmers can optionally specify the order of values to be formatted. This feature is very very useful when internationalizing texts where the order of the parameters is not always the same as in the original language.
  • Some format validations can be done in compile time.
  • It is based on variadic templates but there are lower-level functions on this library that fail over to C varargs.
  • Its performance its way ahead to C++ std::ostream family.

To test {fmt}, I used Godbolt’s Compiler Explorer.

Hello world

This is a very simple “Hello world” program using {fmt}:

#include <fmt/core.h>

int main()
{
    fmt::print("Hello world using {{fmt}}\n");
}

fmt::print() is a helper function that prints formatted text using the standard output (similar to what std::cout does).

As you can see, I did not specify any variables to be formatted BUT I wrote {{fmt}} instead of {fmt} because {{ and }} are, for the {fmt} functions, escape sequences for { and } respectively, which when used as individual characters, contain a format specifier, an order sequence, or simply mean that will be replaced by a parameterized value.

Using simple variable replacements

For example I want to print out two values, I can do something like:

int age = 16;
std::string name = "Ariana";

fmt::print("Hello, my name is {} and I'm {} years old", name, age);

This will print:

Hello, my name is Ariana and I'm 16 years old.

fmt::print() replaced the {} with the respective variable values. Notice also that I used std::string and it worked seamlessly. Using printf(), the programmers should get the const char* representation of the string to be displayed.

If the programmers specify less variables than {}, the compiler (or runtime) returns an error. More on this below.

If the programmers specify more variables than {}, the extra ones are simply ignored by fmt::print() or fmt::format().

Defining arguments order

The programmers can also specify the order the parameters will be replaced:

void show_numbers_name(bool ascending)
{
    std::string_view format_spec = ascending ? "{0}, {1}, {2}\n" : "{2}, {1}, {0}\n";
    
    auto one = "one";
    auto two = "two";
    auto three = "three";

    fmt::print(format_spec, one, two, three);
}

int main()
{
    show_numbers_name(true);
    show_numbers_name(false);
}

As you can see, the {} can contain a number that tells {fmt} the number of the argument that will be used in such position when formatting the output.

Formatting containers

Containers can easily printed out including #include <fmt/ranges.h>

std::vector<int> my_vec = { 10, 20, 30, 40, 50 };
fmt::print("{}\n", my_vec);

Format errors

There are two ways that {fmt} uses to process errors:

If there is an error in the format, a runtime exception is thrown, for example:

int main()
{
    try
    {
        fmt::print("{}, {}, {}\n", 1, 2);
    }
    catch (const fmt::format_error& ex)
    {
        fmt::print("Exception: {}\n", ex.what());
    }
}

In the example above, I say that I have three parameters but only provided two variables, so a fmt::format_error exception is thrown.

But if the format specifier is always constant, we can specify the format correctness in runtime in this way:

#include <fmt/core.h>
#include <fmt/compile.h>

int main()
{
    fmt::print(FMT_COMPILE("{}, {}, {}\n"), 1, 2);
}

FMT_COMPILE is a macro found in <fmt/compile.h> that performs the format validation in compile-time, thus in this case, a compile-time error is produced.

Custom types

To format your custom types, you must to create a template specialization of the class template fmt::formatter and implement the methods parse() and format(), as in this example:

// My custom type
struct person
{
    std::string first_name;
    std::string last_name;
    size_t social_id;
};

// fmt::formatter full template specialization
template <>
struct fmt::formatter<person>
{
    // Parses the format specifier, if needed (in my case, only return an iterator to the context)
    constexpr auto parse(format_parse_context& ctx) { return ctx.begin(); }

    // Actual formatting. The second parameter is the format specifier and the next parameters are the actual values from my custom type
    template <typename FormatContext>
    auto format(const person& p, FormatContext& ctx) {
        return format_to(
            ctx.out(), 
            "[{}] {}, {}", p.social_id, p.last_name, p.first_name);
    }
};

int main()
{
    fmt::print("User: {}\n", person { "Juana", "Azurduy", 23423421 });
}

Neat, huh?

Links

{fmt} GitHub page: https://github.com/fmtlib/fmt

{fmt} API reference: https://fmt.dev/latest/api.html

Compiler explorer: https://godbolt.org/

C++17: std::optional

The C++17 standard library ships with a very interesting class template: std::optional<T>.

The idea behind it is to make explicit the fact that a variable can hold or not an actual value.

Before the existence of std::optional<T>, the only way to implement such semantics was through pointers or tagged unions (read about C++17 std::variant here).

For example, if I want to declare a struct person that stores a person’s first name, last name and nickname; and since not all people have or not a nickname, I would have to implement that (in older C++) in this way:

struct person
{
  std::string first_name;
  std::string last_name;
  std::string* nickname; //no nickname if null
};

To make explicit that the nickname will be optional, I need to write a comment stating that “null” represents “no nickname” in this scenario.

And it works, but:

  • It is error prone because the user can easily do something like: p.nickname->length(); and run into an unexpected behavior when the nickname is null.
  • Since the nickname will be stored as a pointer, the instance needs to be created in heap, adding one indirection level and one additional dynamic allocation/deallocation only to support the desired behavior (or the programmers need to have the nickname manually handled by them and set a pointer to that nickname into this struct).
  • Because of the last reason, it is not at all obvious if the instance pointed to by said pointer should be explicitly released by the programmer or it will be released automatically by the struct itself.
  • The “optionalness” here is not explicit at all at code level.

std::optional<T> provides safeties for all these things:

  • Its instances can be created at stack level, so there will not be extra allocation, deallocation or null-references: RAII will take care of them (though this depends on the actual Standard Library implementation).
  • The “optionalness” of the attribute is completely explicit when used: Nothing is more explicit than marking as “optional” to something… optional, isn’t it?
  • Instances of std::optional<T> hide the direct access to the object, so to access its actual value they force the programmer to do extra checks.
  • If we try to get the actual value of an instance that is not storing anything, a known exception is thrown instead of unexpected behavior.

Refactoring my code, it will look like this one:

#include <optional>
#include <string>

struct person
{
  std::string first_name;
  std::string last_name;
  std::optional<std::string> nickname;
};

The code is pretty explicit and no need to further explanation or documentation about the optional character of “nickname”.

So let’s create two people, one with nickname and the other one with no nickname:

int main()
{
  person p1 { "John", Doe", std::nullopt };
  person p2 { "Robert", "Balboa", "Rocky" };
}

In the first instance, I have used “std::nullopt” which represents an std::optional<T> instance with no value (i.e. : an “absence of value”).

In the second case, I am implicitly invoking to the std::optional<T> constructor that receives an actual value.

The verbose alternative would be:

int main()
{
    person p1 { "John", "Doe", std::optional<std::string> { } };
    person p2 { "Robert", "Balboa", std::optional<std::string> {"Rocky"} };
}

The parameterless constructor represents an absence of value (std::nullopt) and the other constructor represents an instance storing an actual value.

Next I will overload the operator<< to work with my struct person, keeping in mind that if the person has a nickname, I want to print it out.

This could be a valid implementation:

std::ostream& operator<<(std::ostream& os, const person& p)
{
    os << p.last_name << ", " << p.first_name;
    
    if (p.nickname.has_value())
    {
        os << " (" << p.nickname.value() << ")";
    }
    
    return os;
}

The has_value() method returns true if the optional<T> instance is storing an actual value. The value can be retrieved using the value() method.

There is an overload for the operator bool that does the same thing that the has_value() method does: Verifying if the instance stores an actual value or not.

Also there are overloads for operator* and operator-> to access the actual values.

So, a less verbose implementation of my operator<< shown above would be:

std::ostream& operator<<(std::ostream& os, const person& p)
{
    os << p.last_name << ", " << p.first_name;
    
    if (p.nickname)
    {
        os << " (" << *(p.nickname) << ")";
    }
    
    return os;
}

Other way to retrieve the stored value, OR return an alternative value would be using the method “value_or()” method.

void print_nickname(const person& p)
{
    std::cout << p.first_name << " " << p.last_name << "'s nickname: "
              << p.nickname.value_or("[no nickname]") << "\n";
}

For this example, if the nickname variable stores an actual value, it will return it, otherwise, the value returned will be, as I coded: “[no nickname]”.

What will happen if I try to access to the optional<T>::value() when no value is actually stored? A std::bad_optional_access exception will be thrown:

try
{
    std::optional<int> op {};
    std::cout << op.value() << "\n";
}
catch (const std::bad_optional_access& e)
{
    std::cerr << e.what() << "\n";
}

Notice I have used the value() method instead of operator*. When I use operator* instead of value(), the exception is not thrown and the user runs into an unexpected behavior.

So, use std::optional<T> in these scenarios:

  • You have some attributes or function arguments that may have no value and are, therefore, optional. std::optional<T> makes that decision explicit at the code level.
  • You have functions that may OR may not return something. For example, what will be the minimum integer found in an empty list of ints? So instead of returning an int with its minimum value (std::numeric_limits<int>::min()), it would be more accurate to return an std::optional<int>.

Note that std::optional<T> does not support reference types (i.e. std::optional<T&>) so if you want to store an optional reference, probably you want to use a std::reference_wrapper<T> instead of type T (i.e. std::optional<std::reference_wrapper<T>>).

C++17: std::any

When trying to implement something that will store a value of an unknown data type (to be as generic as possible, for example), we had these possibilities before C++17:

  • Having a void* pointer to something that will be assigned at runtime. The problem with this approach is that it leaves all responsibility for managing the lifetime of the data pointed to by this void pointer to the programmer. Very error prone.
  • Having a union with a limited set of data types available. We can use still use this approach using C++17 variant.
  • Having a base class (e.g. Object) and store pointers to instances derived of that class (à la Java).
  • Having an instance of template typename T (for example). Nice approach, but to make it useful and generic, we need to propagate the typename T throughout the generic code that will use ours. Probably verbose.

So, let’s welcome to std::any.

std::any, as you already guess it, is a class shipped in C++17 and implemented in header <any> that can store a value of any type, so, these lines are completely valid:

std::any a = 123;
std::any b = "Hello";
std::any c = std::vector<int>{10, 20, 30};

Obviously, this is C++ and you as user need to know the data type of what you stored in an instance of std::any, so, to retrieve the stored value you have to use std::any_cast<T> as in this code:

#include <any>
#include <iostream>

int main()
{
    std::any number = 150;
    std::cout << std::any_cast<int>(number) << "\n";
}   

If you try to cast the value stored in an instance of std::any to anything but the actual type, a std::bad_any_cast exception is thrown. For example, if you try to cast that number to a string, you will get this runtime error:

terminate called after throwing an instance of 'std::bad_any_cast'
  what():  bad any_cast

If the value stored in an instance of std::any is an instance of a class or struct, the compiler will ensure that the destructor for that value will be invoked when the instance of std::any goes of scope.

Another really nice thing about std::any is that you can replace the existing value stored in an instance of it, with another value of any other type, for example:

std::any content = 125;
std::cout << std::any_cast<int>(content) << "\n";

content = std::string{"Hello world"};
std::cout << std::any_cast<std::string>(content) << "\n";

About lifetimes

Let’s consider this class:

struct A
{
  int n;
  A(int n) : n{n} { std::cout << "Constructor\n"; }
  ~A() { std::cout << "Destructor\n"; }
  A(A&& a) : n{a.n} { std::cout << "Move constructor\n"; }
  A(const A& a) : n{a.n} { std::cout << "Copy constructor\n"; }
  void print() const { std::cout << n << "\n"; }
};

This class stores an int, and prints it out with “print”. I wrote constructor, copy constructor, move constructor and destructor with logs telling me when the object will be created, copied, moved or destroyed.

So, let’s create a std::any instance with an instance of this class:

std::any some = A{4516};

This will be the output of such code:

Constructor
Move constructor
Destructor
Destructor

Why two constructors and two destructors are invoked if I only created one instance?

Because the instance of std::any will store a copy (ok, in this case a “moved version”) of the original object I created, and while in my example it may be trivial, in a complex object it cannot be.

How to avoid this problem?

Using std::make_any.

std::make_any is very similar to std::make_shared in the way it will take care of creating the object instead of copying/moving ours. The parameters passed to std::make_any are the ones you would pass to the object’s constructor.

So, I can modify my code to this:

auto some = std::make_any<A>(4517);

And the output will be:

Constructor
Destructor

Now, I want to invoke to the method “print”:

auto some = std::make_any<A>(4517);
std::any_cast<A>(some).print();

And when I do that, the output is:

Constructor
Copy constructor
4517
Destructor
Destructor

Why such extra copy was created?

Because std::any_cast<A> returns a copy of the given object. If I want to avoid a copy and use a reference, I need to explicit a reference in std::any_cast, something like:

auto some = std::make_any<A>(4517);
std::any_cast<A&>(some).print();

And the output will be:

Constructor
4517
Destructor

It is also possible to use std::any_cast<T> passing a pointer to an instance of std::any instead of a reference.

In such case, if the cast is possible, will return a valid pointer to a T* object, otherwise it will return a nullptr. For example:

auto some = std::make_any(4517);
std::any_cast<A>(&some)->print();
std::cout << std::any_cast<int>(&some) << "\n";

In this case, notice that I am passing a pointer to “some” instead of a reference. When this occurs, the implementation returns a pointer to the target type if the stored object is of the same data type (as in the second line) or a null pointer if not (as in the third line, where I am trying to cast my object from type A to int). Using this version overloaded version with pointers avoids throwing an exception and allows you to check if the returned pointer is null.

std::any is a very good tool for storing things that we, as implementers of something reusable, do not know a priori; it could be used to store, for example, additional parameters passed to threads, objects of any type stored as extra information in UI widgets (similar to the Tag property in Windows.Forms.Control in .NET, for example), etc.

Performance wise, std::any needs to store stuff in the heap (this assert is not completely correct: Where the stuff is actually stored depends on the actual library implementation and some of them [gcc’s standard library] store locally elements whose sizeof is small [thanks TheFlameFire]) and also needs to do some extra verification to return the values only if the cast is valid, so, it is not as fast as having a generic object known at compile time.

C++20: Concepts, an introduction

I am pretty new doing C++ Concepts, so I will post here the things I will learn while starting to use them.

C++ Concepts are one of these three large features that are shipped with C++20:

  • Concepts
  • Ranges
  • Modules

Basically, C++ Concepts define a set of conditions or constraints that a data type must fulfill in order to be used as a template argument.

For example, I would want to create a function that sums two values and prints the result. In C++17 and older I would code something like this:

template <typename A, typename B>
void sum_and_print(const A& a, const B& b)
{
    std::cout << (a + b) << "\n";
}

And it works properly for types A and B that DO have the operator+ available. If the types I am using do not have operator+, the compiler naïvely will try to substitute types A and B for the actual types and when trying to use the missing operator on them, it will fail miserably.

The way the compiler works is correct, but failing while doing the actual substitution with no earlier verification is kind of a reactive behavior instead of a proactive one. And in this way, the error messages because of substitution error occurrences are pretty large, hard to read and understand.

C++20 Concepts provide a mechanism to explicit the requirements that, in my example, types A and B would need to implement in order to be allowed to use the “sum_and_print” function template. So when available, the compiler will check that those requirements are fulfilled BEFORE starting the actual substitution.

So, let’s start with the obvious one: I will code a concept that mandates that all types that will honor it will have operator+ implemented. It is defined in this way:

template <typename T, typename U = T>
concept Sumable =
 requires(T a, U b)
 {
    { a + b };
    { b + a };
 };

The new keyword concept is used to define a C++ Concept. It is defined as a template because the concept will be evaluated against the type or types that are used as template arguments here (in my case, T and U).

I named my concept “Sumable” and after the “=” sign, the compiler expects a predicate that needs to be evaluated on compile time. For example, if I would want to create a concept to restrict the types to be only “int” or “double”, I could define it as:

template <typename T>
concept SumableOnlyForIntsAndDoubles = std::is_same<T, int>::value || std::is_same<T. double>::value;

The type trait “std::is_same<T, U>” can be used here to create the constraint.

Back to my first example, I need that operator+ will be implemented in types A and B, so I need to specify a set of requirements for that constraint. The new keyword “requires” is used for that purpose.

So, any definition between braces in the requires block (actually “requires” is always a block, even when only a requirement is specified) is something the types being evaluated must fulfill. In my case, “a+b” and “b+a” must be valid operations. If types T or U do not implement operator+, the requirements will not be fulfilled and thus, the compiler will stop before even trying to substitute A and B for actual types.

So, with such implementation, my function “sum_and_print” works like a charm for ints, doubles, floats and strings!

But, what if I have another type like this one:

struct N
{
    int value;

    N operator+(const N& n) const
    {
        return { value + n.value };
    }
};

Though it implements operator+, it does not implement operator<< needed to work with std::cout.

To add such constraint, I need to add an extra requirement to my concept. So, it could be like this one:

template <typename T, typename U = T>
concept Sumable =
 requires(T a, U b)
 {
    { a + b };
    { b + a };
 }
 && requires(std::ostream& os, const T& a)
 {
     { os << a };
 };

The operator && is used here to specify that those requirements need to be fulfilled: Having operator+ AND being able to do “os << a“.

If my types do not fulfill such requirements, I get an error like this in gcc:

<source>:16:5:   in requirements with 'std::ostream& os', 'const T& a' [with T = N]
<source>:18:11: note: the required expression '(os << a)' is invalid
   18 |      { os << a };
      |        ~~~^~~~

That, though looks complicated, is far easier to read than the messages that the compiler produces when type substitution errors occur.

So, if I want to have my code working properly, I need to add an operator<< overloaded for my type N, having finally something like this:

#include <iostream>

template <typename T, typename U = T>
concept Sumable =
 requires(T a, U b)
 {
    { a + b };
    { b + a };
 }
 && requires(std::ostream& os, const T& a)
 {
     { os << a };
 };

template <Sumable A, Sumable B>
void sum_and_print(const A& a, const B& b)
{
    std::cout << (a + b) << "\n";
}

struct N
{
    int value;

    N operator+(const N& n) const
    {
        return { value + n.value };
    }
};

std::ostream& operator<<(std::ostream& os, const N& n)
{
    os << n.value;
    return os;
}

int main()
{
    sum_and_print( N{6}, N{7});
}

Notice that in my “sum_and_print” function template I am writing “template <Sumable a, Sumable b>” instead of the former “template <typename A, typename B>“. This is the way I ask the compiler to validate such type arguments against the “Sumable” concept.


What if I would want to have several “greeters” implemented in several languages and a function “greet” that will use my greeter to say “hi”. Something like this:

template <Greeter G>
void greet(G greeter)
{
    greeter.say_hi();
}

As you can see, I want my greeters to have a method “say_hi“. Thus, the concept could be defined like this one in order to mandate the type G to have the method say_hi() implemented:

template <typename G>
concept Greeter = requires(G g)
{
    { g.say_hi() } -> std::convertible_to<void>;
};

With such concept in place, my implementation would be like this one:

template <typename G>
concept Greeter = requires(G g)
{
    { g.say_hi() } -> std::convertible_to<void>;
};

struct spanish_greeter
{
    void say_hi() { std::cout << "Hola amigos\n"; }
};

struct english_greeter
{
    void say_hi() { std::cout << "Hello my friends\n"; }
};


template <Greeter G>
void greet(G greeter)
{
    greeter.say_hi();
}


int main()
{
    greet(spanish_greeter{});
    greet(english_greeter{});
}

Why would I want to use concepts instead of, say, base classes? Because:

  1. While using concepts, you do not need to use base classes, inheritance, virtual and pure virtual methods and all that OO stuff only to fulfill a contract on probably unrelated stuff, you simply need to fulfill the requirements the concept defines and that’s it (Interface Segregation of SOLID principles would work nice here, anyway, where your concepts define the minimum needed possible constraints for your types).
  2. Concepts are a “Zero-cost abstraction” because their validation is performed completely at compile-time, and, if properly verified and accepted, the compiler does not generate any code related to this verification, contrary to the runtime overhead needed to run virtual things in an object-oriented approach. This means: Smaller binaries, smaller memory print and better performance!

I tested this stuff using gcc 10.2 and it works like a charm.

Deleaker, part 0: Intro

I am testing this nice desktop tool called “Softanics Deleaker” (https://www.deleaker.com/). It was written by Artem Razin and, as you can deduce by its name, it is an application that helps the programmers to find memory leaks on C++, Delphi and .NET applications.

Starting this post, I will post several blog entries about my experiences using it and the features it exposes.

I installed it and installed the Visual Studio extension that ships with the installer. For my tests, I am using Visual Studio 2019 16.4 preview.

In Visual Studio I created a C++ console application and wrote this very simple and correct application:

int main()
{
    std::cout << "Hello World!\n";
    return 0;
}

When I run the local debugger, and since I have installed the Deleaker VS extension, the leaker will load all libraries and symbols of my application and will open a window similar to this one:

I still do not know what all those options mean, but the important thing here is the “No leaks found” message. The filter containing the “266 hidden” items refers to known leaks that Deleaker knows that exist in the Microsoft C Runtime Library.

Now, I will create a very small program too containing a small memory leak:

int main()
{
    for (int i = 0; i < 10; i++)
    {
        char* s = new char{'a'};
        std::cout << *s << "\n";
    }

    return 0;
}

As obviously observed, I am allocating dynamically one byte to contain a character and I am forgetting to delete it. When I debug it, I get this interesting Deleaker window:

Now Deleaker detected my forgotten allocation and says that to me: “ConsoleApplication2.exe!main Line 7”.

As you can see, the “Hit Count” says that the allocation occurred 10 times (because my loop) and it says that 370 bytes leaked on this problem. Though that seems weird because I allocated only 1 byte 10 times, the 370 bytes appear because I compiled my code in Debug Mode and the compiler adds a lot of extra info per allocation. When I changed my compilation to Release Mode, I got the actual 10 bytes in the Size column.

When you click into the information table in the row containing the memory leak information, the Visual Studio editor highlights that line and moves the caret to such position (the new char{‘a’} line , so you realize where you allocated memory that was not released.

And that is it for now.

In next blog entries I will explore how to “Deleak” not so obvious things, how Deleaker behaves with shared pointers, COM objects, shared libraries, templates, virtual destructors and so on :)

Happy 2020!

 

C++ “Hello world”

Ok, the most famous first program in any programming language is the ‘Hello World’ program, so I will explain how to create one in this post.

For my example, I will use ‘g++’ in a Linux environment, but ‘clang++’ works exactly the same way.

To write a ‘Hello World’ program in C++, you will need to create an empty file and name it with an appropriate extension (any name will do, for instance, HelloWorld.cpp). Common C++ file extensions include ‘.cpp’, ‘.cxx’, and ‘.cc’.

The compiler does not require the filename to match the name of any ‘class’ or other content inside the file. You can also store the file in any folder; there is no need to create it in a specific directory containing all the elements of a ‘package’ (like in Java).

Once you have created an empty HelloWorld.cpp file, you can open it in any text editor and start writing the following lines of code:

#include <iostream>
 
int main()
{
  std::cout << "Hello world\n";
}

Save the file, open a terminal, navigate to the folder where your file is located, and then enter the following command:

g++ HelloWorld.cpp -o HelloWorld

If you do not receive any messages after entering the command, congratulations! Your program has compiled successfully. Otherwise, there is an error in your code that you will need to fix before compiling again.

Once it compiles correctly, you will need to run the program. In a Linux/Unix environment, you do this by typing ./ followed by the program’s name:

./HelloWorld

And the program output should be:

Hello world

Understanding how all of this works

The C++ compilation process consists of three main steps:

  • Preprocessing: This step involves the preprocessor, which performs various text substitutions and transformations on your code before it is compiled.
  • Compilation: During this phase, your code is converted into machine code, with placeholders for calls to functions that reside in external libraries.
  • Linking: This step resolves those function calls by linking them to the actual functions in the libraries your program uses. If you do not specify any additional libraries (as in our example), your program will only be linked to the Standard Library, which comes with any C++ compiler.

#include

#include <iostream>

All lines starting with ‘#’ are called ‘preprocessor directives’. These are instructions that the preprocessor recognizes and executes.

#include tells the preprocessor to locate the file specified either inside quotes or between angle brackets and insert its content where the #include directive is used.

If the filename is enclosed in angle brackets (as in our case), the preprocessor searches for the file in a predefined directory that the compiler is aware of. For example, it will look for the file iostream in that directory. In a Linux environment, these files are typically located in a path similar to this one (I am using g++ 8.2):

/usr/include/c++/8

If the filename is declared between double quotes, it means the file will be in the current folder or in a folder explicitly specified when compiling the program.

iostream is the file that contains a lot of code allowing our programs to handle input and output. In our ‘Hello World’, we will need std::cout, which is defined in this file.

main function

int main()

When you invoke your program, the operating system needs to know which piece of code to execute. This code resides in the main function.

All functions must return something. For example, if you call a function sum that adds two numbers, it must return a value containing the result of the sum. So, the sum function must return an integer value (an int). Some old compilers allowed the main() function to return void (meaning ‘return nothing’), but the C++ standard specifies that main() must return an int value.

However, even if main() is declared to return an int, if you do not explicitly return anything, the compiler will not complain and will automatically return 0. Note that this behavior is exceptional and only allowed for the main() function.

The return value of the main() function indicates whether an error occurred. A return value of 0 means the program executed without errors, while a non-zero value indicates an error. The specific non-zero value depends entirely on the programmer’s design and error-handling mechanisms.

The program will continue running as long as the main() function is executing. Once its execution ends, the program terminates and returns the value to the operating system.

The body of any function is enclosed in curly braces.

std::cout

std::cout << "Hello world\n";

std::cout is a pre-existing object that represents command line output. The << operator essentially sends the text "Hello world\n" to the std::cout object, resulting in that text being displayed in the terminal.

The \n character sequence indicates a newline.

g++

g++ and clang are the most popular C++ compilers for Unix platforms today. You can replace one with the other almost without restrictions because clang was designed to be a drop-in replacement for g++.

When you say something like:

g++ HelloWorld.cpp

You are instructing the g++ compiler to go through the entire compilation process for the file HelloWorld.cpp. ‘Go through the entire compilation process’ in this case means running the preprocessor on the file, compiling it, linking it, and producing an executable.

Since I did not specify the name of the executable file in the command line example above, the g++ command generates a file called a.out in the current folder.

To specify the name of the file to be generated, you must invoke g++ with the -o option followed by the name of the executable file you want to create.

C++17: std::variant

Let’s suppose I have a system that handles students, teachers and crew of a school.

To model that in an object oriented style, I would have a class hierarchy similar to this one:

class person
{
std::string name;

public:
template <typename String>
person(String&& name) : name { forward<String>(name) }
{
}

virtual ~person() { }
const string& get_name() const { return name; }
virtual void do_something() = 0;
};

class student : public person
{
public:
using person::person;

void do_homework()
{
cout << "Need access to Stack Overflow\n";
}

void do_something() override
{
cout << "I am doing something the students do\n";
}
};

class teacher : public person
{
public:
using person::person;

void teach()
{
cout << "This is the unique truth\n";
}

void do_something() override
{
cout << "I am doing something the teachers do\n";
}
};

class crew : public person
{
public:
using person::person;

void help_team()
{
cout << "I am helping teachers and students\n";
}

void do_something() override
{
cout << "I am doing something crew do\n";
}
};

And my collection would be defined like this:

map<size_t, person*> people;

where the size_t ID would be the key of the map.

Since I do not want to deal with raw pointers, this would be a better definition:

map<size_t, unique_ptr<person>> people;

Now, I will insert some elements to my collection:

people.insert(make_pair(14, make_unique<student>("Phil Collins")));
people.insert(make_pair(25, make_unique<teacher>("Peter Gabriel")));
people.insert(make_pair(32, make_unique<crew>("Justin Bieber")));

To get the name of person 14, I should do something like:

people.find(14)->second->get_name(); //being 100% sure that person with ID 14 exists

And to do something specific implemented in a derived class, I need to downcast:

static_cast<crew&>(*people.find(32)->second).help_team();

Since C++11, the language has been evolving to a more generic and more template metaprogramming-like paradigm and has been getting away from the classical OOP design where inheritance and polymorphism are amongst the most important tools.

So, how could I implement something similar to the thing shown above without inheritance and polymorphism?

Let me introduce std::variant ! :)

C++17 introduced variant, that is basically a template class where you specify the possible types of the values that the variant instance can store, so, for my example, I could define something like:

using person = std::variant<student, teacher, crew>;

In this line, I am defining an alias person that represents a variant value that can store a student, a teacher or a crew (think on variant to be something like a typesafe union).

So, my map would be defined in this way:

map<size_t, person> people;

And my classes student, teacher, and crew could be defined as follows:

class student
{
std::string name;
public:
template <typename String>
student(String&& name) : name { forward<String>(name) }
{
}

const string& get_name() const { return name; }

void do_homework()
{
cout << "Need access to Stack Overflow\n";
}

void do_something()
{
cout << "I am doing something the students do\n";
}
};

class teacher
{
std::string name;
public:
template <typename String>
teacher(String&& name) : name { forward<String>(name) }
{
}

const string& get_name() const { return name; }

void teach()
{
cout << "This is the unique truth\n";
}

void do_something()
{
cout << "I am doing something the teachers do\n";
}
};

class crew
{
std::string name;

public:
template <typename String>
crew(String&& name) : name { forward<String>(name) }
{
}

const string& get_name() const { return name; }

void help_team()
{
cout << "I am helping teachers and students\n";
}

void do_something()
{
cout << "I am doing something crew do\n";
}
};

To make my example clean and to demonstrate that I do not need inheritance and polymorphism, notice I am not defining a base class nor I am defining virtual methods at all. Anyway. in real production code the coder could create a base class with no virtual methods and inherit from such class to avoid code duplication.

Notice also I am not using any pointer (raw or smart), so the map will contain actual values, removing one level of indirection and letting the compiler optimize based on that knowledge.

So, let me add some objects to the map:

people.insert(make_pair(14, student { "Phil Collins" }));
people.insert(make_pair(25, teacher { "Peter Gabriel" }));
people.insert(make_pair(32, crew { "Justin Bieber" }));

To get the person with id 14:

auto& the_variant = people.find(14)->second;

To get the “student” inside that variant object, I need to use the function get:

auto& the_student = get(the_variant);
cout << the_student.get_name() <<  "\n";

If I try to get an object that is not of the type stored in the variant, the system will throw a std::bad_variant_access exception, for example if I try to do this with the variant from the example above:

auto& the_student = get<teacher>(the_variant);

To execute a specific method of a given class, I do not need to do any downcasting because I already have the object of the given type, so, instead of:

static_cast<crew&>(*people.find(32)->second).help_team();

I would do:

get<crew>(people.find(32)->second).help_team();

that is by far straight and cleaner.

Now, given I have a method called “do_something” in all my classes, I would want to be able to invoke it no matter the type of the object stored in the variant.

So, I need to do something like this in the polymorphic world:

for (auto& p : people)
{
p.second->do_something();
}

To do this, there is a function called: std::visit.

What visit does is accessing the variant object and invoke the method passed as argument with the object stored in the variant. So, given my example, I could do something like:

auto& the_variant = people.find(14)->second;
visit([](auto& s)
{
s.do_something();
}, the_variant);

The magic is in the “auto” part here. When you “visit” a variant, the compiler generates one method for each type specified in the variant declaration, in my case 3 (one for student, one for crew and one for teacher), and executes the specific method depending on the type of the value stored in the variant. So, to execute do_something() for all objects in the variant, I need to do something like:

for (auto& p : people)
{
visit([](auto& s)
{
s.do_something();
}, p.second);
}

It is beautiful, isn’t it? Polymorphic-like behavior with no overhead that polymorphism brings.

C++: “auto” return type deduction

Before C++14, when implementing a function template, programmers did not know the return type of their functions and had to do something like this:

template <typename A, typename B>
auto do_something(const A& a, const B& b) -> decltype(a.do_something(b))
{
  return a.do_something(b);
}

Programmers had to use decltype to tell the compiler: “The return type of this method is the return type of the do_something method of object a.” The auto keyword was used to inform the compiler: “The return type of this function is declared at the end.”

Since C++14, coders can do something much simpler:

template <typename A, typename B>
auto do_something(const A& a, const B& b)
{
  return a.do_something(b);
}

Starting with C++14, the compiler deduces the return type of functions that use auto as the return type.

Restrictions:

All returned values must be of the same type. The example below will not even compile because it can return either an int or a double:

auto f(int n)
{
	if (n == 1)
    {
		return 1;
    }

	return 2.0;
}

For recursive functions, the first return value must allow the compiler to deduce the return type of the function, as in this example:

auto accumulator(int n)
{
	if (n == 0)
    {
		return 0;
    }

	return n + accumulator(n - 1);
}

Starting with C++20, a function can be declared like this and it will work properly:

auto do_something(const auto& a, const auto& b)
{
    return a.do_something(b);
}

When programmers define functions this way, if one or more function parameters are declared as auto, the entire function is treated as a template. So, while this new construction might seem to add more “functionality” to the auto keyword, it is really just a more convenient way of declaring function templates.

More posts about auto: