C++17: Structured bindings

When accessing to an element of a given compound type, you probably want to get its internal fields in order to work with them.

When tuples were introduced (my post about tuples), the way of doing this was similar to this:

std::tuple<int, std::string, std::string> person { 20050316, "Isabel", "Pantoja" };
int birth_date = std::get<0>(person);
std::string& first_name = std::get<1>(person);
std::string& last_name = std::get<2>(person);

In this way, you were able to access any elements inside the tuple. Quite useful, but quite verbose as well.

So, std::tie could help us to make this code by far easier to read:

std::tuple<int, std::string, std::string> person { 20050316, "Isabel", "Pantoja" };

int birth_date;
std::string first_name;
std::string last_name;

std::tie(birth_date, first_name, last_name) = person;

The magic that std::tie does here is extracting all elements inside of the tuple and mapping them to the values passed as non-const-references to the std::tie function.

C++17 introduced “structured binding”, a far better and evident and elegant way of doing this without getting lost in what std::tie or std::get<N> do.

Structured binding, as its name suggests, binds a set of variables to its corresponding values inside a tuple. Actually these constructs are supported:

  • std::tuple<T...>
  • std::pair<T, U>
  • std::array<T, N> and stack-based old-style arrays
  • Structs!

How does it work?

The example above could be re-written like this:

std::tuple<int, std::string, std::string> person { 20050316, "Isabel", "Pantoja" };
auto& [birth_date, first_name, last_name] = person;

So, the compiler will “decompose” person in its “atomic” values and will bind the birth_date reference to the first value in person, the first_name to the second value and the last_name to the third one.

Notice that birth_date, first_name, and last_name are references because of auto&. If I would have used auto instead, they would have been actual values instead.

And you can do the same thing with arrays:

std::array<int, 4> values {{ 8, 5, 2, 9 }};
auto [v0, v1, v2, v3] = values;
std::cout << "Sum: " << (v0 + v1 + v2 + v3) << "\n";

int values2[] = { 5, 4, 3 };
auto [i0, i1, i2] = value2;
std::cout << i0 << "; " << i1 << "; " << i2 << "\n";

With std::pair:

auto pair = std::make_pair(19451019, "María Zambrana");
auto& [date, name] = pair;
std::cout << date << ": " << name << "\n";

Or even with structs!!!

struct Point
{
  int x;
  int y;
};

Point p { 10, 25 };
auto [x, y] = p;
std::cout << "(" << x << "; " << y << ")\n";

This feature makes iterating maps incredibly elegant.

Compare this pre-C++11 code:

for (std::map<std::string, std::string>::const_iterator it = translations.begin();
     it != translations.end();
     it++)
{
  std::cout << "English: " << it->first << "; Spanish: " << it->second << "\n";
}

Using C++17, it could have been written like this in C++17:

for (auto& [english, spanish] : translations)
{
  std::cout << "English: " << english << "; Spanish: " << spanish << "\n";
}

Amazing!

C++17: std::string_view

C++17 shipped with a very useful non-owning read-only string handling class template: std::basic_string_view<CharT, CharTraits> with several instantiations mirroring the std::basic_string class template: std::string_view, std::wstring_view, std::u8string_view, std::u16string_view, and std::u32string_view.

A std::string_view instance contains only a pointer to the first character in a char array and the length of the string. The std::string_view does not own such array and thus, it is not responsible for the lifetime of the text it refers to.

Since it contains the length of the string it represents, the string does not need to end with a '\0' character. This feature (the non-‘\0’ terminated string) enables a lot of interesting features like creating substring views by far faster than its std::string counterpart (since creating a substring means just updating the pointer and the length of the new std::string_view instance).

How a std::string_view is instantiated?

std::string_view today = "Monday";  // string_view constructed from a const char*
std::string aux = "Tuesday";
std::string_view tomorrow = aux;  // string_view constructed from a std::string
std::string_view day_after_tomorrow = "Wednesday"sv;  // string_view constructed from a std::string_view
constexpr std::string_view day_before_friday = "Thursday"; // compile-time string_view

How can I use a std::string_view?

A lot of classes from the C++ standard library have been updated to use the std::string_view as a first class citizen, so, for example, to print out a std::string_view instance, you could simply use the std::cout as usual:

std::string_view month = "September";
std::cout << "This month is " << month << "\n";

As I mentioned earlier, you can use the std::string_view to execute several actions in the underlying text with a better performance, for example, to create substrings:

#include <string_view>
#include <iostream>

int main()
{

    std::string_view fullName = "García Márquez, Gabriel";

    // I need to separate first name and last name

    // I find the index of the ","
    auto pos = fullName.find(",");
    if (pos == std::string_view::npos) // NOT FOUND
    {
        std::cerr << "Malformed full name\n";
        return -1;
    }

    std::string_view lastName = fullName.substr(0, pos);
    std::string_view firstName = fullName.substr(pos + 2);

    std::cout << "FIRST NAME: " << firstName;
    std::cout << "\nLAST NAME:  " << lastName << "\n";

    return 0;
}

To verify if a text starts or ends with something (starts_with() and ends_with() were introduced in C++20):

std::string_view price = "125$";
if (price.endsWith("$"))
{
    price.remove_suffix(1); // removes 1 char at the end of the string_view
    std::cout << price << " dollar\n";
}

It exposes a lot of functionality similar to the std::string class, so its usage is almost the same.

How can I get the content of a std::string_view?

  • If you want to iterate the elements, you can use operator[], at() or iterators in the same way that you do it in strings.
  • If you want to get the underlying char array, you can access to it through the data() method. Be aware the underlying char array can miss the null character as mentioned above.
  • If you want to get a valid std::string from a std::string_view, there is no automatic conversion from one to other, so you need to use a explicit std::string constructor:
std::string_view msg = "This will be a string";
std::string msg_as_string { msg };

std::cout << msg_as_string << "\n";

Handle with care

And though probably you are already imagining the scenarios a string_view can be used into your system, take into account that it could bring some problems because of its non-owning nature, for example this code:

std::string_view say_hi()
{
  std::string_view hi = "Guten Tag";
  return hi;
}

Is a completely valid and correct usage of a std::string_view.

But this slightly different version:

std::string_view say_hi()
{
  std::string_view hi = "Guten Tag"s;
  return hi;
}

Returns a corrupt string and the behavior for this is undefined. Notice the “s” at the end of the literal. That means “Guten Tag” will be a std::string and not a const char*. Being a std::string means a local instance will be created in that function and the std::string_view refers to the underlying char array in that instance. BUT, since it is a local instance, it will be destroyed when it goes out of scope, destroying the char array and returning a dangling pointer inside the std::string_view.

Sutile, but probably hard to find and debug.

To read more about std::string_view, consult the ultimate reference:

https://en.cppreference.com/w/cpp/string/basic_string_view

C++17: [[nodiscard]] attribute

C++17 adds a new attribute called [[nodiscard]] to let the user know that a return value from a function or method should be handled properly or assigned to a value.

For example, look to this code:

int sum(int a, int b)
{
  return a + b;
}

int main()
{
  sum(10, 20);
  return 0;
}

It produces no result or side-effects, but if the programmer forgot assigning the return value to a variable by mistake, the error will not be immediately obvious.

Now, in this scenario:

char* getNewMessage()
{
  char* nm = new char[100];
  strcpy(nm, "Hello world");
  return nm;
}

int main()
{
  getNewMessage();
  return 0;
}

There is a memory leak produced because the returned value was not stored anywhere and there is no way to deallocate its memory.

Marking a function or method with [[nodiscard]], encourages the compiler to show a compilation warning when it is invoked and its return value is simply bypassed.

You can also write an additional message with the [[nodiscard]] attribute. That message will be displayed if a warning is generated.

In my examples, we could mark my functions like this:

#include <cstring>

[[nodiscard]]
int sum(int a, int b)
{
  return a + b;
}

[[nodiscard("Release the memory using delete[]")]]
char* getNewMessage()
{
  char* nm = new char[100];
  strcpy(nm, "Hello world");
  return nm;
}

int main()
{
  sum(10, 20);
  getNewMessage();
  return 0;
}

And in this case, g++ returns the following compilation warnings:

In function 'int main()':
<source>:19:6: warning: ignoring return value of 'int sum(int, int)', declared with attribute 'nodiscard' [-Wunused-result]

<source>:20:16: warning: ignoring return value of 'char* getNewMessage()', declared with attribute 'nodiscard': 'Release the memory using delete[]' [-Wunused-result]

Though using it could add a lot of verbosity to your method declarations, it is a good idea using it because it prevents some errors to occur.

More on [[nodiscard]]: https://en.cppreference.com/w/cpp/language/attributes

C++17: std::optional

The C++17 standard library ships with a very interesting class template: std::optional<T>.

The idea behind it is to make explicit the fact that a variable can hold or not an actual value.

Before the existence of std::optional<T>, the only way to implement such semantics was through pointers or tagged unions (read about C++17 std::variant here).

For example, if I want to declare a struct person that stores a person’s first name, last name and nickname; and since not all people have or not a nickname, I would have to implement that (in older C++) in this way:

struct person
{
  std::string first_name;
  std::string last_name;
  std::string* nickname; //no nickname if null
};

To make explicit that the nickname will be optional, I need to write a comment stating that “null” represents “no nickname” in this scenario.

And it works, but:

  • It is error prone because the user can easily do something like: p.nickname->length(); and run into an unexpected behavior when the nickname is null.
  • Since the nickname will be stored as a pointer, the instance needs to be created in heap, adding one indirection level and one additional dynamic allocation/deallocation only to support the desired behavior (or the programmers need to have the nickname manually handled by them and set a pointer to that nickname into this struct).
  • Because of the last reason, it is not at all obvious if the instance pointed to by said pointer should be explicitly released by the programmer or it will be released automatically by the struct itself.
  • The “optionalness” here is not explicit at all at code level.

std::optional<T> provides safeties for all these things:

  • Its instances can be created at stack level, so there will not be extra allocation, deallocation or null-references: RAII will take care of them (though this depends on the actual Standard Library implementation).
  • The “optionalness” of the attribute is completely explicit when used: Nothing is more explicit than marking as “optional” to something… optional, isn’t it?
  • Instances of std::optional<T> hide the direct access to the object, so to access its actual value they force the programmer to do extra checks.
  • If we try to get the actual value of an instance that is not storing anything, a known exception is thrown instead of unexpected behavior.

Refactoring my code, it will look like this one:

#include <optional>
#include <string>

struct person
{
  std::string first_name;
  std::string last_name;
  std::optional<std::string> nickname;
};

The code is pretty explicit and no need to further explanation or documentation about the optional character of “nickname”.

So let’s create two people, one with nickname and the other one with no nickname:

int main()
{
  person p1 { "John", Doe", std::nullopt };
  person p2 { "Robert", "Balboa", "Rocky" };
}

In the first instance, I have used “std::nullopt” which represents an std::optional<T> instance with no value (i.e. : an “absence of value”).

In the second case, I am implicitly invoking to the std::optional<T> constructor that receives an actual value.

The verbose alternative would be:

int main()
{
    person p1 { "John", "Doe", std::optional<std::string> { } };
    person p2 { "Robert", "Balboa", std::optional<std::string> {"Rocky"} };
}

The parameterless constructor represents an absence of value (std::nullopt) and the other constructor represents an instance storing an actual value.

Next I will overload the operator<< to work with my struct person, keeping in mind that if the person has a nickname, I want to print it out.

This could be a valid implementation:

std::ostream& operator<<(std::ostream& os, const person& p)
{
    os << p.last_name << ", " << p.first_name;
    
    if (p.nickname.has_value())
    {
        os << " (" << p.nickname.value() << ")";
    }
    
    return os;
}

The has_value() method returns true if the optional<T> instance is storing an actual value. The value can be retrieved using the value() method.

There is an overload for the operator bool that does the same thing that the has_value() method does: Verifying if the instance stores an actual value or not.

Also there are overloads for operator* and operator-> to access the actual values.

So, a less verbose implementation of my operator<< shown above would be:

std::ostream& operator<<(std::ostream& os, const person& p)
{
    os << p.last_name << ", " << p.first_name;
    
    if (p.nickname)
    {
        os << " (" << *(p.nickname) << ")";
    }
    
    return os;
}

Other way to retrieve the stored value, OR return an alternative value would be using the method “value_or()” method.

void print_nickname(const person& p)
{
    std::cout << p.first_name << " " << p.last_name << "'s nickname: "
              << p.nickname.value_or("[no nickname]") << "\n";
}

For this example, if the nickname variable stores an actual value, it will return it, otherwise, the value returned will be, as I coded: “[no nickname]”.

What will happen if I try to access to the optional<T>::value() when no value is actually stored? A std::bad_optional_access exception will be thrown:

try
{
    std::optional<int> op {};
    std::cout << op.value() << "\n";
}
catch (const std::bad_optional_access& e)
{
    std::cerr << e.what() << "\n";
}

Notice I have used the value() method instead of operator*. When I use operator* instead of value(), the exception is not thrown and the user runs into an unexpected behavior.

So, use std::optional<T> in these scenarios:

  • You have some attributes or function arguments that may have no value and are, therefore, optional. std::optional<T> makes that decision explicit at the code level.
  • You have functions that may OR may not return something. For example, what will be the minimum integer found in an empty list of ints? So instead of returning an int with its minimum value (std::numeric_limits<int>::min()), it would be more accurate to return an std::optional<int>.

Note that std::optional<T> does not support reference types (i.e. std::optional<T&>) so if you want to store an optional reference, probably you want to use a std::reference_wrapper<T> instead of type T (i.e. std::optional<std::reference_wrapper<T>>).

C++17: std::any

When trying to implement something that will store a value of an unknown data type (to be as generic as possible, for example), we had these possibilities before C++17:

  • Having a void* pointer to something that will be assigned at runtime. The problem with this approach is that it leaves all responsibility for managing the lifetime of the data pointed to by this void pointer to the programmer. Very error prone.
  • Having a union with a limited set of data types available. We can use still use this approach using C++17 variant.
  • Having a base class (e.g. Object) and store pointers to instances derived of that class (à la Java).
  • Having an instance of template typename T (for example). Nice approach, but to make it useful and generic, we need to propagate the typename T throughout the generic code that will use ours. Probably verbose.

So, let’s welcome to std::any.

std::any, as you already guess it, is a class shipped in C++17 and implemented in header <any> that can store a value of any type, so, these lines are completely valid:

std::any a = 123;
std::any b = "Hello";
std::any c = std::vector<int>{10, 20, 30};

Obviously, this is C++ and you as user need to know the data type of what you stored in an instance of std::any, so, to retrieve the stored value you have to use std::any_cast<T> as in this code:

#include <any>
#include <iostream>

int main()
{
    std::any number = 150;
    std::cout << std::any_cast<int>(number) << "\n";
}   

If you try to cast the value stored in an instance of std::any to anything but the actual type, a std::bad_any_cast exception is thrown. For example, if you try to cast that number to a string, you will get this runtime error:

terminate called after throwing an instance of 'std::bad_any_cast'
  what():  bad any_cast

If the value stored in an instance of std::any is an instance of a class or struct, the compiler will ensure that the destructor for that value will be invoked when the instance of std::any goes of scope.

Another really nice thing about std::any is that you can replace the existing value stored in an instance of it, with another value of any other type, for example:

std::any content = 125;
std::cout << std::any_cast<int>(content) << "\n";

content = std::string{"Hello world"};
std::cout << std::any_cast<std::string>(content) << "\n";

About lifetimes

Let’s consider this class:

struct A
{
  int n;
  A(int n) : n{n} { std::cout << "Constructor\n"; }
  ~A() { std::cout << "Destructor\n"; }
  A(A&& a) : n{a.n} { std::cout << "Move constructor\n"; }
  A(const A& a) : n{a.n} { std::cout << "Copy constructor\n"; }
  void print() const { std::cout << n << "\n"; }
};

This class stores an int, and prints it out with “print”. I wrote constructor, copy constructor, move constructor and destructor with logs telling me when the object will be created, copied, moved or destroyed.

So, let’s create a std::any instance with an instance of this class:

std::any some = A{4516};

This will be the output of such code:

Constructor
Move constructor
Destructor
Destructor

Why two constructors and two destructors are invoked if I only created one instance?

Because the instance of std::any will store a copy (ok, in this case a “moved version”) of the original object I created, and while in my example it may be trivial, in a complex object it cannot be.

How to avoid this problem?

Using std::make_any.

std::make_any is very similar to std::make_shared in the way it will take care of creating the object instead of copying/moving ours. The parameters passed to std::make_any are the ones you would pass to the object’s constructor.

So, I can modify my code to this:

auto some = std::make_any<A>(4517);

And the output will be:

Constructor
Destructor

Now, I want to invoke to the method “print”:

auto some = std::make_any<A>(4517);
std::any_cast<A>(some).print();

And when I do that, the output is:

Constructor
Copy constructor
4517
Destructor
Destructor

Why such extra copy was created?

Because std::any_cast<A> returns a copy of the given object. If I want to avoid a copy and use a reference, I need to explicit a reference in std::any_cast, something like:

auto some = std::make_any<A>(4517);
std::any_cast<A&>(some).print();

And the output will be:

Constructor
4517
Destructor

It is also possible to use std::any_cast<T> passing a pointer to an instance of std::any instead of a reference.

In such case, if the cast is possible, will return a valid pointer to a T* object, otherwise it will return a nullptr. For example:

auto some = std::make_any(4517);
std::any_cast<A>(&some)->print();
std::cout << std::any_cast<int>(&some) << "\n";

In this case, notice that I am passing a pointer to “some” instead of a reference. When this occurs, the implementation returns a pointer to the target type if the stored object is of the same data type (as in the second line) or a null pointer if not (as in the third line, where I am trying to cast my object from type A to int). Using this version overloaded version with pointers avoids throwing an exception and allows you to check if the returned pointer is null.

std::any is a very good tool for storing things that we, as implementers of something reusable, do not know a priori; it could be used to store, for example, additional parameters passed to threads, objects of any type stored as extra information in UI widgets (similar to the Tag property in Windows.Forms.Control in .NET, for example), etc.

Performance wise, std::any needs to store stuff in the heap (this assert is not completely correct: Where the stuff is actually stored depends on the actual library implementation and some of them [gcc’s standard library] store locally elements whose sizeof is small [thanks TheFlameFire]) and also needs to do some extra verification to return the values only if the cast is valid, so, it is not as fast as having a generic object known at compile time.

C++17: std::variant

Let’s suppose I have a system that handles students, teachers and crew of a school.

To model that in an object oriented style, I would have a class hierarchy similar to this one:

class person
{
std::string name;

public:
template <typename String>
person(String&& name) : name { forward<String>(name) }
{
}

virtual ~person() { }
const string& get_name() const { return name; }
virtual void do_something() = 0;
};

class student : public person
{
public:
using person::person;

void do_homework()
{
cout << "Need access to Stack Overflow\n";
}

void do_something() override
{
cout << "I am doing something the students do\n";
}
};

class teacher : public person
{
public:
using person::person;

void teach()
{
cout << "This is the unique truth\n";
}

void do_something() override
{
cout << "I am doing something the teachers do\n";
}
};

class crew : public person
{
public:
using person::person;

void help_team()
{
cout << "I am helping teachers and students\n";
}

void do_something() override
{
cout << "I am doing something crew do\n";
}
};

And my collection would be defined like this:

map<size_t, person*> people;

where the size_t ID would be the key of the map.

Since I do not want to deal with raw pointers, this would be a better definition:

map<size_t, unique_ptr<person>> people;

Now, I will insert some elements to my collection:

people.insert(make_pair(14, make_unique<student>("Phil Collins")));
people.insert(make_pair(25, make_unique<teacher>("Peter Gabriel")));
people.insert(make_pair(32, make_unique<crew>("Justin Bieber")));

To get the name of person 14, I should do something like:

people.find(14)->second->get_name(); //being 100% sure that person with ID 14 exists

And to do something specific implemented in a derived class, I need to downcast:

static_cast<crew&>(*people.find(32)->second).help_team();

Since C++11, the language has been evolving to a more generic and more template metaprogramming-like paradigm and has been getting away from the classical OOP design where inheritance and polymorphism are amongst the most important tools.

So, how could I implement something similar to the thing shown above without inheritance and polymorphism?

Let me introduce std::variant ! :)

C++17 introduced variant, that is basically a template class where you specify the possible types of the values that the variant instance can store, so, for my example, I could define something like:

using person = std::variant<student, teacher, crew>;

In this line, I am defining an alias person that represents a variant value that can store a student, a teacher or a crew (think on variant to be something like a typesafe union).

So, my map would be defined in this way:

map<size_t, person> people;

And my classes student, teacher, and crew could be defined as follows:

class student
{
std::string name;
public:
template <typename String>
student(String&& name) : name { forward<String>(name) }
{
}

const string& get_name() const { return name; }

void do_homework()
{
cout << "Need access to Stack Overflow\n";
}

void do_something()
{
cout << "I am doing something the students do\n";
}
};

class teacher
{
std::string name;
public:
template <typename String>
teacher(String&& name) : name { forward<String>(name) }
{
}

const string& get_name() const { return name; }

void teach()
{
cout << "This is the unique truth\n";
}

void do_something()
{
cout << "I am doing something the teachers do\n";
}
};

class crew
{
std::string name;

public:
template <typename String>
crew(String&& name) : name { forward<String>(name) }
{
}

const string& get_name() const { return name; }

void help_team()
{
cout << "I am helping teachers and students\n";
}

void do_something()
{
cout << "I am doing something crew do\n";
}
};

To make my example clean and to demonstrate that I do not need inheritance and polymorphism, notice I am not defining a base class nor I am defining virtual methods at all. Anyway. in real production code the coder could create a base class with no virtual methods and inherit from such class to avoid code duplication.

Notice also I am not using any pointer (raw or smart), so the map will contain actual values, removing one level of indirection and letting the compiler optimize based on that knowledge.

So, let me add some objects to the map:

people.insert(make_pair(14, student { "Phil Collins" }));
people.insert(make_pair(25, teacher { "Peter Gabriel" }));
people.insert(make_pair(32, crew { "Justin Bieber" }));

To get the person with id 14:

auto& the_variant = people.find(14)->second;

To get the “student” inside that variant object, I need to use the function get:

auto& the_student = get(the_variant);
cout << the_student.get_name() <<  "\n";

If I try to get an object that is not of the type stored in the variant, the system will throw a std::bad_variant_access exception, for example if I try to do this with the variant from the example above:

auto& the_student = get<teacher>(the_variant);

To execute a specific method of a given class, I do not need to do any downcasting because I already have the object of the given type, so, instead of:

static_cast<crew&>(*people.find(32)->second).help_team();

I would do:

get<crew>(people.find(32)->second).help_team();

that is by far straight and cleaner.

Now, given I have a method called “do_something” in all my classes, I would want to be able to invoke it no matter the type of the object stored in the variant.

So, I need to do something like this in the polymorphic world:

for (auto& p : people)
{
p.second->do_something();
}

To do this, there is a function called: std::visit.

What visit does is accessing the variant object and invoke the method passed as argument with the object stored in the variant. So, given my example, I could do something like:

auto& the_variant = people.find(14)->second;
visit([](auto& s)
{
s.do_something();
}, the_variant);

The magic is in the “auto” part here. When you “visit” a variant, the compiler generates one method for each type specified in the variant declaration, in my case 3 (one for student, one for crew and one for teacher), and executes the specific method depending on the type of the value stored in the variant. So, to execute do_something() for all objects in the variant, I need to do something like:

for (auto& p : people)
{
visit([](auto& s)
{
s.do_something();
}, p.second);
}

It is beautiful, isn’t it? Polymorphic-like behavior with no overhead that polymorphism brings.