C++: reference_wrapper

This small program:

#include <iostream>
#include <functional>
  
void add(int a, int b, int& r)
{
    r = a + b;
}
 
int main()
{
    int result = 0;
     
    using namespace std::placeholders;
    auto f = std::bind(add, _1, 20, result);
     
    f(80);
     
    std::cout << result << std::endl;
    return 0;
}

Supposedly adds 80 to 20 and prints the result; it compiles perfectly but when it gets executed it; it prints out…. 0!

Why?

Because the std::bind method parameters are passed using value semantics and, thus, the “result” variable is copied before being passed to the bound function add. Why?

Because since std::bind does not know if the parameters will still be valid when the actual invocation is performed (programmers could pass a std::function object to another function passing local variables as arguments and invoking it from there).

The solution? Pretty simple:

int main()
{
    int result = 0;
     
    auto f = std::bind(add, _1, 20, std::ref(result));
     
    f(80);
     
    std::cout << result << std::endl;
    return 0;
}

The function std::ref was used, which sends the parameter as a reference to the bound function.

What does this function std::ref do?

It is a template function that returns a std::reference_wrapper object. A std::reference_wrapper is a class template that wraps a reference in a concrete object.

It can also be done in this way:

int main()
{
    int result = 0;
     
    std::reference_wrapper<int> result_ref(result);
    auto f = std::bind(add, _1, 20, result_ref);
     
    f(80);
     
    std:.cout << result << std::endl;
    return 0;
}

and everything would continue working as expected.

As it can be seen, programmers can pass a std::reference_wrapper by value, and everything will work properly because its copy constructor copies the reference (actually, the std::reference_wrapper implementations do not store a reference but a pointer to the data being referenced, but their methods expose it as an actual reference).

Other nice use of this would be in cases where programmers need to have a container of references (the actual objects are stored in other container or elsewhere, and programmers do not need or want to have copies or pointers to them). For example, having these classes:

class A { };
class B : public A { };

If programmers want to have local variables pointing to these objects and store such variables in a container:

int main()
{
  A a, c;
  B b, d;

  std::vector<A> v = { a, b, c, d };
}

Good? No! Bad at all! The programmers here are storing instances of class A in the vector. All instances of B will be copied as instances of A (losing their specific attributes, methods, and all the polymorphic behavior they could have, and so on).

One solution? Storing pointers:

int main()
{
  A a, c; 
  B b, d;
 
  std::vector<A*> v = { &a, &b, &c, &d };
}

It works, but it is not clear whether the container consumers will be responsible for freeing the objects.

Other solution? Using references:

int main()
{
  A a, c;
  B b, d;

  std::vector<A&> v = { a, b, c, d };
}

Looks nice, but it does not compile; because programmers cannot specify reference types in a vector.

Real solution: Using std::reference_wrapper:

int main()
{
  A a, c; 
  B b, d;
 
  std::vector<std::reference_wrapper<A>> v = { a, b, c, d };
}

Someone could argue: In what scenario is this thing useful?

If some developers are creating a UI frame using Java Swing, they probably create a subclass of the JFrame class, specify their visual components as member variables, and also add them to the JFrame‘s component list. Implementing something similar in C++ using std::reference_wrapper instances would be quite elegant.

C++11: enable_if

std::enable_if is another feature taken from the Boost C++ library that now ships with every C++11 compliant compiler.

As its name says, the template struct enable_if, enables a function only if a condition (passed as a type trait) is true; otherwise, the function is undefined. In this way, you can declare several “overloads” of a method and enable or disable them depending on some requirements you need. The nice part of this is that the disabled functions will not be part of your binary code because the compiler simply will ignore them.

Continue reading “C++11: enable_if”

C++: std::function and std::bind

std::function and std::bind were originally part of the Boost C++ Library, but they were incorporated into the C++11 standard.

std::function is a standard library template class that provides a very convenient wrapper for a simple function, a functor, a method, or a lambda expression.

For example, if programmers want to store several functions, methods, functors, or lambda expressions in a vector, they could write something like this:

#include <functional>
#include <iostream>
#include <string>
#include <vector>
 
void execute(const std::vector<std::function<void ()>>& fs)
{
    for (auto& f : fs)
        f();
}
 
void plain_old_func()
{
    std::cout << "I'm an old plain function" << std::endl;
}
 
class functor final
{
public:
    void operator()() const
    {
        std::cout << "I'm a functor" << std::endl;
    }
};
 
int main()
{
    std::vector<std::function<void ()>> x;
    x.push_back(plain_old_func);
     
    functor functor_instance;
    x.push_back(functor_instance);
    x.push_back([] ()
    {
        std::cout << "HI, I'm a lambda expression" << std::endl;
    });
     
    execute(x);
}

As it can be seen, in this declaration:

std::vector<std::function<void ()>> x;

a vector of functions is being declared. The void () part means that the functions stored there do not receive any arguments and do not return anything (i.e., they have void as the return type). If programmers wanted to define a function that receives two integers and returns an integer, they could declare std::function as:

int my_func(int a, int b) { return a + b; }
 
function<int (int, int)> f = my_func;

The standard library also includes a function called std::bind. std::bind is a template function that returns a std::function object which, as the name suggests, binds a set of arguments to a function.

In the first code example, the functions stored in the vector do not receive any arguments, but programmers might want to store a function that accepts arguments in the same vector. They can do this using std::bind.

Having this function:

void show_text(const std::string& t)
{
    std::cout << "TEXT: " << t << std::endl;
}

How can it be added to the vector of functions in the first code listing (if even possible, because they have different signatures)? It can be added like this in the main() function:

std::function<void ()> f = std::bind(show_text, "Bound function");
x.push_back(f);

The code above shows that std::bind takes a pointer to a function (it can also be a lambda expression or a functor) and a list of parameters to pass to that function. As a result, std::bind returns a new function object with a different signature because all the parameters for the function have already been specified.

For example, this code:

#include <functional>
#include <iostream>
  
int multiply(int a, int b)
{
    return a * b;
}
 
int main()
{
    using namespace std::placeholders;

    auto f = std::bind(multiply, 5, _1);
    for (int i = 0; i < 10; i++)
    {
        std::cout << "5 * " << i << " = " << f(i) << std::endl;
    }

    return 0;
}

demonstrates another usage of std::bind. The first parameter is a pointer to the multiply function. The second parameter is the value passed as the first argument to multiply. The third parameter is called a “placeholder.” A placeholder specifies which parameter in the bound function will be filled by a runtime argument. Inside the for loop, f is called with only one parameter, and the second argument is provided dynamically.

Thanks to placeholders, even the order of arguments can be modified. For example:

#include <functional>
#include <string>
#include <iostream>
  
void show(const std::string& a, const std::string& b, const std::string& c)
{
    std::cout << a << "; " << b << "; " << c << std::endl;
}
 
int main()
{
    using namespace std::placeholders;

    auto x = std::bind(show, _1, _2, _3);
    auto y = std::bind(show, _3, _1, _2);
    auto z = std::bind(show, "hello", _2, _1);
     
    x("one", "two", "three");
    y("one", "two", "three");
    z("one", "two");
     
    return 0;
}

The output is:

one; two; three
three; one; two
hello; two; one

std::bind can also be used to wrap a method of a given object (i.e., an already instantiated one) into a function. For example, if programmers want to wrap the say_something method from the following struct:

struct Messenger
{
    void say_something(const std::string& msg) const
    {
        std::cout << "Message: " << msg << std::endl;
    }
};

into a std::function declared as follows:

using my_function = std::function<void (const std::string&)>;

The call to std::bind would look like this:

Messenger my_messenger; /* an actual instance of the class */

my_function 
    a_simple_function = std::bind(
        &Messenger::say_something /* pointer to the method */,
        &my_messenger, /* pointer to the object */,
        std::placeholders::_1 /* placeholder for the first argument in the method, as usual */
);

a_simple_function("Hello"); // will invoke the method Messenger::say_something on the object my_messenger

C++11: unordered maps

The STL ships with a sorted map template class that is commonly implemented as a balanced binary tree.

The good thing on this is the fast search average time ( O(log2N) if implemented as a balanced binary tree, where N is the number of elements added into the map) and that when the map is iterated, the elements are retrieved following an established order.

C++11 introduces an unordered map implemented as a hash table. The good thing on this is the constant access time for each element ( O(1) ) but the bad things are that the elements are not retrieved in order and the memory print of the whole container can (I am just speculating here) be greater that the map’s one.

Continue reading “C++11: unordered maps”

C++: Smart pointers, part 1

This is the first of several posts I wrote related to smart pointers:

  1. Smart pointers
  2. unique_ptr
  3. More on unique_ptr
  4. shared_ptr
  5. weak_ptr

Memory management in C is error-prone because keeping track of every block of memory allocated and deallocated can be confusing and stressful.

Although C++ has the same manual memory management as C, it provides additional features that make memory management easier:

  • When an object is instantiated on the stack (e.g., Object o;), the C++ runtime ensures that the object’s destructor is invoked when the object goes out of scope (when the end of the enclosing block is reached, a premature ‘return’ is encountered, or an exception is thrown), thereby releasing all memory and resources allocated for that object. This very nice feature is called RAII.
  • (Ab)using the feature of operator overloading, we can create classes that simulate pointer behaviour. These classes are called: Smart pointers.
Continue reading “C++: Smart pointers, part 1”

C++: Primitive types

A primitive type is a data type whose values are simple in nature, such as numbers, characters, or boolean values. Primitive types serve as the most fundamental building blocks in any programming language and form the basis for more complex data types. The following are the primitive types available in C++:

bool

It is stored internally in one byte, and the values that a variable of this type can represent are true or false. All boolean operations return a value of this type. This type was not available in early C, so many operations that return integer values instead of boolean ones can be used as boolean expressions. In such cases, the compiler assumes that 0 represents false, and any value different from 0 represents true. For example, the following two code excerpts have the same semantics:”

int a = 2;
if (a != 0) //a != 0 evaluates to a boolean value. In this case, it evaluates to true
{
  printf("a is different than 0\n");
}

and

int a = 2;
if (a) //a is an "int", but since it is different than 0, the compiler evaluates it as true
{
  printf("a is different than 0\n");
}

char

It is stored internally as a byte and represents a character. When this data type was created, there was no immediate need for international character support in the language, so it was completely sufficient to store all the characters needed to write in English. However, as the use of computers evolved, expanded, and became globally available, the need for international character support became evident, leading to the definition of new character encoding standards. When these new standards emerged, a new character data type was required because one byte was insufficient to represent all the symbols used in human languages (Chinese glyphs, for example, number more than 40,000). Despite this, char is still used as the standard character data type, and much legacy code still relies on character strings based on char. Some encoding algorithms, such as UTF-8, can store international characters using sequences of char characters, with UTF-8 storing Unicode characters in sequences of 1, 2, 3, or 4 char bytes.

wchar_t

It is a wide-character type that represents a character but is stored internally using 16 or 32 bits, instead of the 8 bits used by the char type. The number of bits it uses depends on the computer architecture, operating system, and C++ compiler. Typically, Windows uses 16-bit characters, while UNIX systems use 32-bit characters. The encoding for wchar_t is not defined by the standard, leaving the choice to the compiler. Both char and wchar_t can be treated as integer types, allowing arithmetic operations on their values. Initially, wchar_t was a type alias (typedef), but modern compilers treat it as a built-in type by default. However, it can still be handled as a typedef to support legacy code.

These days, wchar_t is the default character type in Windows applications, although programmers can configure their projects to use char instead. wchar_t is the default in Windows because the lower-level Win32 API also uses this type by default. If char is selected, Windows converts any char sequence to a wchar_t sequence using a specified encoding.

short

It is a ‘short integer,’ representing an integer with less precision than a ‘full-blown int.’ Though generally, short represents a signed integer with 16-bit precision (meaning it can represent values between -32,768 and 32,767), the decision of what precision to use was left to the compiler implementer. unsigned short is the unsigned version of this 16-bit precision integer, but it represents values between 0 and 65,535.

int

It is the most common integer data type and was originally used to represent a processor ‘word.’ On 16-bit platforms, it used to be a 16-bit precision integer, and on 32-bit platforms, it became a 32-bit precision number. This ‘rule’ was broken when 64-bit hardware became available, but the int data type still retained 32-bit precision. This means it can store numbers between −2,147,483,648 and 2,147,483,647, or between 0 and 4,294,967,295 when using the unsigned int version.

long and unsigned long

They represent ‘long integer numbers,’ and their precision depends on the compiler and the OS. On 16-bit OSes, they used to represent 32-bit precision integers. On 32-bit hardware, they also represent 32-bit precision, and on 64-bit OSes, they have 32-bit precision on Windows and 64-bit precision on UNIX systems.

long long and unsigned long long

They represent 64-bit integers and are part of the standard since C++11.

float

They represent single-precision floating-point numbers. They are stored in 32 bits (as defined by IEEE 754-2008) and can represent values approximately between 1.18 × 10^-38 and 3.4 × 10^38, with around 6 to 7 significant digits of precision.

double

They represent double-precision floating-point numbers. They are stored in 64 bits and can represent values approximately between 2.225 × 10^-308 and 1.798 × 10^308, with about 15 to 16 significant digits of precision.

C99 exact-width integer types

C99 also introduced a set of exact-width integer types that represent signed and unsigned integers with precisions of 8, 16, 32, and 64 bits, independent of the compiler, OS, or processor architecture. They are:

  • 8-bit precision: int8_t and uint8_t
  • 16-bit precision: int16_t and uint16_t
  • 32-bit precision: int32_t and uint32_t
  • 64-bit precision: int64_t and uint64_t

These exact-width integer types are not built-in types; they are simply aliases (typedefs) of the primitive types described above. They are widely supported by modern compilers, including GCC, Clang, and MSVC.

C++11 introduced more strict sizes for character types as well:

  • 8-bit char: char8_t
  • 16-bit char: char16_t
  • 32-bit char: char32_t

All these exact-width types are declared in the following header:

#include <cstdint>

 void

Though not exactly a data type, void represents:

  • The absence of parameters in a function when declared as an argument.
  • The absence of a return value in a function.
  • When used with pointers, it represents a pointer to a memory address without any information about the data type at that address.

std::nullptr_t

Introduced in C++11 to represent a null pointer, std::nullptr_t allows for better type safety with null pointer constants. I wrote more about it in this post: nullptr .

C++: C++-style listener list

Sometimes, it is useful to create a class to handle listeners that will be notified when something occurs in a given context. This is a common pattern (the Observer pattern) used in Java Swing, where an event triggers the invocation of one or more functions waiting for that event to occur.

In the example below, written in Java, the class instances perform some task, and the programmers want to be notified when the task has been completed:

public class Task
{
  public void doSomething() { }
 
  public void addTaskListener(TaskListener t);
}
 
public interface TaskListener
{
  void taskFinished(TaskEvent e);
}
 
public static void main(String... args)
{
  Task t = new Task();
  final String name = "TASK 123";

  t.addTaskListener(new TaskListener()
  {
    public void taskFinished(TaskEvent e)
    {
      System.out.println("Task finished: " + name);
    }
  });

  t.addTaskListener(new TaskListener()
  {
    public void taskFinished(TaskEvent e)
    {
      System.out.println("This is a second listener");
    }
  });
  t.doSomething();
}

How can this be implemented as idiomatically as possible in C++?

The first and most basic approach would be to use function pointers (in fact, an anonymous class in Java, as in the example, has access to the attributes of the class where it was defined as well as to all local variables marked as final; however, this cannot be done in the same way from an external function).

So, what would the C++ way of doing this look like?

The caller can be implemented as follows using C++ lambdas:

int main()
{
  Task t;
  std::string name = "TASK 123";
  t.addTaskListener([&name]
  {
    std::cout << "Task finished: " << name << std::endl;
  });

  t.addTaskListener([]
  {
    std::cout << "This is a second listener" << std::endl;
  });
  t.doSomething();
}

This is concise, elegant, and powerful: the name local variable can be captured by the lambda function as a closure.

The Task implementation should have a vector of listeners and should be able to access them when the task is successfully executed:

class Task
{
  private:
   std:: vector<TaskListener> listeners;
 
  public:
    void addTaskListener(TaskListener lis)
    {
      listeners.push_back(lis);
    }
 
    void doSomething()
    { 
      ...
      invokeListeners();
    }
 
  private:
    void invokeListeners()
    {
      for (TaskListener lis : listeners)
        lis();
    }
};

The problem is: How should TaskListener be declared?

Could it be a template parameter of a class template?

Answer: No.

Why?

Because each lambda function (as shown in the second example) is, under the hood, a class with a functor; so, there is no way to declare it as one single class and use it for two different classes (two lambda functions with different closures are implemented by the compiler as two unrelated classes).

As a second idea, the addTaskListener method could be implemented this way:

template <typename TaskListener>
void addTaskListener<Tasklistener t>
{
  listeners.add(t);
}

However, in that case, another new problem arises: how could the listeners vector be declared in a way that allows programmers to store one element of a given type and another of a different type?

The correct solution is to use the std::function abstraction.

std::function is a template class that can wrap a function, a functor, or a method, making it very suitable for this problem.

Thus, in the example, TaskListener could be only an alias to a std::function:

#include <functional>
 
using TaskListener = std::function<void ()>;

The parameterized type void () specifies that the function does not receive any arguments and returns void.

More on std::function here.

C++: Variadic templates (functions)

A very interesting feature introduced in C++11 is called “variadic templates,” which, in short, are template classes and functions that can accept a variable number of parameterized types. They can be useful for several things, for example:

  • Providing type-safe replacements for functions with a variable number of arguments (stdarg.h).
  • Allowing the user to create instances of n-tuples.
  • Creating type-safe containers with elements of various types.

This post takes a look at variadic template functions.

For example, consider a function called show() that accepts any number of parameters and displays them separated by whitespace.

Thus, the following calls would be valid:

show(1, 2, 3); // that would output 1 2 3
show("Today is", dayOfTheWeek); // that would output "Today is Tuesday"
show(p1, p2, p3, p4, p5);

In plain old C, a function with a similar, though not identical, signature could be implemented using the variable argument functions mechanism provided by C, along with the functions, types, and macros defined in the stdarg.h library. However, there are two problems with that approach:

  1. The number of arguments must be provided explicitly or implicitly. A good example of this is the C printf function. When declaring its signature, the number of format specifiers used implicitly defines the number of parameters that need to be accessed. For example, printf("%d %s", a, b); knows that there are two variables, and printf("%d %d %d", x, y, z); knows that there are three.
  2. The function is not type-safe in the sense that the programmer is responsible for determining the type of each variadic argument (that’s what the “%s” or “%d” in printf are used for).
void show(int n, ...)
{
    va_list x;
    va_start(x, n);
    for (int i = 0; i < n; i++)
    {
      // retrieve the parameter with va_arg and show it. There is no type info here.
    }

    va_end(x);
}

With the example above, since the type of each parameter after n is not specified, the supported types should be documented somehow; otherwise, the behavior will not be deterministic.

Before C++11, there were two partial solutions to this problem: one would be to create several overloaded functions, similar to this implementation:

template <typename T1>
void show(const T1& t1)
{
  std::cout << t1 << std::endl;
}

template <typename T1, typename T2>
void show(const T1& t1, const T2& t2)
{
  std::cout << t1 << " " << t2 << std::endl;
}

template <typename T1, typename T2, typename T3>
void show(const T1& t1, const T2& t2, const T3& t3)
{
  std::cout << t1 << " " << t2 << " "  << t3 << std::endl;
}

… and so on.

The other solution would be to have a base class (similar to the Java object model) and implement a method similar to this one:

void show(const Object* o1, const Object* o2 = nullptr, const Object* o3 = nullptr, const Object* o4 = nullptr, const Object* o5 = nullptr, const Object* o6 = nullptr)
{
  std::cout << o1->toString() << " ";
  if (o2)
      std::cout << o2->toString() << " ";
  if (o3)
    std::cout << o3->toString() << " ";
  if (o4)
    std::cout << o4->toString() << " ";
  if (o5)
    std::cout << o5->toString() << " ";
  if (o6)
    std::cout << o6->toString() << " ";

  std::cout << std::endl;
}

Though they would work, both approaches have their weaknesses. Both will support only a fixed number of arguments. In the first implementation, the programmer must provide N overloads to support N parameters, and in the second implementation, the programmer must provide a class hierarchy to make it work.

Variadic templates perform a type expansion similar to the template-based implementation (my first solution above), but this is done by the compiler instead of the programmer.

What is interesting about such expansion is that it is performed recursively.

This code implements the show function using variadic templates:

template <typename T>
void show(const T& value)
{
  std::cout << value << std::endl;
}

template <typename U, typename... T>
void show(const U& head, const T&... tail)
{
   std::cout << head << " ";
   show(tail...);
}

The first overload will be the base case, where either a single parameter is passed to the show() method or when all the other overloads have already been expanded.

The second overload is very interesting because we declare a function that takes two elements: U and T. U represents a concrete type (the type of the element to be actually printed), while T represents a list of types (notice the ... syntax). The argument list of the function is also interesting: head will be a const reference to a value of type U, and ...tail will represent a set of const references to several types.

Now, look at the implementation. We take the head and display it using std::cout, and then invoke another overload of show() by passing the tail list of parameters The tail... syntax is called parameter pack expansion, and it is similar to taking all the arguments that tail represents and “expanding” them as individual parameters in the function call. At this point, if the list of parameters contains more than one type, the compiler will create a new overload for this method. If the list of parameters contains only one type, the compiler will invoke the first overload already defined.

Amazing, isn’t it?

This feature is heavily used in variadic template-based classes in the Standard Library, such as std::C++11: std::tuple and C++17: std::variant.

C++: nullptr

In C and C++, the preprocessor definition NULL is/was used to explicitly indicate that a pointer is not pointing anywhere right now.

The problem with NULL is that underneath it is just a plain 0, something like this:

#define NULL 0

The following code excerpt shows how this can turn into a problem:

#include <iostream>
 
void m(int x)
{
    std::cout << x << endl;
}
 
void m(const char* x)
{
    std::cout << (x == NULL ? "NULL" : x) << std::endl;
}
 
int main()
{
  m(12);
  m(NULL);
  return 0;
}

When that code is compiled, the following error is displayed:

test.cpp: In function 'int main()':
test.cpp:19:9: error: call of overloaded 'm(NULL)' is ambiguous
test.cpp:19:9: note: candidates are:
test.cpp:5:6: note: void m(int)
test.cpp:10:6: note: void m(const char*)

Why does the compiler consider the m(NULL) ambiguous?

Because it cannot decide if NULL is actually a pointer (because it is a 0) or an integer (because NULL is… a 0).

C++11 introduces the literal nullptr of type std::nullptr_t; a literal that unambiguously represents a pointer pointing nowhere, eliminating the concept of being a pointer pointing to memory address 0.

It has these advantages:

  • It is a strongly typed null pointer, like Java or C# null.
  • No ambiguity when trying to pass a null pointer to something accepting, say, an int.
  • It is a literal and a keyword, so it cannot be defined or used as a symbolic name.
  • It cannot be casted automatically to a numeric type, as NULL can.

So, if the example above is replaced with this construction, there is this code:

int main()
{
  m(12);
  m(nullptr);
  return 0;
}

The code will compile and execute with no problems.

Since the type of nullptr is std::nullptr_t, which is a native type, programmers can also write a specific overload when the nullptr literal is passed as a parameter, as in the following example:

void m(int x)
{
    std::cout << x << std::endl;
}
 
void m(const char* x)
{
    std::cout << x << std::endl;
}
 
void m(nullptr_t)
{
    std::cout << "nullptr!" << std::endl;
}
 
int main()
{
  m(12);
  m(nullptr);
  return 0;
}

C++: Range-based for loop

Before C++11, when programmers wanted to display the elements stored in a vector, they had to write something like this:

template <typename T>
void show(const T& x)
{
    for (typename T::const_iterator i = x.begin(); i != x.end(); ++i)
        std::cout << *i << std::endl;
}

That method will be useful to show any collection that has a begin() and an end() and an iterator with an operator++(int), an operator!=() and an operator*() properly implemented.

C++11 and later versions ship with a very useful range-based for loop that makes iterations easier to write and read. This new for loop works also with plain-old arrays.

So, the code above can be rewritten like this in modern C++:

template <typename T>
void show(const T& x)
{
    for (auto& i : x)
        std::cout << i << std::endl;
}

As shown, the syntax is very similar to Java’s “enhanced-for” loop, and the resulting code is much easier to write and understand compared to the old version.

In the next example, you can see how this for loop can be used with several containers and arrays:

#include <vector>
#include <array>
#include <iostream>
  
template <typename T>
void show(const T& t)
{
    for (auto& i : t)
        std::cout << i << std::endl;
}
 
int main()
{
    int ints[] = { 10, 20, 30, 40, 50 };
    show(ints);
 
    std::cout << "*****" << std::endl;
 
    std::vector<std::string> s = { 
            "Monday", "Tuesday",
            "Wednesday", "Thursday",
            "Friday", "Saturday", "Sunday"
    };
 
    show(s);
     
    std::cout << "*****" << std::endl;
     
    std::array<string, 12> m = {
            "January", "February",
            "March", "April", "May",
            "June", "July", "August",
            "September", "October",
            "November", "December"
    };
 
    show(m);                            
    std::cout << "*****" << std::endl;
 
    return 0;   
}

Custom classes can also be iterated using the new for loop if the required methods are implemented, as in this example:

#include <vector>
#include <array>
#include <iostream>
 
template <typename T>
void show(const T& t)
{
    for (auto& i : t)
        std::cout << i << std::endl;
}
 
 
class RangeIterator
{
    private:
        int _index;
         
    public:
        explicit RangeIterator(int index) : _index(index) { }
         
        bool operator!=(const RangeIterator& x) const
        {
            return _index != x._index;
        }
         
        const int& operator*() const
        {
            return _index;
        }
         
        int& operator++()
        {
            return ++_index;
        }
};
 
template <int N, int M>
class Range
{
    public:
        using const_iterator = const RangeIterator;
         
        const_iterator begin() const
        {
            return const_iterator{N};
        }
         
        const_iterator end() const
        {
            return const_iterator{M};
        }
};
 
int main()
{
    Range<10, 20> r;
    show(r);
 
    return 0;   
}

Programmers can even iterate over a range of numbers (like the classic for loop) in this very elegant way:

int main()
{
    for (auto i : Range<10, 20>{})
    {
        std::cout << i << std::endl;
    }
}

The class to be iterated needs to have begin() and end() methods that return an iterator.

An iterator is basically a class that implements operator++(), operator!=(), and operator*() to access the next element, verify if there are more elements, and return the current element, respectively. Its semantics mimic pointer arithmetic to behave similarly to how a plain old array can be iterated.

C++: Move semantics

This example involves a class A and a container called List<T>. As shown, the container is essentially a wrapper around std::vector.

There is also a function called getNObjects that returns a list containing N instances of class A.

#include <vector>
#include <string>
#include <iostream>
 
class A
{
public:
    A() = default;
    ~A() = default;
    A(const A& a) { std::cout << "copy ctor" << std::endl ; }
     
    A& operator=(const A&)
    {
        std::cout << "operator=" << std::endl;
        return *this;
    }
};
 
template <typename T>
class List
{
private:
    std::vector<T>* _vec;
     
public:
    List() : _vec(new vector<T>()) { }
    ~List() { delete _vec; }
     
    List(const List<T>& list)
        : _vec(new vector<T>(*(list._vec)))
    {
    }
     
    List<T>& operator=(const List<T>& list)
    {
        delete _vec;
        _vec = new vector<T>(*(list._vec));
        return *this;
    }
     
    void add(const T& a)
    {
        _vec->push_back(a);
    }
     
    int getCount() const
    {
        return static_cast<int>(_vec->size());
    }
     
    const T& operator[](int i) const
    {
        return (*_vec)[i];
    }
};
  
List<A> getNObjects(int n)
{
    List<A> list;
    A a;
    for (int i = 0; i< n; i++)
        list.add(a);
 
    std::cout << "Before returning: ********" << std::endl;
    return list;
}
 
int main()
{
    List<A> list1;
    list1 = getNObjects(10);    
    return 0;
}

When this code runs, it will produce an output like this:

...
...
copy ctor
copy ctor
Before returning: ********
copy ctor
copy ctor
copy ctor
copy ctor
copy ctor
copy ctor
copy ctor
copy ctor
copy ctor
copy ctor

The number of calls to the copy constructor equals the number of objects in the list!

Why?

Because when the getNObjects() function returns a list, all its attributes are copied (i.e., the internal vector is copied) to the list that receives the result (list1), and then the local list inside the function is destroyed (triggering the destructor for each object in the list). Though this is logically correct, it results in poor performance due to many unnecessary copies and destructions.

Starting from C++11, a new type of reference is available to address this problem: rvalue references. An rvalue reference binds to a temporary object (rvalue), which is typically the result of an expression that is not bound to a variable. Rvalue references are denoted using the symbol &&.

With rvalue references, programmers can create move constructors and move assignment operators, which improve performance when returning or copying objects in cases like this example.

How does this work?

In the example, the List<T> class contains a pointer to a vector. What happens if, instead of copying every object in the std::vector, the programmers “move” the vector pointer from the local list inside the function to the list that receives the result? This would save a lot of processing time by avoiding unnecessary copies and destructions.

Thus, the move constructor and move assignment operator work as follows: They receive an rvalue reference to the list being moved, “steal” its data, and take ownership of it. Taking ownership means that the object receiving the data is responsible for releasing all resources originally managed by the moved-from object (achieved by setting the original object’s pointer to nullptr).

Here’s how imove constructor and move assignment operator can be implemented for the List<T> class:

List(List<T>&& list) //move constructor
    : _vec(list._vec)
{
    list._vec = nullptr; //releasing ownership
}
     
List<T>& operator=(List<T>&& list)
{
    delete _vec;
    _vec = list._vec;
    list._vec = nullptr; //releasing ownership
    return *this;
}

With these changes, the output would be:

...
...
copy ctor
copy ctor
Before returning: ********

All the copy constructor calls after the “Before returning” line are avoided.

Isn’t that great?

What other uses do rvalue references have?

Here’s an example of a simple swap function for integers:

void my_swap(int& a, int& b)
{
  int c = a;
  a = b;
  b = c;
}

Straightforward enough. But what if, instead of swapping two integers, we needed to swap two large objects (such as vectors, linked lists, or other complex types)?

template <typename T>
void my_swap(T& a, T& b)
{
  T c = a;
  a = b;
  b = c;
}

If the copy constructor of class T is slow (like the std::vector copy constructor), this version of my_swap can have very poor performance.

Here’s an example demonstrating the issue:

#include <iostream>
#include <string>
  
class B
{
    private: int _x;
    public:
        B(int x) : _x(x) { cout << "ctor" << endl; }
         
        B(const B& b) : _x(b._x)
        {
            std::cout << "copy ctor" << std::endl;
        }
         
        B& operator=(const B& b)
        {
            _x = b._x;
            std::cout << "operator=" << std::endl;
            return *this;
        }
 
        friend std::ostream& operator<<(std::ostream& os, const B& b)
        {
            os << b._x;
            return os;
        }
         
};
 
template <typename T>
void my_swap(T& a, T& b)
{
    T c = a; //copy ctor, possibly slow
    a = b;   //operator=, possibly slow
    b = c;   //operator=, possibly slow
}
 
int main()
{
    B a(1);
    B b(2);
    my_swap(a, b);
    std::cout << a << "; " << b << std::endl;
    return 0;
}

The output is:

ctor
ctor
copy ctor
operator=
operator=
2; 1

The class B is simple, but if the copy constructor and assignment operator are slow, my_swap‘s performance will suffer.

To add move semantics to class B, move constructor and move assignment operator must be implemented:

B(B&& b)  : _x(b._x)
{
    std::cout << "move ctor" << std::endl;
}
         
B& operator=(B&& b)
{
    _x = b._x;
    std::cout << "move operator=" << std::endl;
    return *this;
}

However, the move constructor and assignment operator will not be invoked automatically. In my_swap, the compiler does not know if it should use the copy or move versions of the constructors and assignment operators.

This problem can be fixed by explicitly telling the compiler to use move semantics using the function template std::move():

template <typename T>
void my_swap(T& a, T& b)
{
    T c = std::move(a); //move ctor, fast
    a = std::move(b);   //move operator=, fast
    b = std::move(c);   //move operator=, fast
}

The std::move function casts an lvalue to an rvalue reference, signaling to the compiler that it should use the move constructor and assignment operator.

The updated output is:

ctor
ctor
move ctor
move operator=
move operator=
2; 1

The entire standard library has been updated to support move semantics.

Perfect forwarding is another feature built on top of rvalue references.

C++: Lambda expressions

Having this Person class:

class Person
{
  private:
    std::string firstName;
    std::string lastName;
    int id;

  public:
    Person(const std::string& fn, const std::string& ln, int i)
    : firstName{fn}
    , lastName{ln}
    , id{i}
    {
    }

    const std::string& getFirstName() const { return firstName; }
    const std::string& getLastName() const { return lastName; }
    int getID() const { return id; }
};

The programmers need to store several instances of this class in a vector:

std::vector<Person> people;
people.push_back(Person{"Davor", "Loayza", 62341});
people.push_back(Person{"Eva", "Lopez", 12345});
people.push_back(Person{"Julio", "Sanchez", 54321});
people.push_back(Person{"Adan", "Ramones", 70000});

If they want to sort this vector by person ID, a PersonComparator must be implemented to be used in the std::sort algorithm from the standard library:

class PersonComparator
{
  public:
     bool operator()(const Person& p1, const Person& p2) const
     {
        return p1.getID() < p2.getID();
     }
};

PersonComparator pc;
std::sort(people.begin(), people.end(), pc);

Before C++11, the programmers needed to create a separate class (or alternatively a function) to use the sort algorithm (actually to use any standard library algorithm).

C++11 introduced “lambda expressions”, which are a nice way to implement that functionality to be passed to the algorithm exactly when it is going to be used. So, instead of defining the PersonComparator as shown above, the same functionality could be achieved by implementing it in this way:

std::sort(people.begin(), people.end(), [](const Person& p1, const Person& p2)
{
  return p1.getID() < p2.getID();
});

Quite simple and easier to read. The “[]” square brackets are used to mark the external variables that will be used in the lambda context. “[]” means: “I do not want my lambda function to capture anything”; “[=]” means: “everything passed by value” (thanks Jeff for your clarification on this!!); “[&]” means: “everything passed by reference”.

Given the vector declared above, what if the programmers want to show all instances inside it? They could do this before C++11:

std::ostream& operator<<(std::ostream& os, const Person& p)
{
    os << "(" << p.getID() << ") " << p.getLastName() << "; " << p.getFirstName();
    return os;
}

class show_all
{
public:
    void operator()(const Person& p) const
    { 
        std::cout << p << std::endl;
    }
};

show_all sa;
std::for_each(people.begin(), people.end(), sa);

And with lambdas the example could be implemented in this way:

std::for_each(people.begin(), people.end(), [](const Person& p)
{
    std::cout << p << std::endl;
});

C++: ‘auto’

Starting C++11, C++ contains a lot of improvements to the core language as well as a lot of additions to the standard library.

The aims for this new version, according to Bjarne Stroustrup were making C++ a better language for systems programming and library building, and making it easier to learn and teach.

auto is an already existing keyword inherited from C that was used to mark a variable as automatic (a variable that is automatically allocated and deallocated when it runs out of scope). All variables in C and C++ were auto by default, so this keyword was rarely used explicitly. The decision of recycling it and change its semantics in C++ was a very pragmatic decision to avoid incorporating (and thus, breaking old code) new keywords.

Since C++11, auto is used to infer the type of the variable that is being declared and initialized. For example:

int a = 8;
double pi = 3.141592654;
float x = 2.14f;

in C++11 and later, the code above can be declared in this way:

auto a = 8;
auto pi = 3.141592654;
auto x = 2.14f;

In the last example, the compiler will infer that a is an integer, pi is a double and x is a float.

Someone could say: “come on, I do not see any advantage on this because it is clearly obvious that ‘x‘ is a float and it is easy to me infer the type instead of letting the compiler to do that”, and yes, though the user will always be able to infer the type, doing it is not always as evident or easy as thought. For example, if someone wants to iterate in a std::vector in a template function, int the code below:

template <typename T>
void show(const vector<T>& vec)
{
    for (auto i = vec.begin(); i != vec.end(); ++i) // notice the 'auto' here
        std::cout << *i << std::endl;
}

the following declaration:

auto i = vec.begin();

in ‘old’ C++ would have to be written as:

typename vector::const_iterator i = vec.begin();

So, in this case, the auto keyword is very useful and makes the code even easier to read and more intuitive.

Anyway, there are restrictions on its usage:

  • Before C++20, it is not possible to use auto as a type of an argument of a function or method.
auto x = 2, b = true, c = "hello"; // invalid auto usage

Starting with C++14, the return type of functions can be deduced automatically (you can read more about that here). It can also be used in lambda expressions (more info here); . Starting with C++20, the arguments of a function can also be auto.

To read about auto see these links:

C++: Pimpl

Imagine we have this class defined in the reader.dll DLL:

class DLLEXPORT Reader
{
  public:
    Reader(const std::string& filename);
    ~Reader();        

    std::string readLine() const;
    bool        isEndOfFile()    const;        

  private:
    FILE* file;
};

This class allows its user to read from a file only once.

What if the user wants to use this same class to read from a file multiple times? It can be modified as follows:

class DLLEXPORT Reader
{
  public:
    Reader(const std::string& filename);
    ~Reader();

    std::string readLine() const;
    bool isEndOfFile() const;
    void restart();

  private:
    std::string filename;
    FILE* file;
};

What has been done here is adding the file name as a class attribute, allowing the file to be opened multiple times. Additionally, a ‘restart’ method was introduced.

Below is a function that uses the first version of the reader.dll DLL.

void showFile(const std::string& file)
{
  Reader reader(file);
  while (!reader.isEndOfFile())
  {
    std::cout << reader.readLine() << std::endl;
  }
}

The problem arises when users attempt to link their code with the second version of the reader.dll. The program may malfunction, crash, or fail entirely. Why?

Although the API of the second version is compatible with the first (meaning the code will link perfectly), the ABIs are not. The ABI, or ‘Application Binary Interface’, defines how binaries are linked. Why are the ABIs incompatible? Because the ‘filename’ attribute was added in place of the ‘file’ attribute, every reference to ‘file’ in the invoker will now ‘binarily’ point to the same address where ‘filename’ is located after the change. Since these are different types, the program will behave unpredictably.

This issue occurs because the class header explicitly declares class attributes, which is a well-known encapsulation problem in C++. A similar problem can occur even without adding or removing methods if, for instance, private attributes are replaced (e.g., changing FILE* to std::fstream).

The ‘pimpl idiom’ (also known as the ‘opaque pointer’ or ‘cheshire cat’ idiom) is a C++ technique to avoid this problem. The idea is to include a pointer to a struct in the class interface (.h) to store the class attributes, but define the struct inside the .cpp file, keeping it hidden from the interface. Doing this resolves several issues:

  • ABI compatibility is maintained because the class attributes are not exposed in the .h file and are used only internally within the DLL.
  • It provides better encapsulation (the .h files only expose what the user needs to know).
  • The sizeof(reader) (in this example) remains the same, regardless of how many attributes the class has, as they are hidden within the Pimpl. This is crucial because it prevents memory layout shifts when the implementation changes.
  • If only the implementation changes, the project using our .h does not need to be recompiled since the .h remains unchanged.

So, how would the example look?

VERSION 1: Interface: “Reader.h”

class ReaderImpl; // forward declaration

class DLLEXPORT Reader
{
  public:
    Reader(const std::string& filename);
    ~Reader();

    std::string readLine() const;
    bool isEndOfFile() const;

  private:
    ReaderImpl* pImpl; // pointer to the class attrs
};

Implementation: “Reader.cpp”

#include "Reader.h"

//Here we define the struct to use
struct ReaderImpl
{
  FILE* file;
};

Reader::Reader(const std::string& n)
{
  pImpl = new ReaderImpl{};
  pImpl->file = fopen(n.c_str(), "r");
}

Reader::~Reader()
{
  fclose(pImpl->file);
  delete pImpl;
}

std::string Reader::readLine() const
{
  char aux[256];
  fgets(aux, 256, pImpl->file);
  return {aux};
}

bool Reader::isEndOfFile() const
{
  return feof(pImpl->file);
}

VERSION 2: Interface: “Reader.h”

Implementation: “Reader.cpp”

#include "Reader.h"

struct ReaderImpl
{
  std::string filename; //new attribute for version 2
  FILE* file;
};

Reader::Reader(const std::string&amp; n)
{
  pImpl = new ReaderImpl{};
  pImpl->filename = n;
  pImpl->file = fopen(n.c_str(), "r");
}

Reader::~Reader()
{
  fclose(pImpl->file);
  delete pImpl;
}

std::string Reader::readLine() const
{
  char aux[256];
  fgets(aux, 256, pImpl->file);
  return {aux};
}

bool Reader::isEndOfFile() const
{
  return feof(pImpl->file);
}

void Reader::restart()
{
  fclose(pImpl->file);
  pImpl->file = fopen(pImpl->filename.c_str(), "r");
}

If the programmers of the reader.dll had used the ‘pimpl idiom’ from the beginning, the new Reader.dll would not have affected its consumers at all. This is because the new version would have maintained both API and ABI backwards compatibility.