C++11: unordered maps

The STL ships with a sorted map template class that is commonly implemented as a balanced binary tree.

The good thing on this is the fast search average time ( O(log2N) if implemented as a balanced binary tree, where N is the number of elements added into the map) and that when the map is iterated, the elements are retrieved following an established order.

C++11 introduces an unordered map implemented as a hash table. The good thing on this is the constant access time for each element ( O(1) ) but the bad things are that the elements are not retrieved in order and the memory print of the whole container can (I am just speculating here) be greater that the map’s one.

The API of both kind of maps is quite similar; for example, look at this code:

#include <iostream>
#include <string>
#include <map>
#include <unordered_map>

using namespace std;

template <class T>
void show(const T& x)
{
	for (const auto& i : x)
		cout << i.first << "; " << i.second << endl;
}

template <template <class...> class T>
void test()
{
	T<string, int> x;
	x.insert(make_pair("one", 1));
	x.insert(make_pair("two", 2));
	x.insert(make_pair("three", 3));
	x.insert(make_pair("four", 4));
	
	show(x);
}



int main()
{
  test<map>();
  cout << "*******************" << endl;
  test<unordered_map>();
}

The output of the program is:


four; 4
one; 1
three; 3
two; 2
*******************
one; 1
three; 3
two; 2
four; 4

As you can see,

  • Both maps are created and used in the same way.
  • The output of using the unordered_map is not sorted.

Everything is perfect until this point because we are using primitive types and strings provided by the STL.

But using unordered maps is not so smooth if we want to use our own classes or structs as keys.

In Java, for example, we need to override the equals and hashCode methods of the class we want to use as key; if we do not do that, the code will compile but will not work as we want to.

In C++ we need to do something similar, we need to implement a function object to return a hash code and other function object to compare two objects of our class or struct and pass the classes of our function objects as template arguments when instantiating our unordered map.

The STL unordered map is declared as:

template <typename KEY, typename VALUE, typename HASH = hash<KEY>, typename EQUAL = equal_to<K> >
class unordered_map
{ ..... };

The hash template argument is a function object class that has an implementation similar to this one:

template <typename T>
class hash
{
    public:
        long operator()(const T& o) const { return 0; }
};

and provides specializations for all the primitive types and for the std::string class.

The equal_to class template is implemented similar to this:

template <typename T>
class equal_to
{
  public:
    bool operator()(const T& a, const T& b) const { return a == b; }
};

So, if the class we want to use as key overloads the operator==, we do not need to provide an argument for the EQUAL template argument.

Consider the example of having the struct article_id and the class article implemented as follows:

struct article_id
{
  string brand;
  int id;
  
  article_id(const string& b, int id) : brand(b), id(id) { }
};

ostream& operator<<(ostream& os, const article_id& a)
{
  return os << "(" << a.brand << ") " << a.id;
}

class article
{
  string brand, model, description;

  public:
    article(const string& b, const string& m, const string& d)
       : brand(b), model(m), description(d) { }

    const string& get_brand() const { return brand; }
    const string& get_model() const { return model; }
    const string& get_description() const { return description; }
};

ostream& operator<<(ostream& os, const article& a)
{
  return os << "Brand: " << a.get_brand() << "; Model: " << a.get_model()
         << "; Description: " << a.get_description();
}

I want to have an unordered map with article_id instances as keys and article instances as values.
So, I need to create the classes described above:


class article_hash
{
  public:
    long operator()(const article_id& x) const
    {
    	hash<string> z;
    	return z(x.brand) * 100 + x.id;
    }
};

class article_equal_to
{
  public:
     bool operator()(const article_id& a, const article_id& b) const
     {
     	return a.brand == b.brand && a.id == b.id;
     }
};

Having those, we can already use the unordered_map with our article_id class as key:

int main()
{
   unordered_map<article_id, article, article_hash, article_equal_to> x;
   x.insert(make_pair(article_id("acme", 1), article("acme", "tv", "Acme TV")));
   x.insert(make_pair(article_id("foo", 4), article("foo", "phone", "fooPhone")));
   show(x);
   return 0;
}

Nice, but a slightly sophisticated version of our article_hash and article_equal_to classes would be if we would implement them as specializations of the std::hash and std::equal_to template classes:

namespace std
{
	template <>
	class hash<article_id>
	{
	  public:
		long operator()(const article_id& x) const
		{
			hash<string> z;
			return z(x.brand) * 100 + x.id;
		}
	};
	
	template <>
	class equal_to<article_id>
	{
	  public:
		 bool operator()(const article_id& a, const article_id& b) const
		 {
			return a.brand == b.brand && a.id == b.id;
		 }
	};
}

And, as we are providing specializations of the default classes, we do not need to explicit them when instantiating our unordered map:

int main()
{
   unordered_map<article_id, article> x;
   x.insert(make_pair(article_id("acme", 1), article("acme", "tv", "Acme TV")));
   x.insert(make_pair(article_id("foo", 4), article("foo", "phone", "fooPhone")));
   show(x);
   return 0;
}

Really, really cool! :)

Advertisements

9 thoughts on “C++11: unordered maps

  1. Though I think adding unordered maps to C++ was a good thing, they are often overused (at least in Java they were, since it lacked a Tree). They have two key limitations:

    – They require a good hash algorithm. If your hashing is poor the performance can degenerate, in worse case, to O(N) for lookup/insertion

    – They have a bad iteration performance. The time to iterate relates to the number of buckets, not how many items are in it. Any place where you need to frequently iterate over the elements this is a problem.

    1. You are completely right. Actually I over(ab)used of Java and C# maps implementing caches. Anyway, you will always have some pros and cons to choose your data structures; and having hash tables in STL gives us more where to pick from :)

      Thanks for your comment! :)

    1. In my example I tried to explain that the unordered_map<K, V> provides the same functionality than the map<K, V> so, what better way of doing it that using the same template method that receives a map or an unordered map in its template arguments :)

      So I tried to implement something like:

      template < template <typename K, typename V> class T>
      void test() { }

      and invoke:

      test<unordered_map>();
      test<map>();

      (as I finally did).

      But such declaration does not compile, because when you define something like:

      template < template < typename K, typename V> class T>….

      you are saying the compiler that your T template argument is a template class with two arguments, so, in the calling you do not need to say test<map<K, V> >(); just test<map>();

      So, then… why this does not compile if I create instances of both types of maps providing two template arguments: K and V?

      Because the complete declaration of a map is:

      template <typename K, typename V, typename COMP = less<K>, typename ALLOC = allocator<pair<const K, V> > class map; //4 template arguments

      And of an unordered_map is:

      template <typename K, typename V, typename HASH = hash<K>, typename EQUAL = equal_to<K>, typename ALLOC = allocator<pair<const K, V> > class unordered_map; //5 template arguments

      So, although we just write two arguments, the map uses 4 and the unordered_map uses 5; so… my method simply would work because of that.

      Solution?

      Use variadic templates :)

      So, if instead of defining:

      template < template <typename K, typename V> class T>
      void test() { }

      I define it as:

      template < template <typename …> class T>
      void test() { }

      I solve all my problems! :) In the latter declaration I say: I want to receive a template class with any number of arguments :)

      and so, when I invoke test<map>(); the compiler will instantiate my function with the map and its 4 template arguments; and when I invoke test<unordered_map>(); the compiler will instantiate my function with the unordered_map and its 5 template arguments.

      Thanks for reading me! :)

      1. many thanks for the reply and particularly for a great explanation! please continue the good work.

      2. I was wondering about that too. Cool! Variadic takes time to getting used to.

        Great blog by the way!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s