Imagine we have this class defined in the reader.dll DLL:
class DLLEXPORT Reader
{
public:
Reader(const std::string& filename);
~Reader();
std::string readLine() const;
bool isEndOfFile() const;
private:
FILE* file;
};
This class allows its user to read from a file only once.
What if the user wants to use this same class to read from a file multiple times? It can be modified as follows:
class DLLEXPORT Reader
{
public:
Reader(const std::string& filename);
~Reader();
std::string readLine() const;
bool isEndOfFile() const;
void restart();
private:
std::string filename;
FILE* file;
};
What has been done here is adding the file name as a class attribute, allowing the file to be opened multiple times. Additionally, a ‘restart’ method was introduced.
Below is a function that uses the first version of the reader.dll DLL.
void showFile(const std::string& file)
{
Reader reader(file);
while (!reader.isEndOfFile())
{
std::cout << reader.readLine() << std::endl;
}
}
The problem arises when users attempt to link their code with the second version of the reader.dll. The program may malfunction, crash, or fail entirely. Why?
Although the API of the second version is compatible with the first (meaning the code will link perfectly), the ABIs are not. The ABI, or ‘Application Binary Interface’, defines how binaries are linked. Why are the ABIs incompatible? Because the ‘filename’ attribute was added in place of the ‘file’ attribute, every reference to ‘file’ in the invoker will now ‘binarily’ point to the same address where ‘filename’ is located after the change. Since these are different types, the program will behave unpredictably.
This issue occurs because the class header explicitly declares class attributes, which is a well-known encapsulation problem in C++. A similar problem can occur even without adding or removing methods if, for instance, private attributes are replaced (e.g., changing FILE* to std::fstream).
The ‘pimpl idiom’ (also known as the ‘opaque pointer’ or ‘cheshire cat’ idiom) is a C++ technique to avoid this problem. The idea is to include a pointer to a struct in the class interface (.h) to store the class attributes, but define the struct inside the .cpp file, keeping it hidden from the interface. Doing this resolves several issues:
- ABI compatibility is maintained because the class attributes are not exposed in the .h file and are used only internally within the DLL.
- It provides better encapsulation (the .h files only expose what the user needs to know).
- The
sizeof(reader)(in this example) remains the same, regardless of how many attributes the class has, as they are hidden within the Pimpl. This is crucial because it prevents memory layout shifts when the implementation changes. - If only the implementation changes, the project using our .h does not need to be recompiled since the .h remains unchanged.
So, how would the example look?
VERSION 1: Interface: “Reader.h”
class ReaderImpl; // forward declaration
class DLLEXPORT Reader
{
public:
Reader(const std::string& filename);
~Reader();
std::string readLine() const;
bool isEndOfFile() const;
private:
ReaderImpl* pImpl; // pointer to the class attrs
};
Implementation: “Reader.cpp”
#include "Reader.h"
//Here we define the struct to use
struct ReaderImpl
{
FILE* file;
};
Reader::Reader(const std::string& n)
{
pImpl = new ReaderImpl{};
pImpl->file = fopen(n.c_str(), "r");
}
Reader::~Reader()
{
fclose(pImpl->file);
delete pImpl;
}
std::string Reader::readLine() const
{
char aux[256];
fgets(aux, 256, pImpl->file);
return {aux};
}
bool Reader::isEndOfFile() const
{
return feof(pImpl->file);
}
VERSION 2: Interface: “Reader.h”
Implementation: “Reader.cpp”
#include "Reader.h"
struct ReaderImpl
{
std::string filename; //new attribute for version 2
FILE* file;
};
Reader::Reader(const std::string& n)
{
pImpl = new ReaderImpl{};
pImpl->filename = n;
pImpl->file = fopen(n.c_str(), "r");
}
Reader::~Reader()
{
fclose(pImpl->file);
delete pImpl;
}
std::string Reader::readLine() const
{
char aux[256];
fgets(aux, 256, pImpl->file);
return {aux};
}
bool Reader::isEndOfFile() const
{
return feof(pImpl->file);
}
void Reader::restart()
{
fclose(pImpl->file);
pImpl->file = fopen(pImpl->filename.c_str(), "r");
}
If the programmers of the reader.dll had used the ‘pimpl idiom’ from the beginning, the new Reader.dll would not have affected its consumers at all. This is because the new version would have maintained both API and ABI backwards compatibility.