C++: Primitive types

A primitive type is a data type whose values are simple in nature, such as numbers, characters, or boolean values. Primitive types serve as the most fundamental building blocks in any programming language and form the basis for more complex data types. The following are the primitive types available in C++:

bool

It is stored internally in one byte, and the values that a variable of this type can represent are true or false. All boolean operations return a value of this type. This type was not available in early C, so many operations that return integer values instead of boolean ones can be used as boolean expressions. In such cases, the compiler assumes that 0 represents false, and any value different from 0 represents true. For example, the following two code excerpts have the same semantics:”

int a = 2;
if (a != 0) //a != 0 evaluates to a boolean value. In this case, it evaluates to true
{
  printf("a is different than 0\n");
}

and

int a = 2;
if (a) //a is an "int", but since it is different than 0, the compiler evaluates it as true
{
  printf("a is different than 0\n");
}

char

It is stored internally as a byte and represents a character. When this data type was created, there was no immediate need for international character support in the language, so it was completely sufficient to store all the characters needed to write in English. However, as the use of computers evolved, expanded, and became globally available, the need for international character support became evident, leading to the definition of new character encoding standards. When these new standards emerged, a new character data type was required because one byte was insufficient to represent all the symbols used in human languages (Chinese glyphs, for example, number more than 40,000). Despite this, char is still used as the standard character data type, and much legacy code still relies on character strings based on char. Some encoding algorithms, such as UTF-8, can store international characters using sequences of char characters, with UTF-8 storing Unicode characters in sequences of 1, 2, 3, or 4 char bytes.

wchar_t

It is a wide-character type that represents a character but is stored internally using 16 or 32 bits, instead of the 8 bits used by the char type. The number of bits it uses depends on the computer architecture, operating system, and C++ compiler. Typically, Windows uses 16-bit characters, while UNIX systems use 32-bit characters. The encoding for wchar_t is not defined by the standard, leaving the choice to the compiler. Both char and wchar_t can be treated as integer types, allowing arithmetic operations on their values. Initially, wchar_t was a type alias (typedef), but modern compilers treat it as a built-in type by default. However, it can still be handled as a typedef to support legacy code.

These days, wchar_t is the default character type in Windows applications, although programmers can configure their projects to use char instead. wchar_t is the default in Windows because the lower-level Win32 API also uses this type by default. If char is selected, Windows converts any char sequence to a wchar_t sequence using a specified encoding.

short

It is a ‘short integer,’ representing an integer with less precision than a ‘full-blown int.’ Though generally, short represents a signed integer with 16-bit precision (meaning it can represent values between -32,768 and 32,767), the decision of what precision to use was left to the compiler implementer. unsigned short is the unsigned version of this 16-bit precision integer, but it represents values between 0 and 65,535.

int

It is the most common integer data type and was originally used to represent a processor ‘word.’ On 16-bit platforms, it used to be a 16-bit precision integer, and on 32-bit platforms, it became a 32-bit precision number. This ‘rule’ was broken when 64-bit hardware became available, but the int data type still retained 32-bit precision. This means it can store numbers between −2,147,483,648 and 2,147,483,647, or between 0 and 4,294,967,295 when using the unsigned int version.

long and unsigned long

They represent ‘long integer numbers,’ and their precision depends on the compiler and the OS. On 16-bit OSes, they used to represent 32-bit precision integers. On 32-bit hardware, they also represent 32-bit precision, and on 64-bit OSes, they have 32-bit precision on Windows and 64-bit precision on UNIX systems.

long long and unsigned long long

They represent 64-bit integers and are part of the standard since C++11.

float

They represent single-precision floating-point numbers. They are stored in 32 bits (as defined by IEEE 754-2008) and can represent values approximately between 1.18 × 10^-38 and 3.4 × 10^38, with around 6 to 7 significant digits of precision.

double

They represent double-precision floating-point numbers. They are stored in 64 bits and can represent values approximately between 2.225 × 10^-308 and 1.798 × 10^308, with about 15 to 16 significant digits of precision.

C99 exact-width integer types

C99 also introduced a set of exact-width integer types that represent signed and unsigned integers with precisions of 8, 16, 32, and 64 bits, independent of the compiler, OS, or processor architecture. They are:

  • 8-bit precision: int8_t and uint8_t
  • 16-bit precision: int16_t and uint16_t
  • 32-bit precision: int32_t and uint32_t
  • 64-bit precision: int64_t and uint64_t

These exact-width integer types are not built-in types; they are simply aliases (typedefs) of the primitive types described above. They are widely supported by modern compilers, including GCC, Clang, and MSVC.

C++11 introduced more strict sizes for character types as well:

  • 8-bit char: char8_t
  • 16-bit char: char16_t
  • 32-bit char: char32_t

All these exact-width types are declared in the following header:

#include <cstdint>

 void

Though not exactly a data type, void represents:

  • The absence of parameters in a function when declared as an argument.
  • The absence of a return value in a function.
  • When used with pointers, it represents a pointer to a memory address without any information about the data type at that address.

std::nullptr_t

Introduced in C++11 to represent a null pointer, std::nullptr_t allows for better type safety with null pointer constants. I wrote more about it in this post: nullptr .

4 thoughts on “C++: Primitive types

  1. While “long long” is a de facto standard, until C++11 it was just an extension (extremely popular one, yes :). It is now finally an official part of the Standard, together with exact-width integer types (well, right, they are optional part of <cstdint>, but still…).

    1. Generally it occupies 1 byte. I know that you could perfectly fit a bool value in a bit, but the smallest memory unit to access is a byte.

      Anyway, if you need to work with sets of bits or you need to allocate them dynamically, use “sizeof(bool)” to get sure.

Leave a reply to Marcin Łoś Cancel reply