C++: Primitive types

A primitive type is a data type where the values that it can represent have a very simple nature (a number, a character or a truth-value); the primitive types are the most basic building blocks for any programming language and are the base for more complex data types. These are the primitive types available in C++:

bool

It is stored internally in one byte and the values that a variable of this type can represent are true or false. All boolean operations return a value of this type. This type was not available in early C, so, a lot of operations that return integer values instead of boolean ones can be used as boolean expressions (in that case, the compiler assumes that 0 represents false and any value different than 0 represents true). For example, the following two code excerpts have the same semantics:

int a = 2;
if (a != 0) //a != 0 evaluates to a boolean value. In this case, it evaluates to true
  printf("a is different than 0\n");

and

int a = 2;
if (a) //a is an "int", but since it is different than 0, the compiler evaluates it as true
  printf("a is different than 0\n");

char

It is stored internally as a byte and represents a character. When this data type was created, there was not immediate need of international character support in the language, so, it was completely useful to store all the set of characters needed to write anything on English. Anyway, when the use of computers was evolving, extending and becoming world-wide available, support for international characters was needed and evident and then new character encoding standards were defined. When these new encoding standards were available, a new character data type was needed, because one byte was not enough to support all symbols used in all human languages (Chinese glyphs, for example, are more than 40000). Although of this, char is still used as the standard character data type and a lot of legacy code still uses character strings based on char; some encoding algorithms exist that can store international characters on sequences of char characters, for example, UTF-8 that stores Unicode characters on 1, 2, 3 or 4 char characters.

wchar_t

It is is a wide-character type, represents a character but it is stored internally using 16 or 32 bits instead of the 8 bits of the char type. How many bits a value of this type uses, depends on the computer architecture, the operating system and the C++ compiler being used. Commonly, Windows uses 16-bit characters and the UNIXes use 32-bit characters. The encoding used to represent a wchar_t character is not defined by the standard and the decision of what encoding to use was deliberately left to the compiler. Both types, char and wchar_t can be treated as integer data types and the programmer is able to perform arithmetical operations using values of these types. When created, wchar_t was not a built-in data type but it was just a type alias (typedef); the current compilers use it as a built-in data type by default but the user can tell the compiler to treat it as a typedef (this is needed to support legacy code as well).

short

It is a “short integer” representing an integer that has less precision than a “full-blown int”. Though generally short represents a signed integer with 16-bit precision (that means that it can represent values between -32768 and 32767), the decision of what precision to use was left to the compiler implementor. unsigned short is the unsigned version of this 16-bit precision integer, but the values that it represents are between 0 and 65535.

int

It is the most common integer data type and it was used to represent a processor “word”; so, in 16-bit platforms, it used to be a 16-bit precision integer number and in 32-bit platforms, it is a 32-bit precision number. This “rule” was broken when 64-bit hardware became available and the int data type has still a 32-bit precision; that means that it can store numbers between −2147483648 to 2147483647 or between 0 and 4294967295 if using the unsigned int version instead.

long and unsigned long

They represent “long integer numbers” and their precisions depend on the compiler and the OS. For 16-bit OSes, they used to represent 32-bit precision integer numbers; for 32-bit hardware, they also represent 32-bit precision numbers and for 64-bit OSes, they have a 32-bit precision on Windows and 64-bit precision on UNIXes.

long long and unsigned long long

They represent 64-bit integers.

float

It represents a single-precision floating point number. It is stored in 32-bits (as defined in IEEE 754-2008) and it can represent numbers between 1.18(10^−38) and 3.4(10^38) with around 7 digits of mantissa.

double

It represents a double-precision floating point number It is stored in 64-bits and it can represent numbers between 2.2250738585072009(10^-308) and 1.7976931348623157(10^308) with approximately 16 digits of precision.

C99 exact-width integer types

C99 also introduced a set of exact-width integer types that represent signed and unsigned integer numbers with precision of 8, 16, 32 and 64-bit independently from the compiler, OS or processor architecture. They are:

  • 8-bit precision: int8_t and uint8_t
  • 16-bit precision: int16_t and uint16_t
  • 32-bit precision: int32_t and uint32_t
  • 64-bit precision: int64_t and uint64_t

These exact-width integer types are not built-in types; they are just aliases (typedefs) of the primitive types described above. They are still not supported for all compilers (for example, Microsoft introduced the stdint.h [that is the library header that declares them] just for Visual Studio 2010).

4 thoughts on “C++: Primitive types

  1. While “long long” is a de facto standard, until C++11 it was just an extension (extremely popular one, yes :). It is now finally an official part of the Standard, together with exact-width integer types (well, right, they are optional part of <cstdint>, but still…).

    1. Generally it occupies 1 byte. I know that you could perfectly fit a bool value in a bit, but the smallest memory unit to access is a byte.

      Anyway, if you need to work with sets of bits or you need to allocate them dynamically, use “sizeof(bool)” to get sure.

Leave a comment