Home » C++ » Why must a short be converted to an int before arithmetic operations in C and C++?

Why must a short be converted to an int before arithmetic operations in C and C++?

Posted by: admin November 30, 2017 Leave a comment

Questions:

From the answers I got from this question, it appears that C++ inherited this requirement for conversion of short into int when performing arithmetic operations from C. May I pick your brains as to why this was introduced in C in the first place? Why not just do these operations as short?

For example (taken from dyp’s suggestion in the comments):

short s = 1, t = 2 ;
auto  x = s + t ;

x will have type of int.

Answers:

If we look at the Rationale for International Standard—Programming Languages—C in section 6.3.1.8 Usual arithmetic conversions it says (emphasis mine going forward):

The rules in the Standard for these conversions are slight
modifications of those in K&R: the modifications accommodate the added
types and the value preserving rules. Explicit license was added to
perform calculations in a “wider” type than absolutely necessary,
since this can sometimes produce smaller and faster code, not to
mention the correct answer more often
. Calculations can also be
performed in a “narrower” type by the as if rule so long as the same
end result is obtained. Explicit casting can always be used to obtain
a value in a desired type

Section 6.3.1.8 from the draft C99 standard covers the Usual arithmetic conversions which is applied to operands of arithmetic expressions for example section 6.5.6 Additive operators says:

If both operands have arithmetic type, the usual arithmetic
conversions
are performed on them.

We find similar text in section 6.5.5 Multiplicative operators as well. In the case of a short operand, first the integer promotions are applied from section 6.3.1.1 Boolean, characters, and integers which says:

If an int can represent all values of the original type, the value is
converted to an int; otherwise, it is converted to an unsigned int.
These are called the integer promotions.48) All other types are
unchanged by the integer promotions.

The discussion from section 6.3.1.1 of the Rationale or International Standard—Programming Languages—C on integer promotions is actually more interesting, I am going to selectively quote b/c it is too long to fully quote:

Implementations fell into two major camps which may be characterized
as unsigned preserving and value preserving.

[…]

The unsigned preserving approach calls for promoting the two smaller
unsigned types to unsigned int. This is a simple rule, and yields a
type which is independent of execution environment.

The value preserving approach calls for promoting those types to
signed int if that type can properly represent all the values of the
original type, and otherwise for promoting those types to unsigned
int. Thus, if the execution environment represents short as something
smaller than int, unsigned short becomes int; otherwise it becomes
unsigned int.

This can have some rather unexpected results in some cases as Inconsistent behaviour of implicit conversion between unsigned and bigger signed types demonstrates, there are plenty more examples like that. Although in most cases this results in the operations working as expected.

Questions:
Answers:

It’s not a feature of the language as much as it is a limitation of physical processor architectures on which the code runs. The int typer in C is usually the size of your standard CPU register. More silicon takes up more space and more power, so in many cases arithmetic can only be done on the “natural size” data types. This is not universally true, but most architectures still have this limitation. In other words, when adding two 8-bit numbers, what actually goes on in the processor is some type of 32-bit arithmetic followed by either a simple bit mask or another appropriate type conversion.

Questions:
Answers:

float, short and char types are considered by the standard sort of “storage types” i.e. sub-ranges that you can use to save some space but that are not going to buy you any speed because their size is “unnatural” for the CPU.

On certain CPUs this is not true but good compilers are smart enough to notice that if you e.g. add a constant to an unsigned char and store the result back in an unsigned char then there’s no need to go through the unsigned char -> int conversion.
For example with g++ the code generated for the inner loop of

void incbuf(unsigned char *buf, int size) {
    for (int i=0; i<size; i++) {
        buf[i] = buf[i] + 1;
    }
}

is just

.L3:
    addb    $1, (%rdi,%rax)
    addq    $1, %rax
    cmpl    %eax, %esi
    jg  .L3
.L1:

where you can see that an unsigned char addition instruction (addb) is used.

The same happens if you’re doing your computations between short ints and storing the result in short ints.

Questions:
Answers:

The linked question seems to cover it pretty well: the CPU just doesn’t. A 32-bit CPU has its native arithmetic operations set up for 32-bit registers. The processor prefers to work in its favorite size, and for operations like this, copying a small value into a native-size register is cheap. (For the x86 architecture, the 32-bit registers are named as if they are extended versions of the 16-bit registers (eax to ax, ebx to bx, etc); see x86 integer instructions).

For some extremely common operations, particularly vector/float arithmetic, there may be specialized instructions that operate on a different register type or size. For something like a short, padding with (up to) 16 bits of zeroes has very little performance cost and adding specialized instructions is probably not worth the time or space on the die (if you want to get really physical about why; I’m not sure they would take actual space, but it does get way more complex).