Bitwise operations are performed on ints

bits We had an interesting situation recently, where someone (not from our team) used the following piece of code to set to zero the first bit a byte value:

[code language=”cpp”]unsigned char func(unsigned char input)
{
unsigned char result = input << 1 >> 1;
return result;
}[/code]

This won’t work. Why? It’s simple. All bitwise operations performed on values smaller than an integer are by default performed on integers, 32bits. This may sound silly, but actually it isn’t.

Current processors are usually optimized for 32bit operations, even the 64bit ones. We might be tempted to say that 32bits are enough for everyone. Of course, they are not enough, but for most needs, 32 bit integers will do. So it makes perfect sense to optimize most operations for 4 bytes integers.

What happens if you only try to use one byte? Well, most compilers will probably optimize your code by hiding your byte inside an int, and align it properly at 4-bytes boundary, because fetching from memory is a lot easier if the memory access is aligned to 4, 8 or even 16 bytes boundary. This is why when using the sizeof operator on the following structure:

[code language=”cpp”]struct {
unsigned char a;
int b;
} y;
[/code]

the result will be 8 instead of 5, as one might expect.

As a cautionary tale, for performance reasons, bitwise operations are usually performed on 32 bit integers if not specified otherwise. So the code that we showed initially will fail, returning the same value it received as a parameter. Instead, a normal implementation of the code would be:

[code language=”cpp”]unsigned char func(unsigned char input)
{
unsigned char result = input & 0x7F;
return result;
}[/code]

Why am I telling you all these? Well, I think it’s an important thing to have in mind when doing bitwise operations. From my experience this happens under C, C++ and C#, I suspect Java would behave in a similar way; it makes sense for the managed languages to copy the C/C++ behavior, as they probably translate the code in one simple machine-code statement, instead of implementing more fancy restrictions. Also, I’m probably telling you this because it’s well known that chicks dig bitwise operators.

The picture is from this quite insightful article about bits and bytes.

Comments

Bitwise operations are performed on ints — 3 Comments

  1. why would somebody play with the bits of a char in the first place? a char is a char … a byte is a byte …

    • unsigned char is 1 byte. Unless you have a favorite language that has ‘unsigned char’ as type, and has this unsigned char on two bytes.