I came across something peculiar that I've been puzzling about. I played around with the source a bit, trying to find exactly what conditions generates this kind of code. I'm not entriely sure why, but with code something like this:
typedef unsigned int uint;
uint T(int a, int b)
{
int d;
int r;
// Interesting assembly code generated by:
d = b - a; // Difference
r = d * 8; // Multiply by power of 2
return r; // Convert to unsigned
}
You get assembly code that looks like this:
// Note: Turn on "Optimize for Speed"
?T@@YAIHH@Z PROC NEAR ; T, COMDAT
; Line 22
mov ecx, DWORD PTR _a$[esp-4]
mov eax, ecx; a
shl eax, 29 ; ? ; 0000001dH
sub eax, ecx; ? "-a"
mov ecx, DWORD PTR _b$[esp-4]
add eax, ecx; d = b - a [(-a) + b]
shl eax, 3 ; r = d * 8
; Line 23
ret 0
?T@@YAIHH@Z ENDP
I don't really understand why there is a SHL EAX, 29 instruction. Note that we are multiplying by 8, which accounts for the shift by 3 near the end (8 = 2^3). This code is only generated when multiplying by a power of 2, and the two shift constants always add up to 32. Further, this code only seems to be generated under the three conditions in the source comments.
So, why is SHL EAX, 29 particularly odd here? Well, it will fill the lower 29 bits with 0s, and the old lower 3 bits will become the new upper 3 bits. Now, the SUB and ADD instructions that follow work in such a way that the lower bits can affect the upper bits, but not the other way around. That is, those new upper 3 bits can only affect the upper 3 bits of the result. But then you have the SHL EAX, 3, which will shift those 3 bits out and discard them.
The only possible reason that I can guess at for the SHL EAX, 29, is to affect the flags register. I'm thinking maybe it does something special to the carry or overflow flags. Of course these flags aren't used anywhere here, so why bother generating special (and extra) code to affect them. Plus, from the standard, the upper bits of a result that overflows simply get discarded. The only way I can think of to use the carry or overflow flags might be to use certain boolean operators like <, <=, >, >=. (The use of carry/overflow is controlled by the unsigned/signed part of the type).
Or, if instead you optimized for size, you get:
// Note: Turn on "Optimize for Size"
?T@@YAIHH@Z PROC NEAR ; T, COMDAT
; Line 22
mov eax, DWORD PTR _a$[esp-4]
imul eax, 536870911 ; 1fffffffH
add eax, DWORD PTR _b$[esp-4]
shl eax, 3
; Line 23
ret 0
?T@@YAIHH@Z ENDP ; T
I'm not too sure where that IMUL comes from.
Another oddity is that if the return type is changed from uint to int, then it simply generates:
?T@@YAHHH@Z PROC NEAR ; T, COMDAT
; Line 22
mov eax, DWORD PTR _b$[esp-4]
mov ecx, DWORD PTR _a$[esp-4]
sub eax, ecx
shl eax, 3
; Line 23
ret 0
?T@@YAHHH@Z ENDP ; T
You also get this same (simple) generated code by eliminating the local variables, even under different optimization schemes, such as Speed, Size, or Disabled (Debug). Although, the Disabled/Debug version does generate a few extra instructions for stack maintenance, but they don't affect the calculation at all.
That is, changing the code to:
uint T(int a, int b)
{
return (b-a) * 8;
}
produces the same results as changing the return type to int did.