From time to time, I hear it from low-level coder monkeys posing as either
tech gurus or teachers that using the shift operators (<<
and >>
in C
syntax) instead of multiplication and division in cases when the
factor/divisor is an integer power of 2 results in faster code. While
I've always been skeptical about such speculations – and I've been reassured
several times by many sources, including MSDN forums and
Stack Overflow – I haven't tried it for myself, especially in managed
languages such as C# that are compiled to a byte code first.
Although Mono is not the reference implementation of C# and the .Net virtual machine, it's the one that runs on my notebook and allows for easy static compilation which makes it possible for me to inspect the machine code generated from the .Net executable file. First, I wrote a simple program that reads a byte from the standard input, divides it by 2, and writes the result to the standard output (mainly to avoid optimization that would replace division with compile-time evaluation).
using System;
class Program
{
static void Main()
{
int a = Console.Read();
int b = a / 2;
Console.WriteLine(b);
}
}
I compiled it with the Mono C# compiler and verified that it works (T = 84 in ASCII).
$ mcs monodiv.cs
$ file monodiv.exe
monodiv.exe: PE32 executable (console) Intel 80386 Mono/.Net assembly, for MS Windows
$ echo T | ./monodiv.exe
42
Dumping the .Net bytecode reveals that the first pass of compilation uses division.
$ monodis monodiv.exe
...
.method private static hidebysig
default void Main () cil managed
{
// Method begins at RVA 0x2058
.entrypoint
// Code size 17 (0x11)
.maxstack 2
.locals init (
int32 V_0,
int32 V_1)
IL_0000: call int32 class [mscorlib]System.Console::Read()
IL_0005: stloc.0
IL_0006: ldloc.0
IL_0007: ldc.i4.2
IL_0008: div
IL_0009: stloc.1
IL_000a: ldloc.1
IL_000b: call void class [mscorlib]System.Console::WriteLine(int32)
IL_0010: ret
} // end of method Program::Main
Finally, transforming the bytecode into machine code assures us again that
premature optimization is the root of all evil, as the code executed by the
CPU at runtime contains the good old shift right (shr
) opcode.
$ mono --aot=full monodiv.exe
Mono Ahead of Time compiler - compiling assembly /home/dnet/_projekt/monodiv/monodiv.exe
Code: 38 Info: 4 Ex Info: 6 Unwind Info: 9 Class Info: 30 PLT: 3 GOT Info: 14 GOT: 48 Offsets: 47
Compiled 2 out of 2 methods (100%)
Methods without GOT slots: 2 (100%)
Direct calls: 0 (100%)
JIT time: 0 ms, Generation time: 0 ms, Assembly+Link time: 0 ms.
$ file monodiv.exe.so
monodiv.exe.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped
$ objdump -d monodiv.exe.so
monodiv.exe.so: file format elf64-x86-64
Disassembly of section .text:
...
0000000000001020 <Program_Main>:
1020: 48 83 ec 08 sub $0x8,%rsp
1024: e8 17 00 00 00 callq 1040 <plt_System_Console_Read>
1029: 48 8b f8 mov %rax,%rdi
102c: c1 ef 1f shr $0x1f,%edi
102f: 03 f8 add %eax,%edi
1031: d1 ff sar %edi
1033: e8 12 00 00 00 callq 104a <plt_System_Console_WriteLine_int>
1038: 48 83 c4 08 add $0x8,%rsp
103c: c3 retq
103d: 00 00 add %al,(%rax)
...