Thanks for posting.
I liked the reference to Douglas Crockford's, JavaScript The Good Parts and his following jab at JavaScript.
Good insights into treating your code as a manuscript that is meant to be read by others. I agreed with the curly brace comments for separating the function arguments from the body of the function.
I really appreciated the parts about not justifying argument lists or comments at some arbitrary indentation like:
int FairlyLongFunctionName(int argument1,
int argument2);
It always creates a nightmare when refactoring a codebase by changing the function name size.
Or not separating the list on different sides of the screen (which I am guilty of fairly often):
int ReallyLongFunctionNameThatShouldBeShortened(int argument1,
int argument2);
I used to be really good about trying not to spill past 80 chars on a line. In fact, when I was a newer programmer, I used to print my code out and edit it on paper with a pen. Then I'd go in and make the changes. I found myself constantly fiddling with how to break the code across multiple when long functions were encountered. It also made for annoying refactors when function names, or number of arguments was changed, which is part of the reason I stopped doing it.
I still need to work on my unit test writing as well, which the video reminded me of.
-Brett
Without having watched the presentation yet (now I really want to get into it), the last example you have with the really long function name, at least for me, indicates that the function does too much and needs to be refactored.
As for having too many parameters, except in rare cases (like in NAS2D::Renderer's drawing functions which make sense to have a lot of parameters for drawing functions), generally I will break off many of those parameters into a struct or public class and take a reference to that instead. E.g.:
void someContrivedFunction(int with, int lots, const std::string& of, const std::string& parameters, float that, float are, float better, char* off, char* in, char* a, char* structure)
{}
would become:
/**
* This can now be stuffed into a header somewhere so
* it's out of the way. Also, the struct can be better
* documented versus having a giant documentation block
* for the function parameter list.
*/
struct StructOfRelatedData
{
int Int1;
int Int2;
const std::string String1;
const std::string String2;
float Float1;
float Float2;
float Float3;
char* Buffer1;
char* Buffer2;
char* Buffer3;
char* Buffer4;
};
/**
* Much cleaner interface. Also, less likely to deal with cache hits or whatever
* issues come about with functions that have huge parameter lists.
*/
void someContrivedFunction(StructOfRelatedData& _s)
{}
Again, this code is very much contrived and assumes that the data moved into a struct is used in multiple places.
Hell, even the NAS2D::Renderer draw functions I mentioned could be better served with a different design paradigm (e.g., having a separate function call that sets drawing colors versus putting the color information in the function paramter list but now we're getting into design and architecture choices of an API).
Thank you Brett for the code examples! :)
(I think that was the transcript that Leeor required. ;) )
I suppose we should add an example of one of the better ways to write and justify parameter lists:
void SomeRatherLongOverlyVerboseFunctionName(
int param1,
int param2
) {
// ...
}
The parameter list is then kept together as one unit, with consistent indentation that depends only on structure, and not on name lengths. In particular, if someone renames the function to have a different name length, you don't have to re-justify the parameters to maintain a consistent indentation, and the parameter list is not split across both sides of the screen.
Leeor, that's a good point about passing a struct reference, rather than a bunch of individual parameters. There are many cases where that's the best way to do it. I suppose it depends on context though. If the parameters would be naturally bundled together, it makes sense to exploit that. If they are not naturally bundled together, or would need to be bundled together just for this one function call, then maybe not. I'd probably prefer a long parameter list than bundling together parameters in a forced artificial way. Though certainly long parameter lists can be a sign that something about the design is not so great.
As an example, here's a method with different levels of parameter bundling:
void DrawRect(int x1, int y1, int x2, int y2, unsigned char red, unsigned char green, unsigned char blue);
void DrawRect(Point p1, Point p2, Color color);
void DrawRect(Rect rect, Color color);
void SomeMethod() {
/* Parameter meaning is not so clear at point of call */
DrawRect(5, 10, 10, 20, 30, 50, 10);
/* These two are more self documenting, without needing comments */
DrawRect(Point(5, 10), Point(10, 20), Color(30, 50, 10));
DrawRect(Rect(Point(5, 10), Point(10, 20)), Color(30, 50, 10));
}
On a related matter, when constructing a Rect, there is a choice of parameters, which might not be immediately clear:
/* Parameter list types are identical, so you can't have both overloads. Choose only one. */
Rect::Rect(int x1, int y1, int x2, int y2);
Rect::Rect(int x1, int y1, int width, int height);
/* These options are more clear, and can both be available at the same time */
Rect::Rect(Point p1, Point p2);
Rect::Rect(Point p1, Vector size);
The 80 character limit does seem a bit archaic. Even if you're using a terminal, modern systems still allow terminals of varying sizes. Granted, I default mine to 80 characters wide, so there's that. At any rate, I don't particularly mind a limit of 120 characters, or pushing past the margin just a little, but by and large, I prefer code to be largely left justified.
His comment about how people read was noteworthy. We read in columns, like newspaper articles. I've heard elsewhere, that's much easier on the eye, and takes less effort to move the eye from one line to the next without losing which line you're on. Though I suspect that's less of a problem with code, since the greatly varying lengths of most code lines already creates a visual indicator that can help your eyes track which line is which.
At any rate, I've long found these widescreen computer monitors a bit silly. I spent most of my time on the computer reading, and yeah, I prefer to read in a column, than going all the way across a wide screen monitor. I've long felt computer monitors were being expanded in the wrong direction. I'm sure widescreen is just fine for movies, but it's not what I want for reading. Curiously, I once got a monitor at work that could be swivelled 90 degrees, so it was tall rather than wide. One of the selling features was being able to see more lines of code at one time. The only problem though, was the monitor's viewing angle was not the same between top-bottom and left-right. In normal widescreen mode, it seemed fine, with a decent viewing angle to the sides, but when you rotated in 90 degrees, it had a rather narrow viewing angle to the sides, which affected color display. In the end, the lackluster viewing angle meant I never bothered to use it in tall mode. Would have been nice though.
Hmm, I was reading something recently about mixing levels of abstraction, and how that can lead to hard to read code. Some of the examples involved if statement expressions.
In terms of adding comments to label if/else if branches, I think that would mostly be necessary if you were mixing levels of abstraction. One solution might be to extract the checks into a method, where the name of the method provides a higher level of abstraction, and hence labelling of the if/else if branches.
The main difference between pointers and references are that references are non-nullable and constant. It's a compile error if you try to assign null to a reference. Instead of documenting that a pointer can not accept null, consider using a reference instead. A reference would imply the value needs to point to something valid.
Indeed, you have to go out of your way to trick the compiler into getting a null value stored in a reference variable. One way is to first create a pointer, set it to null, and then create a reference that points to the same thing as the pointer. Failing that, you'd need an ugly cast, or a really awful union, to subvert the type system.
Obj& ref = 0; // Error!
Obj* ptr = 0;
Obj& ref = *ptr; // Okay! (Does not check if ptr is null)
Obj& ref = *(Obj*)0; // Okay! (Casting has subverted the type checks)
union {
Obj* ptr;
// References can not be stored directly in a union, so wrap in a struct
struct Ref {
// Explicitly define a constructor, since otherwise it is deleted due to the reference
Ref();
Obj& ref;
};
} unionVar;
unionVar.ptr = 0; // Okay! (unionVar.ref is now also null)
Another indication you might want to use a reference, is if the pointer itself is constant. References are only ever set when they are defined. They can not be re-assigned. This corresponds to a const pointer (as opposed to a pointer to a const).
Or, put the other way, the two reasons for using a pointer rather than a reference, is to support using null values, or to be able to re-assign the pointer dynamically. If you don't need either feature, I'd recommend using a reference over using a pointer.
It's probably a good idea to use division rather than shift when the actual intent really is to divide. Although easy to forget due to it's prevalence, the shift is an optimization, and one that isn't clear to non-programmers, or people unfamiliar with binary numbers. Using a division really is more clear. Plus, I don't know of any compiler that can't optimize a division by a constant power of 2 into a bit shift.
There is however one gotcha. The rounding mode (towards 0) for signed integers does not match up with a simple bit shift. If you try to divide by a constant power of 2, and only care about positive numbers, but forget to declare the variable as unsigned, you'll get a less efficient opcode sequence with extra instructions to adjust for the rounding mode.
int divideBy2(int value) {
return value / 2;
}
_Z9divideBy2i:
.LFB1564:
.cfi_startproc
mov eax, edi ; Copy the value (we need some scratch space to work with the sign bit)
shr eax, 31 ; Shift right (logical), moving the sign bit to lowest bit position
add eax, edi ; Add 1 (round up towards 0) if value was negative, otherwise add 0 (positive)
sar eax ; Shift right (arithmetic), divide by 2, maintain signdness of result
ret ; Return result (in eax register)
.cfi_endproc
For unsigned values, a rounding mode of towards 0 matches a rounding mode of round down, which is a simple bit shift.
unsigned int divideBy2(unsigned int value) {
return value / 2;
}
_Z9divideBy2j:
.LFB1565:
.cfi_startproc
mov eax, edi ; Copy the value (this could be removed if the method was inlined)
shr eax ; Shift right (logical): Unsigned divide by 2
ret ; Return result (in eax register)
.cfi_endproc
Ahh, that makes sense. Frontload the work. I was stuck thinking of a single divide rather than an array divide.
Thanks for the pop count link, that was good review. The bit manipulations has a rather interesting structure.
As for the algorithm complexity, if you were being pedantic, any algorithm is constant time when you fix the input size. As the pop count instruction works on a fixed number of bits (the register size), it is of course constant time. Though you may notice from the algorithm, the actual number of instructions depends on the number of bits involved in the counting. Looks to be logarithmic in the number of bits, which is pretty efficient. And of course, having a machine opcode to do it is always handy. Looks like it can be done quite efficiently at the hardware level.
I do rather enjoy the conversion (A % B) => (A & (B - 1)) when B is a power of 2. I use a slight variant of that for rounding up to the next multiple of a power of 2: roundUp(value, multiple) = (value + (multiple - 1)) & ~(multiple - 1)
I tried your suggestion to pass a constexpr through a non-constexpr function. It seems the g++ optimizer is too good. I was not able to fool it with my simple test:
unsigned int addOne(unsigned int x) {
return x + 1;
}
unsigned int divideByTwo(unsigned int x) {
return x / addOne(1);
}
The result of g++ -c -S -masm=intel -O2 constExprArg.cpp is:
_Z6addOnej:
.LFB0:
.cfi_startproc
lea eax, 1[rdi]
ret
.cfi_endproc
...
_Z11divideByTwoj:
.LFB1:
.cfi_startproc
mov eax, edi
shr eax ; Shift right (logical): Divide by 2
ret
.cfi_endproc
You're right about truncating the constants, though I also believe you'll need to adjust the number of constants based on the bit size. From the Wikipedia article you linked on Hamming Weight (https://en.wikipedia.org/wiki/Hamming_weight):
For processors lacking those features, the best solutions known are based on adding counts in a tree pattern.
The height of that tree depends on the number of bits at the bottom.
If you were to look at the first algorithm listed:
//types and constants used in the functions below
//uint64_t is an unsigned 64-bit integer variable type (defined in C99 version of C language)
const uint64_t m1 = 0x5555555555555555; //binary: 0101...
const uint64_t m2 = 0x3333333333333333; //binary: 00110011..
const uint64_t m4 = 0x0f0f0f0f0f0f0f0f; //binary: 4 zeros, 4 ones ...
const uint64_t m8 = 0x00ff00ff00ff00ff; //binary: 8 zeros, 8 ones ...
const uint64_t m16 = 0x0000ffff0000ffff; //binary: 16 zeros, 16 ones ...
const uint64_t m32 = 0x00000000ffffffff; //binary: 32 zeros, 32 ones
const uint64_t h01 = 0x0101010101010101; //the sum of 256 to the power of 0,1,2,3...
//This is a naive implementation, shown for comparison,
//and to help in understanding the better functions.
//This algorithm uses 24 arithmetic operations (shift, add, and).
int popcount64a(uint64_t x)
{
x = (x & m1 ) + ((x >> 1) & m1 ); //put count of each 2 bits into those 2 bits
x = (x & m2 ) + ((x >> 2) & m2 ); //put count of each 4 bits into those 4 bits
x = (x & m4 ) + ((x >> 4) & m4 ); //put count of each 8 bits into those 8 bits
x = (x & m8 ) + ((x >> 8) & m8 ); //put count of each 16 bits into those 16 bits
x = (x & m16) + ((x >> 16) & m16); //put count of each 32 bits into those 32 bits
x = (x & m32) + ((x >> 32) & m32); //put count of each 64 bits into those 64 bits
return x;
}
The constant m32 would by useless as a bitmask on 32-bit operations, as would the x >> 32 operation. Accounting for the halving/truncating of the constants, and removal of dead operations, the 32-bit equivalent should be:
const uint32_t m1 = 0x55555555; //binary: 0101...
const uint32_t m2 = 0x33333333; //binary: 00110011..
const uint32_t m4 = 0x0f0f0f0f; //binary: 4 zeros, 4 ones ...
const uint32_t m8 = 0x00ff00ff; //binary: 8 zeros, 8 ones ...
const uint32_t m16 = 0x0000ffff; //binary: 16 zeros, 16 ones ...
int popcount32a(uint32_t x)
{
x = (x & m1 ) + ((x >> 1) & m1 ); //put count of each 2 bits into those 2 bits
x = (x & m2 ) + ((x >> 2) & m2 ); //put count of each 4 bits into those 4 bits
x = (x & m4 ) + ((x >> 4) & m4 ); //put count of each 8 bits into those 8 bits
x = (x & m8 ) + ((x >> 8) & m8 ); //put count of each 16 bits into those 16 bits
x = (x & m16) + ((x >> 16) & m16); //put count of each 32 bits into those 32 bits
return x;
}
That is, one less tree operation for half the number of bits. Or one extra tree operation if you double the number of bits.
Well, I think I got wildly off topic on this thread :P
That's not the most optimized form of the algorithm. Here:
constexpr uint64 m1 = 0x5555555555555555;
constexpr uint64 m2 = 0x3333333333333333;
constexpr uint64 m3 = 0x0F0F0F0F0F0F0F0F;
constexpr uint64 m4 = 0x0101010101010101;
uint64 PopCount64(uint64 x) {
x = x - ((x >> 1) & m1);
x = (x & m2) + ((x >> 2) & m2);
x = (((x + (x >> 4)) & m3) * m4) >> ((64 /* bit depth */) - 8);
return x;
}