Outpost Universe Forums

Off Topic => Computers & Programming General => Topic started by: Hooman on September 18, 2019, 02:09:01 AM

Title: Static initializers will murder your family
Post by: Hooman on September 18, 2019, 02:09:01 AM
Hehe, saw this and found it kind of funny:
Static initializers will murder your family (https://meowni.ca/posts/static-initializers/)

We ran into a static initialization order problem with op2ext a little while back. It was causing crashes in the test code on the continuous integration servers. It was somewhat random when it crashed, making it hard to know quite what was causing it.
Title: Re: Static initializers will murder your family
Post by: leeor_net on September 18, 2019, 06:27:06 PM
Holy shit, that post is awesome! I don't know if C++17 or c++20 will improve any of the issues but the writing style is fabulous. :D
Title: Re: Static initializers will murder your family
Post by: Hooman on September 19, 2019, 05:20:53 PM
Yeah, no kidding.

Quote
As Douglas Adams, the inventor of C++ said, ...
Lol
Title: Re: Static initializers will murder your family
Post by: Arklon on September 19, 2019, 10:19:43 PM
It should be noted that constexpr guarantees it will be static initialized at compile time rather than runtime.
Title: Re: Static initializers will murder your family
Post by: Hooman on September 20, 2019, 07:00:24 AM
Yes, constexpr is wonderful.

Now if only we could constexpr Outpost 2. Then we can have our fun before even playing the game. :P
Title: Re: Static initializers will murder your family
Post by: Vagabond on September 20, 2019, 08:24:26 PM
I consider the static initialization order as it stands in op2ext a decent sized flaw in how we have programmed it. It bothers me, but I don't know how to fix it yet.

I'd like to read the article, but hardly have enough time to program and keep up with the op2ext changes right now as other things going on in life. Hopefully will get to it later.

-Brett
Title: Re: Static initializers will murder your family
Post by: lordpalandus on September 20, 2019, 10:49:18 PM
Question: I don't fully understand the problem, but can't you use a preprocessor to force specific files to compile in a specific order?
Title: Re: Static initializers will murder your family
Post by: Hooman on September 21, 2019, 02:29:32 AM
That's not really what the pre-processor is for. At least not if you're talking about how it's typically used with header files.

In C++ each compilation unit (.cpp file, and recursively all the header files it includes), are compiled into an object file. Once all the source files are compiled into object files, they are linked into an executable file (or a DLL, or a static library, depending on project type).

During the link step, the order in which the object files are linked together is not specified. It might be the order the object files were specified on the command line to the compiler. Depending on the build system, that order might simply match the order of the directory listing returned by the filesystem, or it might be the order source code was added to the project file, or it might just be alphabetical order. At any rate, the order is not specified by the C++ standard.

The order of linking of the object files will affect the initialization order of global variables in those object files. If file A is linked before file B, you might expect the globals in file A to be initialized before the globals in file B. If you reverse the ordering of the linking, you might expect to reverse the order in which global variables are initialized.

This could be a problem if the initialization of a global in file A depends on the value of a global in file B.

Code: [Select]
// A.cpp
ObjectTypeA objectA(objectB.getValue());


// B.cpp
ObjectTypeB objectB;

If objectA is initialized first, then its constructor call will try to use the not yet initialized objectB in the expressions objectB.getValue(). That will likely result in accessing uninitialized memory, which is quite likely to result in a crash.



There is a workaround, though it can seem a bit hacky at times. You put all global variables in the same .cpp file. The order of initialization within a compilation unit is well defined. The variables are initialized in the order they are defined. That means you could write the following:

Code: [Select]
// Globals.cpp
ObjectTypeB objectB;
ObjectTypeA objectA(objectB.getValue());



So could you pre-process stuff and fix the problem? I suppose you could for this particular case. You could do something really ugly and non-standard, like include .cpp files instead of .h files. That could combine the whole project into one compilation unit, wherein initialization order is well defined, based on definition order. You'd then need to carefully order your #include lines:
Code: [Select]
// CombinedFiles.cpp
// Include .cpp files (not .h files!)
#include "B.cpp"
#include "A.cpp"  // A depends on B, so must come after it

Of course then you lose the firewall aspect of having multiple compilation units. If you change anything, even a comment, you now have to recompile the entire project since it's all one compilation unit. Considering how slow C++ compilers typically are, even for medium sized projects, that's a major pain. For a large project, a full recompile could take a few hours. If you keep code separated into different source files, you only need to recompile a section when something in that compilation unit changes. The linking step at the end is pretty fast.





Another problem where globals can get you in trouble with initialization, and where re-ordering things won't help you, is when a global ends up making a call to itself from its constructor. That can happen in all sorts of strange and convoluted ways, and if often quite indirect.

Case in point, I just noticed something today in op2ext. We have a global logging object. The logging object logs messages to a file. When it is constructed, it tries to open the log file for output, which could fail. If it fails, it reports the error to the user with a message box. Over time, use of the message box grew, so now it is used to report errors in several places in the code. At some point, probably weeks later, it was decided that these errors shouldn't just be displayed, but also logged.

So now, if the logger fails to open it's log file, it reports the error, which tries to log the error.

Amazingly it doesn't crash, but that's a bit to do with luck. The logging object is initialized just enough before reporting the error that the reporting system can use it without crashing, though it can't write anything to the non-opened file. I'm pretty sure it's still undefined behaviour though.

The structure looks something like this:
Code: [Select]
// Logger.cpp
Logger::Logger() : logFile(logFileName) {
  if (!logFile.open()) {
    PostErrorMessage("Failed to open log file");
  }
}

// LotsOtherFiles.cpp
// ...
// PostErrorMessage("Blah!");
// ...
// PostErrorMessage("Blah!");
// ...
// PostErrorMessage("Blah!");


// PostErrorMessage.cpp
PostErrorMessage(std::string message) {
  auto formattedMessage = getTimeString() + getExtraDetail() + message
  logger.Log(formattedMessage); // Added a few weeks later after PostErrorMessage became a general reporting tool
  MessageBox(formattedMessage);
}



Edit: I would just like to add that I was fully responsible for adding the logger.Log(formattedMessage); line. It seemed like a good idea at the time.
Title: Re: Static initializers will murder your family
Post by: lordpalandus on September 22, 2019, 02:35:19 PM
I'm just surprised that you cannot state what thing to compile in a specific order. Everything else about programming is sequential; do this, then this, then this, finish with this. I'd have thought that you could force the compiler to compile things in a specific order. Or maybe you can, but its a pain in the ass to setup and do.
Title: Re: Static initializers will murder your family
Post by: Vagabond on September 22, 2019, 08:52:43 PM
From Hooman:
Quote
I would just like to add that I was fully responsible for adding the logger.Log(formattedMessage); line. It seemed like a good idea at the time.

Hey, I was going to take credit for that line of code.

@lordpalandus,

I think to some degree it does compile in a specific order, but if the required behaviour is not specified in the C++ standard, each compiler is allowed to implement their solution in their own way. So you cannot be sure it will work between each vendor's compiler (or between different versions of the same vendor's compiler). It is best to avoid depending on this sort of behaviour in your code because it makes it very brittle. Someone can correct this paragraph since I'm probably mis-speaking a decent amount.

-Brett
Title: Re: Static initializers will murder your family
Post by: TechCor on September 23, 2019, 09:09:44 PM
I'm curious why you would want to rely on static init for anything complex. Initializing a system should be a direct function call, such as Initialize().

Of course, I'm not a fan of globals to begin with and try to keep them to a minimum.
Title: Re: Static initializers will murder your family
Post by: lordpalandus on September 23, 2019, 10:14:50 PM
I love global variables personally, but it makes saving them to file a real chore.
Title: Re: Static initializers will murder your family
Post by: Hooman on September 24, 2019, 06:15:29 AM
Quote
Hey, I was going to take credit for that line of code.

Ok, fine, you can have it! :P

...

Actually, if we want to play the blame game... ;)
Code: [Select]
git blame <previous commit SHA> -- srcStatic/GlobalDefines.cpp
Quote
... somewhere in PostErrorMessage
...
f52c2711 GlobalDefines.cpp           (Brett208 2018-12-21 15:34:42 -0500 29)    logger.Log(formattedMessage);
...

It was you that wrote it!  ;D

Don't you just love version history. It never lies.  ;)

... Though I'm pretty sure I was the one that suggested that line.



I normally try to avoid globals. Though with something like logging, you really don't want to be passing a logging object around to *every* function in your app that might want to log some event or problem.

In the end, we changed the logging to use global functions, and added protective layers around the underlying objects so they aren't really global anymore. At least their visibility isn't global. They may still be statically initialized if so desired, though it would be awkward to activate a logger at that point.

Global functions seem much less offensive than global objects.