I've got this ongoing project, which more or less tries to force export symbols from Outpost2.exe. Using this project I've been able to access code and data from Outpost2.exe much like it was written in a seperate .cpp file, or declared as an extern variable. There are some down sides to certain aspects of this project, but so far they've proved minor, and don't tend to occur in practice.
Previously, this sort of thing was typically done by hardcoding pointer values, or storing the pointers in int or DWORD variable, and used lots of ugly type casting (or inline assembly that isn't type checked). The pointer casting was especially noticable for function pointers, which usually had rather long notations. Notations that you quickly forget after not using them for a while. Accessing variables this way also needed to make use of dereferencing, so it was obvious that a pointer was being used. The code ended up looking like this:
int* tickPtr = (int*)0x56EB1C; // Data pointer cast
int (*funcPtr)(int value1, int value2) = (int (*)(int, int))0x400120; // Function pointer cast
void SomeFunction()
{
int value = *tickPtr; // Access data through pointer
(*funcPtr)(3,4); // Call through function pointer
funcPtr(3, 4); // Call through function pointer (short form)
}
This method was also somewhat limited in what kinds of functions you could setup pointers to. Declaring a pointer to a member function of some class is quite a pain to do. This may be partly because the internal representation of pointers to members is sometimes larger than 4 bytes. In any case, the compiler gives an error message if you try to type cast an int into one of those pointers. Knowing the internal representation, you can always get around this with other hacks, such as using a union, but this makes already ugly notation much uglier.
Here is normal member function pointer access:
class SomeClass
{
public:
void ClassMethod();
};
void (SomeClass::*methodPtr)() = &SomeClass::ClassMethod; // Setting up a function pointer
void SomeFunction2()
{
SomeClass someClassInstance;
SomeClass* someClassPtr = &someClassInstance; // Initialize a pointer to the class
(someClassInstance.*methodPtr)(); // Call class method through method function pointer
(someClassPtr->*methodPtr)(); // Call class method through method function pointer, using class pointer
}
Keep in mind where the "*" goes.
Now, if you try to type cast an int to a pointer of that form, such as like this:
int (SomeClass::*methodPtr2)() = (int (SomeClass::*)())0x400140;// Attempted function pointer cast
You'll end up getting the following compiler error message:
Main.cpp(35) : error C2440: 'type cast' : cannot convert from 'const int' to 'int (__thiscall SomeClass::*)(void)'
There are no conversions from integral values to pointer-to-member values
So you'd need to set the pointer using something beyond a simple type cast.
As a side note, I found through experimentation that this:
class SomeClass2;
int (*SomeClass2::methodPtr3)() = (int (*)())0x400140;
Produces the following compiler error message:
Main.cpp(41) : fatal error C1001: INTERNAL COMPILER ERROR
(compiler file 'msc1.cpp', line 1794)
Please choose the Technical Support command on the Visual C++
Help menu, or open the Technical Support help file for more information
Notice the misplaced "*". Also, I used a forward declaration for a class, whose definition was never supplied. Please note that "msc1.cpp" is not one of my files. Probably an indication that we're getting into some fringe area of the language/compiler, where not too many people have tread before.
As another side note, I once got a core dump from a compiler on unix. Might have been with g++. (Not exactly the kind of encouragement I needed during the wee hours of the morning, while trying to get an assignment done by 9:00 AM).
Anyways, back on topic. We'd like some way to call class member functions inside Outpost2.exe. Assume we know the address of the function, it's calling convention, number of parameters, and return type. If this were code in a seperate .cpp file, we'd just have included it's header file to make use of it. Now remember that C++ compilers compile each .cpp file seperately. By simply writing a header file for the class, and including it, we can get something to compile into a .obj file. It's not until the link stage when it goes looking for the code that it will give an error about an undefined symbol. For example:
class A
{
public:
void F1();
};
void G(A* a)
{
a->F1();
}
Will compile, and then fail to link, since we haven't defined the class member function F1(). Compiling this produces the following compiler output:
Compiling...
Main.cpp
Linking...
Main.obj : error LNK2001: unresolved external symbol "public: void __thiscall A::F1(void)" (?F1@A@@QAEXXZ)
Notice the seperate "Compiling..." and "Linking..." steps. You can also verify that the compiler produced a Main.obj in the intermediate files folder. If you were to turn on assembly listing, you can even see that it produced code to call that function F1, but the assembly listing is really messy and hard to read, so we won't do that here. Also, note that the decorated symbol name is given in the error message. We'll make use of this later.
Now what is a symbol? It's essentially a named memory address. Since we already know what address we want, we just need a way of defining that symbol to have that address. Once we do that, we can get the linker to stop complaining, and produce an exe or dll. Granted, the C++ compiler doesn't seem to have a way to assign those symbols a value, but your typical assembler will.
What you can do, is create an assembly file (NASM format used here), and define that symbol to have the address of the code using something like:
symbolName EQU numericValue
We also need the symbol to be accessible from outside the .obj file the assembler will produce, otherwise the linker won't see it. We do this by defining the symbol to be global. We end up with something like this:
global ?F1@A@@QAEXXZ
?F1@A@@QAEXXZ EQU 0x400140
Here I actually filled in the symbol name, and some (random) numeric value.
Assemble the .asm file using
NASM, and then link it all together to produce an exe or dll. It's a free download from Source Forge, and pretty easy to setup and use. If you include the assembly file into your project, it can be built along with the rest of your project, but you'll probably have to setup a custom build rule to tell your IDE how to build it. In MSVC this can be done by right-clicking on the file, and choosing settings. From there, you can enter a command line under the Custom Build tab to compile the file. I use the following settings.
Description: Performing Custom Build Step on $(InputPath)
Commands: NASMw -f win32 -o "$(InputDir)\Build\$(InputName).obj" "$(InputPath)"
Outputs: $(InputDir)\Build\$(InputName).obj
The -f sets the output format to win32 (COFF), and the -o sets the output file name. The rest is just where the input and output files are located (and I used macros provided by MSVC). I put the output file in a "Build" subfolder from where the source file is. This is not the usual place for output files, but my project was setup a bit differently. I'll probably get to that in a later post.
There is probably a similar setup for CodeBlocks, although, I don't know it. I did notice that the default assembler used by CodeBlock is MASM, not NASM. The syntax is not compatible, so you'd either have to convert the assembly file, or setup the IDE to use a new compiler.
If all that fails, you can always assemble the file yourself from the command line using Nasmw (the Windows version of NASM), and if you can't get your IDE to include the extra .obj file in the link step, you can also call Link yourself from the command line. It's usually a good idea to place Nasmw.exe somewhere in your path, or a bin folder where your IDE will look for it, so you don't always have to specify it's location.
So what does all this accomplish? Well, you can build header files in C++ describing all the code and data in Outpost2.exe, and then just include them into your source files enabling you to make direct calls to the internal functions of Outpost2.exe, and access all the variables, just as if it was code from a seperate .cpp file. Some code to dump a list of all tech names to a file at tick 3 might look like this (untested):
if (tethysGame.tick == 3)
{
int i;
ofstream techList("TechList.txt");
for (i = 0; i < research.numTechs; i++)
{
techList << i << ") " << research.techInfo[i]->techName << endl;
}
}
Note the casing of tethysGame, and the use of .tick, instead of .Tick(). Here, tethysGame is a global class instance, and we are reading a member variable. This was done by declaring a TethysGame class (in a namespace, to prevent redefinitions with the class of the same name which is exported by Outpost2.exe) in a header file, and then an extern variable of that type using "extern TethysGame tethysGame;". The address of this variable was then placed in the assembly file, along with the decorated name of this global variable, enabling access in this manner. Similarly, a header file was created for the Research class, and a global extern variable named "research", as well as a TechInfo struct, whose fields are being accessed to find the tech name. (Code very similar to this was used to produce the tech list I posted a few days ago). Compare that to the older method of writing (untested):
if (*(int*)(0x56EB1C) == 3)
{
int i;
ofstream techList("TechList.txt");
for (i = 0; i < *(int*)0x56C230; i++)
{
techList << i << ") " << *(char**)((*(int**)0x56C234)[i] + 0x28) << endl;
}
}
It took me a few minutes to get the casting to work so it'd compile.
The previous example was simple data access. What about calling class member functions? Well, not only can you call class member functions, but once you have all it's member variables declared so it's size is correctly determined by the compiler, you can also declare local variables of existing class types. The compiler will even call a constructor function located inside Outpost2.exe to initialize your object too. This lets you code something like:
MultiplayerPreGameSetupWnd preGameSetupWnd;
preGameSetupWnd.ShowHostGame(hostGameParameters);
Complete with IntelliSense, giving you the list of member functions as you type that ".". (Yes, that's pretty much what's in the NetPatch I've been working on).
This can be taken even further too. You can also declare your own classes derived from classes that exist in Outpost2.exe. Here's a somewhat larger example doing exactly that:
class CheatView : public CommandPaneView
{
public:
// Member variables
// ----------------
// vtbl
// ----------------
UICommandButton disasterButton[4];
CreateUnitCommand spawnDisasterCommand[4];
public:
// Virtual member functions (inherited)
virtual void UpdateView() {};
virtual void OnAddView()
{
// Setup action parameters
spawnDisasterCommand[0].unitType = mapMeteor;
spawnDisasterCommand[0].cargo = (map_id)1;
spawnDisasterCommand[1].unitType = mapLightning;
spawnDisasterCommand[1].cargo = (map_id)1;
spawnDisasterCommand[2].unitType = mapVortex;
spawnDisasterCommand[2].cargo = (map_id)1;
spawnDisasterCommand[3].unitType = mapEarthquake;
spawnDisasterCommand[3].cargo = (map_id)1;
// Set the button actions
disasterButton[0].command = &spawnDisasterCommand[0];
disasterButton[1].command = &spawnDisasterCommand[1];
disasterButton[2].command = &spawnDisasterCommand[2];
disasterButton[3].command = &spawnDisasterCommand[3];
// Add the buttons to the display
AddButtons(this, 4,
&disasterButton[0], "&Meteor", "Create &Meteor",
&disasterButton[1], "&Storm", "Create &Storm",
&disasterButton[2], "&Vortex", "Create &Vortex",
&disasterButton[3], "&Earthquake", "Create &Earthquake");
};
virtual void OnRemoveView() {};
virtual bool IsNewView() { return 1; };
virtual void Draw(Rect* drawRect, GFXClippedSurface* surface) {};
virtual void SetReportPageIndex() {};
virtual bool DoesUnitSelectionChangeCauseUpdate() { return 0; };
virtual void OnAction() {};
virtual int GetSelectedReportButtonIndex() { return 0; };
};
This is a section of code I used for the DLL I posted a while back with the custom user interface that let you create disasters, and target them with the mouse. I did something similar for a custom mouse control class, as well as for a custom filter class that checked for the hotkey to display my hidden user interface. (And no, the code doesn't work in multiplayer).
Limitations
------------
There are a few limitations with this approach. Symbols used for relative jumps and calls are treated a bit differently than symbols used to access absolute data, such as variable access, and virtual function table entries. With the relative jumps and calls, you need to adjust the address of the symbol by the load address of the DLL, or it will link to the wrong address. For absolute data addresses, there is no adjustment. This means the same symbol can not be accessed in two different ways. If you want a virtual function to be inheritable by derived class, you'll need the absolute address associated with the symbol. That way, the correct absolute address will be placed in the virtual function table of the derived class. If this is done, then you can no longer call the original function in a relative way. This can occur is you prefix the method by the class name using the scope resolution operator ( :: ), such as if you were chaining to a base class function from an overridden virtual function. In practice, I've only run into this problem once in a very minor way, and there was an easy work around.
The other potential limitation I noticed is with virtual destructors. The virtual function table will point to either a Scalar Deleting Destructor, or to a Vector Deleting Destructor, which are both different from the normal destructor. These all have seperate decorated symbol names too, so the correct function address needs to be associated with the correct symbol. The limitation here comes from a possible mismatch between the type of destructor you need to call virtually, from the actual implemented function. The Scalar and Vector deleting destructors take two bit flags packed in a 4 byte int. The lowest bit, if set, means the object should free itself after running any destructor code. The next bit is used by the Vector deleting destructor only, and is used to check if an array of objects needs to be destructed, rather than just one object. The potential problem then, is that the virtual function table might only have a scalar deleting destructor, which might be called if you try to delete[] and array of that object type. This will likely cause a crash. However, I've had trouble trying to get the compiler to generate dangerous code to demonstrate this case, and I'm not exactly sure what conditions would actually cause this sort of thing to happen. Plus, I don't see much reason to go deleting arrays of the built in object types, or really doing much of anything that would cause issues with the destructors.
I'll begin posting source to parts of this project as I clean it up.