It's a beautiful idea really. And yes, simpler. Though I've always viewed it as pretty close to disassembling. Seems like you'd need to know a lot of the same information to do it correctly. The one difference I see is not needing to distinguish between code and data. Considering that problem is not in the general case solvable, that could actually be a pretty big deal.
Right, and it would still take a ton of manual work. A lot of it would be defining the sizes of all global objects in the .(r)data section, which is needed in order to be able to associate base relocations within the object but past its base address, which are pretty common. And, of course, defining decorated symbol names, at least for things you actually need to use or replace in your own code. And your own code needs to have headers written for all this stuff.
How far along was the reverse linker? Was it usable in certain cases? What was unfinished?
I had a pretty decent auto-analysis scheme working enough to use on Outpost2.exe. That was intended to be used to output a text representation of the exe with the reconstructed symbol and relocation tables, so it can then be manually curated before feeding that back in to produce the obj. LLVM (which is what I'm basing it on) already has well-defined APIs for YAML, including a COFF YAML representation, so that'd be the most convenient format to use. Outputting that YAML file after auto analysis is all it does right now.
So, obviously it needs to write out the obj still, which would actually not be nearly as much work as parsing the input exe/auto analysis/etc. The auto-analysis needs to support some exe features I haven't bothered with yet like SEH and delayed imports, Outpost2.exe doesn't use those though so I can probably just not care; there's also still some bugs like my code to deal with common symbols (meaning, references either to the base address or inside the exe headers, before the first section) isn't working for some reason, but again that's not a concern with Outpost2.exe. I need to write an IDA plugin to also output a YAML file, so you can use IDA as a much better curation tool than editing the YAML by hand, or use its auto-analysis which is better than anything I'd ever have time to write myself. Or, just forget about even using my auto-analysis at all, and just make the plugin.
It needed to have code written to handle loading in YAML contents. I'd imagine it being able to have multiple YAMLs used as inputs at once, so you could have manual symbol definitions be overlayed with auto analysis definitions, i.e. a symbol (address) defined in file 1 would take precedence over definitions of it in 2, 3, etc.
To address having to manually define decorated symbol names, and make headers for your code to use anything, since I was already working off of LLVM already, I might as well use libclang to make a tool to parse source code for specially formatted comments or whatever that define where the symbol in Outpost2.exe is, which outputs another YAML file with MS-compatible decorated symbol names etc. So your workflow to do both things just involves writing the headers, and the tool deals with generating decorated symbol names in the right format for use with the reverse-linker. Instead of, you know, dealing with the insanity of ForcedExports.asm.
It seems like it should be pretty easy to just parse C++ comments using libclang, and just use COFF YAML, just not as easy as my decision to be lazy instead.
The reason I shelved it was because the linker I was basing this on, LLD, had a rewrite of most of its codebase. Which did actually look like a huge improvement in code simplicity and speed, but I was just too lazy to feel like doing code archaeology all over again, considering I already spent 3 weeks doing nothing but reading over LLVM's codebase before I even started writing any code.