I did a bit more tweaking to the OP2Helper project before uploading it.
I filtered out the OP2Helper.lib file, which should never have been added to source control in the first place. This project is a static library, so the lib file is the output, much like a DLL or EXE is the output of other projects.
git filter-branch --index-filter 'git rm --cached --ignore-unmatch -r Lib' --prune-empty --tag-name-filter cat -- --all
Note the Outpost2DLL project does contain a lib file, but that is used as an input to link to
Outpost2.exe. The
Outpost2DLL.lib file would essentially be an output when compiling
Outpost2.exe, which we don't have source code for. This lib file had to be generated through other means, from the Outpost2 executable file itself. Hence why the Outpost2DLL contains a .lib file in the repository, but other projects don't.
I searched history to find possibly missed traces of .lib files, perhaps outside of the Lib folder. To get a fairly raw dump of what the repository is holding, I found
rev-list to be useful:
git rev-list --objects --all
I also came across a way to get a list of the 10 largest files in the repository. A good way to know if somewhere in history you've accidentally committed something big, which is eating up a lot of repository space. First run
git gc to generate pack files. Then combine
rev-list with
verify-pack. This outputs a list of hashes and file names:
git gc
git rev-list --objects --all | grep "$(git verify-pack -v .git/objects/pack/*.idx | sort -k 3 -n | tail -10 | awk '{print$1}')"
I found a command that lists hashes, file sizes, and file names, all sorted by size. It's a bit more complicated, but the output is very nice:
git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | awk '/^blob/ {print substr($0,6)}' | sort --numeric-sort --key=2 | cut --complement --characters=13-40 | numfmt --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest
I took a look at the tags. It seems two branches were created for the SDK-V2 tag. In Subversion, tags are actually full branches, and you can commit to them, though it warns you not to. Nevertheless, someone did indeed commit to the SDK-V2 branch, which is why the tag showed up twice during the conversion.
I tagged the original branch point as "v2.0.0". I tagged the most recent SVN version as "v2.1.0". For the commit that was erroneously added on top of the tag branch in SVN, I tracked back to it's proper branch point, created a new branch, cherry-picked the changes over to the new branch as the first commit. I then updated the code to make the style a bit more consistent with the rest of the SDK as a second commit. The changes were then merged it in to master branch as a merge commit.
During all this, I found my editor was trying to make changes to unexpected lines. Eventually I figured out it was cleaning up sloppy whitespace. Things like spaces and tabs at the end of lines. Rather than blend the changes in with the rest of the merge, or fight it indefinitely, I opted to cleanup the whitespace for the whole project in an additional commit.
I also investigated the BaseBuilderV2 branch. I found the code had all been committed, seemingly by accident, in various out of order stages at later dates, to trunk in SVN. Hence the branch contained no additional code. After verifying it really was all already on master, I chose to just discard the branch.
Well, that was a long convoluted learning process. The OP2Helper conversion is finally up on GitHub though.
Now that I know much more about what I'm doing, I think I might want to revisit the Outpost2DLL project. I encountered some of the same problems there, with the noisy history from the project being moved around during SVN repository layout updates. It might be worth redoing that conversion before people start using it.