I've been thinking more on how to accept input from the console for the OP2MapImager. This is the first legitimate console application I've helped designed, so please bear with my learning curve.
Application Assumptions
- OP2MapImager executable must be placed in the same directory as the OP2 Tile Set Bitmaps for the associated map
- User must unpack tile sets and maps from their associated .VOL files manually. (Plan on eventually automating this)
Currently I was considering something like this
OP2MapImager [OutputFileType = PNG [PNG|JPG|BMP]] [PercentScaled = 25[%]] <MapFilename.[map|OP2]>
Examples
OP2MapImager PNG 25 Ashes.map
OP2MapImager 50 C:\Users\UserName\Documents\Ashes.map
OP2MapImager ..\Ashes.map
OP2MapImager PNG Ashes.map bhsurv.map atwmon.map
After doing some thinking and reading, I would like to change the syntax. I think it would be much easier to parse the data if we use the -Short and --Long syntax for parameters. This would allow the end user to re-arrange the order of parameters at will, and simplifies the code to figure out which paramter they are placing in which order. It also seems more standardized.
OP2MapImager [-if = PNG [PNG|JPG|BMP]] [-tw = 8] filename...
USAGE:
-? --Help
-if --ImageFormat File format of the output image
-tw --TileWidth = 8 Integer value indicating the length of pixels in each rendered tile.
32 = full size. 1 = one pixel per tile.
-od --OutputDirectory Set a different directory for outputting map renders. Create the directory if required.
-o --Overwrite = true Indicates if new renders will overwrite files with the same name. Will output an error if unable to filename is already taken.
-v --Version Software version information
My sources for Command Line Input (CLI) Syntax Documentation:
- https://technet.microsoft.com/en-us/library/ee156811.aspx
- https://www.ibm.com/support/knowledgecenter/SSZJPZ_8.5.0/com.ibm.swg.im.iis.common.doc/common/command_conventions.html
- http://docopt.org/
- http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html#tag_12_01
Anyways, I'm open to suggestions on changing any of the syntax.
Formatting Input for Map Scale???
Initially I was thinking of accepting a percentage value for scale. Hooman pointed out this would require handling fractional tile sizes. For example, if the user selected 17% scale, each tile would be 5.44 pixels in length. I would rather not deal with handling fractional pixel placement. So I thought we could just force round the percentage. In this case, you could either force the image to have 5 or 6 pixels per tile. This is not very precise. As a user I would expect to get an image that was exactly 17% in size if that is what they asked for.
So, what if we allow the user to set the pixel width of each tile instead of the scale. In this case you would input 5 and get a render that was 5 pixels in length per tile. There could be 32 discrete scale choices (although the bottom end at 1 or 2 or 3 pixels per tile may be rather useless). This seems cleaner to me than percentages though.
Thoughts?...
@leeor_net,
Thanks! It sounds like you are comfortable with switches (-S / --Scale) so I will look at going with this format.
I tinkered with the code some today. Looks like I was pulling tile set indices incorrectly from MapData.h. I fixed this and managed to get a perfect render! Below is Twin Valleys in scale factor of 8 (8 pixels per tile).
Unfortunately, I get an out of range on vector access for any map that is not 64 x 256 tiles in size. Well I didn't test every size, but the several I did outside this size didn't work. I'm getting a negative tileInfoIndex value which throws the exception when accessing the TileInfo vector.
I'm guessing the function Hooman and I wrote to translate an (x,y) coordinate into an index within the 1D array may be messed up and only working with this size of map??? The code seems to throw the out of range exception at different (x,y) coordinates based on the map size. I don't have the brainpower to verify/troubleshoot bitwise operations right now (and even if I did, I may not have paid enough attention to what Hooman was teaching me to solve on my own...). Also, this doesn't seem to match up with the culprit, since it is the tileInfoIndex and not the cellIndex throwing the error.
size_t MapData::GetCellIndex(unsigned int x, unsigned int y)
{
unsigned int lowerX = x & 0x1F; // ... 0001 1111
unsigned int upperX = x >> 5; // ... 1110 0000
return (upperX * mapHeader.mapTileHeight + y) * 32 + lowerX;
}
Twin Valleys ( Scale Factor of 8 )
(http://forum.outpost2.net/index.php?action=dlattach;topic=5965.0;attach=807;image)
Hooman,
I spent some time learning more about post build events in Visual Studio. Below is the result. Now Visual Studio will check if the code was built in release mode, and if so create a folder called 'OP2MapImager X.X' inside the release folder. Then it populates the new folder with all the files required for the build.
The user will still have to manually change the version number in the code, folder name, and ReadMe before zipping the folder. So, I guess it eliminates some of manual labor.
Thank you for the reply in the other topic on expanding VolDecompress. I haven't had time to implement it yet, but should be next on the list to look at. I would prefer moving away from the WIN API code, but I'm not sure it makes sense for me to just rewrite a smattering of the functions this way and leave the rest with WIN API and COM interoperability.
x86 vs x64 compatibility
I think the WIN API code used in OP2VolDecompress is not compatible with x64 builds. I can only seem to get it to compile in x86.
xcopy /y /d "$(ProjectDir)FreeImage\FreeImage.dll" "$(OutDir)"
if $(ConfigurationName) == Release (
mkdir "$(OutDir)OP2MapImager X.X"
xcopy /y /d "$(ProjectDir)well00*.bmp" "$(OutDir)OP2MapImager X.X"
xcopy /y /d "$(ProjectDir)FreeImage license-gplv3.txt" "$(OutDir)OP2MapImager X.X"
xcopy /y /d "$(ProjectDir)ReadMe.txt" "$(OutDir)OP2MapImager X.X"
xcopy /y /d "$(OutDir)FreeImage.dll" "$(OutDir)OP2MapImager X.X"
xcopy /y /d "$(OutDir)$(TargetFileName)" "$(OutDir)OP2MapImager X.X"
)
For the X.X, perhaps your script can set a shell variable using the "svn info" command. The command on it's own will list a number of values, but you can select one specifically with the --show-item argument. I would recommend experimenting with the "last-changed-revision" value. You could tag each release with the SVN revision number. I believe by setting the path/url to the root of your project, it will include the most recent revision change to that project (ignoring more recent repository changes outside of your project). Of course if you depend on another project, it might be easier to just use "revision" (the global value). Another possibly useful value is "last-changed-date".
info: Display information about a local or remote item.
usage: info [TARGET[@REV]...]
Print information about each TARGET (default: '.').
TARGET may be either a working-copy path or URL. If specified, REV
determines in which revision the target is first looked up.
With --show-item, print only the value of one item of information
about TARGET. One of the following items can be selected:
kind the kind of TARGET
url the URL of TARGET in the repository
relative-url the repository-relative URL
repos-root-url the repository root URL
repos-uuid the repository UUID
revision the revision of TARGET (defaults to BASE
for working copy paths and HEAD for URLs)
last-changed-revision the most recent revision in which TARGET
was changed
last-changed-date the date of the last-changed revision
last-changed-author the author of the last-changed revision
wc-root the root of TARGET's working copy
...
Once you have some kind of version tag, I would also recommend zipping the package as part of the build step.
If the release package preparation is slow, it may be nice to have it separated out. Looks like you're only doing it for release mode, which should be good enough. If later you decide to add testing, which might also be done in release mode, the packaging could be move to a later step. That's why I suggested a separate release package project that depends on the actual project. The actual project would still be the default project for builds, so the release package build would only run when requested, and project dependency rules would ensure the actual build was also up-to-date when it runs. Not sure if this makes sense to do; just wanted to be clear on the idea.
The Win API code does have differences between 32-bit and 64-bit. Mostly though, it uses auto-expanding type names, so they're the appropriate type on each platform. Depending on how things are coded, it can often be source compatible between the two, though not likely to be binary compatible.
You shouldn't need to use the Win API. You can depend on the standard C++ library. It's mostly just file access code that depends on the Win API, which there is reasonably straightforward replacements to in the standard C++ library.
I didn't see an easy way to do binary memory stream in the standard C++ library. It may make sense to still use your own StreamReader classes. A custom FileStreamReader class can just delegate to standard C++ library file access classes. A custom MemorySreamReader class can just be built in raw C++ without any standard library or operating specific code.
The existing code for OP2Editor used the Win API for the FileStreamReader. The MemoryStreamReader was simple C++ code without use of libraries or the Win API. The code was quite simple in both cases.
Another Update,
I spent some time learning more about the command prompt and writing batch programs, which is essentially what Visual Studio C++ Post Build Events are.
There is not a built in zip console command in Windows (strange I thought), so I settled on 7-Zip's console application. 7Zip is a fairly well known, free, and easy to use alternative. I placed 7za.exe in the repository, which means that others will be able to run the post build event without having to fetch the application manually (if they are running windows).
I had to download the console application portion of TortoiseSVN. Unfortunately, it does not default to installing the console application when you install TortoiseSVN. So, if others pull the source code, the post build script will blow up until they do the same if not already installed. Perhaps there is a way to place the executable in folder like I did with 7-zip?
The code below is the full post build script. Basically it
- saves the filename and directory to a local variable
- creates a directory for all release files
- copies all required files to the release directory
- zips the release directory using 7-zip (I'm using the standard zip format, not 7-zip's arguably better format)
- deletes the initially created directory
xcopy /y /d "$(ProjectDir)FreeImage\FreeImage.dll" "$(OutDir)"
REM SET zipName="OP2MapImager Ver1.0" svn info --show-item revision --no-newline
SET zipName="OP2MapImager Ver1.0"
SET directoryName=$(OutDir)%25zipName%25
if $(ConfigurationName) == Release (
mkdir %25directoryName%25
xcopy /y /d "$(ProjectDir)well00*.bmp" %25directoryName%25
xcopy /y /d "$(ProjectDir)FreeImage license-gplv3.txt" %25directoryName%25
xcopy /y /d "$(ProjectDir)ReadMe.txt" %25directoryName%25
xcopy /y /d "$(OutDir)FreeImage.dll" %25directoryName%25
xcopy /y /d "$(OutDir)$(TargetFileName)" %25directoryName%25
7za a -tzip $(OutDir)%25zipName%25.zip %25directoryName%25
REM DEL /q %25directoryName%25\*.*
RMDIR %25directoryName%25 /s /q
)
The only piece I need help on is setting the revision number from SVN. I cannot figure out how to set the contents of a local variable as a concatenation of a string and the output of the SVN console command.
Successfully working SVN console command:
svn info --show-item revision --no-newline
I want it to read something like:
SET zipName="OP2MapImager Ver1.0" svn info --show-item revision --no-newline
@Hooman,
Thanks for the reply in the other thread on vol decompression. Now that I'm done fiddling with the post build event (besides the issue noted), I'll head back over and try to implement the tips you gave.
This may be relevant:
https://stackoverflow.com/questions/17546016/how-can-you-zip-or-unzip-from-the-command-prompt-using-only-windows-built-in-ca (https://stackoverflow.com/questions/17546016/how-can-you-zip-or-unzip-from-the-command-prompt-using-only-windows-built-in-ca)
It seems there is a way to compress files to ZIP using Power Shell with newer versions of Windows. Power Shell can be called from a regular shell or batch file. I noticed an alternate methods towards the end that may work for older versions of Windows.
For the variable, this might help:
https://stackoverflow.com/questions/2323292/windows-batch-assign-output-of-a-program-to-a-variable (https://stackoverflow.com/questions/2323292/windows-batch-assign-output-of-a-program-to-a-variable)
Looks a bit sick, but seems to work.
On Linux, you could just use either of the following:
export variableName="Some fixed string: `command args`"
export variableName="Some fixed string: $(command args)"
I finished the first of 3 books on the Windows command line/batch scripting. I figured out enough to pull in the SVN revision number and append it to the end of the directory name. The author separated his book into 3 volumes about 85 pages each. They sell for $3.00 each on Amazon for an electronic copy. I would have been willing to pay $9.00 for the book all together. https://www.amazon.com/Learn-Command-Line-Batch-Script-ebook/dp/B00V9D3QVM/
New post build script is below. New release workflow at the bottom of the ReadMe is posted below that.
REM Place FreeImage DLL into debug or release directory for running application within Visual Studio.
xcopy /y /d "$(ProjectDir)FreeImage\FreeImage.dll" "$(OutDir)"
REM Pull repository revision number and set into final release directory name.
FOR /f %25%25i IN ('svn info --show-item revision --no-newline') DO SET svnVersion=%25%25i
SET zipNameBegining="OP2MapImager Ver1.0."
SET zipName=%25zipNameBegining%25%25svnVersion%25
SET directoryName=$(OutDir)%25zipName%25
if $(ConfigurationName) == Release (
mkdir %25directoryName%25
xcopy /y /d "$(ProjectDir)well00*.bmp" %25directoryName%25
xcopy /y /d "$(ProjectDir)FreeImage license-gplv3.txt" %25directoryName%25
xcopy /y /d "$(ProjectDir)ReadMe.txt" %25directoryName%25
xcopy /y /d "$(OutDir)FreeImage.dll" %25directoryName%25
xcopy /y /d "$(OutDir)$(TargetFileName)" %25directoryName%25
REM Use 7Zip to zip final version
7za a -tzip $(OutDir)%25zipName%25.zip %25directoryName%25
DEL /q %25directoryName%25\*.*
RMDIR %25directoryName%25 /s /q
)
1. If changing Major/Minor revision number, set new version number at top of OP2MapImager main.cpp AND in post build event batch script.
2. Run SVN Commit and then SVN Update to merge committed changes and properly update revision number of repository.
3. Set Solution Configuration to Release.
4. Set Solution Platform to x86. (x64 is not currently supported due to some WINAPI code within OP2Utility VolDecompress).
5. Compile Code.
6. The following files will automatically be copied into the zipped directory'OP2MapImager 1.0.XXXX' (XXXX is svn revision number):
* OP2MapImager.exe (From Release Directory)
* FreeImage.dll (x32 version)
* Well0000.BMP-Well0012.BMP (Reformated BMPs that a normal image editor may open.)
* ReadMe.txt (this file)
* FreeImage liscense-gplv3.txt
7. Place zip file on the Outpost Universe Website.
I think you might be better served by adding appropriate error handling, rather than flags to avoid special case exceptions.
Perhaps it would be better to add 2 public functions to ResourceManager, one that returns all map files in the directory and the other to return all save files in the directory? These respective functions could then remove wellpallet.map or OP2SGAME10.OP2 before returning their values. Currently ResourceManager allows returning all files with a given extension, which is what is creating the problem. It might not be apparent to the user that they will not get wellpallet.map when trying to get all the files of a given extension if they don't see the flags when scanning the header file.
Another aspect that I wasn't sure about was returning filenames vs StreamReaders. I have almost 100 unique maps in my Outpost 2 copy. It seemed strange to pass out 100 StreamReaders simultaneously, so I elected to pass out filenames if you ask for more than a single file. Maybe hundreds of StreamReaders isn't a big deal though? It could cause problems though if someone in the future is using the ResourceManager to get listings of files in archives for modification I suppose.
So, I could remove the it could look like:
SeekableStreamReader* getResourceStream(const string& filename, bool accessArchives = true);
void getAllFilenamesOfType(vector<string>& filenamesOut, const string& directory, const string& extension, bool accessArchives = true);
void getAllSavedGames(vector<string>& filenamesOut, const string& directory);
void getAllMaps(vector<string>& filenamesOut, const string& directory, bool accessArchives = true);
Currently there is a public function in ResourceManager:
void getAllStreamsOfFileType(vector<SeekableStreamReader*> seekableStreamReadersOut, const string& directory, const string& extension, bool accessArchives = true);
I think it would be best to just remove this and have the user just get filenames and ask for each stream one file at a time. I'm currently not using the function in the MapImager anyways.
In short, if you're not writing a batch file, or a command that will be run by the shell, use forward slash. It will work everywhere
Thanks for fixing. I'll start to incorporate this into the future.
Perhaps a future pair programming session?
I'm interested, but it will be a month or so before it is something I would be able to dedicate a block of time to.
Perhaps it would be better to add 2 public functions to ResourceManager, one that returns all map files in the directory and the other to return all save files in the directory?
That seems a bit too specific for ResourceManager. Maybe.
It might be better to offer a shell glob search function from ResourceManager. Then users can search for "*.map", "*.OP2", or even better "SGAME0?.OP2". The last search pattern would save you the trouble of one of those flags.
As for the "wellpallet.map" file, does it actually crash, or just produce an ugly unexpected map? If it crashes, you need better exception handling. I think the map imager should just produce an error line stating a file couldn't be processed, maybe some details about the error, and then continue on.
On another note, "wellpallet.map" should be "wellpalette.map". And if it's not a real map file, and can't be processed as such, maybe it should be renamed to have a different extension. What does it even do anyway? There is already tile group info in the editor, so I assume it's not just duplicating that.
Another aspect that I wasn't sure about was returning filenames vs StreamReaders. I have almost 100 unique maps in my Outpost 2 copy. It seemed strange to pass out 100 StreamReaders simultaneously, so I elected to pass out filenames if you ask for more than a single file. Maybe hundreds of StreamReaders isn't a big deal though? It could cause problems though if someone in the future is using the ResourceManager to get listings of files in archives for modification I suppose.
Open files is a finite system resource. Don't just return hundreds of open files without a very good reason. It might not even work if the file limit is set too low. I'd say dump that code, as programmers should not be encouraged to do things that way.
This limit doesn't so much apply to the memory streams, or archive file contents, since they all share one actual file, but it still seems like a bad pattern.
I'm thinking you should avoid the out parameters. Not sure if this design is about syntax, or efficiency.
Assuming efficiency, you do want to avoid copying large structs and arrays around, but a vector is really just a small data type that references an array. It should be possible to pass the vector type by value, and still get by reference transfer of the underlying array data.
Relevant: Return Value Optimization (RVO)
In regards to syntax, I think it would be more clear to simply return the vector. The function user could then declare a variable and assign it to the result of the function, rather than having to declare the variable up front to be passed in.
vector<type> localVar = functionReturningVector();
I implemented a function called getAllFilenames that accepts a string that is turned into a regex search for the filename. This gave me fine grained control to avoid the last saved game and anyone in the future to do similar things without bloating the number of functions available. I also deleted the flags for the specific filenames.
I was looking at wellpallet.map, and it isn't being distributed by the base download of the game, so it shouldn't typically be a problem. I'm not sure where/when it got added to my copy of Outpost 2. It does render into an image without throwing an error. I moved the for loop that checks for it and deletes it into the MapImager code so ResourceManager doesn't have to worry about it. Something that didn't really need to be worried about in the first place, but it is taken care of now.
Here are the public members of ResourceManager now:
SeekableStreamReader* getResourceStream(const string& filename, bool accessArchives = true);
void getAllFilenames(vector<string>& filenamesOut, const string& directory, const string& filenameRegexStr, bool accessArcives = true);
void getAllFilenamesOfType(vector<string>& filenamesOut, const string& directory, const string& extension, bool accessArchives = true);
// Searches .vol and .clm archives for file and then extracts it.
// returns true if EITHER the file is extracted OR
// if BOTH overwrite == false AND the file already exists in the directory.
bool extractFile(const string& filename, bool overwrite = false);
// Searches all .vol and .clm files and extracts any file with the given extension.
void extractAllOfFileType(const string& directory, const string& extension, bool overwrite = false);
I'm thinking you should avoid the out parameters. Not sure if this design is about syntax, or efficiency.
I am trying to return a reference to the vector since it doesn't make since to me to copy a class in this instance. I could return a pointer, but then have to explicitly delete the object later. This way I just create it upfront and don't have to delete it.
I read up some on the Named Return Value Optimization. So basically C++ will often compile into returning a reference even though the specification is to return a copy of a class from a function? Kind of messed up???
This limit doesn't so much apply to the memory streams, or archive file contents, since they all share one actual file, but it still seems like a bad pattern.
Agreed, I deleted the function.
When I chose filesystem, I was hoping it would allow for being cross platform. The problem is that it is still in draft form, so they may change the implementation whenever it is actually released. and I didn't know if compilers on other operating systems support it, so I still pushed all the filesystem specific code into XFile.
Using XFile is fine. What I'm saying is that within XFile, you have a lot of "#if defined PLATFORM" style code. A lot of it appears to be preemptive, assuming there will be a problem before it even has a chance. As the calls go to <filesystem>, which itself should be cross platform, you should be able to just assume platform specific differences will have already been taken care of. If they're not, a compile error on that functionality is probably more useful than a generic error message about the platform not being supported. I'd suggest removing the the guards.
I use primarily g++ on Linux, and on occasion clang. I do have <experimental/filesystem> available.
Currently the default folder is wherever OP2MapImager.exe is located. It does make more sense to switc this so ResourceManager uses the folder where you told it to find the archive files as the default directory.
Yes, I think the ResourceManger should store the path it gets passed when finding the VOL and CLM files.
I need a way to distinguish between the directory, relative or absolute, and the REGEX expression that the user wants to use to search for multiple filenames though right?
You can convert relative paths (including regex/glob) to absolute by prepending a root folder. This might be the path to the VOL and CLM files. Paths are specified to the ResourceManager as relative to its own folder, which are made absolute before being passed to the OS. This avoids issues where changes to the current working directory cause unintended side effects.
You can avoid prepending if the path is already absolute. Though whether to accept such paths is another issue. In some cases accessing files outside of some root folder might be a security risk. In such cases, it may make sense to throw an exception rather than process the path.
if (directoryStr == "" || directoryStr == "/" || directoryStr == "\\" || directoryStr == " ")
directory = current_path();
Here you might consider raising an exception for the "/" and "\\" cases. For the other two cases, the if is possibly meaningless, since the default handling for relative paths is to consider them relative to the current working directory (current_path()).
Also, it's concerning you're checking for " " (a single space). This strikes me as papering over a bug or bad design somewhere else in the code.
I use primarily g++ on Linux, and on occasion clang. I do have <experimental/filesystem> available.
Sounds good. I'll take the if Defined statements out then.
You can convert relative paths (including regex/glob) to absolute by prepending a root folder. This might be the path to the VOL and CLM files. Paths are specified to the ResourceManager as relative to its own folder, which are made absolute before being passed to the OS. This avoids issues where changes to the current working directory cause unintended side effects.
You can avoid prepending if the path is already absolute. Though whether to accept such paths is another issue. In some cases accessing files outside of some root folder might be a security risk. In such cases, it may make sense to throw an exception rather than process the path.
I want to ensure the directory does not include any REGEX and allow the filenames to include REGEX. So the user cannot search through multiple directories using REGEX, but they can search for multiple files in a directory. I'm not sure how to definitively separate out relative and or absolute directories from filenames that may include REGEX, so I think it is best to provide the directory and the REGEX defined filenames as separate variables.
I have no problems with the OP2MapImager searching anywhere on a user's computer that they specify for maps, archive files, and well files through either relative or absolute directories. I don't have any problem allowing them to save the renders anywhere they want. If this is a concern though, then I guess we could look into limiting to only saving files in My Documents or something?
if (directoryStr == "" || directoryStr == "/" || directoryStr == "\\" || directoryStr == " ")
directory = current_path();
Here you might consider raising an exception for the "/" and "\\" cases. For the other two cases, the if is possibly meaningless, since the default handling for relative paths is to consider them relative to the current working directory (current_path()).
Also, it's concerning you're checking for " " (a single space). This strikes me as papering over a bug or bad design somewhere else in the code.
I need a way to receive from the user that they are attempting to search the current directory for maps. If they pass
OP2MapImager /
or
OP2MapImager \
it will image all maps in the current directory. I'm not really experienced with the command prompt. If there is a better way to handle, I'd be happy to know about it.
I did just check, and you cannot pass an empty string through a command prompt argument, so an empty string or a single space aren't really an option. I wasn't sure when first writing so I threw them both in.
All of XFile is pretty unpolished right now. I was learning how to use the filesystem library code when writing it and didn't spend a lot of time cleaning it up. Since filesystem should work on both Linux and Windows, and you seem comfortable using it even in draft form, it would be good to spend some time cleaning it up.
Update. I pushed some changes to the code base to try and do some cleaning.
I want to simplify the function getFilesFromDirectory. However, if directory_iterator is used with a path that is empty (""), then it will not iterate over the current directory. I have to set the directory via current_path in order to make it work. That is causing the weird check the path against the directory in the first 3 lines.
Does it make sense to allow iterating over the directory if a path is passed into the function that contains a filename? If no, then I could simplify by just throwing an error if a filename exists at the end of the path.
I can move the line to check if the path string is a forward/backwards slash to the mapImager main.cpp file. Then it would be an implementation detail for how the user tells OP2MapImager how to iterate over the current directory. This might make more sense than placing it in XFile. This would apply to both versions of getFilesFromDirectory.
vector<string> XFile::getFilesFromDirectory(const string& pathStr, const string& fileType)
{
path p(pathStr);
path directory(p.remove_filename());
if (p == directory)
directory = current_path();
if (pathStr == "/" || pathStr == "\\")
directory = current_path();
vector<string> filenames;
for (auto& directoryEntry : directory_iterator(directory))
{
if (directoryEntry.path().extension() == fileType )
filenames.push_back(directoryEntry.path().string());
}
return filenames;
}
I want to ensure the directory does not include any REGEX and allow the filenames to include REGEX. So the user cannot search through multiple directories using REGEX, but they can search for multiple files in a directory.
Why? I don't see a reason for this restriction.
I'm not sure how to definitively separate out relative and or absolute directories from filenames that may include REGEX, so I think it is best to provide the directory and the REGEX defined filenames as separate variables.
Absolute/relative is an orthogonal concern to fixed/glob/regex.
A relative path is not complete. It can not be used to access files or folders. Either you must prepend a root path, or the OS will prepend the current working directory as the root path, and from there it will match using the absolute path.
You need a fixed path to open a file or folder. If instead you have a glob/regex, it must be used to search for a match. Any matches then become the fixed path to operate on, usually one at a time in a loop.
It makes perfect sense to use a glob or regex or match multiple folders, even if the filename portion ends up being fixed. Ex: "**/makefile" to find any "makefile" in any of the subfolders rooted at the current working directory. The "**" part will match zero or more folders. It's not found in every environment. It's common in the Ruby and Javascript worlds, but doesn't work in the command shell.
I have no problems with the OP2MapImager searching anywhere on a user's computer that they specify for maps, archive files, and well files through either relative or absolute directories. I don't have any problem allowing them to save the renders anywhere they want. If this is a concern though, then I guess we could look into limiting to only saving files in My Documents or something?
Don't worry about it for your map imager, but it may be a concern for a more general utility library. If such a component is used in a game that support multiplayer, and the other computers can influence which files are accessed, that can be a problem. Consider a game that support resource downloading, such as map files. Maybe it takes a map filename such as "custom.map", or "campaign/01.map", which by default will be in the game's "map/" folder. If instead, the path is "/etc/passwd", or (some arbitrary number of) "../../etc/passwd", or "symLinkToRoot/etc/passwd", then you could potentially trick the client into uploading its (Linux) password file. A contrived example here, since the host distributing the map is usually the one choosing the map. You could extend the idea though to the resources accessed by the map. Maybe the map references "/etc/passwd", so when you send the map to other players, their computers may try to access "/etc/passwd", which might get loaded into the game and somehow leaked back to the host.
You can defeat the absolute path attack by unconditionally adding a non-empty prefix. You can defeat the ".." attack by expanding the path to a canonical form, and ensuring the prefix still matches and there are no ".." left (or hack it, and just disallow any ".."). You might also want to turn off following symlinks in case a name links to a file or folder in a parent directory. This is sometimes handled by expanding to a canonical path, which may translate symlinks to their targets.
I need a way to receive from the user that they are attempting to search the current directory for maps.
You could check the count of the number of arguments, and if absent set a default path of "" for the current folder. Failing that, the user could specify:
Or even:
I want to simplify the function getFilesFromDirectory. However, if directory_iterator is used with a path that is empty (""), then it will not iterate over the current directory.
That is indeed unfortunate, and very unexpected. This may warrant further investigation.
if (pathStr == "/" || pathStr == "\\")
directory = current_path();
This is very unusual. Such a path means absolute, not relative, and so it's unexpected that it would end up being based off the current working directory. I may have to take a closer look at this code.
Have you exported a makefile for the relevant projects?
I played more with filesystem. filesystem::is_directory returns false upon passing a blank string (""). So, based on this, I'm thinking that filesystem is purposely not iterating over the current directory when a blank string is provided as the path. When I use a dot (.) or ./, everything works properly though. Reading the specification document for filesystem would possibly answer the question definitively, but I'm happy just using ./.
I had forgotten that ./ is used for the current relative directory. :-[ I reworked the OP2MapImager to take either . or ./ for indicating the current directory. Not allowing "" for a relative directory will keep someone from iterating over 100 maps by accident just because they left off the filename by accident. It also plays nicer with filesystem.
It didn't occur to me you could actually pass a blank string or a single space as an argument by using quotes. Experimenting with it a little teased out a bug when you pass in an empty string. I've fixed this bug on the development version.
Some point in the near future I'll release a new version of the OP2MapImager with both of these issues fixed.
So, that simplifies getFilesFromDirectory to the code below (minus the 3 commented out lines in the middle)
// For current relative directory use . or ./. An empty string is not considered a valid directory.
vector<string> XFile::getFilesFromDirectory(const string& directoryStr, const string& fileType)
{
path directory(path(directoryStr).remove_filename());
// Brett208 4Aug17: creating a path with an empty string will prevent the directory_iterator from finding any files in the current relative path on MSVC.
//if (directoryStr == "")
// directory = path(".");
vector<string> filenames;
for (auto& directoryEntry : directory_iterator(directory))
{
if (directoryEntry.path().extension() == fileType )
filenames.push_back(directoryEntry.path().string());
}
return filenames;
}
Have you exported a makefile for the relevant projects?
No, but we will need to decide how to build the project on Linux at some point. I was considering just creating a completely new solution/project in whatever IDE you prefer on Linux and manually configuring required properties. We will have to load in the FreeImage.dll designed for Linux and create a separate post build event that complies with whatever IDE you want to use since I'm doubting it accepts Windows batch scripts. I've read up enough to know sort of what a makefile is and does, but not enough to use in practice. However it is done, I'm sure to learn a lot in the process.
Oh course first the Windows.h header needs to be teased out of the Archive code.
I was thinking to create a Console Application that finished defining the required functions to manipulate archive files first. This way we can test the results of changes to the archive code as windows.h is removed. Then perhaps circle around to the Linux build of everything.
MSVC has an option to export a Makefile. It might be easier to use that than generate a new one by hand. Though the NAS2D makefile is pretty generic, so maybe we could just reuse that.
Teasing out the "#include <windows.h>" will be a bit of a pain. It might be easier if we used regular stream code rather than memory mapped files. Perhaps converting that should be the first step.
I played around with the <experimental/filesystem> code today on Linux. Indeed, the directory_iterator does not like empty path strings. It also doesn't like shell globs, and it doesn't like filenames either. It must be a non-empty folder name.
The filesystem library does handle converting paths between various path/string/char* formats. That could allow some slight simplification to the code. For instance, directory_iterator can take any of those 3 types, which means you don't need to convert a string to a path before calling it. You also don't need to explicitly convert a path to a string before adding it to the vector.
After playing around a bit, I built out my example to be similar to your code, minus the filename extension checks. Here's what it looked like:
std::vector<std::string> getFilesFromDirectory(const std::string& dir)
{
auto pathStr = dir.length() > 0 ? dir : "./";
std::vector<std::string> filenames;
for (auto& entry : fs::directory_iterator(pathStr))
{
filenames.push_back(entry.path());
}
return filenames;
}
Oh, and I'd say just assume the folder name passed is an actual folder, and not a file. Trying to strip a filename from the path and use the containing folder is a bit unexpected and could lead to subtle bugs in library use. I'd say just let the path feed through to the directory_iterator, where it will cause an exception if it refers to a file rather than a folder.
leeor_net,
It looks like a clean library. The goal of using filesystem is that it may eventually become part of the standard library, which means many people would become familiar with it. Of course, since it is experimental right now, maybe using it isn't worth the squeeze.
Hooman,
Are you using an alias of fs for the namespace std::experimental::filesystem? If not, it looks like they named the namespace differently between the two compilers.
for (auto& entry : fs::directory_iterator(pathStr))
Produces
Error (active) E0276 name followed by '::' must be a class or namespace name
I'm getting E0304 from the line of code below. If I change it to entry.path().string(), then it compiles. So it looks like there are some differences between the two compilers implementation.
filenames.push_back(entry.path());
Error (active) E0304 no instance of overloaded function "std::vector<_Ty, _Alloc>::push_back [with _Ty=std::string, _Alloc=std::allocator<std::string>]" matches the argument list
Here is the function rewritten to work in VS17. Since I have a using statement, declaring experimental::filesystem namespace for directory_iterator isn't necessary, but left it to point out the difference.
std::vector<std::string> getFilesFromDirectory(const std::string& dir)
{
auto pathStr = dir.length() > 0 ? dir : "./";
std::vector<std::string> filenames;
for (auto& entry : experimental::filesystem::directory_iterator(pathStr))
{
filenames.push_back(entry.path().string());
}
return filenames;
}
Also, if we want to accept empty strings as meaning the same thing as ./ or ., we should update the following two functions to read this way.
bool XFile::isDirectory(const string& pathStr)
{
if (pathStr.length() == 0)
return true;
return is_directory(pathStr);
}
bool XFile::pathsAreEqual(string pathStr1, string pathStr2)
{
if (pathStr1.length() == 0)
pathStr1 = "./";
if (pathStr2.length() == 0)
pathStr2 = "./";
path p1(pathStr1);
path p2(pathStr2);
return p1 == p2;
}
If this all looks good, I'll commit the changes.
-Brett
Ahh yes, I forgot the namespace line:
#include <experimental/filesystem>
namespace fs = std::experimental::filesystem;
The blank path was a problem for directory_iterator. I don't know if it's an issue for is_directory or path object comparisons. I had no trouble setting a path object to the empty string. I only had problems when I tried to use an empty path object with directory_iterator.
Maybe check what happens when is_directory is passed an empty string. As for comparing path objects, I'm wondering how robust the comparison is. A simple string compare won't be very effective in some cases, though I don't know if it does a string compare or something more involved. What if you compare "." and "./"? What if you compare "./" and "./subFolder/../"? In some cases it may make sense to do a simple string compare, to see if two paths are the same and in the same form. In other cases, you might want to know if two paths refer to the same thing, even if they are given in different forms.
Just committed changes to the repo.
There are now 3 versions of getFilesFromDirectory. The one Hooman just wrote, one using regex for the filename, and one that uses the file extension. The regex and file extension overloads first call the generic getFilesFromDirectory and then remove files that do not meet the regex or file extension requirement. This way we do not have to repeat the line of code forcing a blank string into "./" and the directory iterator in each function.
vector<string> XFile::getFilesFromDirectory(const string& directory, const regex& filenameRegex)
{
vector<string> filenames = getFilesFromDirectory(directory);
for (int i = filenames.size() - 1; i >= 0; i--)
{
if (!regex_search(filenames[i], filenameRegex))
filenames.erase(filenames.begin() + i);
}
return filenames;
}
I have already tested is_directory with a blank string. It will return false, which makes sense in light of the directory_iterator not working with a blank string.
I played some with comparing paths. Using == against filesystem::path basically compares if the underlying string is the same. So == will return "." and "./" as different (false). However, the filesystem::equivalent function will return "." and "./" as being the same (true).
So, I rewrote the function as pathsAreEquivalent. The areas I was using pathsAreEqual in the code base, I was using it to compare if the filename itself was the same. Now in the code base, I just changed it to compare (==) the raw strings or used strcmp in the archive code since it uses char* instead of std::string.
// Will throw an exception if passed only a filename as path1 or path2. Lead filenames with relative or full directory.
bool XFile::pathsAreEquivalent(string path1, string path2)
{
if (path1.length() == 0)
path1 = "./";
if (path2.length() == 0)
path2 = "./";
return equivalent(path(path1), path(path2));
}
Test Data:
string d1 = ".";
string d2 = "./";
string d3 = "./Debug/../";
string d4 = "";
bool equal = XFile::pathsAreEquivalent(d1, d2); // TRUE
equal = XFile::pathsAreEquivalent(d1, d3); // TRUE
equal = XFile::pathsAreEquivalent(d2, d3); // TRUE
equal = XFile::pathsAreEquivalent(d1, d4); // TRUE
There are now 3 versions of getFilesFromDirectory. The one Hooman just wrote, one using regex for the filename, and one that uses the file extension. The regex and file extension overloads first call the generic getFilesFromDirectory and then remove files that do not meet the regex or file extension requirement. This way we do not have to repeat the line of code forcing a blank string into "./" and the directory iterator in each function.
Ahh, I wasn't expecting you to snaffle my function. That's fine. Sounds like you've simplified the code. Though I should point out it could be considerably less efficient to build the full file list, and then filter it to remove items you don't want. That is especially true of a large folder with only a few files of interest.
Seems like a template function could be written that takes some sort of filter function. That could eliminate code duplication, runtime costs of building a list and then filtering, and also dispatch cost for the filter function. A default filter function could be set that doesn't remove any filenames. Not quite sure how it would work though.
return equivalent(path(path1), path(path2));
What is "equivalent"?
I like how you have test code. Is it wrapped in a test suite? I was thinking having a few unit tests could be good for some of these projects.
The case sensitivity thing is more an issue of the file system than the OS. Think how Linux accesses NTFS or FAT partitions, or what happens when Windows accesses a file over a network share that's hosted on Linux. Sometimes there is a bit of support to paper over differences, sometimes not.
The VOL files are essentially their own file system. Whether they are case sensitive or not is up to the design of VOL files. I suppose I could check the Outpost 2 code to see if it uses case insensitive comparisons. Mostly though I'd suggest not bothering. It's not likely to be an issue in practice, and if it is, the user will probably be the best person to handle it.
Did you notice the bug in my current release before or after asking that question? :)
Ehh, no. Being on Linux.... It was just a curious question.
In FreeImage, the Rescale function is:
FIBITMAP* FreeImage_Rescale(FIBITMAP *src, int dst_width, int dst_height, FREE_IMAGE_FILTER filter)
In your code, you're calling it without the last parameter, which defaults to FILTER_CATMULLROM. The available filters are:
FI_ENUM(FREE_IMAGE_FILTER) {
FILTER_BOX = 0, // Box, pulse, Fourier window, 1st order (constant) b-spline
FILTER_BICUBIC = 1, // Mitchell & Netravali's two-param cubic filter
FILTER_BILINEAR = 2, // Bilinear filter
FILTER_BSPLINE = 3, // 4th order (cubic) b-spline
FILTER_CATMULLROM = 4, // Catmull-Rom spline, Overhauser spline
FILTER_LANCZOS3 = 5 // Lanczos3 filter
};
It seems the further down the list you go, the better the quality is supposed to be. It actually depends on a number of factors though, both about the source image, and the scale factor. In particular, if you scale down by more than 1/2, some algorithms can skip over source data entirely, which may be undesirable. Also, some of them may use surrounding pixels in the formula, which may not be desirable due to the tile nature of the source image. You could end up with pixels from an adjacent tile bleeding into a scaled down tile. For these reasons, I think the first option might actually be the most desirable for our application, as it should have a hard cutoff at tile boundaries, and not discard data when scaling significantly, but rather average all the data into the output. Further, the further down the list you go, the more the computational requirements tend to increase. Hence, for both speed and quality, it might might sense to try the first option. I suspect there will be no noticeable difference in either speed or quality, partly for the reasons you've stated, though I thought I'd ask if you've tried already.
We could speed up rendering large batches of low pixels per tile by caching different scales of the tiles so they didn't have to rescale each time another render is taken.
Yes, I was thinking of that. I was actually looking at the code wondering if you might have already done it.
This has some side effects though if you are going to render say both stock tileset maps and Greenworld tileset maps in the same session, since they use the same name for wells.
I had also thought of this too. If you include the VOL file as part of the path, as if it were a folder, it might solve the name issue. Given enough tilesets, it could create a memory issue, though I doubt that will ever become a problem. Could just have the user convert images in smaller batches, or have the rendering discard old cached items if the current map uses a different tileset.
Thanks for the timing info. It seems the large maps take a fairly significant portion of the time. Does that also hold for large 32x32 renders? I suspect there is a fair bit of overhead for each tile draw, beyond just copying the pixels. That may lower efficiency as tile sizes drop.