I'm looking for a general rule of thumb on when it is or is not appropriate to assume RVO kicks in when returning a non-trivial object (such as a class or container of some sort).
For example, I wrote a function to instantiate an archive and return a pointer to the new ArchiveFile class as listed below:
ArchiveFile* archive = ConsoleHelper::openArchive(archiveFilename);
ArchiveFile* ConsoleHelper::openArchive(const string& archivePath)
{
if (XFile::extensionMatches(archivePath, "VOL"))
return new Archives::VolFile(archivePath.c_str());
else
return new Archives::ClmFile(archivePath.c_str());
}
However, this would be cleaner to work with if I rewrote it like the following:
ArchiveFile archive = ConsoleHelper::openArchive(archiveFilename);
ArchiveFile ConsoleHelper::openArchive(const string& archivePath)
{
if (XFile::extensionMatches(archivePath, "VOL"))
return Archives::VolFile(archivePath.c_str());
else
return Archives::ClmFile(archivePath.c_str());
}
I like this because I don't have to worry about dealing with the pointer or destroying the ArchiveFile later. Can I expect the compiler to optimize this by returning a reference to the Archive instead of a new copy?
-Brett
It's somewhat compiler dependent but you can generally expect a decent optimizing compiler to find a better way to do this besides constructing an object, copying said object, then destroying it. However, since you're passing by value to a base class, you run the risk of object slicing (https://en.wikipedia.org/wiki/Object_slicing) (partial copy of an object).
To avoid having to worry about memory management by returning pointers (and prevent object slicing), you can use a smart pointer. Something along these lines:
typedef std::shared_ptr<Archive> ArchivePtr;
ArchivePtr ConsoleHelper::openArchive(const string& archivePath)
{
if (XFile::extensionMatches(archivePath, "VOL"))
{
return std::make_shared<Archives::VolFile(archivePath.c_str())>();
}
else
{
return std::make_shared<Archives::ClmFile(archivePath.c_str())>();
}
}
By using smart pointers you can eliminate the need for memory handling and have a clean way of creating derived types without having to use raw pointers.
SIDE NOTE 1: You'll notice that I included the braces -- I would recommend you adopt this as a coding best practice to avoid potential mistakes that are easy to overlook (https://en.wikipedia.org/wiki/Unreachable_code#goto_fail_bug). If it appears too verbos you can condense it as such:
if (XFile::extensionMatches(archivePath, "VOL")) { return std::make_shared<Archives::VolFile(archivePath.c_str())>(); }
Though I tend to reserve this for very short statements:
if(someCondition){ return; }
SIDE NOTE 2: Determining file type by extension is dubious. It probably works in most cases but it lends itself to people doing stupid things like changing extensions. A better way to do this would be to check for some sort of magic value (like a header string). I'm not sure if vol/clm have such identifiers so this may not be possible -- in which case you have to assume that the user isn't an idiot and if they are they deserve whatever explosion ensues though I would add some sort of a try/catch block in an effort to fail gracefully vs. simply crashing.
It's somewhat compiler dependent but you can generally expect a decent optimizing compiler to find a better way to do this besides constructing an object, copying said object, then destroying it. However, since you're passing by value to a base class, you run the risk of object slicing (partial copy of an object).
To avoid having to worry about memory management by returning pointers (and prevent object slicing), you can use a smart pointer.
I started reading up on object slicing after your post. The concept is making my head hurt :|. I bought a book on more advanced C++ than the beginner book I had earlier but haven't made it to the chapter on smart pointers yet. I'll try and read up on them some more this weekend.
Thanks for the tip on using them.
SIDE NOTE 1: You'll notice that I included the braces -- I would recommend you adopt this as a coding best practice to avoid potential mistakes that are easy to overlook.
if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
goto fail;
if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
goto fail;
goto fail;
if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
goto fail;
Spotting the mistake in the code above seems very easy visually to me. I'd say removing the braces reduces vertical used space and does not make it any more difficult to read. Since people have reported bugs like this, I suppose it happens in real life. Although I cannot think of a single time that I have expanded an if statement to include 2 lines of code and not thought about the fact that a set of braces needs to be added.
I used to always include the braces, but after a long time, I found it just easier to read and more succinct to not use them for single lines of code after an if statement.
Perhaps not having the braces makes things more difficult when merging conflicting changes using a repository with other programmers, but I don't have any real examples or experience here to pull from?
if(someCondition){ return; }
If would typically write the above statement as
if(someCondition)
return;
I find including content that is typically spread across multiple lines of code less readable when condensed to one line.
SIDE NOTE 2: Determining file type by extension is dubious. It probably works in most cases but it lends itself to people doing stupid things like changing extensions. A better way to do this would be to check for some sort of magic value (like a header string).
There are magic values. vol files all start with "VOL" and CLM files all start with "OP2 Clump File Version 1.0".
I could look at designing some sort of function to stream into a string the first X bytes of the file into. I'm not sure that I want to support this feature though. Outpost 2 is unable to find and read CLM or VOL files if they do not contain their extensions. Either way you are right that I'm missing some sort of error exception. See below for a correction.
ArchiveFile* ConsoleHelper::openArchive(const string& archivePath)
{
if (XFile::extensionMatches(archivePath, "VOL"))
return new Archives::VolFile(archivePath.c_str());
if (XFile::extensionMatches(archivePath, "CLM"))
return new Archives::ClmFile(archivePath.c_str());
throw invalid_argument("Provided filename is not an archive file (.VOL/.CLM)");
}
Leeor_net is right about object slicing. You'll want to return the value by pointer or reference, rather than by value. If you're going to use smart pointers, I think unique_ptr might be appropriate here. Admittedly though, I have little experience so far with using the smart pointer classes.
If your main objective is to use the dot syntax (archive.method), instead of the arrow syntax (archive->method), using a reference is sufficient. You can convert at multiple locations, such as the new statement, the return statement, or at the point of call. Remember that references use value like notation, even though they are pointers under the hood:
Archive* ptr = new VolFile(fileName);
Archive& ref = *ptr; // You could inline this above
If your main object is to get rid of the delete, you're kind of out of luck with the current design. You have a function that returns a pointer to an object of either type. This has to be done by base class pointer. Since the object is returned by pointer/reference form the function, it's can't be a local variable stored on the stack. That means it's either a heap variable (new/delete), or a global (generally bad practice to use). At best you can hide the new/delete by using smart pointers.
A topic of interest for smart pointers is Resource Acquisition Is Initialization (RAII) (https://en.wikipedia.org/wiki/Resource_acquisition_is_initialization).
To avoid the new/delete, and global variables, you'd need a code structure where some function creates the appropriate object on the stack as a local variable, which can pass that object into other methods for future processing, but never returns it. In that way, the lifetime of the object never escapes the function that created it. This isn't always appropriate.
I have no qualms about checking file extensions in your code to guess the archive type. At any rate, the archive code already checks the file headers after trying to open the files. The code will throw an exception if it's opened by an archive class that doesn't match the file contents. If you wanted, you could use try/catch blocks to try opening the file as both types, and going with the first that succeeds, assuming one of them does. That would make it independent of the extension, though the code is a bit uglier, and this is supporting a use case people should really be avoiding, namely badly named files with misleading extensions.
A lot of languages have keyword delimited blocks, such as being/end:
if someBooleanCondition then begin
blockOfCode
end
If you're used to that format, than the C++ equivalent makes a lot of sense:
if (someBooleanCondition) {
blockOfCode;
}
Some people loathe this format. I used to. Though I began to try it after programming in languages that delimit blocks with keywords. It makes a big difference in the amount of code you have to scroll through. It takes one less line of vertical space, so it's more compact than having braces at matching indentation levels. It also passes all the lint validations for checkers that ensure "if" statements always having matching braces. The linters check this because bugs like the one leeor_net posted are surprisingly common. I too used to be of the opinion that it's quite visually obvious, though it's still possible to get wrong during refactoring due to competing demands for attention when making complicated changes involving many lines of code. This type of bug is not something that is generally written as is. It gets introduced during code changes, since without the braces, the structure of the code is very dependent on how many lines of code you have in a block.
Just got a unique_ptr working in the code. I had to format it as below with the curly brackets instead of the format Leeor had recommended. Not sure why. I ran out of time to research the subject today. I had tried using make_unique to create the pointer but it didn't like this.
unique_ptr<ArchiveFile> ConsoleHelper::openArchiveUniquePtr(const string& archivePath)
{
if (XFile::extensionMatches(archivePath, "VOL"))
return unique_ptr<VolFile>{new VolFile(archivePath.c_str())};
if (XFile::extensionMatches(archivePath, "CLM"))
return unique_ptr<ClmFile>{new ClmFile(archivePath.c_str())};
throw invalid_argument("Provided filename is not an archive file (.VOL/.CLM)");
}
-Brett
These might be an interesting and relevant read:
https://stackoverflow.com/questions/37514509/advantages-of-using-make-unique-over-new-operator (https://stackoverflow.com/questions/37514509/advantages-of-using-make-unique-over-new-operator)
https://stackoverflow.com/questions/22571202/differences-between-stdmake-unique-and-stdunique-ptr (https://stackoverflow.com/questions/22571202/differences-between-stdmake-unique-and-stdunique-ptr)
I think I see the difficulty. You need to specify the type (VolFile) to call the proper constructor, but you're returning a type based on the base class (ArchiveFile). The question is, are unique_ptr's covariant if their template parameters are? I'm guessing no, even though you'd want them to be here. I don't think there is an easy type safe way for the compiler to assume this, particularly since the compiler wouldn't have special info to know the smart pointers are really pointer types. This is all just me guessing here. I assume you tried like this?
return make_unique<VolFile>(archivePath.c_str());
unique_ptr<ArchiveFile> ConsoleHelper::openArchiveUniquePtr(const string& archivePath)
{
if (XFile::extensionMatches(archivePath, "VOL"))
return unique_ptr<VolFile>{new VolFile(archivePath.c_str())};
if (XFile::extensionMatches(archivePath, "CLM"))
return unique_ptr<ClmFile>{new ClmFile(archivePath.c_str())};
throw invalid_argument("Provided filename is not an archive file (.VOL/.CLM)");
}
In C++11 this is the equivalent of:
unique_ptr<VolFile> volPtr = new VolFile(archivePath.c_str());
return volPtr;
See this article (http://www.informit.com/articles/article.aspx?p=1852519) about C++11 initialization forms;
Hooman's suggestion of using make_unique should be correct.
It is working now. Thanks for the tips. I wasn't understanding that you simply feed make_unique the values required for the constructor. I was trying to feed make_unique a new object instead. The right syntax felt weird.
Incorrect
return make_unique<VolFile>(VolFile(archivePath.c_str()));
Correct
return make_unique<VolFile>(archivePath.c_str());
I don't have time right now, but hopefully sometime soon I'll have enough time to clean up all the calls to the function to the unique pointer and push to the repo.
-Brett
Although unique_ptr is a C++11 feature, the "make_unique" function is a C++14 feature. It got left out of the C++11 standard, which seems to have been an oversight. To get code to compile with g++, you'll need to pass the appropriate flag:
I didn't realize make_unique is C++14. I use VS2017 with default settings, which I think leave C++14 features in the code.
it should be perfectly fine to construct a unique_ptr of the derived type, and then return it as a unique_ptr to the base class. Everything will just work automatically.
Good to know. This is working from my couple of tests without issue in practice as well.
I had one more related question.
So, now that I have decided to return a unique_ptr to the ArchiveFile, I will want to pass the unique_ptr to a couple of subclasses to perform work with it.
for example
unique_ptr<ArchiveFile> archive = ConsoleHelper::openArchive(archiveFilename);
checkFilesAvailableToRemove(*archive, filesToRemove, consoleArgs.consoleSettings.quiet);
Would it be preferred to pass the ArchiveFile as a reference to a private class function as opposed to passing the pointer itself? I don't think in this case the private function actually cares at all the location/status of ArchiveFile as long as it can consume it? It would be up to the function calling checkFilesAvailableToRemove to ensure it is passing a valid ArchiveFile...
So the declaration for checkFilesAvailableToRemove would read as:
void checkFilesAvailableToRemove(ArchiveFile& archive, const vector<string>& filesToRemove, bool quiet);
-Brett