Two issues cloud this topic. The main problem typically experienced in C/C++ is local variables have a static size, fixed at compile time. You need heap memory for dynamic sized data, which is managed with pointers. This is an implementation detail. It's possible for a language to implement dynamic sized local variables, it's just not done in C/C++. The other issue that comes up is lifetime. Local variables are tied to the activation record (stack frame) of a function call. If the lifetime of a variable is not tied to the lifetime of a function, it will need to be stored outside of a function's activation record. This is in a way the real use of pointers, as this issue goes beyond simple implementation details.
// Dynamic size
void SolveSomeProblem(int size)
{
int* workingMemory = new int[size];
...
}
struct Struct {
// Fields ...
};
// Dynamic lifetime
Struct* InitNewStruct()
{
Struct* s = new Struct;
// Initialize fields ...
return s;
}
Sirbomber is right about avoiding pointers if possible. Avoid both pointers and dynamic memory allocation if you can, but don't go far out of your way to do so. They solve a specific set of problems, so use them when you need to. Dynamic memory allocation is more costly than allocating space for local variables on the stack. If you ever find yourself allocating memory that is a hardcoded constant size, stop and think about it for a moment, make sure the dynamic lifetime argument applies.
void DoNotDoThis()
{
char* buffer = malloc(1024); // Pointless use of fixed size heap memory, tied to life of function
// Use buffer somehow ...
free(buffer); // This function is also not exception safe, since this free is skipped if an exception is thrown (memory leak)
}
void BetterWay()
{
char buffer[1024]; // Local variable, allocated on stack
// Use buffer somehow ...
}
I recommend preferring reference types over pointer types when possible. It's tempting to think of references as syntactic suger for pointers, and that isn't far off. The main difference between the two, aside from syntax, is nullability, and re-assignability. A pointer can be null, and can have it's value changed to point to different objects. A reference must be set to reference an object (expected to be non-null) when it comes into scope, and can not be re-assigned. This simplifies reasoning about references, and there will be no null checks in your code. Both can be unsafe, if abused, pointing to invalid memory or an object after it has been destructed.
Obj obj; // Local variable
Obj *obj1 = 0; // Ok
//Obj &obj2 = 0; // Error (reference can't be assigned null)
Obj &obj3 = obj; // Ok (pointing to existing constructed object)
Obj &obj4 = *obj1; // Ok (but a bad idea, as this indirectly assigns null)
obj1 = &obj; // Ok (re-assign to existing constructed object)
//obj4 = &obj; // Error (reference can't be re-assigned)
obj1->a; // Pointer field access
obj3.a; // Reference field access
Passing large objects to functions should be done by reference (or pointer) whenever possible, for efficiency reasons. If you want the function to modify your object, then you also need to use a reference or a pointer for correctness reasons. If you don't want the function to change your object, you can pass the object as a reference to a const, or a pointer to a const. This of course restricts the function to only reading fields and calling const methods on the object. If the function needs to modify the object passed in, but not have the changes reflected in the caller, a copy needs to be made, possibly by making use of pass-by-value.
void f(SomeLargeClass obj); // By-value, copies object at point of call
void g(SomeLargeClass &obj); // Ry-reference, reference syntax
void h(SomeLargeClass *obj); // By-reference, pointer syntax
void SomeFunction()
{
SomeLargeClass obj;
f(obj); // By-value, object is copied
g(obj); // By-reference, but syntax is the same as by-value
h(&obj); // By-reference, clearly indicated at point of call
}
A benefit of references, is that a const reference can capture a temporary value that might not have a memory address. Consider passing the result of "5 + 5" to a function. This works if the parameter is declared as "const int&", but not if it's a pointer type (and not if it's not const).
void f(const int& value);
...
f(5 + 5); // Ok
Pointers to void essentially disable type checking. It makes sense for something like malloc to return a void*. Similarly for other functions that deal with generic memory allocation without regard to the type. In general avoid using void* if possible. Why disable type checking without a reason? Similarly, why cast without a reason. Type casts override type checking (and also silence warnings), so avoid them too when possible. It's better for something like malloc to return a void*, than to have every point of call do a cast to the appropriate type. It's also worth pointing out that C++ broke down casting into different types for added safety and checking, albeit, the syntax is a bit more verbose. If curious, look into static_cast, dynamic_cast, const_cast, and reinterpret_cast.
S1* s1 = malloc(sizeof(S1)); // No cast needed, type is now S1*
S2* s2 = malloc(sizeof(S2)); // No cast needed, type is now S2*
S3* = (S3*)s2; // Cast type S2* to S3* (potentially dangerous)
Note that C++ added type safe memory allocators, which you should use. C++ code should use new/delete (single object) or new[]/delete[] (array) rather than malloc/free. Also, by extension to the guideline given earlier, use new[]/delete[] when you need dynamic memory for an array where the size is only known at run-time. Use new/delete when you need a single object with dynamic lifetime that isn't tied to the scope of a function that creates the object.