Author Topic: Covariance and Contravariance  (Read 2554 times)

Offline Hooman

  • Administrator
  • Hero Member
  • *****
  • Posts: 4955
Covariance and Contravariance
« on: May 25, 2018, 05:03:34 PM »
I was pondering a problem with streams and serializers which got me to look up some old topics. There's an article on Wikipedia on Covariance and Contravariance which was quite interesting.

Variance refers to the relative ordering of complex types compared to the relative ordering of the component types. Roughly speaking, covariance preserves the relative ordering, while contravariance reverses the relative ordering. A complex type might result from taking the pointer of another type, adding a const modifier, or building a function that uses the original types as inputs or outputs.

Covariant: 
A >= B  implies  ComplexType<A> >= ComplexType<B>
Contravariant: 
A >= B  implies  ComplexType<A> <= ComplexType<B>



Classic examples involve inheritance of classes and Liskov's substitution principle:
Code: [Select]
class A {};
class B : public A {};

class C {
public:
  virtual A* Func();
};

class D : public C {
public:
  virtual B* Func() override;
};

Class D overrides a method from class C. As such it needs to have the same signature. However, the return type is different. This is known as a covariant return type, and is allowed in C++.

Here, the simple types are A and B, which are relatively ordered due to their inheritance hierarchy (A >= B). The complex types, which use A and B as component types, are the types of C::Func and D::Func. Those types are, respectively, A* (), and B* (), or in other words, functions taking no arguments and returning a pointer to a class. Don't get confused by the inheritance hierarchy between C and D, that's a different matter.

A caller of C::Func expects back an object of type A*. If instead, the code replaced the call of C::Func with a call to D::Func, it would instead get back an object of type B*, which is fine, since a B* is compatible with an A* (i.e. B is an A). Here we have A* ()  >=  B* ().



Example:
Code: [Select]
class A {};
class B : public A {};

class C {
public:
  virtual void Func(B*);
};

class D : public C {
public:
  virtual void Func(A*) override;  // Error: C++ does not allow this as an override.
};

Unfortunately C++ doesn't allow the above code. It's not a valid override because the function signature has changed. The function signature includes the argument types, but not the return type. If the override keyword is removed, it would create a new overload of the function, resulting in two different virtual functions of the same name. A bit disappointing, but consider for a moment if it would work. Note the reversed order of B* and A* in this example, where the base class C uses B* (the derived typed), and the derived class D uses A* (the base type).

Code that calls C::Func would expect to be able to pass in a B*. If instead it was changed to call D::Func, it could still pass in a B* since a B is-an A. Hence the everything still works. The new replacement function can take any object of the old type, plus other objects of the more generic base type. Here we have void (A*)  <=  void (B*).



"Be liberal in what you accept, and conservative in what you produce."

It turns out that to preserve the Liskov substitution principle, functions need to be covariant in regards to their output, and contravariant in regards to their input.

Note that programming languages often allow for out parameters, or in/out parameters, rather than just the typical in parameters. Out parameters are indeed outputs, and so a compatible function would need to be covariant in regards to out parameters, while remaining contravariant in terms of in parameters. If a parameter is both in/out, then a compatible function needs to be invariant for that parameter type.

C# allows specifying in or out on parameters. The Interface Definition Language (IDL) for COM also allows such specification. C++ however does not. Any in/out declaration in IDL would be translated to a comment for that parameter in C++. This lack of specification may be why C++ does not allow covariance/contravariance of function arguments, as not being able to specify in/out means the compiler can't reasonably check if covariance or contravariance would be required for each argument.



If you consider the case of passing a buffer to be filled with data, such as in a Read method, the buffer is an out parameter. Covariance is required for override safety. On the other hand, a buffer passed to a Write method is an in parameter. Contravariance is required for override safety.

Code: [Select]
void Read(/* out */ char* buffer, size_t size);
void Write(/* in */ const char* buffer, size_t size);

Related, the C++ standard defines how const and volatile qualifiers (cv-qualifiers) affect the covariance of derived types. Interestingly, it only defines this for class hierarchies, not for built-in data types. With GCC, you get a warning if you try to override a method and change the const qualifier on a primitive return type. It does of course work, extended to the primitive type in the obvious way.


Challenge problem:
If you were to order the following two types (>=), which should be the supertype? (And why?)
char *
const char *

I'll continue next time with the answer, and hopefully get into how this applies to templates, and how things are different there.
« Last Edit: May 25, 2018, 05:10:01 PM by Hooman »

Offline Vagabond

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1015
Re: Covariance and Contravariance
« Reply #1 on: May 27, 2018, 03:33:55 AM »
This is interesting. I had never really thought of Covariance and Contravariance, or that they would be treated differently in different programming languages.

I like using the ref and out keywords in C#. It makes for clean documentation of function arguments. Since passing any class as a function argument is assumed by reference in C#, you only have to use it for changing a value type like int. If I am specifically altering a class passed as an argument into a function, I will often use the ref keyword even though it isn't needed to explicitly point out the argument will be altered.

I guess in C++ one cannot distinguish between out and ref on an argument besides documenting with a comment. They would both just look like variableType &variableName.

-Brett

Offline Hooman

  • Administrator
  • Hero Member
  • *****
  • Posts: 4955
Re: Covariance and Contravariance
« Reply #2 on: June 30, 2018, 07:13:24 AM »
A quick reply to the challenge:

In regards to classes, a supertype specifies a minimum contract that all subtypes must fullfill. Subclasses are allowed to support additional operations, so long as they support all required operations.

In terms of a const char * used to represent a buffer, the basic operation is being able to read the buffer. A char * also supports reading the buffer, while additionally supporting writing to the buffer.

The answer then, is const char * is akin to being a supertype of char *. (Though this terminology isn't used in C++, and C++ only supports covariance for class types, not for primitive types).



Worth noting, is that for a return type to be covariant, according to the C++ standard, the new return type must be equally or less cv-qualified (const/volatile qualified).

Which is kind of like saying it must be a sub-type in terms of const-ness.



Finally, I wanted to leave a link to this article:
Covariance and Contravariance in C++

In particular, the article discusses covariance and contravariance in terms of C++ templates. It seems C++ templates are invariant. At least by default. With a bit of code, that can change.

The article includes a table of common STL types. Interestingly, it lists std::function as both covariant in the return value, and contravariant in the function arguments. It seems both covariance and contravariance are possible in C++, for templates, with a bit of work.