Home » c# » Problem understanding C# type inference as described in the language specification

Problem understanding C# type inference as described in the language specification

Posted by: admin November 29, 2017 Leave a comment

Questions:

The C# language specification describes type inference in Section §7.5.2. There is a detail in it that I don’t understand. Consider the following case:

// declaration
void Method<T>(T obj, Func<string, T> func);

// call
Method("obj", s => (object) s);

Both the Microsoft and Mono C# compilers correctly infer T = object, but my understanding of the algorithm in the specification would yield T = string and then fail. Here is how I understand it:

The first phase

  • If Ei is an anonymous function, an explicit parameter type inference (§7.5.2.7) is made from Ei to Ti

    ⇒ has no effect, because the lambda expression has no explicit parameter types. Right?

  • Otherwise, if Ei has a type U and xi is a value parameter then a lower-bound inference is made from U to Ti.

    ⇒ the first parameter is of static type string, so this adds string to the lower bounds for T, right?

The second phase

  • All unfixed type variables Xi which do not depend on (§7.5.2.5) any Xj are fixed (§7.5.2.10).

    T is unfixed; T doesn’t depend on anything… so T should be fixed, right?

§7.5.2.11 Fixing

  • The set of candidate types Uj starts out as the set of all types in the set of bounds for Xi.

    ⇒ { string (lower bound) }

  • We then examine each bound for Xi in turn: […] For each lower bound U of Xi all types Uj to which there is not an implicit conversion from U are removed from the candidate set. […]

    ⇒ doesn’t remove anything from the candidate set, right?

  • If among the remaining candidate types Uj there is a unique type V from which there is an implicit conversion to all the other candidate types, then Xi is fixed to V.

    ⇒ Since there is only one candidate type, this is vacuously true, so Xi is fixed to string. Right?


So where am I going wrong?

Answers:

UPDATE: My initial investigation on the bus this morning was incomplete and wrong. The text of the first phase specification is correct. The implementation is correct.

The spec is wrong in that it gets the order of events wrong in the second phase. We should be specifying that we make output type inferences before we fix the non-dependent parameters.

Man, this stuff is complicated. I have rewritten this section of the spec more times than I can remember.

I have seen this problem before, and I distinctly recall making revisions such that the incorrect term “type variable” was replaced everywhere with “type parameter”. (Type parameters are not storage locations whose contents can vary, so it makes no sense to call them variables.) I think at the same time I noted that the ordering was wrong. Probably what happened was we accidentally shipped an older version of the spec on the web. Many apologies.

I will work with Mads to get the spec updated to match the implementation. I think the correct wording of the second phase should go something like this:

  • If no unfixed type parameters exist then type inference succeeds.
  • Otherwise, if there exists one or more arguments Ei with
    corresponding parameter type Ti such that
    the output type of Ei with type Ti contains at least one unfixed
    type parameter Xj, and
    none of the input types of Ei with type Ti contains any unfixed
    type parameter Xj,
    then an output type inference is made from all such Ei to Ti.

Whether or not the previous step actually made an inference, we
must now fix at least one type parameter, as follows:

  • If there exists one or more type parameters Xi such that
    Xi is unfixed, and
    Xi has a non-empty set of bounds, and
    Xi does not depend on any Xj
    then each such Xi is fixed. If any fixing operation fails then
    type inference fails.
  • Otherwise, if there exists one or more type parameters Xi such that
    Xi is unfixed, and
    Xi has a non-empty set of bounds, and
    there is at least one type parameter Xj that depends on Xi
    then each such Xi is fixed. If any fixing operation fails then
    type inference fails.
  • Otherwise, we are unable to make progress and there are
    unfixed parameters. Type inference fails.

If type inference neither fails nor succeeds then the second phase is repeated.

The idea here is that we want to ensure that the algorithm never goes into an infinite loop. At every repetition of the second phase it either succeeds, fails, or makes progress. It cannot possibly loop more times than there are type parameters to fix to types.

Thanks for bringing this to my attention.

Leave a Reply

Your email address will not be published. Required fields are marked *