Improving the Quality of Signatures with mypy

Preliminary

The Python Interface files of PySide are generated by a few scripts. When .pyi files were started in 2017, a minimal syntax check was possible because these files could be run in Python itself.

Some changes to the format of .pyi files made that impossible, leaving PySide’s .pyi files quite unchecked for years. Only correct parsing of all functions could be checked by the generator.

The introduction of the mypy tool as a rigorous error checker for the generated files brought many improvements, but also some surprizes.

Running the mypy Tests

The mypy tests are automatically run by the Qt company CI framework (COIN). When you have mypy installed, the tests are run when building with tests. In debug mode, this can take more than 30 s, therefore we provide the translation option

--skip-mypy-test

which can be used when repeatedly translating. But note that mypy has a good cache that suppresses analysis of unchanged .pyi files.

Types of mypy Errors

Duplication Errors

Many functions have multiple signatures, which are later translated to multiple typing.overload versions in the .pyi file. Due to the mapping of C++ functions to Python it sometimes happens that similar C++ functions become Python duplicates. This was simple to filter out, but mypy still finds duplicates which differ only in parameter names. This is now handled by the function remove_ambiguous_signatures() in module layout that compares the so-called annotations which ignore parameter names.

Shadowing Errors

A quite subtle error type is the shadowing of multiple signatures. This is due to the sequential nature of .pyi files:

* In ``C++``, the order of functions does not matter at all. The best fit is
  automatically used.

* In Python stub files, the alternatives of multiple signatures are sequentially
  checked in ``@typing.overload`` chains of functions.
  This can produce shadowing when an annotation contains another.

An Example: PySide6.QtCore.QCborSimpleType is shadowed by int when int is listed first. That is due to the Method Resolution Order mro():

* int.mro()              [<class 'int'>, <class 'object'>]

* QCborSimpleType.mro()  [<enum 'QCborSimpleType'>, <enum 'IntEnum'>,
                          <class 'int'>, <enum 'ReprEnum'>,
                          <enum 'Enum'>, <class 'object'>]

You see that the mro() has an ordering effect on the multiple signatures. The enum inherits from int and should come before the int entry. The whole task of bringing the multiple signatures into a conflict-free order is a sort of Topological Sorting.

We build a sorting key using the length of the mro of the argument annotations and some additional heuristics. They can be inspected in function get_ordering_key() that is called by sort_by_inheritance() in module layout.

Unsolvable Errors

Some errors are pointed out by mypy that we cannot solve. The only chance we have is to disable these errors partially or even completely. They are marked in the .pyi files, see below.

Contradiction to Qt

Errors are found by mypy where Qt has a different opinion. The error types “override” and “overload-overlap” needed to be disabled because we cannot change what Qt thinks is right.

Examples:

Error code "override" cannot be fixed because the problem
is situated in Qt itself:

    Signature of "open" incompatible with supertype "QFile"

Error code "overload-overlap" also cannot be fixed because
we have no chance to modify return-types:

    Overloaded function signatures 1 and 6 overlap with
    incompatible return types

They are globally disabled by the comment:

# mypy: disable-error-code="override, overload-overlap"

Other errors like “misc” are too broad to be prematurely disabled. See below how we handle them.

Disagreement with __add__ and __iadd__

There are internal rules for Python which can only be recognized when mypy points them out as “misc”. There are functions which come in pairs:

__add__, __iadd__, __sub__, __isub__, __mul__, __imul__, ...

and more. There is this rule:

if __add__ and __iadd__ exist in a type, the signatures must be the same.

In 95 % this rule is fulfilled, but in a few cases it is not. There we have to compute these cases, and if they disagree we generate a disabling mypy inline comment “# type: ignore[misc]”. You can see this functionality in ExactEnumerator.klass of module enum_sig.

Disagreement with inconsistent overloads

If there is a mixed overloading of methods and static or class methods, mypy believes this is an error. In a first version, we fixed this rare situation by suppressing this “misc” error. But when moving to correct positional-only parameters (PEP 570) this suppression created unsolvable follow-up errors. The cleaner solution was to remove the static methods and prefer the normal methods.

See function is_inconsistent_overload() of module layout which checks if “self” is always or never an argument.

Conclusion and Future

This effort has brought the reported mypy errors from 601 down to zero, which is really an improvement. But there can be done more. Although we now know that we are generating syntactically and semantically quite correct files, we still do not know whether the real types really fulfil the requirements of mypy.

There is a stubtest module in mypy which we might perhaps use to do even more tests. These would check if the implementation and stub files agree.

Literature