Improving collecting run-time type information for code insight

Permanently deleted user

Created March 20, 2014 19:18

Hello,

I'm using PyCharm for some months now, and it's without a doubt the best Python IDE I've ever used!
With that being said, it's still has some flaws regarding code completion (well, naturally because of the dynamic typing system of Python).

In PyCharm 2.7, if I remember correctly, you introduced a feature for collecting run-time types information for code insight, which is used with the debugger. This is a great feature, but I think, it can be improved significantly.

This tool, in its current implementation, is used to collect type information only for parameters passed to a function, and for parameters returned from a function, but nothing else.

i.e. if there is a variable that is assigned to some object, which was dynamically created, then it seems that typing information will not be collected for it.

Let me show you an example, using sqlalchemy (which is the main reason that got me to write this post):
let's take a look at a simple database definition using SQLAlchemy's SQL Expression Language:

from sqlalchemy import create_engine
from sqlalchemy import Table, Column, Integer, MetaData

engine = create_engine('sqlite:///:memory:')
metadata = MetaData()

table = Table('some_table', metadata, Column('some_col', Integer, primary_key=True))
table.create(engine)

ins = table.insert()

Now, if we take a look at the last line, 'ins' is a variable pointing to an Insert object, which has a lot of methods. We could say for example:

ins.bind(engine)

But there are no code completion suggestions for this variable, since the static type inference could not infer its type (you may want to take a look at the insert() method definition to understand why).

But, if I enable 'Collect run-time types information for code insight' in the 'Python Debugger' settings, I would expect that when I run the debugger, the type information for the variable 'ins' will be stored, and I will have code completion suggestions. However, that's not what happens.

But if I change this last line, into the following few lines:

def foo(a):
   return a

ins = foo(table.insert())

And then run the debugger, the type information for the parameter 'a' will be collected, and since 'a' is also returned, I will also get typing information for 'ins' variable.
That's pretty weird behavior...

I would appreciate if you'll consider changing that behavior , and also collect type information for variables not part of a function parameters.

Thanks!

4 comments

Dirk Dittert

Created March 24, 2014 22:46

Python really is a very dynamic language. Even with all that type inference and debugger tricks, there's no way for PyCharm by just analyzing source code to tell which calls will or will not succeed or what the correct return type will be in all those cases. For example, you could make Python objects to dynamically respond to certain method calls based on the current system time. There's no way for PyCharm to know about that without the code being executed.

If you always want want to be certain about method signatures and return types, you'd have to use a language that provides this type information at compile tile (e.g. Java, .NET, Kotlin).

You can also help Pycharm by adding type information in cases where it doesn't guess correctly: Please see the online help for type annotations in doc strings.

And, you're right: it would be great if more dynamic type information could be tracked by the debugger. But I guess, there's only so much JetBrains can do about that...

Permanently deleted user

Created March 24, 2014 23:07

Hi Dirk.
I appreciate your answer!

I'm familiar with the reasons for why it's hard to infer all the types statically. My post was mainly for suggesting an improvement to the run-time type collector. I think it can be improved significantly, if it will also store the information for local variables, and not just for parameters passed to a function. I mean, it's already has that information at run-time, so why not cache it, the same way it is cached for functions' parameters\returned values.

About function type annotations - I'm familiar with the concept, but many times I use 3rd party libraries which were not written by me, so there's not much I can do (I'm not really going to go over all that code and annotate it :)). Further more, function type annotations do not allow for generic types annotations - Something you define a function which receive a parameter, and you want to annotate the function return type as that parameter type. Or better example is when you define a class which has methods, and the class initialized by passing another type (class) to its constructor (__init__), and you want to annotate all that class's methods to say they're returning the same type as the one passed to the constructor. (just like generics in Java, or C#).

Dirk Dittert

Created March 25, 2014 11:19

I do certainly do agree with your intentions. Making code analysis better is always a good thing. As I understand the documentation, it seems to be possible to declare generics with Foo[T] with T to Z being reserved for generics. I have never used that, though.

Permanently deleted user

Created March 25, 2014 18:31

Apparently I didn't pay much attention to the part in the documentation that deals with generics. The solution they give there is very nice! I didn't know it exists, so thanks for that.

I'll have to read more about it to make sure it has everything I need, but from first observation it's pretty good.

Anyway, it can be great solution for when you're developing your own modules, but when you use a 3rd-party library, like sqlalchemy, there's not much you can do about it, and that's the case when you most need the auto-complete... if you want to always have good auto-complete in PyCharm, you'll have to make the entire Python community conform to PyCharm's annotation rules, and of course that's not gonna happen.

Please sign in to leave a comment.