Can anyone explain tree related classes?

Ok, my head is spinning trying to figure out what the billion classes are for related to trees. E.g., what the hell is a stub?  Is that the same thing is a base class with most of the stuff filled in? These classes are critical and not a single one of them has even three words that says what they are for. Surely one person could write one sentence for these classes and save even internal jetbrains people lots and lots of time.  I'm reading through plug-ins like crazy trying to abstract what the hell all of the shit does.   I've gone down 1 million different paths trying to figure out how to implement a simple "this node references this other node" mechanism. I think I finally figured out that you can't create a Psi node for a token, which is bogus because then I have to create internal nodes as parents to identifier nodes just to say they are PsiNamedElement.

I understand that this tool has to be incredibly general, hence the lots of class names and levels of indirection, but if you don't tell me what all of these classes are for it's really impossible to build a plug-in. I'm also facing different versions which changes the API as I look through plug-ins.




Stubs are actually pretty well documented here: IntelliJ has to be not only general but efficient too.

It's true that the documentation is pretty spread across various spots and that there's very little in the code. The main central point is here: I'd recommend reading at a minimum:

...if you haven't already. The last one there explains the various levels of tree tokens pretty well. If you have more specific questions feel free to ask them here and I and others can try to answer them, or if you post the code somewhere I can take a look at it.


What exactly are you trying to achieve? Are you building a custom language, or are you trying to extend an existing language?


hi Colin, yep, I've been reading those again and again. The tree diagram is what confirmed for me that there are no PSI elements associated with tokens from the AST.  I don't understand how that can be when I need a variable reference, for example, to implement an interface for named items. I'm writing this email quickly before I finish up for the day so I don't have all the names in front of me.   part of my confusion is what the heck an IElementTYpe is. Is it a token type? and AST node type? a psi node type? all of these getValue() and getNode() and getElementType() are really confusing me.  It is really pointer spaghetti and I can't keep track of all these type names in my head.  We really need a simple description of these critical PSI and AST related classes and interfaces. It just can't take that long.

Dimtry has replied so I will try to explain what I'm trying to do there.


Hi Dimtry,

Essentially I'm trying to learn the core API necessary to build a plug-in for a completely new simple programming language from scratch so that I can automatically generate a framework from an ANTLR grammar. I.e., first I'm trying to learn what it takes and then I will automate it.  I believe I have a general mechanism in my mind to make the parsing stuff work.  After I was able to identify what was causing a null pointer this afternoon, I was able to get a simple structured view together.  I managed to abstract from multiple plug-ins what I needed minimally. See attached. It shows the PSI tree structure I have for this trivial programming language for which I also attach the grammar.

Now I'm getting to the hard part :). All I want to do is make variable and function references to satisfy the needs of your search, refactor, and jump to definition functionality.  I'm trying to do the reference / definition relationship stuff.  

I'm trying to understand also the relationship between tokens in the trees and PSI nodes.   To start, I assume there is an AST node associated with the token? Can I make a PSI node from that? If not, how do I implement PsiNamedElement? Or can that only be done on interior tree nodes?  Do I need to create a special kind of PSI node for each kind of reference, such as variable and function references? When I set a definition, is it on the leaf node associated with the identifier for the function definition or is it on the interior node called "func" associated with my func rule in the grammar? Is this enough so that you can get me started? And given the PSI nodes you see in the tree there, can you make any recommendations about what special types I need in order to implement the def/ref stuff? Every time I try to look through source code, I get lost in all of the interface and class names associated with trees, particularly because nodes tend to wrap other nodes. (That is all totally appropriate given the scope of the problem, it's just that I don't know what any of them are for ;)).




Please sign in to leave a comment.