Choosing the right Visitor/Processor/Traverser paradigm for "compiling" a PsiFile?

There seem to be many PsiElementVisitor, PsiElementProcessor, SyntaxTraverser classes which all provide mechanisms for traversing a PsiFile. And one can also always manually recurse through either PsiElements or ASTNodes at any time.

I'm trying to build a "compiler" which reads an entire PsiFile (probably Top-Down), and processes it into a binary format. Which of the above paradigms is best-suited for this purpose? Does Jetbrains have examples? I found the Kotlin plugin was a very helpful reference for parsing and constructing a PSI, but I've had less luck finding good examples of processing/compiling a PSI.


Thanks, and sorry that this is such a high-level question.

1 comment
Comment actions Permalink

This is complicated to answer, because it depends on a number of factors.

For a plugin I'm writing, I am building out the language support for a language (XQuery) that is steadily getting closer to becoming a compiler for the language (so I can implement references/resolve, static analysis, etc.).

The approach I've taken for the PSI is to have a PSI class for each EBNF symbol, and a corresponding AST interface. I then refer to the interface classes instead of the implementation classes. To minimize the number of PSI objects, I'm not including the objects that only forward to a lower element (e.g. if an ADD symbol only has the left-hand side).

I then have various data model classes (type system, integer values, decimal values, imports, functions, variables, etc.) that the PSI classes implement. This allows me to easily check for a variable, querying its properties, without knowing the details of what is implementing it (e.g. parameter vs local vs global). The PSI classes are implementing the functionality via the PSI/AST tree APIs (wrapped in Kotlin sequences).

In addition to this, I have a "walk tree" sequence that traverses the PSI tree, going into the children first, then the siblings -- this is similar to the PsiElementVisitor logic, but does not need any additional visitor logic for each element (as I can check if the element is an instance of one of the AST interfaces). This also has a reverse version that traverses siblings first, then the parent node (stopping if the parent is a PsiDirectory). This API I'm using to do things like "find the variable that this name reference refers to" -- by walking the tree in reverse order, with special handling for things like for loops that declare multiple variables.

One place to start would be to look at the intellij-community source code. You could also try implementing a small part of the logic using each of the different mechanisms, then assess which one works best for what you are doing.


Please sign in to leave a comment.