Custom Language Support Contributor Implementation
Hi!
I am trying to create a plugin for Jack language (toy language from nand2tetris course). So far I've followed simple language plugin tutorial and been peeking into rust plugin source code. Currently I am trying to implement completion contributor which in SLPT is very brief and in rust is pretty complex. So I'm coming to ask help to you guys. The problem I came across is that while I type code in Jack and I try to debug (according to answer on "Q: My completion is not working! How do I debug it?" ANSWER) the apparent position (myPosition variable) in every CompletionContributor#fillCompletionVariants call somehow is an identifier, and parent almost always is a PsiErrorElement: "x" expected, got 'IntellijIdeaRulezzz'. So I think I have some problems in either my BNF structure on in my Flex file. However I think I've followed the tutorial pretty carefully (just the complexity there is really low so maybe I transitioned poorly).
My question is: What may be the cause that every position is psiElement(identifier) and for every position parent is PsiErrorElement?
for context, here is my BNF file
{
parserClass="ge.freeuni.jack.lang.parser.JackParser"
extends="com.intellij.extapi.psi.ASTWrapperPsiElement"
psiClassPrefix="Jack"
psiImplClassSuffix="Impl"
psiPackage="ge.freeuni.jack.lang.psi"
psiImplPackage="ge.freeuni.jack.lang.psi.impl"
elementTypeHolderClass="ge.freeuni.jack.lang.psi.JackTypes"
elementTypeClass="ge.freeuni.jack.lang.psi.JackElementType"
tokenTypeClass="ge.freeuni.jack.lang.psi.JackTokenType"
tokens = [
LBRACE = '{'
RBRACE = '}'
LBRACK = '['
RBRACK = ']'
LPAREN = '('
RPAREN = ')'
COLON = ':'
SEMICOLON = ';'
COMMA = ','
EQ = '='
EXCL = '!'
PLUS = '+'
MINUS = '-'
AND = '&'
OR = '|'
LT = '<'
MUL = '*'
DIV = '/'
GT = '>'
DOT = '.'
NOT = '~'
]
}
File ::= ClassDecl
ClassDecl ::= class identifier LBRACE Defs* Subroutines* RBRACE
private Defs ::= VarScope Type identifier SEMICOLON
VarScope ::= field | static
Type ::= int | boolean | char
Subroutines ::= FuncScope RetType identifier LPAREN Params RPAREN FuncBody
FuncScope ::= constructor | method | function
RetType ::= Type | void
Params ::= Type identifier ParamsTail*
ParamsTail ::= COMMA Type identifier
FuncBody ::= LBRACE LocalVars* Statements* FuncRet RBRACE
Statements ::= do identifier
FuncRet ::= return identifier?
LocalVars ::= var Type identifier IdTail* SEMICOLON
IdTail ::= COMMA identifier
and my flex file:
package ge.freeuni.jack.lang;
import com.intellij.lexer.FlexLexer;
import com.intellij.psi.tree.IElementType;
import ge.freeuni.jack.lang.psi.JackElementType;
import ge.freeuni.jack.lang.psi.JackTypes;
import com.intellij.psi.TokenType;
import static com.intellij.psi.TokenType.*;
import static ge.freeuni.jack.lang.psi.JackTypes.*;
%%
%{
public _JackLexer() {
this((java.io.Reader)null);
}
%}
%public
%class _JackLexer
%implements FlexLexer
%unicode
%function advance
%type IElementType
EOL_WS = \n | \r | \r\n
LINE_WS = [\ \t]
WHITE_SPACE_CHAR = {EOL_WS} | {LINE_WS}
WHITE_SPACE = {WHITE_SPACE_CHAR}+
IDENTIFIER = [_a-zA-z][_a-zA-Z0-9]*
%%
<YYINITIAL> {
"{" { return LBRACE; }
"}" { return RBRACE; }
"[" { return LBRACK; }
"]" { return RBRACK; }
"(" { return LPAREN; }
")" { return RPAREN; }
":" { return COLON; }
";" { return SEMICOLON; }
"," { return COMMA; }
"=" { return EQ; }
"!" { return EXCL; }
"+" { return PLUS; }
"-" { return MINUS; }
"&" { return AND; }
"|" { return OR; }
"<" { return LT; }
"*" { return MUL; }
"/" { return DIV; }
">" { return GT; }
"." { return DOT; }
"~" { return NOT; }
"class" { return CLASS; }
"field" { return FIELD; }
"static" { return STATIC; }
"int" { return INT; }
"char" { return CHAR; }
"boolean" { return BOOLEAN; }
"constructor" { return CONSTRUCTOR; }
"function" { return FUNCTION; }
"method" { return METHOD; }
"constructor" { return CONSTRUCTOR; }
"function" { return FUNCTION; }
"method" { return METHOD; }
{IDENTIFIER} { return IDENTIFIER; }
{WHITE_SPACE} { return WHITE_SPACE; }
}
[^] {return BAD_CHARACTER;}
as you might quickly spot, I haven't tried to incorporate string or comment detection, I'm just trying to get this completion part working first.
Any help will be appreciated!
Please sign in to leave a comment.
Hi,
Could you please provide the issue context?
Karol Lewandowski
I tried to invoke completion in two places:
1. when I'm beginning to type in file
and here I was expecting to have class keyword to popup (so the caret position is basically 0,1 0,2
2. while defining a variable of a class (or standing inside a class)
here if I haven't started typing I expected to have suggestions: field, static, constructor, method, func and if invoked after typing fl to just have field pop up.
this is my completion contributor code
however I don't think completion provider can do anything, considering every element is considered as psiElement(identifier) and every parent is psiErrorElement
I see the issue now.
The IntellijIdeaRulezzz dummy identifier is inserted before completion in a copy of the edited document so that the PSI tree is correct and doesn't contain PsiError elements - it works for most cases.
In case you don't need any identifier at the current context or you need another dummy text, it is possible to change the default "IntellijIdeaRulezzz" dummy identifier to something else, depending on your needs. It can be done in CompletionContributor.beforeCompletion(CompletionInitializationContext context), e.g.:
context.setDummyIdentifier(""); // or other text
Karol Lewandowski
Okay I'm gonna see into that, but can you explain why this happens in my case? Am I missing something or do I have to do something else so that the exact token is the current element? i.e at the start of the file, when I start typing, shouldn't the current element be psiElement(JackTypes.CLASS)?
Karol Lewandowski
I tried to do what you told me, however as expected it only changed the text of the LeafPsiElement, I'm adding a screenshot of the state of execution. This is where all the confusion happens, somewhy even the first element (as you see I have only typed 'cl') is of type
and everything afterwards is also qualified as psiElement(identifier), I can't seem to wrap my head around the fact as why ?
To add more proof, when I try to bring basic completion for identifier element, it starts to suggest content provided by CompletionProvider, i.e.
I even thought that only using YYINITIAL in .flex file may be the problem, but seems like highlighting works fine and in rust plugin they only use YYINITIAL state (outside of string literals and comments ofc)
I'm sorry, but I missed your key point. I thought you were getting an error element in the completion position, but you get the identifier, which is fine.
You get the identifier token type because the lexer recognizes any characters other than keywords as identifiers. When registering completion providers, you should pass the pattern that matches the current context, not the context you want after completion.
In the first provider:
You try to complete the "class" keyword, but the pattern is the class keyword that doesn't exist at the moment, and the pattern cannot be matched, so it doesn't make sense. The pattern should reflect the context that you want to invoke this completion. It's OK if the current position from the completion point of view will be an identifier as it is just a temporal state in a copy of the document used for completion. If a class can be defined only on the top level, then you can create a pattern for matching a PSI element with a file parent. Maybe there is a better pattern for this, I don't know the language. This is just an example, and usually, creating patterns requires experimenting as sometimes completion will be invoked in the position that matches the pattern but is unexpected.
Similarly, in the second case - if you want to complete keywords that are part of a class, then try something like "psiElement().withTreeParent(psiElement(CLASS))".
Karol Lewandowski
Thanks, now I understand what I have to do.
Can I ask one last question?
So when I view the structure from PsiViewer in kotlin files, the thing I noticed is: if I am standing inside a class, as soon as I type "val" a new element is added in a tree under the name PROPERTY and the property contains psiElement(val) and psiErrorElement as its children. In my case when I start declaring new property in Jack, it doesn't get converted into a single object unless I type out the whole declaration (field int foo;). So how is the first behavior achieved? It would be a lot easier to provide completion with that kind of construction.
This is a very good and important question.
You can add "pins" to your grammar rules in the BNF file. Pins allow you to match a given rule even if it's not finished or is invalid, e.g., the property may look like:
It may be defined as:
The "pin=1" attribute means that whenever "val" is matched, then the "property" rule is encountered (and included in the built syntax tree), and tokens will be consumed inside of this rule until the "recoverWhile" rule is matched (as error elements if they are incorrect according to the rule). I guess it's not easy to understand this with a single example, so I recommend reading the Grammar-Kit documentation, especially:
Karol Lewandowski
Thank you so much. I have read the documentation but I thought pin and recovery where only for error handling but with your example everything is clear now, thank you yet again!
Karol Lewandowski
Sorry for bothering again, I abstained from starting a new post (maybe it wasn't a good idea but still).
So after your clarifications I managed to get a better psi tree but there is still something bugging me. It probably has to do with "recoverWhile" rule, I couldn't exactly figure out its usage from documentation, however I got the hang of pins. So my question now is this. In kotlin's psi tree when you create an empty class it looks like this:
when I start typing inside this class, a new psiErrorElement appears between "{" and "}" elements, but when I do the same in my jack language (basically the same structure, can provide screenshots if necessary), the right brace "}" gets dropped out of the CLASS_BODY element and depending on my recoveryWhile rule, it either becomes a child of CLASS element or even a FILE element. after the PIN is met for Jack's property rule, the right brace - "}" pops back in where it belongs and everything seems normal. Am I missing something in pin rules or does it have to do something with "recoverWhile" ? how is the kotlin behavior achieved?
p.s I can also provide my updated .bnf file for more context
My first question is: does it matter in your case where the right brace is? If not, I suggest not trying to reflect another language's behavior but constructing grammar that will work for your features. If the completion works as expected in the current state, I would leave it as is.
Kotlin parser is implemented manually from scratch (https://github.com/JetBrains/kotlin/blob/master/compiler/psi/src/org/jetbrains/kotlin/parsing/KotlinParsing.java), so it may be hard to achieve exactly the same behaviors with Grammar Kit.
If you need this behavior for some reason, please share the updated BNF and provide screenshots of the code and PSI tree. It will be much easier to understand what is happening.
Oh, I think you are right, I just thought I was doing something wrong when that was happening. It's not mandatory for right brace to behave like that at all.