Custom Language Support Contributor Implementation

Answered

Hi! 

I am trying to create a plugin for Jack language (toy language from nand2tetris course). So far I've followed simple language plugin tutorial and been peeking into rust plugin source code. Currently I am trying to implement completion contributor which in SLPT is very brief and in rust is pretty complex. So I'm coming to ask help to you guys. The problem I came across is that while I type code in Jack and I try to debug (according to answer on "Q: My completion is not working! How do I debug it?" ANSWER) the apparent position (myPosition variable) in every CompletionContributor#fillCompletionVariants call somehow is an identifier, and parent almost always is a PsiErrorElement: "x" expected, got 'IntellijIdeaRulezzz'. So I think I have some problems in either my BNF structure on in my Flex file. However I think I've followed the tutorial pretty carefully (just the complexity there is really low so maybe I transitioned poorly). 

My question is: What may be the cause that every position is psiElement(identifier) and for every position parent is PsiErrorElement? 

for context, here is my BNF file

{
parserClass="ge.freeuni.jack.lang.parser.JackParser"

extends="com.intellij.extapi.psi.ASTWrapperPsiElement"

psiClassPrefix="Jack"
psiImplClassSuffix="Impl"
psiPackage="ge.freeuni.jack.lang.psi"
psiImplPackage="ge.freeuni.jack.lang.psi.impl"

elementTypeHolderClass="ge.freeuni.jack.lang.psi.JackTypes"
elementTypeClass="ge.freeuni.jack.lang.psi.JackElementType"
tokenTypeClass="ge.freeuni.jack.lang.psi.JackTokenType"

tokens = [
LBRACE = '{'
RBRACE = '}'
LBRACK = '['
RBRACK = ']'
LPAREN = '('
RPAREN = ')'
COLON = ':'
SEMICOLON = ';'
COMMA = ','
EQ = '='
EXCL = '!'
PLUS = '+'
MINUS = '-'
AND = '&'
OR = '|'
LT = '<'
MUL = '*'
DIV = '/'
GT = '>'
DOT = '.'
NOT = '~'
]
}


File ::= ClassDecl
ClassDecl ::= class identifier LBRACE Defs* Subroutines* RBRACE
private Defs ::= VarScope Type identifier SEMICOLON
VarScope ::= field | static
Type ::= int | boolean | char

Subroutines ::= FuncScope RetType identifier LPAREN Params RPAREN FuncBody
FuncScope ::= constructor | method | function
RetType ::= Type | void
Params ::= Type identifier ParamsTail*
ParamsTail ::= COMMA Type identifier
FuncBody ::= LBRACE LocalVars* Statements* FuncRet RBRACE
Statements ::= do identifier
FuncRet ::= return identifier?
LocalVars ::= var Type identifier IdTail* SEMICOLON
IdTail ::= COMMA identifier

and my flex file:

package ge.freeuni.jack.lang;

import com.intellij.lexer.FlexLexer;
import com.intellij.psi.tree.IElementType;
import ge.freeuni.jack.lang.psi.JackElementType;
import ge.freeuni.jack.lang.psi.JackTypes;
import com.intellij.psi.TokenType;

import static com.intellij.psi.TokenType.*;
import static ge.freeuni.jack.lang.psi.JackTypes.*;

%%

%{
public _JackLexer() {
this((java.io.Reader)null);
}
%}

%public
%class _JackLexer
%implements FlexLexer
%unicode
%function advance
%type IElementType

EOL_WS = \n | \r | \r\n
LINE_WS = [\ \t]
WHITE_SPACE_CHAR = {EOL_WS} | {LINE_WS}
WHITE_SPACE = {WHITE_SPACE_CHAR}+

IDENTIFIER = [_a-zA-z][_a-zA-Z0-9]*

%%

<YYINITIAL> {

"{" { return LBRACE; }
"}" { return RBRACE; }
"[" { return LBRACK; }
"]" { return RBRACK; }
"(" { return LPAREN; }
")" { return RPAREN; }
":" { return COLON; }
";" { return SEMICOLON; }
"," { return COMMA; }
"=" { return EQ; }
"!" { return EXCL; }
"+" { return PLUS; }
"-" { return MINUS; }
"&" { return AND; }
"|" { return OR; }
"<" { return LT; }
"*" { return MUL; }
"/" { return DIV; }
">" { return GT; }
"." { return DOT; }
"~" { return NOT; }

"class" { return CLASS; }
"field" { return FIELD; }
"static" { return STATIC; }
"int" { return INT; }
"char" { return CHAR; }
"boolean" { return BOOLEAN; }
"constructor" { return CONSTRUCTOR; }
"function" { return FUNCTION; }
"method" { return METHOD; }
"constructor" { return CONSTRUCTOR; }
"function" { return FUNCTION; }
"method" { return METHOD; }
{IDENTIFIER} { return IDENTIFIER; }
{WHITE_SPACE} { return WHITE_SPACE; }
}

[^] {return BAD_CHARACTER;}

 

as you might quickly spot, I haven't tried to incorporate string or comment detection, I'm just trying to get this completion part working first.

Any help will be appreciated!

 

 

 

 

12 comments
Comment actions Permalink

Hi,

Could you please provide the issue context?

  • What is the code you invoke completion in? I mean the Jack language code.
  • What is the caret position when you invoke completion?
  • What are expected tokens to be completed?
  • What are the completion contributor and providers implementations?

 

0
Comment actions Permalink

Karol Lewandowski

I tried to invoke completion in two places:

1. when I'm beginning to type in file

cl

and here I was expecting to have class keyword to popup (so the caret position is basically 0,1 0,2

2. while defining a variable of a class (or standing inside a class)

class Foo {
fi // <- this should be: field int k;
}

here if I haven't started typing I expected to have suggestions: field, static, constructor, method, func and if invoked after typing fl to just have field pop up.

this is my completion contributor code

class JackCompletion: CompletionContributor() {
init {
extend(BASIC, psiElement(JackTypes.CLASS_DECL), object : CompletionProvider<CompletionParameters>() {
override fun addCompletions(
parameters: CompletionParameters,
context: ProcessingContext,
result: CompletionResultSet
) {
result.addElement(LookupElementBuilder.create("class"))
}
})

extend(BASIC, memberPattern(), object : CompletionProvider<CompletionParameters>() {
override fun addCompletions(
parameters: CompletionParameters,
context: ProcessingContext,
result: CompletionResultSet
) {
for (elem in arrayOf("field", "static", "constructor", "method", "function"))
result.addElement(LookupElementBuilder.create(elem))
}
})
}
}

however I don't think completion provider can do anything, considering every element is considered as psiElement(identifier) and every parent is psiErrorElement

0
Comment actions Permalink

I see the issue now.

The IntellijIdeaRulezzz dummy identifier is inserted before completion in a copy of the edited document so that the PSI tree is correct and doesn't contain PsiError elements - it works for most cases.

In case you don't need any identifier at the current context or you need another dummy text, it is possible to change the default "IntellijIdeaRulezzz" dummy identifier to something else, depending on your needs. It can be done in CompletionContributor.beforeCompletion(CompletionInitializationContext context), e.g.:
context.setDummyIdentifier(""); // or other text

0
Comment actions Permalink

Karol Lewandowski

Okay I'm gonna see into that, but can you explain why this happens in my case? Am I missing something or do I have to do something else so that the exact token is the current element? i.e at the start of the file, when I start typing, shouldn't the current element be psiElement(JackTypes.CLASS)? 

0
Comment actions Permalink

Karol Lewandowski

I tried to do what you told me, however as expected it only changed the text of the LeafPsiElement, I'm adding a screenshot of the state of execution. This is where all the confusion happens, somewhy even the first element (as you see I have only typed 'cl') is of type

IElementType IDENTIFIER = new JackTokenType(debugName: "identifier");

and everything afterwards is also qualified as psiElement(identifier), I can't seem to wrap my head around the fact as why ?

To add more proof, when I try to bring basic completion for identifier element, it starts to suggest content provided by CompletionProvider, i.e.

extend(BASIC, psiElement(IDENTIFIER), ....some keywords...) <- works everytime, anywhere in my jack files

I even thought that only using YYINITIAL in .flex file may be the problem, but seems like highlighting works fine and in rust plugin they only use YYINITIAL state (outside of string literals and comments ofc)

 

0
Comment actions Permalink

I'm sorry, but I missed your key point. I thought you were getting an error element in the completion position, but you get the identifier, which is fine.

You get the identifier token type because the lexer recognizes any characters other than keywords as identifiers. When registering completion providers, you should pass the pattern that matches the current context, not the context you want after completion.

In the first provider:

extend(BASIC, psiElement(JackTypes.CLASS_DECL), object : CompletionProvider<CompletionParameters>() {
override fun addCompletions(
parameters: CompletionParameters,
context: ProcessingContext,
result: CompletionResultSet
) {
result.addElement(LookupElementBuilder.create("class"))
}
})

You try to complete the "class" keyword, but the pattern is the class keyword that doesn't exist at the moment, and the pattern cannot be matched, so it doesn't make sense. The pattern should reflect the context that you want to invoke this completion. It's OK if the current position from the completion point of view will be an identifier as it is just a temporal state in a copy of the document used for completion. If a class can be defined only on the top level, then you can create a pattern for matching a PSI element with a file parent. Maybe there is a better pattern for this, I don't know the language. This is just an example, and usually, creating patterns requires experimenting as sometimes completion will be invoked in the position that matches the pattern but is unexpected.

Similarly, in the second case - if you want to complete keywords that are part of a class, then try something like "psiElement().withTreeParent(psiElement(CLASS))".

1
Comment actions Permalink

Karol Lewandowski

Thanks, now I understand what I have to do.

Can I ask one last question? 

So when I view the structure from PsiViewer in kotlin files, the thing I noticed is: if I am standing inside a class, as soon as I type "val" a new element is added in a tree under the name PROPERTY and the property contains psiElement(val) and psiErrorElement as its children. In my case when I start declaring new property in Jack, it doesn't get converted into a single object unless I type out the whole declaration (field int foo;). So how is the first behavior achieved? It would be a lot easier to provide completion with that kind of construction.

0
Comment actions Permalink

This is a very good and important question.

You can add "pins" to your grammar rules in the BNF file. Pins allow you to match a given rule even if it's not finished or is invalid, e.g., the property may look like:

val propertyName: String = "value"

It may be defined as:

property ::= VAL property_name [COLON type] [ASSIGN value] {
  pin=1
  recoverWhile=property_recover_rule
}

The "pin=1" attribute means that whenever "val" is matched, then the "property" rule is encountered (and included in the built syntax tree), and tokens will be consumed inside of this rule until the "recoverWhile" rule is matched (as error elements if they are incorrect according to the rule). I guess it's not easy to understand this with a single example, so I recommend reading the Grammar-Kit documentation, especially:

 

0
Comment actions Permalink

Karol Lewandowski

Thank you so much. I have read the documentation but I thought pin and recovery where only for error handling but with your example everything is clear now, thank you yet again!

0
Comment actions Permalink

Karol Lewandowski

Sorry for bothering again, I abstained from starting a new post (maybe it wasn't a good idea but still). 

So after your clarifications I managed to get a better psi tree but there is still something bugging me. It probably has to do with "recoverWhile" rule, I couldn't exactly figure out its usage from documentation, however I got the hang of pins. So my question now is this. In kotlin's psi tree when you create an empty class it looks like this:  

File
-- Package
-- imports
-- class
---- psiElement(class)
---- psiElement(identifier)
---- class_body
------ psi( { )
------ psi( } )

when I start typing inside this class, a new psiErrorElement appears between "{" and "}" elements, but when I do the same in my jack language (basically the same structure, can provide screenshots if necessary), the right brace "}" gets dropped out of the CLASS_BODY element and depending on my recoveryWhile rule, it either becomes a child of CLASS element or even a FILE element. after the PIN is met for Jack's property rule, the right brace - "}" pops back in where it belongs and everything seems normal. Am I missing something in pin rules or does it have to do something with "recoverWhile" ? how is the kotlin behavior achieved?

p.s I can also provide my updated .bnf file for more context

0
Comment actions Permalink

My first question is: does it matter in your case where the right brace is? If not, I suggest not trying to reflect another language's behavior but constructing grammar that will work for your features. If the completion works as expected in the current state, I would leave it as is.

Kotlin parser is implemented manually from scratch (https://github.com/JetBrains/kotlin/blob/master/compiler/psi/src/org/jetbrains/kotlin/parsing/KotlinParsing.java), so it may be hard to achieve exactly the same behaviors with Grammar Kit.

If you need this behavior for some reason, please share the updated BNF and provide screenshots of the code and PSI tree. It will be much easier to understand what is happening.

0
Comment actions Permalink

Oh, I think you are right, I just thought I was doing something wrong when that was happening. It's not mandatory for right brace to behave like that at all. 

0

Please sign in to leave a comment.