[ANN] Grammar-Kit: generate parser from grammar
Grammar-Kit plugin is now available in the plugin repository.
http://plugins.jetbrains.net/plugin/?id=6606
Source code and quick documentation:
https://github.com/JetBrains/Grammar-Kit
Purpose:
Editing/refactoring support for BNF-like grammars.
Readable PsiBuilder-based parser & PSI hierarchy generation.
http://plugins.jetbrains.net/plugin/?id=6606
Source code and quick documentation:
https://github.com/JetBrains/Grammar-Kit
Purpose:
Editing/refactoring support for BNF-like grammars.
Readable PsiBuilder-based parser & PSI hierarchy generation.
Please sign in to leave a comment.
Hello Gregory,
what is the best place to report problems with the GrammarKit? The tracker on github which looks pretty unused (1 single issue which I created myself today) or the Jetbrains YouTrack?
btw, it's a great tool and I'm really looking forward to use it to build a full fledged parser for my R-integration for Intellij (https://code.google.com/p/r4intellij/)
Best,
Holger Brandl
Where I could find notes about using the plugin in real life?
Sources of the plugin is not very helpful: code have no comments.
As for now I've found out that the plugin should be used as follow to create a custom language plugin:
1. Create a grammar file to define a Psi nodes.
2. Generates a Psi structure
3. Create a Lexer .flex file
4. Generate a parser using JFlex
So I have following questions according to GrammarKit usage:
1. Where I could find a recomendations about creating base files to be used in generated by GK code?
I'm tring to use a GrammarKit plugin sources as basis for analyse. With Psi structure base (BnfNamedElement, BnfCompositeElement, etc.) files is more or less clear - I've created the similar classes - but with Parsers I've stuck. According to class name (and structure) the class 'org.intellij.grammar.parser.GeneratedParserUtilBase' is generated somehow. But how I could create the own parser base class - it should be used in code generated by GK plugin.
2. Where I could find more clear documentation about what means each attribute in .bnf file?
Thanks a lot.
There're quick documentation and HOWTO pages right on github:
https://github.com/JetBrains/Grammar-Kit/blob/master/README.md
https://github.com/JetBrains/Grammar-Kit/blob/master/HOWTO.md
GeneratedParsersUtilBase serves as runtime and should be simply copied right from Grammar-Kit sources.
This is described on documentation page.
Grammar-Kit itself is a real-life example of its own usage.
Feel free to ask more questions and I will expand documentation pages in the areas that are unclear.
I've found out that token constant is not generated in correspond interface if it is not used in any rule. So, I defined all possible tokens I'll be used in lexer and implement part of rules for now. All tokens are defined but not used are not presented in Tokens interface.
Is this a feature or a bug?
For now I've created a fake rule with all the tokens to force generating it.
Alexei, I also used a fake rule. But in the end (most) all tokens should be used in PSI rules, so fake-rules should be obsolete.
The latest binaries on the github have the following:
1. "tokens" list attribute. To avoid rules that simply define tokens do the following (recommended) :
{
// old way, still supported but may be dropped in future
OP_MUL = "*"
OP_DIV = "/"
// new way, all the tokens will be generated
tokens = [
OP_MUL = "*"
OP_DIV = "/"
]
}
2. Crtrl-Q (Quick documentation) on an attribute shows its description.
Currently descriptions are very limited but I think of expanding them when I have more time.
Please, help me with a grammar rules.
I think I don't understand how to use properly recoverUntil attrubute for rule.
Here is my rule: http://code.google.com/p/ice-framework-idea-plugin/source/browse/src/grammar/IceGrammar.bnf
What I want to be done:
If in the code (correct code):
module {
class A {
};
interface B {
};
class C {
};
};
(PsiTree: Module [ Module body [ Class 'A' , Interface 'B', Class 'C' ]])
I'll insert a space into the 'interface' word than class C is not parsed at all.
So I get following PsiTree: Module [ Module body [ Class 'A' ]]
How to fix the rules to get next Psi tree in the situation?
Module [Module body [ Class 'A' , <possible dummy element>, Class 'C'] ]
Thanks a lot
Consider the following case:
repetition ::= ( element ) *
recoverUntil should always be used on the "element" and not on the "repetition" itself.
So in your case:
module_body ::= (module | data_definition | constant | forward_declaration ) * {recoverUntil=data_def_recover }
private data_def_recover ::= !(module_start | '[' | 'const' | 'class' | 'interface' | 'enum' | 'exception' | 'struct' | 'sequence' | 'dictionary' | '}' | ';')
should be changed to:
module_body ::= module_body_element *
private module_body_element ::= module | data_definition | constant | forward_declaration {recoverUntil=data_def_recover }
private data_def_recover ::= !('[' | 'class' | 'const' | 'dictionary' | 'enum' | 'exception' | 'interface' | 'module' | 'sequence' | 'struct' | '}' | ';')
Unfoturnatally, GramamrKit plugin grammar already used the new syntax with tokens and could not be built directly by previous version (1.5) of the plugin without workaround :)
I had to remove token= string to try to make it work.
Anoter issues with building plugin with old version are:
- "import com.intellij.psi.PsiReference[];" in the head of generated classes
- error: no suitable method found for tokenize(StringTokenizer)
method StringUtil.tokenize(String,StringTokenizer) is not applicable
(actual and formal argument lists differ in length)
method StringUtil.tokenize(String,String) is not applicable
(actual and formal argument lists differ in length)
- error: cannot find symbol method performInplaceRefactoring(<null>)
So, please. Build a new version of the plugin as soon as it possible.
The binaries from https://github.com/JetBrains/Grammar-Kit/tree/master/binaries should already contain these changes