Grammar-kit language support - Jflex or BNF

Answered

Created August 13, 2024 20:45

Hi,

So i've been working on a language support plugin developed with Grammar kit. It's the first intellij plugin I ever develop so kind of diving into the deep here. I unfortunately do not have any experience what so ever with BNF or JFLEX.

I've tried to follow the tutorial but it's kind of ‘raw’ in the sense of that it just instructs you what the do, but doesn't really explain why. And this is where I'm having troubles getting my head around certain things.

So the language I'm planning to support with this plugin is the Grafana Alloy syntax. It's not that much of a language so it's properly a good starting point before doing anything more complex.

https://grafana.com/docs/alloy/latest/get-started/configuration-syntax/

It also supports expressions, has constants that can be defined and it has pre-defined components which on it's turn have their own structure with their optional and required attributes. My plan is to implement code completion for all the different components.

Now, here's the deal. I have been able to get the BNF structure defined. When I run it, it nicely creates a PSI structure exactly as how I am expecting it to be. It identifies what a component is, what a label of a component is, the components blocks, attributes within the blocks, the attribute name, the attribute value etc etc etc.

What I have troubles to understand is what I should be defining in JFLEX and to what extend I should implement further rules there.

For example, in my BNF I define what a COMPONENT is; it's combination of a BLOCKNAME, BLOCKLABEL and BLOCKBODY. However in the GENERATED JFLEX, this BLOCK is returned anywhere because it's defined as a rule, not a BNF token. When I look at for example code completion implementation, it seems to rely on PSIElements, which I already have, so I guess I don't need to worry about that.

When I look at syntax highlighting it does seem to rely on the types that are returned from the JFLEX rules. So if I want to have a complete COMPONENT BLOCK colored differently, I would need to redefine what a block is in the JFLEX rule so that I can return the appropiate type, eventhough it's already identifiable by the PSI Elements? This seems double work and doesn't make sense to me.

So I guess my question would be; how do I decide which rules I need to implement in JFLEX versus BNF. What is the thought process I need to follow.

1 comment

Karol Lewandowski

Created August 21, 2024 11:54

Hi Raoul,

There is no duplication. Parsing is split into two phases:

Text tokenization.
Building a syntax tree.

The first one is handled by lexer and this is implemented with JFlex. This phase simply splits the input text into tokens like numbers, identifiers, strings, boolean values, etc. It doesn't define any rules related to syntax.

The second phase builds a syntax tree from the tokens provided by lexer. If the consumed tokens are valid, the syntax tree is built, and when an unexpected token is consumed, an error element is added. This phase is related to the language syntax.

Please sign in to leave a comment.