How to produce a slightly different psiTree from astTree?


I develop a custom language with jflex and grammar-kit. I struggle with a part of the grammar that is a kind of dynamic. And my problem is, I did not find out how to do that.

I explain that with an example:

actor ::= ACTOR actorDef
actorDef ::= ID

connection ::= CONNECTION actorRef _someOtherStuff_
actorRef ::= ID

where ACTOR, CONNECTION and ID are IElementTypes and _someOtherStuff_ means that there comes some other stuff as well but I omited it. ID holds an identifier String for a variable.

Where is the dynamic? When the ID of actorRef is the same as in a precending actorDef, everthing is okay. Example:

CONNECTION User _someOtherStuff_

My PsiTree looks like this

            ID (User)
            ID (User)

But when the ID of actorRef is not the same as in precending actorDef, I have to change the actorRef into a actorDef. Example:

CONNECTION UserB _someOtherStuff_

and the PsiTree should look like that

            ID (User)
            ID (User)

Well, that's what I want. I tried two different ways to achieve that.

First, I hooked into MyParserDefinition#createElement, when a actorRef is beeing created. Then I searched the psiTtree for a actorDef and compared the IDs. If IDs were not the same I returned a new CompositeElement with type actorDef. That was not working, since the new node didn't had all other attributes like children and parent etc. Moreover I could not copy that information from the original node to the newly created node, because I ran into StackOverflowExeptions (that operations used createElement as well...).

Second, I tried to use the modifier external from the grammar-kit to hook into the parsing. But I didn't find out how that works and/or I'm to stupid to get the point.

So, what is here the right approach? I just thought I have an AST and have to create a slightly different PsiTree out of that.
Does anyone has got a hint to my problem, of understood my problem at all, or even read the whole question until here ;)?


Assuming that I got it right that you want to switch elementType of the ID node to actorDef from actorRef if and only if the actor with that ID is not already parsed.

I would do the following:

{   stubParserClass = "some.package.MyParserUtilThatExtendsGPUB" } actor ::= ACTOR actorDef actorDef ::= <<rememberID>> ID
connection ::= CONNECTION actorRefOrDef _someOtherStuff_  
private actorRefOrDef ::= (<<isKnownID>> actorRef | actorDef) // &<<isKnownID>> may be used as well actorRef ::= ID

Then I would add rememberID and isKnownID methods to parser util class MyParserUtilThatExtendsGPUB as follows:

public class MyParserUtilThatExtendsGPUB extends copied.from.grammar.kit.GeneratedParserUtilBase {

  public static boolean rememberID(PsiBuilder builder, int level) {     if (builder.getTokenType() == ID) getKnownIDs(builder).add(builder.getTokenText());     return true;   }   public static boolean isKnownID(PsiBuilder builder, int level) {     return builder.getTokenType() == ID && getKnownIDs(builder).contains(builder.getTokenText());   }   private static final Key<Set<String>> KNOWN_IDS = Key.create("KNOWN_IDS");   private static Set<String> getKnownIDs(PsiBuilder builder) {     Set<String> set = builder.getUserDataUnprotected(KNOWN_IDS);     if (set == null) builder.putUserDataUnprotected(KNOWN_IDS, set = new HashSet<String>());     return set;   }


Now every actorDef will remember its ID and the following actorRefOrDef will chose one or another way of parsing its ID.
External rules and expressions can solve any problem you can possibly solve manually :)
Sometimes I like to wrap external expressions with predicate like &<<isKnownID>> for better PSI generation.
This way it is more apparent to the generator that this external processing does not consume any tokens.


Hi Gregory
That's exactly what I was looking for. Apparently, I overlooked the <<XY>> stuff in the documentation...
Thank you very much and greetings


Please sign in to leave a comment.