Reusing existing language parser provided by native library

Answered

Created July 27, 2022 10:20

I have my own language. I would like to develop plugin to supply my language on IDEA platform. I read the documentation but I have original parser/lexer, API to manipulate own AST tree already formed like native dll/so.

How I can map my own AST to the IDEA AST tree (node by node: my_node -> ASTNode) w/o scanning and parsing stage? I am familiar how to map native code and structs to java and i am speaking about architecture injection.

Another question is: how I can access from PSI/ASTNode to the full underlying text buffer to retrieve text regions by my own coordinates? Do I need save the reference to the buffer on scanning/parsing stage?

Thanks

4 comments

Karol Lewandowski

Created July 28, 2022 09:38

Hi,

It sounds like a similar case to what ANTLR4 IntelliJ Adaptor does (except that it doesn't operate on native libraries). I suggest checking this project and considering creating your own adaptors adapting the program structure provided by your native libraries to IntelliJ PSI.

Regarding the second question, I'm not sure I understand it. When you have a PSI tree built, you can access underlying text by PsiElement.getText().

Yurivolo

Created July 31, 2022 15:45

Thank you for the answer. Approximately I did something like you advised. But I am using SGLR(scannerless generalized LR) parser, it has no distinct lexer. For now my dummy lexer has only one lexeme - all text. And I have no notifications about parser state; after parsing I have parsing tree only and abilities to traverse over it and access attributes of each node. I need to build PSI ASTTree from mine. I was build the tree (I see it in psiViewer) but source text is not connected with my nodes. Which interfaces I need to implement for integrating my tree to the IDEA infrastructure?

Thank you.

Karol Lewandowski

Created August 03, 2022 07:45

Hi,

The lexer implementation is required.

There is no good solution for your case as lexers are executed synchronously in the UI thread to support many IDE features and must be as fast as possible. Implementing it on top of the parser (what would be a workaround) will degrade IDE performance and is not recommended. The recommended approach is implementing a lexer in addition to your scanner-less parser.

Yurivolo

Created August 06, 2022 10:48

Lexer for what? For my lexer all input is set of the tokens separated by white spaces. I have implemented lexer like it but obviously it is not synchronized with PSI AST I mapped from mine and yields garbage leafs which my original AST has already. All of I need is to map my AST to PSI Tree, my AST has all information which PSI needs to navigate over the input text or AST I can return it on demand at any time. SGLR which I use written in C and has tiny overhead by memory and performance.

Please sign in to leave a comment.