create psi from binary file

Answered

I am working on creating a psi tree from binary data. The binary files in question are of known format, and contain any number of "records". The goal is to read in the bytes of the file, creating the psi elements/tree as we go, and then be able to provide support for something like say, the Structure tool window.

For example, perhaps the psi tree for one of these files would simply have a root file element, and 10 child record elements.

I understand the process of creating a plugin which supports a new language, for a new file type, and defining the lexer and parser for that language. However, I am hoping to be able to accomplish this task without a lexer or parser (I am just reading bytes), going straight from the binary data to psi.

Any suggestions on how best to implement this within the IntelliJ framework? I am particularly struggling with where in the process the reading of the binary data from the file (and psi creation) should occur, because it seems like I am trying to accomplish something manually that is all handled "automatically" when you have a lexer and parser for a language and file with ascii text.

Should this be done in the IFileElementType.parseContents? Any help is appreciated.

6 comments
Comment actions Permalink

Idea is primarily a code text editor.

What will happen to parsed Psi tree? Is it going to be displayed as text in editor window? Edited, too? Saved back to binary?

BTW about purpose of lexer and parser:

  • lexer parses string to basic PSI tree
  • parser then converts lexer output to full AST tree with complex elements.
1
Comment actions Permalink

"What will happen to parsed Psi tree? Is it going to be displayed as text in editor window? Edited, too? Saved back to binary?"

No editing, no saving back to binary, just viewing. So, in the end, ultimately, you could imagine that we have some type of graphical editor that will be displayed when double clicking on one of these binary files in the project, where the content shown to the user is based off of the psi tree built from the data.

But, perhaps an interim solution would be to just display the psi tree as text in the standard text editor window.

0
Comment actions Permalink

to display bin as readable text, you still need to convert it to string. 

to format (difficult!), highlight text in Idea, you'd need to build psi tree.

 

if there are psi parsers / formatters available for this file format, it is easier to use those and output formatted text to plain text editor.

0
Comment actions Permalink

"if there are psi parsers / formatters available for this file format, it is easier to use those and output formatted text to plain text editor."

Right, and I have actually started with this implementation because this made the most sense when I was looking at other implementations within the IntelliJ codebase. Also this served as kind of a 'proof of concept' for this project.

But instead of going from binary data -> formatted text -> psi, I am wondering about the possibility of going from directly from binary data -> psi ,and cutting out that middle step.

0
Comment actions Permalink

Having PSI for binary files is possible. For example, Java .class files are handled like this.

You should create your own FileViewProviderFactory extension (like ClassFileViewProviderFactory), return there your own class extensing SingleRootFileViewProvider, and override its createFile method to return your own PsiFileEx (or even PsiBinaryFileImpl) inheritor. It can contain the parsed representation inside (obtained with the help of VirtualFile#contentsToByteArray) and have any methods you like to query/operate on that representation.

1

Please sign in to leave a comment.