Looking for Lexer doc

Hello, I'm working on coloring for Clojure, so I'm implementing a Lexer.

Where is the documentation on the various interfaces that need to be
implemented for a custom language? I assume there should be JavaDoc
somewhere.

Thanks
Peter

5 comments
Comment actions Permalink

What do you mean by various interfaces?

And what version are you targeting? 7 or 8?

0
Comment actions Permalink

Hi Jay, thanks for helping

I am building my custom language piece by piece. For example, now I am
working on brace matching. Next I will do coloring, folding, references
etc.

As I understand it, each feature is implemented as a Factory of some
sort, that is registered as an Extension in plugin.xml.

These Factories share common components such as the Lexer and the
Parser. These components are also implementations of appropriate
interfaces.

So I have two questions:

1) Where is the documentation on these interfaces? I found the Javadoc
included with the code, but it is fairly sparse. For example, the Lexer
has a getState() method, but the JavaDoc just says "return the state".
Is the state just internally meaningful to the Lexer, or do other
components make use of it?

2) How do all these components fit together? How do they find each
other? For example, the BraceMatcher is registered as an extension, but
requires the Lexer-- how does it find it? Is there a document that
describes the architecture of a custom language plugin and how
everything finds what it needs?

I am using the Groovy plugin source code as a reference, but it is large
and complex, and makes use of lots of undocumented helper classes so it
is hard to find this information.

Again, thanks
Peter

Jay wrote:

What do you mean by various interfaces?

And what version are you targeting? 7 or 8?

---
Original message URL: http://www.jetbrains.net/devnet/message/5228905#5228905

0
Comment actions Permalink

Ooops, sorry forgot. I am targeting version 8 and beyond.

Thanks
P



Peter Wolf wrote:

Hi Jay, thanks for helping

I am building my custom language piece by piece. For example, now I am
working on brace matching. Next I will do coloring, folding, references
etc.

As I understand it, each feature is implemented as a Factory of some
sort, that is registered as an Extension in plugin.xml.

These Factories share common components such as the Lexer and the
Parser. These components are also implementations of appropriate
interfaces.

So I have two questions:

1) Where is the documentation on these interfaces? I found the Javadoc
included with the code, but it is fairly sparse. For example, the Lexer
has a getState() method, but the JavaDoc just says "return the state".
Is the state just internally meaningful to the Lexer, or do other
components make use of it?

2) How do all these components fit together? How do they find each
other? For example, the BraceMatcher is registered as an extension, but
requires the Lexer-- how does it find it? Is there a document that
describes the architecture of a custom language plugin and how
everything finds what it needs?

I am using the Groovy plugin source code as a reference, but it is large
and complex, and makes use of lots of undocumented helper classes so it
is hard to find this information.

Again, thanks
Peter

Jay wrote:

>> What do you mean by various interfaces?
>>
>> And what version are you targeting? 7 or 8?
>>
>> ---
>> Original message URL:
>> http://www.jetbrains.net/devnet/message/5228905#5228905

0
Comment actions Permalink

Hi Peter

Well, first of all the link that Yann gave is truly useful. It's a little bit outdated though, however it still has the right concept of what's going on.

Regarding your questions:
1) There's no good and thorough documentation. But it seems strange to me that you want to implement the Lexer yourself. Usually it is a jflex-generated Lexer with idea skeleton.
2) BraceMatcher does not need the Lexer. It just defines what token types are pairs to highlight. However if there would be no such token types in the file, nothing will fail.

In general, the custom language plugin looks like this:
— you register a file type
— you return your Language form that file type
— based on the Language ID your extensions are loaded
— to highlight the file, Syntax highlighter is loaded. Here's the groovy's definition:

<syntaxHighlighter key="Groovy" implementationClass="org.jetbrains.plugins.groovy.highlighter.GroovySyntaxHighlighter"/>


 

GroovySyntaxHighlighter has the getHighlightingLexer method, which returns Lexer, which in turn brakes the file into tokens and GroovySyntaxHighlighter defines the color for the tokens.

— to parse the file, IDEA asks ParserDefinition. Again, from groovy's plugin.xml:

<lang.parserDefinition language="Groovy" implementationClass="org.jetbrains.plugins.groovy.lang.parser.GroovyParserDefinition"/>


 

ParserDefinition is first asked to return a Lexer, then the Parser, which is fed with the Lexer's output (array of tokens). The parser output is the PSI tree, which has the Lexer tokens as leaf elements (or something like this).

For generating your Lexer, see the link that Yann gave.

0

Please sign in to leave a comment.