Different types of lexers

Hello!

I am wandering about different types of lexers.

They all implement the lexer interface. I tried to make my own lexer that would return whole text as one token but i'm getting Out of memory exception - i suppose because of some value i return. So the first question here: what's the scheme of lexer usage. Which return values are being passed to lexer back?

The second thing i noticed is lexers over lexers. Javascript lexing uses LayeredLexer. Is there any documentation? Ruby and some internal languages use MergingLexer. What are the purposes of these and in what situations to use them?

Thanks,
Jay

2 comments
Comment actions Permalink

- Lexer should return null token type when it comes to end.
- MergingLexerAdapter is used when some tokens from base lexer should be
merged during lexing (sometimes it is hard to make reg exp for this in
base lexer)
- Sometimes token in base language is set of tokens in another language
then layered lexer is used, the later activates necessary lexer during
processing and switches it off when token is processed.

Jay wrote:

Hello!

I am wandering about different types of lexers.

They all implement the lexer interface. I tried to make my own lexer that would return whole text as one token but i'm getting Out of memory exception - i suppose because of some value i return. So the first question here: what's the scheme of lexer usage. Which return values are being passed to lexer back?

The second thing i noticed is lexers over lexers. Javascript lexing uses LayeredLexer. Is there any documentation? Ruby and some internal languages use MergingLexer. What are the purposes of these and in what situations to use them?

Thanks,
Jay


--
Best regards,
Maxim Mossienko
IntelliJ Labs / JetBrains Inc.
http://www.intellij.com
"Develop with pleasure!"

0
Comment actions Permalink

Thank you much!

I'll try to add a lexer that switches between several lexers depending on start and stop tokens.

0

Please sign in to leave a comment.