Enhance Spellchecker

Answered

Hi!

I Would like to change the default Behavior of the spellchecker.

The spellcheck is too dumb in my opinion. It checks always the whole word as it is. It would be smart if it would check if one (unknown) word contains two or more (known) words.

Examples:

phpinfo => Typo (BUT php, info or phpInfo is OK)

emailserver => Typo (BUT email, server is OK)

emaildnsserver => Typo (BUT email, dns, server is OK)

 

How to override?

Where can i find the logic where words will be divided into words by camelCase?

6 comments
Comment actions Permalink

You can provide custom com.intellij.spellchecker.tokenizer.SpellcheckingStrategy and/or provide custom bundled dictionary via com.intellij.spellchecker.BundledDictionaryProvider

0
Comment actions Permalink

Ive created a class:

package application;

import com.intellij.psi.PsiElement;
import com.intellij.spellchecker.tokenizer.SpellcheckingStrategy;
import com.intellij.spellchecker.tokenizer.Tokenizer;
import org.jetbrains.annotations.NotNull;

public class spellchecker extends SpellcheckingStrategy {

@NotNull
@Override
public Tokenizer getTokenizer(PsiElement element) {
System.out.println("+++");
System.out.println(element);
return EMPTY_TOKENIZER;

}

}

And Registered it

<extensions defaultExtensionNs="com.intellij">
<spellchecker.support language="HTML" implementationClass="application.spellchecker"/>
</extensions>

But i dont see any debug output. *scratch head*

 

I have also registered a "ApplicationStart"-Component which can be reached, so the Plugin is working i general

 

0
Comment actions Permalink

That's because there's an existing strategy for HTML already com.intellij.spellchecker.xml.HtmlSpellcheckingStrategy

try adding _order="first"_ in your plugin.xml declaration for EP

0
Comment actions Permalink

I ended up by adding a new custom inspection (extends SpellCheckingInspection).

To solve my problem i needed some Recursion and smart Algorithms. Not the easiest task :-D

I try to find so much valid words as i can first and put them in a child-parent-word-tree (...). But there are a lot of Words which are valid then. Example:

Word: "emailserver"

validWordsTree: {em=null, ai=em, ls=ai, er=erv, erv=emails, ail=em, server=email, ails=em, ema=null, il=ema, ilse=ema, rv=ilse, email=null, emails=null}

As you can see many "words" with only two letters are valid. So i decided for my Plugin to use a minimum length of 3 chars to reduce false positives:

validWordsTree: {ema=null, ilse=ema, email=null, server=email, emails=null, erv=emails}

 

This works well so far. But i figured out that some words are not valid where there should be. Example "dnsserver":

isValidWord: dnsserver: false
isValidWord: dns: false (<= Should be true)
isValidWord: dnss: false
isValidWord: dnsse: false
isValidWord: dnsser: false
isValidWord: dnsserv: false
isValidWord: dnsserve: false
isValidWord: dnsserver: false

I check these with

private boolean isValidWord(String word) {
return !myManager.hasProblem(word);
}

So the Question is: Why is "dns" or "php" not a valid word with that check "!myManager.hasProblem(word);" ?

These words are valid standalone.

0
Comment actions Permalink

TL;DR. The Quesion was:

Why is "dns" or "php" not a valid word with that check "!myManager.hasProblem(word);" ?

These words are valid standalone.

0
Comment actions Permalink

So basically now you have default Spellchecking inspection working and yours in addition?  Or do you suppress default's "false positives" in your plugin?

 

And what is "myManager"? Please always share full code.

0

Please sign in to leave a comment.