TinyMCE JMySpell and Jazzy spellchecker implementation


In my previous post I was asked about the “May be JMySpell is a better backend.”  On that moment I don’t know the answer – JMySpell library was new to me.

So, I download and implement spellchecker with JMySpell to get the answer.  Now JSpellChecker support 3 spellchecker engines (google, Jazzy and JMySpell) and I can briefly compare them all.

Google engine is easy to start with (you don’t need to  have any dictionaries) but it doesn’t give you the guarantee that it will be stable and you will not overcome the limit of requests. Jazzy engine looks like most advanced engine but it’s not easy to prepare dictionaries (Mozilla dictionaries could be used with some manipulations). The advantage of JMySpell is that it can use OpenOffice dictionaries. (You can download any dictionary from OpenOffice wiki). On my view Jazzy have more settings for spellchecker object than JMySpell, but actually I don’t do really deep comparison in quality of spell-checking.

Here is excerpt from the official JMySpell site:

This allows us to use the dictionaries from OpenOffice.org in Java applications, whether they’re J2SE applications or J2EE web applications. Since at the moment there is only one 100% Java Open-Source spell checker (Jazzy), and the inclusion of dictionaries, particularly the Spanish dictionary, is difficult, the objective of this project is to fill this gap.

Here is the list of changes for the https://sourceforge.net/projects/jspellchecker/ project

  1. Code refactored to support different implementations of spell-checkers. new class TinyMCESpellCheckerServlet define request reading, delegate spell-checking to abstract methods and write response in JSON
  2. JMySpell spellchecker implementation has been added (JMySpellCheckerServlet)
  3. Servlet paths changed to (please update tiny_mce config scripts)
    • jazzy servlet path become “/jazzy-spellchecker”
    • google servlet path become “/google-spellchecker”
    • new JMySpell servlet path has been added “/jmyspell-spellchecker”
  4. Location of dictionaries in the “spellchecker” application reorganized
    jazzy dictionaries should be located under “/WEB-INF/dictionaries/jazzy” before it was “/WEB-INF/dictionaries”
    JMySpell dictionaries should be located under “/WEB-INF/dictionaries/jmyspell” in form of zip files named to correspond to the language attribute in TinyMCE config script

Here is the example configuration


tinyMCE.init({
 theme : "advanced",
 mode : "textareas",
 plugins : "spellchecker",
 theme_advanced_buttons3_add : "spellchecker",
 spellchecker_languages : "+English=en-us,Swedish=sv"
 spellchecker_rpc_url    : "/spellchecker/jmyspell-spellchecker", //spellcheck url for jazzy use /spellchecker/jazzy-spellchecker

the following files should be present in the “/WEB-INF/dictionaries/jmyspell” directory

  • en-us.zip
  • sv.zip

On the moment I tested JMySpell en-us dictionary from TinyMCE and it works fine. And I’m looking forward to perform more QA for JMySpell spellchecker engine and probably it’s possible to do with help of community, since I’m not linguist. The authors of JMySpell states that their library implements MySpell algorithm (which was used by OpenOffice as spellchecker engine before version 2.02). On the moment OpenOffice use Hunspell spellchecker library. Few words about Hunspell

  • Improved suggestion using n-gram similarity, rule and dictionary based pronounciation data.
  • Morphological analysis, stemming and generation.
  • Hunspell is based on MySpell and works also with MySpell dictionaries.
Advertisements

https://www.facebook.com/achorniy

Tagged with: , , , ,
Posted in Software Development
21 comments on “TinyMCE JMySpell and Jazzy spellchecker implementation
  1. Andrey Chorniy says:

    they are available via SVN
    svn co https://jspellchecker.svn.sourceforge.net/svnroot/jspellchecker jspellchecker

    you can find that and more info in the previous spellchecker-related post at
    https://achorniy.wordpress.com/2009/08/11/tinymce-spellchecker-in-java/

    • for who doesn’t have an SVN client: (register and) login into sourceforge and download tarball from root SNV directory.

    • Krish says:

      Hi Andrey…i was able to sucessfully incorporate it however it does not seem to work in IE properly. It does not highlight all the mispelled words. I tried to figure out the problem it seems the problem is with the _markWords function of editor_plugin.js file(in spellchecker plugin folder of tinymce). Specifically the condition if(rx.test(v)) evaluates to true and false differently in both the browsers. I was wondering if you were able to make it work in IE also?

      • Andrey Chorniy says:

        For sure it works in IE. it actually was tested in IE first. I suppose the problem could be with newer tinymce version initially
        majorVersion : ‘3’,
        minorVersion : ‘2.0.1’,
        releaseDate : ‘2008-09-17’,

  2. Hello,
    i’ve found that not all dictionaries from OO are working.
    I’ve been using Italian or German from this page http://wiki.services.openoffice.org/wiki/Dictionaries and to make them work I had to update the .aff file inside the .zip bundle by moving the “SET xxxxxx” directive at first line (after comments).
    Is an assumtion of Jmyspell to look for SET on first line, but sometime isn’t (you can find it also in middle of the file).

    cheers Lorenzo

  3. […] TinyMCE JMySpell and Jazzy spellchecker implementation […]

  4. yuraz says:

    Could you please explain in more detail, step by step, JMySpell and TinyMCE implementation?

  5. trialot says:

    I suppose that the .zip files are not needed and that the .aff and .dic files can be used (myspell does the same in unix installations).

    Can you confirm?

    Furthermore, a hunspell java api is existent. Easy to extend the project to hunspell. Is that coming? Or can I propose a change (i will not test that, since using myspell/google) and leave the testing to someone interested…..

    A comment (!)….open office is using .oxt extension files. Just a TIP : download them and rename them as .zip file. That is all (or clean up some files…).

    And…when having yast or similar stuff….just use yast to install lost of packages (just make symlinks on the linux box). Easy to upgrade….

  6. Mitch says:

    Hi..thanks for this..it works great..Is there anyway to add an ‘add to dictionary’ feature or are the dictionaries locked down? I need to be able to spell check medical terms.

    • Andrey Chorniy says:

      Which “spellchecker” implementation do you use ?
      Probably “JMySpellCheckerServlet” will be easiest to find new dictionaries, since it use openoffice dictionaries

  7. Aw, this was a really good post. Spending some time
    and actual effort to create a top notch article… but what can
    I say… I hesitate a lot and don’t seem to get nearly anything done.

  8. Ryan Veteze says:

    hey, thanks for this. it was pretty difficult to dig out the tinymce 4 stuff but i eventually found it.

  9. Ayub Khan says:

    Could you please help with arabic language. Not able to configure arabic dictionary.

    • Please describe which way to integrate are you using (Lucene-based / jMySpell, etc.)
      I never work with arabic language before, you should check that underlying spellchecking engine support arabic language first.

  10. Ayub says:

    I am using Jazzy spell checker. I downloaded the word list from http://sourceforge.net/projects/arabic-spell/ and created the folder structure /WebContent/jazzy/ar/ar.0 (ar.0 is the wordlist file).

    If I put few words in words in ar.0 and try to spell check those words, it working. However the entire wordlist file is huge (60MB+) and the spell checker hangs during spell check even trying to spell check 2 words with the full wordlist file as the dictionary.

    I found jortho to be good as well, however it is tightly coupled with JTextComponent.

    Please let me know if you know of any reliable spell check engine + arabic dictionary which I can use with TinyMCE

  11. Well, I think your case show that Jazzy based spellchecker implementation may not work well for huge dictionaries (or for arabic words).
    I don’t have exact knowledge. but I think that it worth to try to use Lucene based spelchecker implementation. Lucene itself was designed to process huge amount of data, so possibly their implementation will be able to handle that.

  12. jim Theodoridis says:

    Very usefull article!!
    Jazzy does not return any suggestion

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: