I am developing an open-source tool to edit the Dictionary/Thesaurus/Hyphenation of OpenOffice, LibreOffice, Firefox, Thunderbird and SeaMonkey.
My goal was that in the future someone would use it in Thunderbird and fix the en_GB speller since it was full of typos/missing words and no one offered himself for the task.
A couple of months went by and, since no one volunteered, I took the task myself.
I want to create the ultimate speller, as good or better than the ones used by commercial software. It encompasses several fields of knowledge, from simple to complex words.
To make sure the words I add are the correct ones, I have been looking for them in credible sources such as the:
1) Oxford Dictionaries;
2) Collins Dictionary;
3) Macmillan Dictionary;
4) Wiktionary (used with caution);
5) Wikipedia (used with caution);
6) Physical dictionaries.
I am also involved on several projects/hobbies with a specific jargon, which means I can test the words mentioned there: I have been pasting text from credible sites and e-mails trying to find missing words. This way I have added some jargon/technical words.
I have been told to use scripts in order to update the dictionary, but I am adding the words by hand with copy/paste after checking them in the dictionaries mentioned above. This takes longer and is harder but the results are much better and accurate.
On 25.Aug.2013 I released a "forked" V2.00. In January 2014 my version was officially implemented in Apache OpenOffice and the same happened with Mozilla in May 2014. So far, 10'000+ words have been added since I embraced the project.
About ize/ise: Just like in other languages, such as mine, there are valid words that can be written differently. Since Oxford says some words are valid both ways, I kept both and let the user decide which he prefers the most. A good example in Portuguese are the words: "loira" and "loura" (blonde) which mean the same and can be written differently.
Main difficulties developing this dictionary:
1) Names of places and persons;
2) Words ending with 's;