Thursday, April 05, 2007

Localising Mediawiki Software

Now this is an answer to a question that came up on the Mediawiki i18n list - how to localize Mediawiki. The best way to do it is to do it in one place. For now there is Betawiki that helps, one day we will hopefully have that feature integrated in Incubator. Nevertheless it takes quite a lot of time when you need to translate the messages one by one: open a page, write some word, save.

This needed time shortens a lot when you work with the help of a CAT-Tool like OmegaT.

For now working with OmegaT is possible, but it needs again Nikerabbits help to do so - he needs to extract the messages and then only one person can work on it at a time and then these messages can be uploaded again with the help of Nikerabbit. It is not the right way at this time, since he already has to deal with many things on Betawiki, so no, I don't want him to work more than necessary.

There is a feature on the way called WikiRead-WikiWrite (for OmegaT) - this feature should have been already there but due to health problems of the programmer it was not. Now we asked another programmer if he has time to do it, since otherwise we would loose funding and also the tool. It is a situation we would never have wanted to happen, but that's life - there is always that unpredictable part in it.

WikiRead-WikiWrite for OmegaT is supposed to enable the CAT-Tool to access a wiki page and get it into OmegaT so by translating you create a translation memory. This translation memory is relevant for translating similar texts and terminology research. We need consistency of terminology within Mediawiki at some stage - so: yes, it makes sense to use it.

Once we have the possibility OmegaT can read all the to be translated pages at once, one translates offline and stores to target. Translating offline without the need to have to open one by one the pages and then store them one by one requires only a third of the actual time needed if you do it online. Mind me: if you need to translate only one or two messages it makes sense to do it directly on Betawiki, if you have to deal with a whole series of messages then OmegaT will help to do a better job.

In a second stage we do imagine a wikidata application where the translation memories are stored. It is not so different from what we have now on OmegaWiki. What must be different is:

  • no limitation in lenghts of the entry on syntrans level
  • the definition field becomes a notes field and may be empty
  • license information must be properly stored according to the collection
  • possibility to store and retrieve translation memories

This means that any open source software can have its repository there and it also means that many strings you find in multiple open source applications can be re-used quite easily requiring less work in localization.

Of course having OmegaT with the direct connection to download the translation memory from there would be even better - but that needs additional programming from withing OmegaT - I don't have a clue on how much work this is ... well who knows how to code will be able to tell us I suppose.

I know we are quite a step away from that scenario even if WikiRead-WikiWrite hopefully will be there soon ...

Well if you want to make me a special birthday present and a huge improvement in localization efforts: consider to work on that bit.

Probably I will need to re-edit this post since I am not sure if I really considered everything (I suppose not ... its a period that often requires me to interrupt what I am doing).
Post a Comment