Tuesday, March 27, 2007

The language (prevention) committee???

I haven't been here for some days and had to follow up quite a bunch of e-mails and what do I find? People complain about the requirement to have a localized UI for a new language project, mainly talking about Wikipedia in a new language. We are also accused to use the policy request for our own neat reasons ... uhm ... which own neat reasons should we have if not making sure a new Wikipedia can have an easy life???.

Today most new wikipedias are really small projects with a quite restricted number of potential editors.

Now let us go back to the day the Neapolitan Wikipedia was requested ... I already had thought about a Wikipedia in Neapolitan, but had planned it for autumn 2006 to start, but everything went different: in Summer 2005 people in the discussion group napulitano@yahoogroups.com started to say things like "it would be great to have a Neapolitan wikipedia" and at a certain stage I simply decided to open the new language request and see if there was enough support for it. Within just very few days we reached the amount of votes needed and the Wikipedia was created fairly fast. The new project was there with a Neapolitan UI and me as temporary admin. From the 10 native speakers that supported the new project not all got active. In the end we were three stable editors.

Soon we found out that the Mediawiki software did not like a particularity of Neapolitan. In Neapolitan you have words that require a double quote, like d''o and the double quote from the Mediawiki software is understood like "write in italics". So we asked to solve this issue, but up to date we still have to use a workaround inserting # & 3 9 ; # & 3 9 ; (writing with spaces to make sure the blog does not convert to html). New users are sometimes driven away. The hurdle to be able to contribute on nap.wikipedia is high: people speak Neapolitan every day, but they never wrote it in their life. And when they then timidly start to contribute writing of course with errors since the only spelling they know is Italian and so they use Italian rules on Neapolitan pronunciation to write it down they find out that there are even issues in writing simple double quotes. Then again: some super Neapolitans who seem to know everything come to a collaborative project and complain about mistakes ... well whenever they are asked to contribute with their knowledge and to share it to make sure written Neapolitan has a chance to survive they vanish in the dark ... so what? Well, probably their Neapolitan is then not good enough and they are worrying about ... what??? ... and they probably feel big since they wrote an anonymous note ... well: probably you understand what I think about people who are not man/woman enough to say who they are and do facts instead of only complaining.

So: one request the language committee has is the creation of a certain amount of contents by a certain number of people in order to make sure issues related to languages can be dealt with before they arise on the new wikipedia. When they arise before the creation of the new wikipedia people will know how to ask and how to deal with things easier since they are on Incubator where they find a cross community and where there are plenty of people who can help and address them to the right contacts. Besides that: this requirements also helps to make sure that people really want to work on their language and it is not just a wish of that moment and that after two weeks they stop.

Another request is the localization of the User Interface. Well the User Interface of a Wikipedia can be edited only by an admin. The funny situation I had on the Neapolitan Wikipedia at the beginning was that I was the only temporary admin. A person with quite a lot of technical wiki skills, but lacking in Neapolitan language not being it my mother tongue. So the first bit of the localization was done like this:

* I translated a sentence
* sent the link to the Neapolitan discussion group
* got the corrections back via e-mail
* re-edited the sentence
* sent the link again to the discussion group for proof reading
* eventually had to edit again etc.

Now that was very time consuming and so only a small part was done since as soon as people became aware of nap.wikipedia they also started to vandalize the project ... well only the admin can delete spam, only the admin can do certain stuff. So everything slowed down on the localization end quite a bit.

Then other two admins and a bit later another one was elected and I got bureaucratship so I could nominate them admins. But still we are only four people who can deal with the UI localization - one of us only comes along when we call him - he really has not a minute of time and it is always difficult to get hold of him, a second one did not stay well and even now cannot edit every day. The only one being able to do a bit more is Eric and I would like to tell him: thank you, you are great!!!

Even if other people would like to help with the localization: they cannot help ... they would have to become an admin first.

On Betawiki the story is different: anybody gets edit rights for the various UI languages - thank you Nikerabbit for all the programming work you do - and there was one very special person who cared about the community around betawiki: Gangleri ... who knows where he is - we cannot reach him. Please, should you read this post: all of us would very much like to know that you are fine. Gangleri did such a great work ... well: Mediawiki should thank him ... he really worked all day for weeks or better months. Now Betawiki could be implemented on Incubator and people would have less problems with the localization, but it does not have priority. So for now we kindly ask people to do that job on Betawiki.

By working on Betawiki they do not only help their own Wikipedia, but also all Wikis that are potentially created in that language. That means: they do much more for their languauge than just doing it on the wikipedia bit by bit.

What I would like to say with this: please do understand that behind each request we made in the new language creation policy there are hours of work and considerations, many discussions on why a request makes sense and another one did not make sense (and therefore is not included n the policy).

We are here to help wikipedias to survive, to be able to build a reasonable and sustainable community around it, to encounter less problems possible when they have to live on their own. Many people underestimate the work it takes to build up a new wikipedia. Most new projects are less sourced language entities with not too many potential contributors. If we want to help them we need to consider our own experiences and we need to make sure they will not encounter the same problems that we had.

Wikipedias in less sourced languages with only a small number of potential editors: we want you among us and we want you to succeed and become brilliant projects.

There would be so much more to say ... but I hope that you, who are reading this, understand what we are about and will help us in our job.

Thank you!

Monday, March 26, 2007

Pisa ... again, but different this time

Being on the way back home on the motorway I decided to get finally used to my PDA and start to write my blog. Up to date I always was in Pisa for conferences or meetings, but not this time. Yesterday meant "back into the class room" and teach to me, at the SSML. It was a strange feeling having the register with the names of the students in hand and having to check who was present and who not. It was the first time after 12 years and I noted that I missed that for quite some time. Showing students new things, explaining etc., yes it is definitely fun. Of course it was quite different to years ago, I did not give German language lessons, but explained how to use OmegaT, how to use Wikipedia and its sister projects to search for terminology. We looked at OmegaWiki and how to add a definition or translation. We also took the time to talk about things like networking on portals like LinkedIn. Last but not least we also had a look at IRC as a communication means sice in future chat will be more and more used by translators. In the end I asked, like I always do, what I could do better. It seems all were happy with what they learnt new. Resuming I would say : it was a great experience for all of us.

Friday, March 23, 2007

Banche ... ma sono fuori? (Monte Paschi di Siena)

Allora: qualche tempo fa faccio un bonifico per inviare del denaro ad un c/c italiano. Ho tutti i dati e mio marito va in banca per eseguirlo. Quando torna vedo le spese per un bonifico: EUR 5,50 ... conoscevo questo importo per bonifci all'estero e quindi ho pensato che si sono sbagliati ... chiamo: e la risposta era: no, è l'importo da pagare per bonifici eseguiti in filiale per i c/c in Italia. Già allora ho scritto un'e-mail con lamentele al servizio dei clienti alla centrale della MPS. Ma che ... non si sentono mica di dover rispondere ... comunque: chiedere quel importo secondo me è frode e la Commissioner Neelie Smit-Kores presso la Commissione Europea come anche le varie associazioni per la difesa dei consumatori dovrebbero essere molto interessati (riceveranno la copia della e-mail spedita alla banca + qualche nota aggiuntiva). Non ho avuto il tempo per farlo in questo periodo ... giuro che lo farò. E: mi chiedo se i clienti della MPS in Italia semplicemente accettano questo prezzo perché non lo sanno diversamente o che davvero nessuno ci fa caso. E che ne è dei vari esercizi che hanno i loro c/c lì e che hanno più che un solo bonifico da fare? Il sistema online della MPS non lo ritengo sicuro in quanto è protetto da solo due password - esistono banche che hanno un livello di sicurezza superiore. (Password + TAN che cambia per ogni singola operazione eseguita).

Ma torniamo ai servizi. Nel 2004 non sapendo alcune cose ho pagato delle tasse in ritardo (eh vabbo' le cose in Italia sono diverse che in Germania) e quindi sapevo che doveva venire una richiesta di pagamento da parte delle Agenzie delle Entrate e quello puntualmente è arrivato quando neanche ero qui a Maiori. Allora: ho compilato il foglio e mio marito è andato in banca per pagare - gli viene detto che il modulo che abbiamo ricevuto dall'Agenzia delle entrate non era valido e che doveva compilare il modulo - quel prestampato celeste - manualmente. Ok, l'abbiamo fatto immediatamente e per assicurarci che il modulo inviato da parte dell'Agenzie delle Entrate sia davvero non valido abbiamo chiesto alla banca di attestarlo per iscritto, visto che se non ho quello sono logicamente io la responsabile per l'eventuale pagamento in modo errato: l'impiegato si è rifiutato e naturalmente anche il direttore ... ed è meglio che non mi dilungo sui modi ...

Conclusione: è ora di cambiare banca - appena il tempo me lo permette.

Una nota alla Monte dei Paschi di Siena: ma credete davvero che con quel tipo di servizio riuscirete a sproavvivere a lungo? E che ne fate dei soldi di tutti quei bonifici eseguiti che vi fanno entrare 5.50 EUR alla volta? Cioè se guardo il servizio: certamente non servono per migliorarlo - e allora che ne fate con i nostri soldi? Ricordatevi: siamo clienti e non siamo noi a dover essere grati che voi siete così gentili a prenderci il nostro denaro per "servizi".

Su livello europeo avete ancora tanto da imparare ... e com'era quella pubblicità tempo fa: "... perché non siete solo un numero ..." o qualcosa del genere ...

Esistono banche dove il c/c, i bonifici e altri servizi non costano nulla ... e non dico qui che pretendo di pagare nulla, ma il giusto per il giusto servizio - comparato con quello che fate voi mi chiedo: siete in grado di amministrare denaro o avete dei buchi neri, come nello spazio, che ingoiano tutto?????

Ritenete questo post come "gentile" - non ho espresso i miei veri sentimenti qui ...

Monday, March 19, 2007

Amalfi Coast - finding the treasure

I hardly ever post personal notes here, well today I will. This is in answer to an article I read on Positanonews in English: City Slickers and Country Cousins.

What is it that makes people fall in love with this very particular part of the world? Well ... I still remember the day I first came here in the Summer of 1986. We arrived in the afternoon after a long trip first in train to Naples and then from there by car. The first glance at the coast leaving the motor way in Vietri sul Mare was simply incredible ... the sun reflected like many shimmering stars on the sea. The blue of that sea and the green of the mountains ... an incredible feeling came up ... from one second to the other, not having even seen more than that one glance, I was sure that this was the place to live, the place to stay until the end of days. This was the moment when I felt like coming finally home after an endless trip from wherever in the world. Still today, whenever I have to leave this unique place to live, even if it is only for some short days, I feel as if a part of me is ripped away and remains here. And when I come back home and I see that first glance (yes, I always prefer to drive some more kms and enter the coast in Vietri) what was ripped off when I left comes back and that very particular feeling is renewed. Mind me: for some strange reason I never took a photo of that view.

I suppose it happens with most people just that way: it is not something particular that makes us love this very unique part of the world - it is that one thousandth of a second that changes everything. Afterwards we say that we like it here because of this and that, but the moment we decide to stay is the very first one. Some people who come here on holiday probably have a similar experience, but less strong and less convincing - and yes, I am happy about that, there would not be enough space on the coast to have all who see the Devine Coast live here - of course: we hope all of them to come back and to take a part of the coast back home in their heart.

Of course there are also people who don't feel all this and just see the coast like a nice place to pass a holiday - well: these are the ones who are not made for it - they will find their place somewhere else and I very much hope for them that when they find it they decide to really take the step ... it is worth each second.

Back to the article on Positanonews: well yes ... many of these people might feel similar things I felt and still feel each time I see the sea, the sun on the water, the mountains.

Now you who read this are curious? Well: one day, if you decide to come here you will see ... and then, please let us know your thoughts about this very particular place.

Monday, March 12, 2007

When Open Source, Machine Translation and Computer Assisted Translation meet

Nice title, right? Well what scenario is possible there? Where are the potentials? What are the problems?

Ok, so let's start: actually there is more translation need in Open Source and Open Content than what can be handled by humans right now. All we need is probably already available and just needs some adjustment and additional feature.

Imagine a software like Mozilla Firefox that needs UI translation and manual translation. Well: the UI should be done by using a CAT, but considering some basic facts like: translators are not programmers, each language is very different - this must be considered somewhere - and yes, I know they are working on it. Then take the manual and the help files: the need translation and over time updates. A workflow I could imagine is:

1) Machine translation of the manual/help files using Apertium
2) Alignment of source and target text and load it in OmegaT for proof reading and translation of the parts that are completely "out of order".
3) Creation of the final documents + feedback to Apertium

A second time the translation process starts it would then go like this:
1) Pretranslation of the manual/help files using OmegaT (sentences that remain the same receive 100% matches).
2) Machine translation of the new parts using Apertium
3) Proof reading within OmegaT
4) Creation of final documents + feedback to Apertium

To do that Apertium needs to be programmed for the language pairs needed - the approach of Apertium is to create a one language to the other machine translation. Therefore, like you can see when you try out the Spanish-Catalan version, which is already very mature, you will get factual translation where only a 4-5% needs some manual changes.

I showed this also to a translator who deals professionally with these languages and one wikipedian (time ago). Both of them confirmed that the factual translation is correct. So it is suited for manuals and encyclopaedic entries. Of course, if you put an article done with machine translation online, you need to mark it as such until it is proof-read.

Of course it could be that some specific terminology for software is now missing in its engine ... well that needs to be added. For now Apertium has own dictionaries that care about this, but we would very much like to see the contents creation within OmegaWiki - the reason is simple: on one hand people who need Apertium function better and better will create their dictionaries there and on the other hand the work they do can be also used for other scopes, such as spell checkers, offline bilingual dictionaries, the dictionaries for the OLPC laptop, other software localization projects etc. So: the work is done only once and then re-used. If we get various projects to work together all of them will have better results by having less work to do. This means their time is used more effektively and they can do more good in the same time.

When it comes to OmegaWiki there is one feature that we are still missing, but that is already planned: inflections. These inflections are needed for Apertium, they are needed for spellcheckers and for various other applications. So again: co-operation among various projects becomes a means to make sure they will go ahead and survive in future.

One last note: please don't consider only our everyday languages where in some way you normally find people to co-operate with (and where we really also should try to consider their time investment with the value it actually has and where exactly this consideration and respect often is missing) - consider all those rare, less sourced, non governmental languages or better language entities (just to introduce this term) that will be helped a lot by such ways of working.