Tuesday, May 29, 2007

Localisation PhD Scholarship at the University of Limerick

Streamlining Quality Assurance in Digital Content Localisation

Symantec Ireland and the Localisation Research Centre (LRC) at the Department of Computer Science and Information Systems (CSIS), University of Limerick, have agreed to offer a 3-year funded position for a suitable candidate to work on a collaborative research project with Symantec and the LRC, leading to a PhD degree.

Candidates should forward their application (cover letter, CV) to the LRC, CSIS, University of Limerick, Limerick, Ireland. Note that the closing date for the receipt of applications has been extended to 22 June 2007.

More details on http://www.localisation.ie/resources/Research/symantecphd.htm


just copying and pasting .....

Monday, May 28, 2007

News ... in Neapolitan language ... no, not a wiki this time

It is just about two weeks ago or so that a friend from the United States called me. His family is from Campania and therefore he is very much interested in the Neapolitan language. That is: for him Neapolitan is the true mother tongue. We already have a Wikipedia in Neapolitan as well as some few websites. With OmegaWiki we will now be able to build dictionaries in various language combinations and spellcheckers and some other nice surprises :-)

So talking with him that idea of making news available in Neapolitan became stronger - we already had thought about it before, but there was that final step that was missing. Said and done: we talked with Michele Cinque and Antonio D'Urso from positanonews about it - and they found it a great idea. One prerequisite of mine was: we or need somebody who really writes well or we need help to proofread my writings and/or translations. And yes: Carmine Colacino again is helping out here - thank you! This means: my texts are going to be proofread by him :-)

So that is how it came to be that now we have a news section in Neapolitan language on positanonews, the first registered online newspaper in Campania. Besides it being the first news section in an online newspaper I also thought that it would make sense to have texts for facilitated reading for all those who would like to look up a word. The technique is the same I use here on the blog: with a snap.com account so that you can have a preview of the page where the link leads to. Well: some more will be inserted over time and I will center on more difficult words then, because probably the normal ones are quite easily understood.

To give some more "food" to readers I will try to link to relevant wikipedia articles whenever these are present.

Of course, like always there will be critics :-) and yes I very much like criticism as long as it is constructive and people try to help to improve things. We very much encourage others to write there as well. It may be anything related from a familie's history, background information, actual news to the presentation of cultural events and music.

This is definitely a way to give space to all to write about life, news, culture and of course also encyclopaedic articles on Wikipedia. If you feel like writing: please just do. We know that it is not easy to write in Neapolitan but it is also true that if you don't try, you will not learn how to do it.

Thank you for your time and we hope to see you soon on the newssection for Neapolitan news on positanonews.it as well as on the Neapolitan wikipedia.

A big thank you to the director of positanonews Michele Cinque for giving us these unique possibility and to Antonio D'Urso for making it possible from the technical side. Again: thanks to Carmine Colacino for taking the time to copy edit my texts and translations.

Sunday, May 27, 2007

An Encyclopaedia and Neapolitan music

Yesterday evening I was at the hotel Domina Royal in Positano for the presentation of an Encyclopaedia. Well it was an exceptional evening, not only because of the contents of the programme.

The "Nuova Enciclopedia Illustrata della Musica Napoletana" (New Illustrated Encyclopaedia of Neapolitan Music) by Pietro Gargano. It will be great to have definitely something to verify the contents about songs we have on the Neapolitan Wikipedia and to add reference to these unique works that are not finished for now. But only having a look at the first volume, of which of course I simply had to get one :-) we will be able to check a lot of data we already have and it will help to make at least some of the data available in Neapolitan.

Such a huge corpus of text would be difficult to translate manually - it would take really loads of time and it would maybe be too expensive for quite a limited market. My thoughts here go very much in the direction of Apertium and the creation of a dictionary that allows for machine translation from Italian to Neapolitan and maybe also to English. This would help here since the terminology used is quite restricted and the whole of the corpus is work of one and the same author. So it would be easier to get high quality results and then just proofread the text. Going that way such particular works can be made available quite easily in other languages and can be offered through print on demand services.

Well: it is for now a dream ... that hopefully one day can become true.

I am still thinking on how some of the songs were presented ... it would be great to show you the whole presentation online. Who knows if the one person who recorded it with the videocamera will help to do this.

And yes, I met some very interesting people - and I had a really great evening. I would like to thank Michele Cinque from Positanonews (news in Italian and English) who invited me to this presentation.

Saturday, May 26, 2007

How to create a spellchecker ... that is the problem.

Considering that I need to write more and more often in Neapolitan, having a spell checker would be nice, but .... oh yes, that is a really big but ...

So you search for OpenOffice.org + spell checking in Google ... you reach a nice project page that leads you to Hunspell. Than you are there, you can download a bunch of files (that don't say anything about how to create such a dictionary) and you think ... well, let me look at Mozilla. So you go there: and you are redirected to OOo ... one project sends you to the other :-(

That is: you get into a nice cycle. No way to find out how a wordlist for such a spellchecker should look like or better be created. Maybe it should be obvious to the whole world on how to do that ... well it is not to me ... grrrrrrrrrr ...

Friday, May 25, 2007

eu-grant brings down fees to €1,600 for postgraduate Global computing and localisation programmes

Just pasting this from a press-release I received - interesting for Translators.

Limerick, Ireland, 25 May 2007: The University of Limerick has just announced that the Irish Higher Education Authority (HEA) is making significant funding available to its recently launched new postgraduate programmes in Global Computing and Localisation. These grants will benefit students from all EU countries who will realize savings of approximately €3,650 against the standard fees.
The US$9 billion localisation industry has an increased demand for professional localisers with a solid technical and business oriented background. In close cooperation with industrial and academic experts, the University of Limerick who was the first to offer dedicated postgraduate localisation programmes in 1997 is now responding to this demand by offering two new postgraduate programmes in localisation, starting in September 2007.
The Graduate Diploma in Localisation Technology and the Master of Science in Global Computing and Localisation will be offered on a full-time and part-time (one-day-a-week) basis.
Reinhard Schäler, Director of the Localisation Research Centre (LRC) at the University and Course Coordinator for these programmes, said “We are extremely pleased to see that the Irish Higher Education Authority sees localisation as a strategic postgraduate skill and has decided to support it so generously. These grants will make a significant difference to students wishing to pursue our programmes.”
Ireland has been a world centre of localisation since the mid 1980s, with companies such as Microsoft, Oracle, Symantec and Google locating their European Headquarters here. International academic and business leaders have described the University of Limerick as “the Mecca of Localisation”, “teaching the best minds in the internationalisation and localisation business”. UL has been offering postgraduate courses in localisation for the past decade, is the home of the EU-funded Localisation Tools Laboratory and Showcase (LOTS) and is the publisher of the peer-reviewed and indexed Localisation Focus – The International Journal of Localisation. More information on these programmes is available on www.localisation.ie/education.

Wednesday, May 16, 2007

Real-time translation ... her name is Eleda or Iansa

Hi, if you go into certain IRC channels these days you will find two new users: eleda and iansa ... well these are two really nice ladies (or better one that uses two different names) that, depending on the languages they know best or just a bit do a real nice real-time translation job in the chat.

Language combinations she knows best are Spanish-Catalan-Spanish and Spanish-Portuguese-Spanish. Her English-German-English .... well she needs still to learn a lot of grammar and vocabulary and the same for English-Dutch-English. There are also other language combinations, but sorry, I cannot list them all. Fact is: she can do already quite a nice job and will become better and better over time.

Now you are wondering who these nice lady is, right? Well eleda can be found in the apertium IRC if she is offline, just ask spectei where she's gone ... he will probaly know it. But who is she? Well she's the first translation bot for IRC that uses Apertium at its backend. It was programmed by Francis Tyers - thank you Francis, a first step into a direction that in future hopefully will help us a lot :-)

This means the better the Apertium terminology is the better the bot will do its work. We will have the possibility to add Apertium XML lines to OmegaWiki - as for now this is unfortunately still not possible.

Anyway: imagine that bot getting better and better ... and imagine having it at disposal during chat, e-mail (in Tunderbird for example) and ready in the background for websites (in Firefox for example).

Ah yes, there is that website translation tool aroud that produces all sorts of strange translations, well: when it comes to Eleda or better Apertium: we can change that. It is up to us to contribute with translations and grammar rules :-) so, if you want it: you just can do ...

Commands:
@help shows you the options
@follow + nick follows the person writing and shows you the translation in a private chat window
@unfollow + nick stops eleda from translating what that specific person writes

Of course: typing must be correct ;-)

Thursday, May 03, 2007

Big ... bigger ... the biggest ... encyclopaedic articles ...

I already wrote about this some time ago ... it is the never ending struggle and fight between two worlds:

- any length is fine, also just one sentence because this sentence gives basic information
- only long articles are good

Uhmmm ... well: I would like to invite you to come into a book store ... wow ... more than 29000 entries there. You find anything there ... general encyclopaedias in just one volume and many of the entries have a lengths of just one sentence (like: Maiori is a city in the South of Italy, on the Amalfi Coast, in the province of Salerno, region Campania). There are specialised encyclopaedias with very specific articles let's take an example, maybe about biology. There are the really big ones like the Britannica. Now each of them has a certain kind of target audience.

What is Wikipedia's audience? The general reader that could be happy enough with knowing that Maiori is a town in Italy, the highly specialised one that want to know all about a specific animal we maybe don't all know or those who want to read those huge articles? Who is our audience? All of them or just the "elite" of encyclopaedia readers that would say that one sentence about a town somewhere in the world is not enough? Uhmmm ... but what is Wikipedia's basic scope: "Provision of information in the field of general encyclopedic knowledge via the Internet." (I am quoting from the article about Wikipedia on the English wikipedia).

Reading that and considering what a wiki is: all discussions about what is better, best, worse, worst etc. don't need to be discussed. If we are NPOV all of these versions are equally valualbe and Wikipedia having its unique goal and being a wiki can combine all three of them in one. Isn't that incredible? So why limit what can be added? Even a small sentence can be of value ... and even a town or place that seems to be irrelevant to one can be relevant to somebody else. Who are we to say (just as an example): this city or river may not go into wikipedia as a stub since it is not relevant enough ... who tells us what is relevant or not? Wouldn't it be against the NPOV policy?

Oh yes, now I hear some shouting: but there are then 5000 articles about cities that are just stubs and the wikipedia seems to be bigger as it is ... well: go to the library, take one of those general encyclopaedias and look into it ... remember: for a kid one sentence telling where a city is often is enough - if there is more: even better, but that sentence can be a huge help when they study.

Imagine one thing, at the moment I am writing I don't have a clue on how many articles there are on nap.wikipedia ... I never really cared about numbers ... you don't believe it? Ask people who know me ... I simply don't look at that stuff. I contribute to wikis because I like to do it - it is irrellevant where and what and when. It is irrelevant how many edits I do ... I don't know how many there are around of mine. What counts is that we do what we do because we love to do it.

I repeat: each article, even of only one sentence can be of high value for somebody searching for information ... don't exclude the small ones, please and stop counting numbers ... it will help you a lot. We are not in competition - we are co-operating projects, that's all there is to it.

Apertium and OmegaWiki

Apertium is a machine translation tool. OmegaWiki is a dictionary ... both can have advantages from each other. For now people who work on Apertium dictionaries don't work on OmegaWiki ... well in future this could change and people who work on OmegaWiki could become Apertium contributors :-)

How? Well this morning the lead of Apertium, Mikel Forcada, confirmed that it is fine for them to add a field to OmegaWiki that can contain data structured in Apertium format. Of course: we will need to follow their paradigms when we create that short structural piece, but that should not be all too difficult. There is aimaz, a contributor to Apertium, he is considering a "Paradigm Cruncing Program" - this evening I will try to understand more about it and how it possibly could connect to all the other things we have in mind.

Anyway: the connection between OmegaWiki and Apertium will make a difference for many less ressourced languages. I am just considering languages like Piemontese and Neapolitan and of course also Bavarian and others around the world: we will be able to translate texts automatically - Wikipedia is particularly well suited since these are texts without "feelings" that need to be transmitted. So that the few people who can really write in these languages can concentrate on proof reading.

Eventually you now wonder how good such a machine translation is: well that very much depends on the amount of data available for the specific language combination. When it comes to Catalan-Spanish I saw extraordinarily good results.

Machine translation will never really substitute translators, but it will be a big help for certain kinds of texts and for certain languages in particular.

Tuesday, May 01, 2007

Understanding the OmegaWiki MySQL database

Well, I am not all through it, but I am finding what I need bit by bit. What I am wondering is: why is there no documentation, why I had to download the whole bunch to see what is where and find it out myself - that takes time and the way the database is gets people who would like to take a look and eventually contribute with queries probably quite annoyed ...

So let's see how I, a person that is a translator and not a programmer nor knows too much abut MySQL, found out where to look.

First step: I downloaded the MySQL database from the page on OmegaWiki. Then I needed a software to be able to open it. Zdenek from dict.info told me that XAMPP would be good for this and so I downloaded the portable version. I chose the Windows installer that guided me through the options. I activated Install Apache as service and Install MySQL as service.

Once it is installed you launch it. Then you open your browser and insert http://localhost and you get a starting page. Then I launched phpmyadmin (on the lefthand side) and created a database called OW.

The MySQL database I downloaded, that is the one with the OmegaWiki data was extacted and then copied into xampp/mysql/bin folder. Then you need to launch the window to execute dos commands in Windows (don't know how it is called in English). From there you go into the directory where the OmegaWiki data is and launche the command mysql -u root -p -f ow this copies the OmegaWiki database in your local OW database.

Then within phpmyadmin I choose the OW database and from there a series of trials started to find out what is where.

... to be finished ... as soon as there is some time ...