Saturday, December 08, 2007

When Neapolitans become Indians and Pulcinella dances his raindance ...

You are wondering about that title, right? So let me tell you the story. On our group for Neapolitan language we have people from many places in the world. Not all speak Italian, most speak English, many speak or learn Neapolitan ... what a mixture, right? Well ... discussions are always a bit particular since we have to use the language which most of us can understand and write - otherwise: would it make sense to discuss just in a handful of people? Uhmmm ... Wikipedians who are about NPOV will understand that the more people are involved the better it is for any kind of project and discussion.

There was this member of the group ... he complained about us writing in English, talks about the birth (?!?) of a new language called Naplish (never heard of this ... could be some Neapolitan indigenous dialect???) and says that there are only two natives in the group????? Moment ... but ... besides the two we all know who live in Naples also he lives there ... ehmmm ... I'd say they are at least three then, right? And the rest of the many of us who live in Neapolitan speaking regions??????????

Not enough ... first we are told that we should not write in English, but then who writes in English is actually the one complaining ... uhmmm ... does he eventually believe we don't understand him when he writes his very own version of Neapolitan??? Yes, he has a very own version that does not follow the actual grammar and spelling rules ... but: considering that we know how to talk ... well: we can understand what he writes when we apply Italian spelling rules to Neapolitan pronunciation (what a mess ... right? ... well, that's Neapolitan ...).

He talks about two mayor difficulties ... let me quote out of that mail:

"On one side the complexity of a language counting infinite variations on a
rather vast territory. Very often one may find deep lexical, phonetic and
syntactic discrepancies at very short distances.
On the other side our intellectuals and institutions live isolated in their
ivory towers, out of touch with the 'indians confined in their cultural reserve'where the language is still actively used."

Wow ... are there languages that are not complex? Without variations??? As much as I know only dead languages are without variations (but still complex) ... so what? Or am I completely ... ehmmm ... no ... ?!? I mean even if you take someone speaking Italian who lives in one region and then you take another one who lives in another region ... they both speak with variations, even grammatical variations ... even different words used to talk about the same thing, but all is considered to be Italian ... so why should this be different for Neapolitan which has much older roots and therefore more influences from outside?

"Our intellectuals" ... uhm ... who are these? I mean those few people who write Neapolitan to his opinion are intellectuals? And so I would be one of them? Gosh guys ... I'm an intellectual and I did not even know that ... that is hilarious ... And then: they live isolated in ivory towers? Wow... I'd like one ... that would eventually resolve our space problems ... and again: we then would be not in touch with the Indians (that is native Neapolitans) who live in cultural reserves (cities, small places etc.?) where the language is actively used?!?

Wow ... I am married to an Indian then ... and I live in the middle of a reserve ... hardly anybody here speaks Italian (well they know how to speak it, but not with people who live here - Italian is for strangers and even if I am German: I am not considered a stranger anymore ... they speak Neapolitan with me ...) or did I then become an Indian myself? (Indians, the real ones, please don't be upset about me using this - you have a great culture, please be proud of it and help to make your language and culture survive!)

Uhmmm and Pulcinella? Imagine him dancing a rain dance in his Neapolitan reserve ... well, hopefully rain will come and wash some negative thoughts away ... and Pulcinella will make people laugh like he has done for centuries ... have to write to a theatre company ... that's really something for them.

Quoting another piece:
"To aggravate the situation, these days you may find media and techniques which may tend to transform a language into an industrial product. The best
translation software will never be able to translate the nostalgic despair of
Santa Chiara or the magic wits of A rumba ré scugnizze."

First of all don't read the last sentence - there are 4 errors in 4 words ... if you want to know why, subscribe to the Napulitano newsgroup and read the explanations.

Then again: what I don't understand is that people seem to believe that Machine Translation is used and then the text is left as it is ... that would be plain stupid. Well, probably they don't do their homework about how professionals work and believe they are just better typers using babelfish ... Instead: machine translation it is a help in order to not having to type in all that stuff - and: machine translation is not suitable for any kind of text, certainly not poems and songs ... oh ... I forgot that our dear writer believes that Neapolitan is used only in literature and music ... but then again, when you go into the computershop here in the city and you ask for what you want using the proper word and pronunciation they often don't understand you: you need to know the Neapolitan one :-) (which very often is based on English writing with Neapolitan pronunciation ... a bit like anywhere).

Next quote:
"Considering this situation, we should publish easy, basic texts with the support of available technology, thus showing the fluency of an idiom still in use and evolution."

Wow ... he is a member of the group where I posted many bits and pieces from nap.wikipedia linking to it ... and I sent in all the short and easy written articles we published on Positanonews with "easy reading" links to OmegaWiki where people can find the word explained and translated into other languages ... or are my mails invisible???? (Please note that due to time problems the last article was not tagged that way.) Well over 4800 reads of one article show me that my mails seem to be somewhat visible ... or maybe there is some kind of magic wand for certain people that does not leave certain information through and filters it out??? Or is it Pulcinella doing one of his tricks ... see Pulicnella loves it when people struggle and fight (he is a bit naughty sometimes and helps things to go in a certain funny way ;-)

Next quote (with some omissis):
"As a little start, let me follow the example of ... (anonymizing dots), adapted to my native expression:"

And then The Devine Comedy by Dante (!!!!!!!) that is really one of the easiest texts of Italian (or better not really Italian) literature:

"Italian:
Nel mezzo del cammin di nostra vita
mi ritrovai per una selva oscura
ché la diritta via era smarrita. ....

Neapolitan:
Propie a mità ra vita mia
Ie me truvaie mmieze a na buscaglia scura,
roppe ch'eva perze a via maesta. ...."
(some attention to ortography is needed, sorry)

"English:
Half-way through my life
I found myself in a dark forest
after missing the right way. ....."

I mean calling a text by Dante (born in 1265, died in 1321) a simple text ... well ... I suppose Dante would be really upset ... I hope he will not come up during the dreams of somebody and tease and prick him ... that coud hurt ... well, he could use Pulcinella for that ... really ... and knowing how Pulcinella normally behaves :-D ...

Well let's say I prefer my Indians and Pulcinella dancing his raindance to using Dante to show simplicity of language ...

Friday, December 07, 2007

Translating Wikipedia articles (2)

Like I already said yesterday, I would come back to this argument today.

Apertium is already used in some projects, one of which is the Occitan Wikipedia. For those who are not familiar with Wikis: there you have the possibility to compare the not proofread version with the proofread version and that is something you will see by clicking here.

What you see on the left hand side is the text as it was after the machine translation and on the right hand side the proofread version of the text. The changes are highlighted in green on the left and in blue on the right hand side. There are even some parts of the text that were not changed at all.

The work on the glossary and the grammar rules (well I am not using the specific terminology here to make things understandable for all) has been going on for approximately one year now.

At a certain stage the problems arise from vocabulary that is missing and not so much from the rules. Of course these translations will probably never be a 100% perfect, but the quality depends very much on us and our adding terminology and classifying it.

Comparing the above result to what you would see for Spanish-Catalan, well the last one having been under development for years is much better.

You can find further reading about co-operation between Wikipedia and Apertium on the Apertium Wiki.

Language pairs that are right now available are:

  • Spanish←→Catalan
  • Spanish←→Galician
  • Spanish←→Portuguese (pt and pt_BR)
  • Catalan←→English
  • Catalan←→French
  • Catalan←→Occitan (oc and oc@aran)
  • Romanian→Spanish

Many other language pairs are under development. Of course: you may start on any language combination that is comfortable for you. Please keep in mind: the more similar two languages are the easier it is to program the rules, the faster the translation engine will produce good translations.

If you want to start to work on wordlists, please write me at: s.cretella (at) voxhumanitatis.org and tell me which language pair you are interested in. You can also reach me by skype at: sabinecretella

I will upload a wordlist to google docs and give you access. Please let me know if you have difficulties to work online (that is if you work with a dial-in connection).

The Apertium Chat is on Freenode.

One more thing I just received criticism since machine translation would flatten the language: well any translated text, in particular when it comes to literature translations, is post edited by a second person. The translation is never published directly since during translation - and you can be the best translator of the world - there are always some bits and pieces that sound a little strange or that do not really transport the scene into the other culture. And please allow me to introduce the concept of cultural localization here that will be explained in one of the future posts here and that was coined by Dr. Martin Benjamin who is part of the advisory board of Vox Humanitatis. The concept of cultural localization became then immediately part of the scope of the association.

And since I am adding notes here: please remember that the Fundraiser of the Wikimedia Foundation is still running and that you can help by donating and telling others that the fundraiser is on. For more information and to donate please click here.

Thursday, December 06, 2007

Translating Wikipedia articles ...

... into less resourced languages. Well, time has come that we can start to think about how to go about a faster creation of contents for the many small Wikipedias. As you all know, often we have just a handful of people creating and translating and then adapting articles. Well ... combining various Open Source and Open Content projects we can now go a further step into the direction of fast contents creation, but that does not mean: stub upload. This is a completely different way of doing things.

Apertium is a machine translation tool that works really great with similar languages. Approx. a year ago I had a translation from Spanish to Catalan done by Apertium through the online interface (http://xixona.dlsi.ua.es/apertium/) and asked some people of the Catalan Wikipedia to have a look at it. They told me that of course it was not perfect, but that it would be easy to proofread it and much faster than actually translating it. In March I made a similar test during a masters for translation studies in Pisa. I asked one of the students who was bilingual Spanish and Catalan to have a look at the outcome of the machine translation of a general text. The grammar was almost perfect and and also the terminology. There were just 5 corrections in a bit more than half a page (A4).

Now what does this mean to us: if we have a bilingual wordlist for two similar languages under a free license, we can pass it on to the Apertium people. From there we are a step closer of getting machine translation for that specific language combinations on their way.

One note inbetween for the Apertium people who might read this: please don't mind me not using specific terminology to describe what needs to be done. It could become to techy.

So the next step is to identify what a term is and how it needs to be handled. That is for example a verb needs to be declared as such, then one needs to give it a tag that indicates which conjugation scheme needs to be applied. This needs doing for all word types, that is verbs, nouns, adjectives etc. After that grammar rules need to be considered. Step by step the correctness level will be improved and the time invested to complete wordlists which will be available as google doc spreadsheet and to add all the additional information will help to save a lot of time. That is: now it will take longer, once the engine "learnt" how to deal with the terminology and grammar for that specific language combination creating contents will become much faster. This will help the small projects in such a way that the few editors can concentrate on proof reading and adapting and will result in a faster contents growth that has quite high quality.

This project that is going to care about less resourced languages will be one of the first lead through Vox Humanitatis. Should you be interested in helping with the wordlists, please let us know which language combination you would like to work on (that is starting from English right now and step by step from others since most of the Terminology is there in English). We will get you the access to the online document. If you need to work offline, please let us know. You can contact me by e-mail: s.cretella (at) voxhumanitatis.org

I just received a list of the supported language combinations as well as an example for Catalan-Occitan and some notes on evaluation of machine translation co-operating with a Wikipedia community. This means I have quite some further stuff to tell you. I'll post that info tomorrow, otherwise this blog would become too long.

Please also note that the documents will be released under CC-BY license and therefore they can be integrated into any wiktionary.

Naples' airport ... a very particular publicity

When on Sunday, 2nd November, I waited for the boarding to go to Barcelona, not having a book with me, I took some photos around and one is particular: the Italian mineral water producer Ferrarelle is creating publicity where each line of the publicity is written in a different language, in this case using "e 'o tiempo". It was the first time I saw Neapolitan taken to the level of all other European languages.

Thank you Ferrarelle!

(And yes, I don't mind giving a commercial company relevance if they do something like that).

Wednesday, December 05, 2007

Local languages applied - Catalan


During the last three days I was in Barcelona at the European Forum on Science Journalism, but more about that during the following days. Now I want to talk about a language that has made its comeback into every day life and is doing really well.

I have been to Barcelona quite a long time ago, just for a transfer to change airplane and reach Malaga. Then I remember the signs at the ariport were in Spanish and English. Today when you come out of the airport you see them in Catalan, English, Spanish. Now you will say: but what's so special, right? Well, I already know that Catalan now is "official", but one thing is knowing it and a different one is experiencing it. Imagine the Naples airport with signs in Neapolitan, English and Italian or the Turin airport with Piedmontese, English and Italian ... it gives a very particular feeling to see that. In Catalonia people are very proud of their language and culture. When you talk to them they will tell you that it is relevant to use it for anything, at home and in business and of course at school or universities.

I was not sure about what to do: going by bus to the city centre or taking a taxi, but considering that probably Sunday around noon was the only moment when I could have a short walk in the centre I chose the bus. It turned out to be the right decision ...

When I reached the centre I saw something that looked a bit like a market and of course: I had to go there and see what was on. It was an exhibition of goods with labels in Catalan and there was an information stand. So I looked a bit around to see what they had. I started to talk with some young people who could not really understand English and so they called a man who then spoke French. Well my French is far from being perfect, but we managed to talk and so I found out he was the husband of the president of ADEC (Associació en Defensa de l'Etiquetatge en Català) and that they are actively promoting Catalan and with that typical Catalan products.

They had publicity material and I took some of them with me. I also got the contact to the association itself - that is: I will need to write them, but I should do that in Catalan - so if there is someone out there who can help, it would be great.

Besides all the other particular things and also similarities I found between the Amalfi Coast's traditions and typical things in Barcelona, I immediately had one thing come in mind: if this is Catalonia and it is so distinct in how things go and are done, in architecture etc. how might other regions of Spain be? I mean: I had that feeling of wanting to know more.

Now what does that tell us: I was looking at Barcelona with the eyes of a tourist and seeing all these particularities with all these particular names brought me to the conclusion that besides Catalonia there are many other regions of Spain to be explored. This was local languages applied - unification using diversification. Underlining the differences Catalonia helps the other regions in Spain to become more relevant as well. Tourists will come back over and over again, because they will feel the need that there is really a lot to discover.

Now imagine how that would be when a tourist comes to Italy. Having different languages which distinguish the local products from the national onew will help them discovering Italy - if you have the same name written on a product you need to read further to understand where things are from and the local product, the particularities loose much of their marketing force. Instead of just buying a bottle of wine when they go back home that is from Italy they will buy various bottles of wine: from Sicily, Tuscany, Campania, Puglia, Veneto, Piedmont (just to name some). That is: it would increase the demand and therefore economy would have better prospectives.

In Catalonia they went so far to have Catalan also in Universities, used for research, in schools. Kids grow up with Catalan and Spanish on one level and immediately also start with English. This means these kids will be used to think in various ways and will be able to easily connect to other cultures. They will be better communicators. And this will lead again to economical advantages for the region.

Of course we cannot apply everything at once, but step by step reconsidering what local culture really is about, we can follow this really great example Catalonia is giving us.

The same is valid for the whole world - for all regions, all people, any culture.

And yes: sooner or later I'll be back in Barcelona and discover it more deeply. During the next days I'll (time permitting) tell you more about the Museum, the conference, people I met ...

Monday, November 26, 2007

Neapolitan reaches 1000 entries on OmegaWiki

This is a very short post - just to tell you that now there are more than 1000 words in Neapolitan language on OmegaWiki (and no, I did not add number 1000 since I am still hurrying after hundreds of things to do). Thanks to all who contributed to make the dictionary grow :-)

Friday, November 23, 2007

Ubuntu and D-Link 504T

I already had difficulties last time to have my internet connection work properly with ubuntu, so I expected the same this time, but: the connection this time was there when I used the command line, but did not work when I wanted to start firefox. Well this meant a firmware update of the modem. So I downloaded the new version and there was another file that I downloaded "by chance" - a how to to reset the ADSL-Modem ... well it was one of those feelings: yes, I definitely needed it. I did the whole reset routine so many times and ... well I went to bed frustrated. This morning I tried first with my old windows computer to access the modem, no way - so the last chance was to use ubuntu ... and now: I have a working Internet connection with this ubuntu computer, but no internet connection with my windows computer - for whatever reason :-( I hope that at least by this evening I will have everything more or less running on this computer and then I will care about the other one ... or in the meantime, shutting windows down and reloading it for some strange reason it will work again.

Thursday, November 22, 2007

Choosing what to do first or ...

... just having to react???

These days are particular, busy, I should do hundreds of things contemporarily since for one month I did not do really much.

In addition to the ordinary work also wordsandmore.org seems to get more attention. The strange thing is that this happened because spammers had taken over the website during the period I was away and people got aware of it. It is great that they now started to check the website and hopefully will also start to contribute. Since these neat, mainly Chinese, spammers (yes that's ironical) used accounts to create their spam we now closed the wiki almost completely and people who want to co-operate need to ask for an account. Well, it is not difficult to create it and it seems they don't mind sending an e-mail to become members.

Still some spam accounts have not been blocked, since we are doing this step by step when we see that the accounts are used.

I hope this is a step in the right direction and the w&m website starts to create its communities.

Besides that I received the logo for vox humanitatis and had to set it up, otherwise the others could not have gone ahead with inviting people to join us.

The Wikimedia fundraiser is on and there's not much for me to do there - what I am actually doing is inserting a footer in each e-mail that tells people about it. So if you who read would like to donate, you can do this on the fundraising website.

Destinazione Italia - Gerard will get crazy with me these days. I start to see which files are ready and I am interrupted or by skype or by phone or by one of all these other neat things to do ... this needs doing ...

I received the confirmation for the conference in Barcelona - that is 3 and 4 of December - I'd say it is a good moment to talk about quality and Wikipedia and hopefully be able to make people aware of the fundraiser as well.

Christmas: only during the last days I noted that 24th of December is a Monday - that means that we can go to Germany for Christmas ... and I am happy about that since leaving my stepfather there alone ... no, I would not like that. It's the second Cristmas without my mom ... so the organisation will be mainly up to me :-)

My laptop is getting worse and worse here ... so I am waiting for the new one that was already sent from Germany. My first completely Windows free laptop. I'll use Ubuntu - my reseller already installed it :-) Well, Vista is not an option it would hinder me for quite some work I do, quite some software I need would not work and also most of my hardware would not be accepted. An Acer Extensa - first I wanted a Dell, but in Italy they only sell Windows on Dell computers and Dell Germany does not ship to Italy ...

Already time to prepare and get the kids from school - I should also have a look around the cribs and we were invited by a group called 'E Scugnizzi - so next week on Monday or Wednesday evening we will take our tour - first looking at the cribs that are steadily growing and then the music.

And then there is the article to be posted on Positanonews ... it is alread here, but I don't have time to tag it for easy reading ... probably this time I will just publish it since time is not enough to do everything.

Well ... time to go and get the earthquakes ... and when I come back I hopefully can get some mails and files ready.

Wednesday, November 07, 2007

When things don't go as expected ...

Well, many of you know that I was supposed to deal with the Fundraiser 2007 and then at some stage, in October, I disappeared ... many probably asked themselves what happened and did not find an answer (I just sent a note to a very small group of people about what was on): well my husband was in hospital and came out again some days ago. In the meanwhile my kids were ill and had to take antibiotics and as last member of the family I had the same cold like they had and just finished to take my medicine. Again I was called from the school that Marco was not well and that I should take him home ... I went there and took him home – well: he is not really ill, just some coughing ... but enough to create some trouble at school.

I already tried to get back on my track ... like so often, things simply don't want to go like I want them to go and therefore I am not sure how much I will really be able to contribute during the next weeks.

Thanks to all those who supported me during this period, thanks to all who are actively helping the fundraiser.

Probably I will be able to do only very limited things, but I can see that things are going ahead and that is the most relevant thing.

Again: thanks to all who helped and are helping. I hope things will be back to normal, soon.

Sunday, September 30, 2007

Wikimania 2008 ... where is it going to be?

Well, these days many of us talk privately about it ... no, I am not going to tell you what others say, I am going to tell you where I would "feel" it right.

Alexandria

A place where cultures and people meet, it is somewhat a central point when it comes to connecting the modern with the ancient world. It holds the biggest library of the ancient world, a place where wisdom is collected ... wisdom that now reaches up to our days.

Isn't this a perfect merge? The antique world of knowledge combined with free knowledge for all?

The antique centre of wisdom meets the centre of wisdom of the present and future ... I find it unique. And besides many other facts that also are advantageous for Alexandria, this is my main point ... it can and probably will take us to the next level.

No well, one other thing is probably really relevant, besides the "feeling": Wikimania 2008 in Alexandria can attract people from the Middle Orient and can also contribute to peace when people start to co-operate on projects about knowledge. The wiki world is a very particuar one and I believe that many of you will agree when I say: it can change a live and how people think since we all feel or felt it ourselves.

I personally favour Alexandria for Wikimania 2008.

Sunday, September 23, 2007

Wikimedian of the hour ....


Well, it is not about being the best, the biggest, the whatever ... it is about contributing actively to the fundraiser ... a test-feature that shows a photo of the Wikimedian of the hour is running on two wikipedias for now - the Piemontese and Neapolitan Wikipedias. Besides that we decided that it made sense to have the donation page in our languages as well. Piemontese is already there ... Neapolitan still needs translation and then proof reading.

Now let's come to the point: the sense of this exercise is to get more people look at the donation page during the fundraiser period. We all know that a picture or graphic attracts our eyes more than just a written line. The Wikimedia Foundation will need more and more funds since it is continuously growing exponentially. This means we need and want to reach user groups that before were not reached and there are plenty of them. Of course we cannot do everything "right now" since the time left to the next fundraiser is short, but at least we should start to do something. Showing pictures of the community creates a different feeling of "being part of real people". It welcomes people in a different way.

These "Wkimedian of the hour" pictures can be used anywhere - also in the village pumps for example. Well, I would like to see you ... yes, you who are reading ... among us as well. You are a Wikimedian, so you should be there. Of course, those who want to remain anonymous can send me their picture and I will simply upload it without information about who it is. Pictures I receive by e-mail for publication on commons and flickr are released under cc-by-sa 3.0 and GFDL license.

If you don't know how to get that picture of yourself, well you can do a collage or just a screenshot like the one included in this blog.

The picture cycle gets updated regularly substituting the actual pictures with a bot, that is when some new pictures are added.

If you are on one of the projects that would already like to have the Wikimedian of the hour online, please let me know. I can pass you the file for the upload or eventually upload it with my bot.

I believe that for small projects (not only Wikipedia, but also Wiktionaries, Wikisource etc.) it will not only have fundraising effect, but will eventually also attract more people looking and hopefully contributing to these projects.

You can reach me by e-mail at scretella (at) wikimedia (dot) org

So I hope to see your picture online soon or have it in my mailbox. :-)

Friday, September 21, 2007

Two who love wikimedia projects ....

... and regularly contribute were on holiday on the Amalfi Coast in Maiori ... it was great to meet them after such a long time knowing each other from various projects. It was funny to listen to French and understand it and to answer in German or just talk German :-). This is also the way on how my twins got their "Zuckertüte". I suppose they had fun here on the coast and they visited a place where I never have been: the top of mount Vesuvius ...
When we went to Positano to have a look at the shops, the particular way the houses are built and where I, this time, had the chance to meet a well known artist from the Amalfi Coast (who after the fundraiser will get its article on nap.wikipedia) we met with Michele Cinque in a bar and so I took the chance to take some pictures ... of course also for our WikiLove campaign.

What? You don't know what that is? Well we are trying to get pictures of many people who love our projects to be used for driving more attention to the upcoming fundraiser. There is a flickr group and a category on Commons. Well, yes ... it still needs to grow and I wanted to take my photo today, but again my husband is not here to take it ... and really one in pyjama when I wake up when he comes home after work would not give the best impression ;-)
And what about you??? Is your picture already there? No? .... So what about adding it on Commons or flickr? So many will be able to see you during the fundraiser period ...

Have fun!!!

Thursday, September 20, 2007

Wikimedia Foundation and China ... Beijing and Fundraiser ...

Really this is something I was already playing with for some days now and now it happened that Karl Siu added me as a friend on Facebook and so I saw his question: "Should XXIX Olympic Games in Beijing be boycottet?" ... My first answer is no - by boycotting them you don't reach anything, just some more problems are created. Instead of boycotting we should support them.

And what does this have to do with the Wikimedia Foundation and the Fundraiser? At a first glance nothing ... at a second: it can make a huge difference, depending on how we bring the message over and how our community would like to adopt such a message.

What do journalists very likely anywhere in the world use to find background information on the news? Wikipedia, right? Which kind of information will reporters from anywhere in the world need? Well: all that is in some way connected to the Olympic games ....

Now what would happen, if we start a project now, making it public to the press, that is about creating background information on the Beijing Olympic Games? That is making Wikipedia the most relevant information resource for background information for that event? What if we actively ask journalists to tell us which kind of information they would like to see and that the improvement on the articles and new articles are based on these questions?

What if these articles then are translated (or written) in as many languages as possible?

I believe that we have enough people who can help with such a giant project ... I believe that we have such a great community that has enough of that knowledge or is able to research the needed bits and pieces ... and that will save journalists loads of time, right?

Now ... do you believe that journalists who are going to save quite some time would also help us in some way? I would say: the likelyhood is very high. So what could they do for us? Well: help us to talk to people ... that's their job ... they can help us in different ways by telling the world that such a project is starting (now?) ...

1) they can and will attract readers and writers
2) they can talk about the fundraiser and ask people to remember that (donations) - don't forget: these people know how to get messages over ... so they could do it ... right?
3) the newspapers could grant some space to get our message in
4) people who read Wikipedia and follow the project that should then be very active during the fundraiser eventually (and hopefully) will see that the fundraiser is on and they donate ...

So: it is, in the end, all connected.

Maybe this is the way to involve a huge part of our community indirectly in the fundraiser?

And: it could also be a good opportunity to open a formal contact with the Chinese Olympic Committee ...

Btw. the press agency for Bejing will also follow the Earthrace - an event that wants to make biodiesel more attractive by tempting the world record of the world circumnavigation. There are some youtube videos around ... I also saw a presentation video a good week ago ... don't find the link right now - just search for Earthrace on youtube and you will find it ... it's quite interesting.

Sunday, September 16, 2007

Fundraiser 2007 - Responsiveness of Community

Many of you eventually know that I am dealing with the Fundraiser 2007 of the Wikimedia Foundation ... well ... there is one thing I feel a bit strange about: it seems as if general messages in village pumps and mailing lists where we ask for help simply don't go through ... or people simply don't read ... now there is a last attempt to be made and this is contact people one by one ... that is going through the projects and ask active people for help. In some way it makes me feel like spamming around and I don't feel actually comfortable with that, but on the other hand it seems to be the only chance we have ... uhmmm ... will go and do that now ... don't know if this translation of a saying is correct in English: when the prophet does not come to the mountain, take the mountain to the prophet ... have a great Sunday!

p.s. and yes, I already added a fundraiser button to my blog ;-)

Tuesday, September 11, 2007

Sardininan - Sassarese languages or language and dialect?

Well, there is a nice website that can help us with that question ... and that is from the institution that cares about this officially - the Region of Sardinia.

When it comes to the Limba Sarda Comuna used on the actual Sardinian wikipedia there is no doubt that the language exists, but we must appreciate that it is an artificial language that was created out of the living languages of Sardinia. The website of the Region of Sardinia states:

Limba sarda comuna: una lingua realmente esistente: Sa Limba sarda comuna è naturale per il 92,8 per cento, è in posizione mediana rispetto a tutti i dialetti del sardo e può ancora essere migliorata per farla diventare la lingua ufficiale dei sardi.

Limba sarda comuna: a language that in fact exists: Sa Limba sarda comuna is natural be 92,8 per cent, it is in an intermediate position compared to all Sardinian dialects and can still be improved to have it become the official language of the Sardinian people

So they still want to improve the language ... nice ... 92,8 per cent of it is natural that means 7,2 percent is not natural. If I consider these percentages to what translators work with every day, that is the "matches" we get in our CAT tools, then 92,8 percent is a low percentage of being "natural". It seems to be high, but in fact it is not ...

Let's say I translate any kind of text (a sentence for example) and my analysis software tells me that the text is up to 93% percent equal to another sentence I translated before, this means that I cannot leave the sentence as is, because I will need to change at least one word in the sentence to make it a proper translation of what is there.

Just to give you an example:
The house on the hill is green - that is what was translated before. Now I get such a 92,8 per cent match with a sentence like: the tree on the hill is green. If I left it as is: it would state something completely different.

You can also look at it like this:
The house on the hill is nice and green. - that is 100% English
The house on the hill is nice and vert. - that is approx. 89 % English + 11% French
(it is just a matter of playing with the amount of words to get the 92,8%)

So what these 92,8% tell us: even if a huge part of it is considered to be built out of the "natural language part" it is still an artificial language.

But what is a language and what is a dialect? Well: that very much depends from which POV you look at things. But ISO determined some rules to understand what a language is and what not. That is, before you can get an ISO 639 code for a language you need to prove that this languabe complies to the standard. Of course there are living languages that don't have an ISO code, because up to now nobody cared for them - I am just thinking about Griko Salentino, a language spoken and written in Italy - but if people care about that language, they will ask for it.

What is a dialect ...

a) a language without an army
b) a way of expressing orally that developed out of a language and that has some differences , for example in pronunciation, some expressions etc, even having the same basics when it comes to grammar (just to mention one example)

So could
Campidanese (ISO 639-3: sro)
Gallurese (ISO 639-3: sdn)
Logudorese (ISO 639-3: src)
Sassarese (ISO 639-3: sdc)
be dialects of the Common Sardinian Language? Well ... only from a logical POV this is not possible, because they were there long before the Common Sardinian Language was created ...

By having their ISO 639 code, when they requested that code, they complied to the requests of the International Standardisation Organisation and therefore, on an international level they are considered to be languages even with an ISO code.

Please let me repeat: there are languages that don't have one, but these can request a code ...

When it comes to the language committee we had to draw a line somewhere and this line should not come from us, that is: it is NOT up to the members of the language committee to decide what a language is or not. We needed some kind of standard to apply and the clearest one was and still is the ISO standard. So if somebody wants to complain and say that the four languages above are in fact dialects of Sardinian and not languages, we should kindly invite them to create their papers and contact ISO directly to have the ISO 639-3 language code taken away ... it is NOT up to the language committee to take such decisions.

Another thing people should then also consider to do: also UNESCO states that these four languages are languages and they are in the red book of endangered languages - so if whoever states that they are not languages and he/she is so sure about it: they should also contact UNESCO. It is NOT up to the language committee to take such decisions as to delete four languages out of the endangered languages list ...

Sorry for me being so ironical, but: when such discussions about what is and what is not a language come up ... well: before you come to us, please go to the INTERNATIONAL bodies that deal with the question.

We are only normal people that base their decisions on standards and can tell people where to go to request their code, but we can nor create that code, nor influence what is recognised on an international level. (Nor do we want to do that).

Now to the question of sc.wikipedia ... I remember that, at the beginning, sc.wikipedia tried to host all of the Sardinian languages, then someone came up and decided to make sc.wikipedia a Limba Sarda Comune wikipedia only. Well: the Limba Sarda Comune is being used by Sardinian Authorities to facilitate their work.

In any case the code "sc" stands for the macro language Sardinian and not for the Limba Sarda Comune, so there is no reason why it should have the right to claim that code for the language. That is the Limba Sarda Comune, like any other language in the world that wants recognition by ISO must request an own ISO 639 code. It is not an option to simply say: now let's take that one since it is there ... well the one that is there stands for something else.

The question of the actual sc.wikipedia came up because of people telling us that Sassarese is not a language, but a dialect of Sardinian and that the Limba Sarda Comune (Common Sardinian Language) is the only "right language" of Sardinia.

Well again: it is not us who is going to decide on Sassarese and the other three being or not being a language - we rely on ISO 639-3 codes since we had to draw a line and avoid to simply assert things. It is not us who is going to decide if the Limba Sarda Comune is going to get an ISO 639 code. If you, who read this, are interested in this matter, it is up to you to get things on their way.

See: the decision to base whatever we do on ISO 639-3 was one of the wisest decisions ever taken within the language committee ... imagine which fights (almost all political based) we would have if we did not do this.

Just to make things clear - I repeat it again:

a) we do NOT decide if something is a language or not
b) we base our decisions on ISO 639-3
c) we actually need a solution for various scripts used for one language
d) we would love to see Multilingual Mediawiki there since it could be used to create easily sustainable communities
e) we are not going to go ahead on discussing if Sassarese is a language or not (it has a code)
f) we will need to find a solution for Limba Sarda Comune which does NOT have an ISO 639 code and is using the sc code in an improper way.

Thank you for your patience and understanding.

Saturday, September 08, 2007

Fundraiser 2007 - communication with projects

Well, dealing with the Fundraiser 2007 I am trying to involve the whole of the communities. But that seems easier that it really is ... one thinks: oh well, there are the village pumps and you just go around them ... or you go through the mailing lists (but not all projects have one) ... or in the worst case you use the various chat rooms ... well no, it does not really work ... a really well structured communication in this specific moment is not possible - and in some way we should think about a solution.

Village pumps:

I am getting step by step to them - there is no all comprehensive page - but even if there was there is one huge problem: they are structured very different from one project to the other. Often I posted "somewhere" without even understanding if it was the right place ... some have extra pages for messages written in a different language from theirs (but then again: there you don't reach a maximum number of people who maybe would help). Some have different sectors for different themes ... but again: there you don't reach all potential people, just the part of them that goes to the "general" village pump page. Uhmmmmmm ...... not sure how to sort this out ... and no: I would be agains a page for special "foundation information" since again it would be read by only a part of the users/editors and these would probably be more or less the same ones who read foundation list ....

Now I asked for a bot that can help us to add new sections to specific pages within the pywikipediabot framework - this would at least make one part of it easier, not having to go around all projects, but still the "how to communicate effectively" problem remains ....

For now, until we maybe get a better solution, I would like to ask people from the various projects to check the page where I list the village pumps and change the links I have there to the page where they want to have the messages added - for all projects please - this will help all of us to live an easier wiki-life. During the next days I will then make another round around the various Village Pumps asking people to correct the link on the page above if necessary.

I am sorry, but for now I don't see a different way to get this on the way.

And please: all links to all projects are needed - also the smallest ones ... they all have the same relevance.

Thanks!!!

Thursday, September 06, 2007

Buttons and Banners for Wikimedia Fundraiser 2007

Yes, I am going to tell you also here :-)

We are organising the fundraiser 2007 and we will need quite a bunch of help to get things on their way. So one of the first things we are doing is to care about buttons and banners that we can then add to our blogs, websites, user profiles wherever so that people anywhere in the world see it and can help.

So if you like creating buttons and banners and wish to help with the translations of the buttons and banners, have a look at meta:

Description:
http://meta.wikimedia.org/wiki/Fundraising_2007/Buttons_and_banners_to_be_translated

First examples and translations:
http://meta.wikimedia.org/wiki/Fundraising_2007/web_buttons

Thanks and have a great day!

Monday, August 27, 2007

2000 reads ... crazy ... but they are definitely there ...

The article about Freecol reached 2000 reads today. It has been published on 14 th of July and is still getting approximately 50 reads a day. Looking at the statistics and Google analytics most of these reads are unique reads - this means almost 2000 people looked at it. Mainly in Italy and the United States, but also in Germany and other parts of the world.

I noted that also the newer articles, like the one about the photo contest and the one about I Musicastoria are getting quite many reads.

What does this mean? I mean I can understand the photo contest, because it is about the Amalfi Coast and I can understand the amount with I Musicastoria which is a group that has really great music and is quite known around here ... but why the game? Probably these amounts of reads that are really high compared to those of articles in Italian, tell us two things.

Besides the fact that officially people still consider their native language, in this case Neapolitan, as a second class language, there is a high interest in it and probably people love to read and use their language ... eventually something is changing and they start to see it as relevant again ... it seems as if the desperately search for contents ... and what does this mean for our small Wikipedias? Probably there are many many readers that wait for us to write ... often these Wikipedias are the only projects that provide contents in less ressourced languages and therefore their role is quite different from what Wikipedia actually should be.

The other notion is that if so many people look at a computer game they eventually want one in their language ... this was the reason for us to go ahead and localize another game that is for now available for computers and will be available for mobile phones soon. If you want to play a game in Neapolitan or Piedmontese language, just have a look at Sudoku Mania and should you want to contribute with your localization: please contact me at s.cretella (at) wordsandmore (dot) org. I would really like to help you with the localization which is easiest done using OmegaT, since it already has a filter for java .bundle and html files.

Friday, August 24, 2007

.po files and OmegaT ... (Abiword)

Well, like so often things happen when you don't expect them to happen. We have almost 40°C here and I suppose that on the balcony it is well over that temperature ... no way of really being able to work ... that will need to be done during the night when the temperature is lower ...

So I thought, well, let me see how much it would be to localize that game into Neapolitan ... then I started like so often to talk to Gerard and we were talking also about the localization of software into Neapolitan ... being just that handful of people OpenOffice is not possible, and for some strange reason Abiword came up and so I had a look at it. I saw the explanation of how to create a "clean" .po file ... stuff that was somewhat too complicated for me to deal with (or better: I did not feel like trying) and so I thought: wouldn't there maybe be an easier way in order to avoid doing the same over and over again ...??? Well, I saw there were already localized .po files online and one of them the en_GB.po and that was the moment when I saw that maybe the way we were now dealing with .po files, trying to create a filter or whatever for files where the segment with the translation is empty is the wrong way to go ... .po files have a regular structure. Seeing that English to English file with both filled segments it became clear that it should not be too difficult to build a filter - but: not being an OmegaT programmer really I am not able to do that, so how to use the present filters? ... Well, first I tried with XML, but that one did not work and then with html ... and that one works ...

How did I create a file with translateable html contents? Well: search and replace ... that's all ... it could be done with a simple macro or a short script in whatever language that can process the .po files provided that both segments in .po contain text.

Now that is now my modified .po file looks like now ...

at the beginning there is the usual html declaration and the html tag

then a translation unit looks like this:

#. DLG_Para_LabelAt
#: po/tmp/ap_String_Id.h.h:97
msgid "&At:"
msgstr ">&At:< div "

and of course the closing html tag at the end (sorry I cannot put this in here since blogger gets a hickup. Also the space in front of "div" must be taken out (blogger-hickup also here).

this means I just created a "div" tag around the text that may not be touched ... and the strangest thing is: it really works ...

So this is a pseudo filter for .po files and now I will be able to localize the software creating a translation memory and then, for future updates work will be much less. Of course, who is interested in getting that .html file that can easily (with search/subsitute) be re-converted to normal .po can contact me, but please give me some time to answer ... it is really one of these periods of the year.

Less resourced languages meet ... getting some projects on the way

Berto from the Piedmontese Wikipedia, who also has i-iter.org that deals with less resourced languages stayed here in Maiori for some days and so we had plenty of time to talk and consider many strategies on how to protect less resourced languages and the very specific culture of the various regions in the world. Well there is still much to be worked out, but one thing became clear: we are going to work much closer together than before and we will find a structure on how to make the most out of the efforts of so many people who care about the same goals.

So yes, the first Piedmontese-Neapolitan meet-up made some first results.

This is just a note to let you know: something is going on ... so stay tuned for more news :-)

Saturday, August 18, 2007

A game in Piemontese with an article in Neapolitan

Approximately a month ago I wrote an article about the game Berto localized into Piemontese, Freecol and something incredible is happening ... I mean why should people who speak and read Neapolitan find an article about a game in Piemontese sooooo interesting? Well it is .... the article up to date has over 1550 reads (you can see that below the article) and gets further approx. 50 reads a day.

We don't have a clue on how often our small Wikipedias are read, but considering these figures: there are many readers for our languages, even if for now not writers.

What it also tells me: indeed time is due to go over to software localization for our languages ... in particular games, browsers, stuff you use often ... I am wondering how a handful of Wikipedia editors that we are will be able to deal with that ... we desperately need more people writing ... it doesn't too much matter that everything is written correctly, it is just relevant that people start.

Berto: tanks for that huge amount of work you did. It enters an incredible market nièche that is unfortunately always underestimated.

People find their languages fun - and they do want to read about such stuff - and thanks to a game two cultures meet ... exchange ... will co-operate ... that is something that I find incredibly exciting.

Thursday, August 16, 2007

nap.wikipedia: a first so-called stable version article

... it is not perfect in terms of wikification, the photos still were not uploaded to commons and some other things that need to be done, but it is a proofread article that was created initially for Positanonews, but then we decided to put only two paragraphs of it on the newspaper website (one of which is not on Wikipedia) and add the whole of the information on the Neapolitan Wikipedia. In this way I have a double use and double effect: people reading the article in the newspaper will reach the Neapolitan Wikipedia :-) and eventually there are some new ones that could become contributors.

Musicastoria, the band, is known in Austria and Germany where they already gave concerts and also in Yemen, well of course they are not known as much as here in Campania, but like so often: it is a matter of connections.

They are relevant to Neapolitan, its culture and its language since they go into the small villages and ask old people to sing their old songs, tell their old stories ... all in local language. There are not many of such groups around and I hope I will be able to involve them with some free projects.

Who knows if we will also get our first double licensed GFDL and CC-BY-SA 2.5 Neapolitan song from them ... it would be simply great :-) ... well, I will try to get it ;-)

The article on nap.wikipedia will need some time to become more complete ... but once it is: I want to translate it into English (many Italian Americans are from this region and they will love to know about them) and German. If you wish to help out with other languages, of course you are welcome :-)

Even a stub saying that

Musicastoria is a band from Vietri sul Mare, Italy playing ethnical music of the region of Campania and Southern Italy in general by collecting material through testimonials by old people.

in any language would be great.

If you want to listen to them, you can do this here (no complete songs, but 1 minute parts of some of them):

Tuesday, August 14, 2007

Pictures on nap.wikipedia

As I already said in the nap.wikipedia beer parlour and on the discussion list: I will now start to delete images without license information. I waited for quite some time before starting because I know they were copied from other wikimedia projects, but this does not help ... we need or license information on it or they need to be deleted. There are too many of them to inform the people who uploaded them about each single picture.

I would also propose to only use Commons. So if you have opinions on this: let me know here, on nap.wikipedia or in the discussiongroup napulitano@yahoogroups.org.

This is just to integrate what was already said since it could be some read the blog but are not really active on nap.wikipedia - so: they can contact me and remedy the situation for the pictures they uploaded.

Thanks, Sabine

Monday, August 13, 2007

Length of articles in small Wikipedias

Again I had a discussion about how big a Wikipedia article in small Wikipedias needs to be ... well thanks to Magnus Manske on the wiki-de mailing list I found a link that lead me to a Brockhaus version of 1911 (thanks for sending that one in just today). These kinds of encyclopaedias are still produced and for many they are enough (or even more than enough). So the length of an entry can be really short. Just look at the Examples.

What does this mean for us: well such short articles are very much wanted on our small Wikipedias and again I repeat: nobody can say that one or the other Wikipedia is good or bad only because they don't have long articles ... we could and should decide that in projects in less ressourced languages where you only have a handful of writers the "one line article" maybe including also a picture is enough to be called an article and not a stub. Why that? Well: 5 to 10 active people will never be able to create a project with all long articles, but they are well able to create over time an encyclopaedia like the one referred to above.

Time ago I stopped to add one sentence articles to the nap.wikipedia ... I suppose it is time to go ahead adding them - and: not adding the stub template, since we cannot and do not want to be compared with the big ones (that are compared to 50 and more volume paper encyclopaedias).

Said that ... as soon as time allows I'll go ahead to prepare the lists I have here.

Sunday, August 12, 2007

Spelling etc. on nap.wikipedia

Now we have that bunch of editors, but we don't have "fixed rules" on how to write things. This needs discussion, but not only among Wikipedians who write, but among all these people that are interested in the Neapolitan Wikipedia and connected projects. This means we are going to discuss this in the mailing list for the Neapolitan language and I herewith invite all interested people to join us there. You can also send an e-mail to napulitano-subscribe@yahoogroups.com

We must make a point, otherwise we really get into trouble - and we must start to use something like "stable versions in terms of ortography and grammar". I also have the corrected templates here and I hopefully will make it today to create them as well as the first "stable article".

Neapolitan is a particular language, since there is no law that rules how it needs to be written, but there are grammar rules and dictionaries and more than 500 years of literature. Of course some terms will need adaptation to today's language - and I want a wider public to be involved in this, not only the very small active community, but also very much the passive one: that is our readers - and many of them are in the discussion group mentioned above.

I was already thinking about creating a spell checker for firefox and OpenOffice.org, but my problem is: the day only has 24 hours ... I ask Carmine to proof read whenever he has time, the handful of editors is doing its best and I write some short news articles where I can connect directly to OmegaWiki and add terminology there that can then be used for the spell checker.

At this stage I am searching for someone who would like to start to create one with the very few words that are there. I really can't cope with it myself and people I know are not able to do it. So if there is somebody who would like to help us on that bit: the whole Neapolitan community would be really grateful. Having also a small file in place will lead to more data coming in step by step, since people will use the spell checker and give us their added wordlists. These of course then need to be checked and integrated, but it would at least start ... if we wait until whatever software has the functionality to do it we will need ages to go ahead. And we will need to repeat the same corrections over and over again - that simply does not make sense to me.

This spell checker should be under a CC-BY license so that we can use the list for integration in OmegaWiki and re-use the same work in various ways.

We already talked about course material etc. as well ... yes, indeed, it is time to create it ... I would like to see first of all written exercises that stimulate proper spelling and sentence building. But well ... it will take time ... we are only that very small group of people working on it.

Thanks to all who continuously help on nap.wikipedia with interwiki links, creating basic articles, taking out errors here and there and most of all: thanks to all who regularly dedicate their time to the Wikipedia.

I am now going to post exactly this text to the beer parlour on nap.wikipedia and send it to the discussion list.

Monday, July 30, 2007

Seagulls ....

Well, due to a lack of time, or better time for doing things alone without needig to take the kids with me, I started to go to te beach around six o'clock in the morning ... it's just great: hardly anybody around, just some pigeons and seagulls (btw. the linked picture is just any seagull ... didn't find the kind we have here on the coast), a fisher boat here or there ... but: nobody swimming ... you don't have to care about your direction and you just can go ahead ... and ahead ... and ahead ... well, some here in town probably think I am crazy, because even if we are in the middle of the summer and the water is really nicely warm, when you get out there is still the wind coming from the mountains and that is quite fresh ... that is people here would say: be aware, you get a cold ... well up to now I did not and I will not :-)

But to the seagulls ... this morning I had a real funny situation ... I was in the water swimming and a seagull came flying over the water, just above me, looking down at me ... uhm I thought ... well, it went away in a wide circle, but a second one came along and did the same ... it looked down from the sky to what I was doing there ... then the first one came back - the same again ... it went ahead like this for some minutes ... the way they looked at me made me very much remember my parrot, when he looked at something all too interesting, eventually food ... so did these two consider me to be seagull food ;-) well ... looking at myself I'd say: uhmmm ... the'd need some grappa afterwards :-P

Let's see if they will be there tomorrow morning ...

Saturday, July 21, 2007

Citazione ...

La vera amicizia è una pianta che cresce lentamente e deve passare attraverso i traumi delle avversità perché la si possa chiamare tale. (George Washington)

Friday, July 20, 2007

Articles with stable version for nap.wikipedia

Well, again this theme is on my mind ... stable versions ... on the Neapolitan wikipedia we have a very particular situation: only very few people are really able to write well and the others write "as they speak" often adopting the spelling of the Italian writing rules to Neapolitan or if they are Neapolitans who grew up in the states and are eventually of second or third generation you could even find some very particular words coined in the States (and well yes, that is still Neapolitan, just a different dialect of it), which of course does not work well. Even being many of them native speakers we have problems when it comes to written versions. Yes, there are rules, but Neapolitan is not taught at school nor you can easily find courses around where you can learn the language. Also when I write I always have to ask for proof reading by Carmine, well for me it is not even my mothertongue ... some people are frightened to write since they know they are not able to write correctly, but eventually they would write and start to contribute if they knew that the issue of having spelling errors in the end is not soooo big - this is something that can be sorted out.

What would be really helpful is a "stable version" function so that people can find proofread examples where they can rely on. This will help them to contribute on nap.wikipedia. For us it is not so much about having long pages with loads of contents - for us it is still on "getting people to write" in their native language.

Templates, the problem with the double quote and all that is too wiki specific is still too much for many ... I am also considering in not adding templates for cities etc. for the moment (for new pages) - it is confusing to people. I note that more and more when I talk with people who could become valid authors on nap.wikipedia. Most of those who write Neapolitan properly are of my generation or older and there are only few of them who really can dig into wiki syntax etc.

This means: we have to give them a really low entry level ... and we have to assure people who edit: there is a stable version where you cannot do any harm to, you can write, even if not 100% correct ... and you can then learn from the stable versions and the corrections made in your writings.


It is indeed a long process to get through all these problems ... and for now I don't see an end to it.

Tuesday, June 26, 2007

It is too darn hot

Italy is hot, too darn hot. It is too hot to move.. It is too hot to cook. You drink four five liters and .. you just sweat. When you go to the beach at 08:30 it is already too hot. So you swim, you get out, and again, it is too darn hot.

Even the computer agrees, it is too darn hot. Skype only works for a few minutes, then it overheats, it is too darn hot. So I put my brush under the computer there is now more room the heat can get out.. It is too darn hot.. To keep Skype going I have to wave air to cool the darn thing.. it is too darn hot..

It is to hot to do anything.. so I am happy that somone wrote this for me.. it is too darn hot

Love and sunshine from the Amalfi coast :-)

Sunday, June 24, 2007

Elections and endorsements

Now that the endorsement period is over I can write this ...

Well three board members are going to be elected ... than you have the possibility to endorse the three you would like to see and you go through the various presentations. I found some very young people who have really great ideas and in some way they would deserve their place as well... but then you have to make a choice ... that is you need to look at people, what they are able to do, how much background information they have and how much they deal with issues that are relevant to you.

Well what is relevant to me is the support of the lesser resourced languages ... the understanding that sometimes just talk is not enough (still thinking about the double quote issue with nap.wikipedia). The fact that they can behave in such a way that they can get results. And one very important thing to me is that they are honest to themselves and do state privately the same they state publicly (and there is no difference to whether I have the same opinion or not - it is just about stating what you really believe).

So in the end my endorsements went to Erik (Eloquence), Kim and Yann. I believe all three of them are well able to deal with the "big issues" that are relevant for the huge majority and at the same time they take the time to look at our very "small issues" compared to the others, but that are considered to be really huge mountains to our small communities.

I also understand that by offering their time to the board they will not have much free time left ... so thank you for considering to donate so much of your time to the community.

Saturday, June 16, 2007

OmegaWiki a translation dictionary? By all means: no

Out of a bar entry on it.wiktionary, yes I was pointed to the bar on it.wikt, I understand that most people see OmegaWiki just as a translation dictionary. But: if you feel like that, then you did not look properly at it.

At this stage entries are not really complete, but they already have a huge part of the complete functionality that is to come. This means we have:

- lemma
- definition (lemma + definition together form a Defined Meaning)

In the annotations you find all additional info like

String properties:
- example sentence
- hyphenation

Text properties:
well: this needs still some work

URL properties:
also here: still some work needed ... this is to link to other projects

Option properties:
- part of speech
- verb
- noun
- adverb
- describing word
- contraction
- article
- pronoun
- preposition
- conjunction
- interjection

Then there are
- Relations – this means to which other words the actual one is related
- Incoming relations – this is which other words relate to the actual one
- Class membership – for now with only "lexical item" in there
- Collection membership – for example: we have, just to name some, the General Multilingual Environmental Thesaurus, Destinazione Italia, OLPC Children's Dictionary, ISO 639-3 languages as collections. A list of present collections and their completion status can be found here.

Under Relations you will find the typical "narrower and wider terms" like you find them in a thesaurus for example.

What is still missing is Etymology – that one is planned, but not there right now – you will find it within the annotations.

So this is the basics of what we have now – and yes, many entries just have lemma + definition + translations, but that depends on the fact that for now we don't have all too many fans who care about the annotations :-)

But OmegaWiki is not only that ... it is already being used to create study material for students and to provide easy to read news articles for many. What does this mean, well the easiest thing is to have a look at two examples:
- http://www.positanonews.it/menu/default.asp?id=6090 for an example in Italan
- http://www.positanonews.it/menu/default.asp?id=6167 for an example in Neapolitan

You will find additional info on this theme and why we work with Positanonews in a former blog of mine: Combining projects just for fun ... and unexpected results. And a list of articles that will grow by one a week is maintained on OmegaWiki.

During the Holland Open Gerard showed the articles to some people and also other universities started about tagging their materials in such a way for their respective students.

This is just one of the many additional uses of the OmegaWiki data – maybe we will explain about the others in a separate post. It would be somewhat an overkill to put all in one article.

There is also another part of the post in the bar on it.wikt that needs an answer ... well people assumed that even if I work mainly on OmegaWiki I don't care about Wiktionary anymore ... that is more a kind of a personal question and not a factual one – so it will be answered separately.

Resignation as Bureaucrat and Admin from it.wiktionary ...

... or: when things get crazy and you have to draw a line

I still remember my first steps on it.wiktionary ... 16th of June 2004, yes, exactly three years ago – the first term I edited was lunedì (Monday). The almost only user that was seen online in these days was Paulo. He wanted to pull up the project, but shortly after he saw that I took things seriously he went away.

On 30th August 2004 the idea of an universal Wiktionary was born – thanks to Gerard Meijssen who then created articles of languages with loads of templates in there – well... the whole story would be too much now, but thanks to some misunderstandings and a following very long skype conversation today we have OmegaWiki.

The Italian wiktionary was always subject to loads of spam. So at a certain stage, I don't remember exactly when, I became adminship ... well the only one then working there was me, so I was the only one who could have done some clean-up.

Well the spam load got more and more and it became almost impossible for me to do good work there ... so I started to ask for help ... some registered, became admins, went away. Again I went ahead alone. If it was not for the pywikipediabot I would not have been able to add so many entries to it.wikipedia since always the main work on it was to keep it clean from spam.

Last year then I became Bureaucrat ... well I was not really present anymore, but I cared about a very particular user group that before that started to edit it.wiktionary – most of them teens, great and wonderful young people dedicating their free time to free projects. I am not naming them one by one here, because I could forget someone and I don't want to do this.

Now many of you will ask themselves why I did not immediately leave Wiktionary when I was working on OmegaWiki (time ago called Ultimate Wiktionary and then WiktionaryZ). Well ... it is a bit like with a baby ... you feed it, see it grow and then it goes, but in some way there is still part of your life connected to it.

But back to our very particular user group: they are most of all teens, partly very young, but dedicated ... and they did good. Of course not always, but it's a wiki, right? So you do what you can when you can. I admire them – they did great work. They gave part of their live to the project. A good group of them became admins – they came while others who today claim to be the only ones to be able to give rules to the project then were present, but did not care at all when I said: hey guys, I can't deal with all that anymore, I need help. No, I will never again allow people to say that the youth of this country is no good – they are great! You just need to trust them to do good and they will do it. They put their heart in it.

Now today I was called through my discussion page on OmegaWiki since someone asked for the desysop of one of the admins. He was accused of being not active, not cleaning up spam and copyviolations ... ok, so I went to the chat (yes, I have the log, but anyway it is not relevant for me anymore ... so: it doesn't make a difference to publish it). So I asked for several things, most of them not really so tragic, but a desysop to one of the most valuable people at the time he joined ... well, that was something to really talk about.

Now what I learnt is that some wikipedians (well, yes, the one asking for desysop is quite new to wiktionary) assume that when you become and admin you assume the responsibility to regularly clean up the wiki ... well I would say: this is perceived completely wrong. If the admins were paid and had to sign a contract, well then this is a different thing, but they are not – they DONATE their free time. Ok, then I got the following as the reason for "desysop": ... ma di aver lasciato wikizionario senza policy, senza controllo copyviol, senza importare decine di lemmi da altri progetti... (but to have left Wiktionary without policy, without checking copyviol, whtout importing tens of lemmas from other projects) .... UHMMMMMM ....

Is there a contract where it is written down that who is elected Admin (well, ok, Sysop) or Bureaucrat has to do this and otherwise he gets desysoped? That is plain stupid – against any sense of collaborative projects, against any sense of humanity. Another nice sentence I got in these answers was (fairly at the beginning): " invasa di bambini che giocavano, può bastare come spiegazione?" (invaded by kids who were playing, isn't this enough of an explanation?)

I can't believe it ... well maybe these teens were not perfect, but they did their best, nobody is perfect. I would very much like to remind some poeople of what the Wikimedia Foundation aims to do: provide free knowledge to all people in the world in their language.

Now does this stop with the contents or are we here as well to transmit what Free Content, Free Software, Wikis and Collaborative Projects are to people, young and old, who are new to it. Also this is a kind of education, of giving knowledge – this is social knowledge. Well it seems that some do not like to invest in their future ... because instead of starting a desysop procedure this person could have talked to them, explaining them and if the explanation would have been logic enough they probably would have agreed and co-operated. But now? How can people feel good to work in such an environment? Does this really reflect the community spirit? I would clearly say: no.

I am not willing to co-operate in future on it.wiktionary since obviously under these circumstances I cannot be of any help. Therefore I resign from my being Bureaucrat and Sysop on the Italian Wiktionary. It is sad and part of my life is in that project ... I don't want to see it die ... it would be too much. So: I am not going to go back there ... well with two exeptions, one of which is: just my user page will contain a link to this article and to the place where people can find me if they want to talk to me – that's all.

One word to the new Amdins and Bureaucrats: according to what I learnt from the chat today it seems you signed for some kind of a contract by being elected Admin or Bureaucrat. I will have a look in one years time and hopefully you will all be there happily doing what you are expected to do ... well: it is a responsibility you took according to your new policies ... so the project is waiting for you – people are expecting you to do your job now.

There would be so much more to say .... please, if you work on other projects: don't follow this example – it destroys the community. We need to build community, bring over the spirit of free projects, of all free projects. The teens of today will be among the best editors of tomorrow. Never consider a kid, even a 5 year old, to be too young to do something – they are all well able, if they have an interest and they want to do. Trust them to do good. They need it – they are our very own future.

And to the editors who worked on it.wiktionary during my short period ... well during the last three years: thank you so much for your efforts and help – and a very special thanks to our young ones – I find it really special that you dedicate your free time to free projects ... like I already said: you are our future and the day will come, if it is not already here, that we will learn from you.

Thanks for your attention. And yes, of course: you will always find me on the Neapolitan Wikipedia but mainly working on the connection of various free projects in order to maximise results with the donations in time people give us. Yes, we must value each minute highly.

Thank you!!!

Thursday, June 07, 2007

Combining projects just for fun ... and unexpected results

The story ;-)

Well time ago, when we started to work on the Destinazione Italia project with OmegaWiki I understood that professors at the University of Bamberg need free news, but they need it tagged in such a way that the studends can easily look up words with definitions and translations.

Then some weeks ... well a couple of months ago ... I wrote my first article about the tuna cages that are planned in Cetara, near Maiori, on the Italian Wikinews which lead to a contact with Maria Rosaria Sannino from eCostiera since she received the article through google alert.

Well yes, we then made an appointment at the Bar Pineta with some other people from Agorà and Positanonews. That is where I met the director of Positanonews and understood that he is a very likeminded person, thinking very open and Open Content without really referring to it. So a natural Open Source person here on the coast ... that is indeed rare.

I started to take the headlines of Positanonews for the Costiera Amalfitana Wikia and I was offered to simply use the contents I prefer ... well, no, I don't do that, I prefer to link to it and not double contents ... well yes, during the next days I will need to work on the Wikia ... I did not have much time these days and still don't have much ...

Comunication with Michele Cinque and his team developed really well and we reached two very particular results with this - on one hand a news section in Neapolitan language and on the other facilitated reading articles (the list is maintained on OmegaWiki ... another one will be added by today).

The facilitated reading articles resulted from two things:

  • the need to link Neapolitan words to a dictionary in order to help people to understand the words and
  • the need of the University of Bamberg, that is news with links to the terminology so that students can read them more easily and have this terminology in such a fashion that it can be easily exported from a database
Most of you know me quite well ... sometimes I just do things to show what is possible and then all sorts of not calculated, that is unexpected results derive from what I do.

So I created my first "Easy to read" article to show it to Martin ... now he really likes the idea and we are going ahead to do this adding one article a week. This means three very different groups of people are now connected to produce open contents for the community (in alphabetical order):
  • OmegaWiki
  • Positanonews
  • University of Bamberg
Yesterday I took a first article from the Italian Wikinews (if you are from Wikinews, please have a look if attribution is ok in this way) which still needs to be taggen, to have some free content on national and international news in Italian with a free license, since of course Positanonews is centered mainly on regional news and all that comes from outside cannot be released under a free license unless the author does not agree to it. So we can add also Wikinews to the list above and we really have a good mix of projects from various sectors that all have the same spirit. Do as much work in the Open Source world as possible and give access to information and education to people.

Now you are eventually asking yourself why I chose this combination: well it was not chosen, it happened, just because people are open minded, have the spirit to pay it forward and do not only think about their own needs and wishes.

So thanks to all contributors from (in alphabetical order):


There will be other unexpected results ... they are already on their way ... and when time is due I will tell you about it.

Tuesday, May 29, 2007

Localisation PhD Scholarship at the University of Limerick

Streamlining Quality Assurance in Digital Content Localisation

Symantec Ireland and the Localisation Research Centre (LRC) at the Department of Computer Science and Information Systems (CSIS), University of Limerick, have agreed to offer a 3-year funded position for a suitable candidate to work on a collaborative research project with Symantec and the LRC, leading to a PhD degree.

Candidates should forward their application (cover letter, CV) to the LRC, CSIS, University of Limerick, Limerick, Ireland. Note that the closing date for the receipt of applications has been extended to 22 June 2007.

More details on http://www.localisation.ie/resources/Research/symantecphd.htm


just copying and pasting .....

Monday, May 28, 2007

News ... in Neapolitan language ... no, not a wiki this time

It is just about two weeks ago or so that a friend from the United States called me. His family is from Campania and therefore he is very much interested in the Neapolitan language. That is: for him Neapolitan is the true mother tongue. We already have a Wikipedia in Neapolitan as well as some few websites. With OmegaWiki we will now be able to build dictionaries in various language combinations and spellcheckers and some other nice surprises :-)

So talking with him that idea of making news available in Neapolitan became stronger - we already had thought about it before, but there was that final step that was missing. Said and done: we talked with Michele Cinque and Antonio D'Urso from positanonews about it - and they found it a great idea. One prerequisite of mine was: we or need somebody who really writes well or we need help to proofread my writings and/or translations. And yes: Carmine Colacino again is helping out here - thank you! This means: my texts are going to be proofread by him :-)

So that is how it came to be that now we have a news section in Neapolitan language on positanonews, the first registered online newspaper in Campania. Besides it being the first news section in an online newspaper I also thought that it would make sense to have texts for facilitated reading for all those who would like to look up a word. The technique is the same I use here on the blog: with a snap.com account so that you can have a preview of the page where the link leads to. Well: some more will be inserted over time and I will center on more difficult words then, because probably the normal ones are quite easily understood.

To give some more "food" to readers I will try to link to relevant wikipedia articles whenever these are present.

Of course, like always there will be critics :-) and yes I very much like criticism as long as it is constructive and people try to help to improve things. We very much encourage others to write there as well. It may be anything related from a familie's history, background information, actual news to the presentation of cultural events and music.

This is definitely a way to give space to all to write about life, news, culture and of course also encyclopaedic articles on Wikipedia. If you feel like writing: please just do. We know that it is not easy to write in Neapolitan but it is also true that if you don't try, you will not learn how to do it.

Thank you for your time and we hope to see you soon on the newssection for Neapolitan news on positanonews.it as well as on the Neapolitan wikipedia.

A big thank you to the director of positanonews Michele Cinque for giving us these unique possibility and to Antonio D'Urso for making it possible from the technical side. Again: thanks to Carmine Colacino for taking the time to copy edit my texts and translations.

Sunday, May 27, 2007

An Encyclopaedia and Neapolitan music

Yesterday evening I was at the hotel Domina Royal in Positano for the presentation of an Encyclopaedia. Well it was an exceptional evening, not only because of the contents of the programme.

The "Nuova Enciclopedia Illustrata della Musica Napoletana" (New Illustrated Encyclopaedia of Neapolitan Music) by Pietro Gargano. It will be great to have definitely something to verify the contents about songs we have on the Neapolitan Wikipedia and to add reference to these unique works that are not finished for now. But only having a look at the first volume, of which of course I simply had to get one :-) we will be able to check a lot of data we already have and it will help to make at least some of the data available in Neapolitan.

Such a huge corpus of text would be difficult to translate manually - it would take really loads of time and it would maybe be too expensive for quite a limited market. My thoughts here go very much in the direction of Apertium and the creation of a dictionary that allows for machine translation from Italian to Neapolitan and maybe also to English. This would help here since the terminology used is quite restricted and the whole of the corpus is work of one and the same author. So it would be easier to get high quality results and then just proofread the text. Going that way such particular works can be made available quite easily in other languages and can be offered through print on demand services.

Well: it is for now a dream ... that hopefully one day can become true.

I am still thinking on how some of the songs were presented ... it would be great to show you the whole presentation online. Who knows if the one person who recorded it with the videocamera will help to do this.

And yes, I met some very interesting people - and I had a really great evening. I would like to thank Michele Cinque from Positanonews (news in Italian and English) who invited me to this presentation.