Monday, January 29, 2007

Monopolising the CAT-Market

When I read such stuff I understand that we do the OmegaT marketing the wrong way: http://www.econtentmag.com/Articles/ArticleReader.aspx?ArticleID=19044

We must go out to companies and explain them about translations, how to do them, what a CAT-Tool is and why it does not make sense to be dependent on a "closed technology".

Latin is dying out

And another interesting link to read about Latin dying out ... if I only had time ;-) : http://news.bbc.co.uk/2/hi/europe/6308281.stm

PDF becomes a new ISO standard

Well, just to let you know about this . .. don't really have time to write. If you are interested in PDF just click here and read more.

Sunday, January 28, 2007

Erin McKean about Dictionaries

Well ... I now understand a bit more about the New Oxford American Dictionary ... and if you are interested in Dictionaries you should have a look at the presentation by Erin McKean. Have fun :-)

Sunday, January 21, 2007

Contents creation for wikipedias

The following text was originally created for an e-mail to the afrowikinews group. I believe it is interesting to more people so I post it also here.

*****
Well if contents creation for Wikipedias is a theme: let's take up that
theme considering how it could be done.

Of course it depends very much about which wikipedia we are talking and
if we are talking about "how to build a wikipedia" only with the help of
the community or about "how to use funding". Or maybe a combined way.

Now let's talk about the case in which funding is available.

When it comes to funding, in a first place you have to consider if the
sponsor would like to see certain information there - that could be
anything: from descriptions of cities and mountains to biology to
whatever. The most requested pages are normally background information
to the news - this means: people search for articles that are related to
the news they read about in the newspaper, learn about in the news etc.

The best way would be: have people edit in the relative language. Often
you don't find enough people who are able to edit so one part of the
money should then be used to teach people how to create contents. This
will take some preparation time and eventually a workshop needs to be
created. Probably there people will need to move, so this must be done
well in advance to allow people for proper programming. Really this is
the ideal case.

When you don't get enough people who can edit the second choice is
translation. This means that in such a case co-operation is needed: you
find people, possibly professional translators or students in their last
year of translation studies (these could even use the project as a stage
and this implies that you have enough connections to relevant
universities - well: we are already building on these various networks)
and you have them translate chosen texts. Also here the ideal would be:
teach these translators some basics first - this is difficult to achieve
with professionals, but quite easy to achieve with students in their
last year - you can try to get the whole course involved by simply
agreeing to a "wikipedia translation workshop" within their study
programme.

Why translators and students will do a good job: they use their
translations as reference - remember: they are GFDL - this means when
they apply for a job they can give the link to their translations - this
will help them with their start up and assure that they are going to do
the best possible for the project.

How to choose who does what - that is valid for writing and translation
- and also here you have several possibilities depending on the project
and the wikipedia.

1) People can choose what to edit on based on a criteria they get - that
is you trust them to do good.
2) People are assigned to specific themes and articles.
3) You choose the articles to be translated and assign them.
4) The articles are choosen by the organisation that is providing the
funding
5) A mixture of these ways.

The first case is the most open one that has the highest risk potential,
but that can also get you the best results.
As for the case when you decide who does what, if possible, it makes
sense to talk to the various people so that you get a feeling for them -
this helps to choose the right people for the right themes.

One thing that is being done with Cherokee is machine translation and
for some wikipedias even that could make sense in order to have material
to work on. There are other examples where machine translation works
very well and people can concentrate on improving the 5% error quote.
Apertium is a software that deals with one language to one language
translation among similar languages - I did once a trial text about a
city and had it translated from Spanish to Catalan - the people who had
a look at that translation said that it was fairly good and that
correcting it was much faster than having to translate from scratch. So
also that could be an option. Of course there are proprietary software
solutions that do things really very good, but that are for now not
easily accessible.

But this last option is really only to be considered when there are not
enough mothertongue speakers available to reach a tipping point where
also other people start to contribute. Normally it is preferrable to
work with people who sooner or later will always go back to their work
to see what happened with it ... and see: even if a translator does not
know how to edit a wiki ... once he sees his/her articles there and
finds an error, a strange sentence, a typo or whatever: they will
correct it ... the reason is simple: they have a relation with this
text, they feel it theirs.

As you can see: there are many possibilities and there are even more -
it is impossible to describe them all - how a project in the end will
work out and is done always depends from a number of circumstances.

When it comes to Neapolitan: we would be happy if we could get the
funding to create the bilingual wordlist needed for Apertium so that the
handful of us who can edit articles could concentrate only on that bit.
Building the list bit by bit simply takes a lot of time and it would be
very important that one person could concentrate for some time to do
only this. Well, maybe one day :-)

While I was writing this e-mail a mail by Delirium reached the
foundation list where he states the following:

-----
....omissis ....

This is also a problem for sources, because on some topics, even
very important ones, there is often only information available in one
language---for example many China-related topics have information only
available in Chinese. On en: this is not so much of a problem, because
we have many Chinese-speaking editors. But el: has few to no
Chinese-speaking editors, so cannot use such sources, except indirectly
by translating the en: article.

Of course one does the best with what one has, so the smaller Wikipedias
simply need more people to begin with, and then will probably have to
rely more heavily on translating articles from the larger Wikipedias.
-----

And really: the only thing I can do at this stage is thank Delirium - he
could not have expressed the problems of some wikipedias better.

Thanks for taking the time - your ideas and thoughts are like always
welcome and will help to improve our work for sure.