This is a read-only archive. Find the latest Linux articles, documentation, and answers at the new Linux.com!

Linux.com

Feature

Open source translation tools

By Mayank Sharma on March 23, 2005 (8:00:00 AM)

Share    Print    Comments   

Do you speak open source, but English is not your native language? You can turn to a growing list of user-friendly tools for translators, including specialized graphical tools like poEdit, KBabel, and Gtranslator.

Most of the popular open source applications use the GNU gettext framework, which specifies that applications store the strings of words they use (for constructing the interface, error messages, documentation, etc.) in portable object (PO) files. When a PO file does not have any translations, it is called a POT (PO template) file.

To convert the strings stored in PO files from English into a local language you need a font and a keyboard layout for your chosen language. This article shows how to install these in Fedora Core 2. You also need a Unicode-enabled browser (such as Konqueror, Mozilla, or Firefox), and of course the translation applications.

Once you have all your tools in place, see the KDE documentation translation page for more information on how to translate documentation using PO files.

Translating an application is a little more complex than simply translating its documentation. GNOME, OpenOffice.org, KDE, and Mozilla/Firefox have their own separate procedures for translation. OpenOffice.org requires its GSI (GutSchmidt's Interface) files to be translated, while Mozilla and Firefox have their translatable messages in .dtd and .properties files. One way to translate these applications is to convert everything into PO files and then use one of the PO editing tools.

SourceForge's Translation Project has various tools that can assist you as a translator. The oo2po and po2oo tools convert files from the GSI format to PO format and, after translation, back to GSI. The moz2po and po2moz perform similar functions with the Mozilla English Language Packs.

However, "we've encountered problems using oo2po and po2oo," says Kartik Mistry, developer/translator for the Utkarsh Gujarati Project. "So we directly edited the GSI file -- an intensive task for the team, as the GSI file's format is very sensitive; even one wrong space makes your compilation fail or gives you German strings. moz2po works fine, but po2moz has problems and generates extra strings. But the good thing is that the Translation Project team is working hard and with every new release the tools are improving."

Where do you ask translation questions?
While there are lots of localized software and plenty of content available, what seems to be missing is a good multilingual forum board like LinuxQuestions.

The best and quickest way to translate PO files is to use KBabel. KBabel can do a rough translation using a glossary file, which you can generate using KBabeldict. This accelerates the process of translation by reducing the number of strings that need to be translated. Although many localization projects don't have them, KBabel can use dictionaries and spell-check your translations.

A useful tool bundled with KBabel is the CatalogManager. By comparing the POT files with the translated PO files, it gives you a quick status overview of how many files have been translated and to what extent.

The GNOME PO Translator Guide gives you a good idea on how projects like GNOME deal with PO files. If you are planning to pass around your PO files, make sure they are Unicode-encoded. With computers becoming popular in non-English-speaking areas, Unicode has been accepted as the standard for rendering non-English (as well as English) scripts. There are various Unicode Transformation Formats (UTF) in which you can store Unicode data.

Conclusion

As the online collaborative model spreads into the field of localization, new Web-based translation tools like Pootle and Rosetta make the life of a translator a lot easier by offloading the requirements of "offline" tools like KBabel.

But once you have the right setup (fonts, keyboard) and the inclination to localize, there are plenty of things you can do apart from translating.

Mayank Sharma is a freelance technology writer and FLOSS migration consultant in New Delhi, India.

Share    Print    Comments   

Comments

on Open source translation tools

Note: Comments are owned by the poster. We are not responsible for their content.

not only KBabel...

Posted by: Anonymous Coward on March 23, 2005 08:28 PM
Poedit can automatically translate as well (and in a, subjectively, better way!) and it also has catalog manager. It shows that the author is sadly only familiar with KBabel and not the other tools he writes about...

#

Where is the automatic translator?

Posted by: JelleB on March 23, 2005 08:39 PM
The strings in most applications are not very complex. There are online translation services that translate from and to several languages (babelfish the most famous). What I have not found yet is a tool that feeds the pot strings to one of these services. The results will not be very good, but it is better than nothing, and you only need to do that for unknown strings.

Having a central translations repository would be very nice too, if it suggests the closest matches to choose from. Translating is hard work, and if you want to rely on volunteers to do it, you might as well make it as easy as possible for them.

Yes I know, if I want it I should write it myself. (I don't wan't it that bad i guess)

#

Short strings are dependent on context

Posted by: Anonymous Coward on March 23, 2005 09:49 PM
Many of the short strings are dependent on the context and some times the original string is not that well thought out. That makes it too difficult for current automatic translation tools like the fish.

Translation is hard work and one reason for that is the need to keep checking on the context. For example, one recent projcet used strings like "tag" and "issue" in three different meanings each. Other times, it was not possible to guess from the short text what was mean, without actually figuring out which menu / dialog message it was for.


Feeding the strings through an automated translation service would probaby be worse than leaving them in English.

#

Re:Short strings are dependent on context

Posted by: Anonymous Coward on March 24, 2005 12:53 AM
Well, as a SW translator, I can tell you that sometimes, it's better to leave the SW in English even if you compare the English version and the rigorously human-translated SW<nobr> <wbr></nobr>:)

#

Re:Short strings are dependent on context

Posted by: JelleB on March 25, 2005 05:32 AM
You are right that you need the context to do a good translation, and that a human translater is much better than a machine service. But the problem lies with manpower, not with finding the best quality translation. If a machine tool does 90% correct, then that is 90 % a human does not need to do. Tha means that the software may be usefull in the transl;ated language. The 10 % incorrect strings are the itch for the next translator. If you do not provide the itch, the chances are much lower that somebody will stratch it.

Just MHO, take with regular dose of NaCl.

#

Re:Short strings are dependent on context

Posted by: Anonymous Coward on March 25, 2005 05:48 AM
Even if automated translators can perform 90% well (which I doubt), you could argue that no translation is better than a bad one from a user-image point of view - a user that sees no translation may realize (especially if there's a message) that the project isn't along to that stage yet; a user who sees a crappy translation will get a bad image of the project, especially if it causes them to accidentally rm -rf<nobr> <wbr></nobr>/. I certainly get a bad opinion of programs that are poorly translated into English...

#

Re:Short strings are dependent on context

Posted by: Anonymous Coward on March 25, 2005 03:25 PM
I definitely agree that no translation is better than a bad one, IN THE CASE OF SOFTWARE. Another reason is this: the interface may not be fully understandible to some people if it is all English, but at least it will be true to its meaning. A bad or incorrect translation however, could be flat out incorrect or misleading.

If the argument were to machine translate a website, with flowing text (i.e. normal sentences within the context of paragraphs, etc.), then I would find it acceptible to use machine translation to capture 90% of the meaning.

In software however, messages often are there to instruct, lead, etc. the user. Very dangerous to put the entire system at the mercy of of machine generated auto-translations.

Thanks.

#

Reporting errors

Posted by: Anonymous Coward on March 30, 2005 03:16 PM

The author might have already reported errors with oo2po to <a href="mailto:translate-devel@lists.sourceforge.net" title="mailto">mailto:translate-devel@lists.sourceforge.net</a mailto>, I haven't checked. If not please report them.



I use oo2po extensively, personally I don't know why anyone would ever edit the OpenOffice GSI file directly, it is just too error prone. The whole of OpenOffice.org 2.0 community translation now works with PO files generated by oo2po and are integrated succesfully using po2oo.

#

This story has been archived. Comments can no longer be posted.



 
Tableless layout Validate XHTML 1.0 Strict Validate CSS Powered by Xaraya