Trolltech | Documentation | Qt Quarterly | « Extension Dialogs | How to Find and Control Children »

Forgot a tr()?
by Jasmin Blanchette
Qt's tr() mechanism for internationalization is easy to grasp, easy to use, and easy to misuse. This article gives some tips to ensure that all of an application's user-visible strings go through tr(), and that lupdate finds them all. Monolingual application developers are not left out: This article also talks about QRegExp, <BLINK>XML</BLINK>, and Swedish.

Common Blunders
[BLUNDER:] From Middle English BLUNDEREN, to go blindly, perhaps from
Old Swedish BLUNDRA, have one's eyes closed, from Old Norse BLUNDA.

-- American Heritage Dictionary

The tr() function is dual in nature. It is both an lupdate marker and a C++ function. Forgetting this duality can lead to blunders. Some blunders result in compiler errors, others are caught by lupdate, and a few go undetected and result in a message to qt-bugs@trolltech.com. The blunders of the latter kind are those that interest us here. We'll start by looking at the problems that can arise, and then we'll present solutions.

  1. Forgetting a tr().

    It is easy to forget to enclose user-visible strings in a tr() call. Neither lupdate nor the C++ compiler can distinguish between user-visible and user-invisible strings. Only QPainter ends up knowing that, but by then it's too late: "Cannot open %1" is unrecognizable as "Cannot open "C:\index.html".

  2. Using tr() on a variable.

    The arguments to tr() must be string literals. This works:

        QPushButton *ok = new QPushButton( tr("OK"), this );
    

    This doesn't:

        QCString str = "OK";
        QPushButton *ok = new QPushButton( tr(str), this );
    

    Sure enough, the string "OK" goes through tr(), but lupdate will miss the second example, and so will your Japanese translator. Although "OK" is okay in most languages, it won't do in Japanese.

    A more sophisticated version of this blunder involves temporary objects:

        tr( QString("Cannot open %1").arg(fileName) )
        tr( "Cannot open " + fileName )
    

    The correct idiom is

        tr( "Cannot open %1" ).arg( fileName )
    

  3. Using QT_TR_NOOP() without tr().

    The QT_TR_NOOP() macro and its lengthy cousin QT_TRANSLATE_NOOP() are rarely used, but when they are, they are often used wrongly. Unlike tr() and translate(), these macros perform no translation and simply mark the strings for translation. The only result of

        const char *strings[2] = {
            QT_TR_NOOP("OK"),
            QT_TR_NOOP("Cancel")
        };
        for ( int i = 0; i < 2; i++ ) {
            QButton *but = new QPushButton( strings[i], this );
        }
    

    is extra work for your Japanese translator, and a false sense of safety. A solution here is to enclose strings[i] in a tr() call. Another is to banish these funky macros. Reformed Pascal programmers have an advantage here: They know how to get along without string arrays.

  4. Calling tr() on an object.

    tr() is a static member function. The normal way to invoke it from outside MyWidget is MyWidget::tr(). The C++ compiler also lets myWidget->tr() pass, but lupdate isn't smart enough to deduce the type of myWidget. In Qt 3.1, lupdate will emit a warning in this case ("Cannot invoke tr() like this").

  5. Forgetting a Q_OBJECT.

    Every now and then, somebody comes to my office and says "tr() doesn't work!" I then spend half an hour trying to isolate the problem, recompiling Qt with qDebug() statements in the QTranslator code, to finally discover that the tr() call appeared in a class with no Q_OBJECT macro. The Q_OBJECT macro is necessary for signals, slots, properties, and tr(). Without it, tr() uses the name of the base class as the context, whereas lupdate still uses the name of the class in question. Recent versions of lupdate recognize this blunder and emit a warning ("Class MyClass lacks Q_OBJECT macro").

  6. Installing the QTranslator too late.

    Every now and then, somebody comes to my office and says "tr() doesn't work!" I then spend half an hour trying to isolate the problem, recompiling Qt with qDebug() statements in the QTranslator code, to finally discover that the QTranslator object was installed on the qApp object after the application's main window was created. Let's be careful.

Macro or Virtual Function?

The belief that tr() is a macro or a virtual member function of QObject is widespread. The truth is, tr() is a static function, and is therefore non-virtual, and certainly not a macro. Every class with a Q_OBJECT macro has tr() and trUtf8() declared and implemented automatically.

You might wonder why tr() isn't implemented once and for all in QObject, using className() to provide a context. The reason is this: Let A be a QObject subclass, and let B be an A subclass. Suppose that A's source code contains this:

    setText( tr("Hello") );

The string "Hello" is, to lupdate, in the context "A". However, when this code is executed for an instance of B, className() returns "B", so tr() would wrongly attempt to translate it in the context "B".

It is important that all these blunders are solved before the .ts files are sent for translation. The following sections will demonstrate how to spot and fix the tr() blunders.

Getting a Grep

The first tool to use to find occurrences of blunders #1, #2, and #3 is grep. Windows programmers can also try to use findstr, or download one of the many grep ports from a software tool provider such as Borland, MKS, or GNU.

The following command will find strings that aren't surrounded by tr() (blunder #1):

    grep -n '"' *.cpp | grep -v 'tr('

The -n option tell grep to print line number information. The -v option inverts the match, so the above reads, "Find all lines that contain " but that don't contain tr(." This approach does find missing tr() calls, but it also finds many innocent strings that need no translations, such as widget names and XPM data. We'll soon see a better method to spot blunder #1.

Blunder #2, using tr() on a variable, is easier to spot:

    grep -n 'tr(' *.cpp | grep -v '"'

This is essentially the opposite of the above: "Find all lines that contain tr( but that don't contain "."

If the application uses qApp->translate(), use the greps above, using translate in place of tr.

Blunder #3, using QT_TR_NOOP() wrongly, can easily be prevented by looking at the code. Run grep with the -l (hyphen ell) option to obtain a list of files that use QT_TR_NOOP() or QT_TRANSLATE_NOOP():

    grep -l 'QT_TR' *.cpp

For small projects, grep will often be enough. Larger projects will benefit from the Swedish chef presented below.

Mock Swedish
Belbo orders Abu to change all words, make each "a" become "akka" and
each "o" become "ulla," for a paragraph to look almost Finnish.

-- Umberto Eco

The most convenient method to spot missing tr() calls -- the No. 1 blunder as well as blunder #1 -- is to run the application with a translation. But what if none is available yet?

The eye-eighteen-enn business's response to this problem takes the form of the Swedish chef character from the Muppet Show. The idea is to use a program to translate the original English application to the language spoken by the Swedish chef, which we'll call Mock Swedish.[1] The application is then run with the Mock Swedish translation enabled, and any English word that appears untranslated at this point is caused by bad tr() usage.

Here are some English strings from a Qt application and their Mock Swedish equivalents:

EnglishMock Swedish
&File&Feele-a
Big pink pigBeeg peenk peeg
Qt QuarterlyQt Qooerterly
InternationalizationInterneshuneleezeshun
Do you want to save %1?Du yuoo vunt tu sefe-a %1? Bork Bork Bork!

The advantage of using the Swedish chef is that the result looks indisputably foreign -- making it easy to spot plain English words -- and yet, the application is still usable by the tester, who can make sense of most Mock Swedish.

There are, of course, no technical reasons to prefer a Swedish chef to a German hot-dog vendor or a Japanese sushi cook. Any language that looks reasonably foreign and that still is readable will do.

This table summarizes the spelling changes:

an->un      -f->ff      th|->t
au->oo      -ir->ur      -tion->shun
a-->e      -i->ee or i      -u->oo
en|->ee      -ow->oo      v->f
-ew->oo      |o->oo      w->v

The "|" sign stands for a word boundary, and "-" stands for mid-word. Thus, according to rule "-f", "f" is replaced by "ff" wherever it occurs, except in words like "fool" where "f" is preceded by a word boundary. A lex program that implements the Swedish chef is available from www.almac.co.uk/chef/chef/ftp_chef.html.

A few words are identical in English and Mock Swedish. The algorithm can be refined to ensure that the translated string is different from the original, by appending "bork" or by using some other cheap trick.

English: File Edit Preferences Help
Mock Swedish: Feele-a Ideet Prefferences Help bork
Backslang: Elif Tide Secnereferp Pleh
Caps Lock: fILE eDIT pREFERENCES hELP
Mixed Case: fIlE eDiT pReFeReNcEs hElP
No Vowels: Fl Dt Prfrncs Hlp
AltaVista German: Datei Bearbeiten Sie Präferenzen Hilfe

Writing a .ts Converter
How to represent this in a standard ASCII file remains a mystery.
-- Paul M. Roberts

We will now write the program that parses an untranslated .ts file and generates a Swedish .ts file. These files are XML files, so it is tempting to use Qt's SAX or DOM parsers to read them in. Instead we'll opt for a short solution using QRegExp, a class that holds no secrets to Qt Quarterly's early adopters. The task is fairly easy; we need to replace lines like

    <source>&File</source>
    <translation type="unfinished"></translation>

with

    <source>&File</source>
    <translation>&Feele-a</translation>

and save the result.

The bulk of the job -- reading the .ts file, putting in translations, and writing the result to a .ts.chef file -- is just 12 semicolons long thanks to QFile and QRegExp:

    QRegExp rx( "(<source>(.*)</source>.*<translation)[^>]*>"
                "</translation>" );
    rx.setMinimal( TRUE );

    QFile in( fileName );
    if ( in.open(IO_ReadOnly) ) {
        QFile out( fileName + ".chef" );
        if ( out.open(IO_WriteOnly) ) {
            QString str = in.readAll();
            int pos = 0;
            while ( (pos = str.find(rx, pos)) != -1 ) {
                QString} newText = rx.cap(1) + ">" +
                        mock( rx.cap(2) ) + "</translation>";
                str.replace( pos, rx.matchedLength(),
                             newText );
                pos += newText.length();
            }
            out.writeBlock( str.utf8(), str.utf8().length() );
            out.close();
        }
        in.close();
    }

The function mock() is given the job of converting the English string to Muck Svedeesh, Gnalskcab, cAPS lOCK, mIxEd CaSe, N Vwls, or AltaVista-Deutscher, as preferred.

Subclassing QTranslator
I have yet to see an interesting piece of code that comes from these OO people.
-- Alexander Stepanov

A completely different approach, which doesn't require the use of QRegExp or XML, is to subclass QTranslator. The QTranslator::findMessage() function is virtual and can be reimplemented as follows:

    QTranslatorMessage MockTranslator::findMessage(
            const char *context, const char *sourceText,
            const char *comment ) const
    {
        return QTranslatorMessage( context, sourceText, comment, mock(sourceText) );
    }

Once again, mock() does the real work.

When running the application with MockTranslator installed, blunders #1 and #3 result in untranslated English text in a sea of Swedish.

So far, we haven't invoked lupdate. This is a flaw, since we know that it is possible to fool lupdate by using tr() on a variable -- blunder #2. This can be tested as follows. Run lupdate to generate a .ts file, and then use the "search and replace" feature of a text editor to replace every occurrence of

    <translation type="unfinished"></translation>

with

    <translation>Mock me!</translation>

Load the corresponding .qm file into a QTranslator subclass that reimplements findMessage() as follows:

    QTranslatorMessage MockTranslator::findMessage(
            const char *context, const char *sourceText,
            const char *comment ) const
    {
        QTranslatorMessage msg = QTranslator::findMessage(
                context, sourceText, comment );
        if ( msg.translation() == "Mock me!" )
            msg.setTranslation( mock(sourceText) );
        else
            qWarning( "Blunder #2 with %s, %s, %s", context,
                      sourceText, comment );
        return msg;
    }

This is the best approach to catch blunder #2.

When tr() is Not Enough

The lupdate tool supports C++ source files and Qt Designer .ui files. Some applications store user-visible strings in other places: in databases, configuration files, scripts, etc. There are a few approaches that can be used to integrate these sources into the lupdate, Qt Linguist, lrelease cycle.

The simplest way is to write a C++ program that extracts the strings from wherever they come from and generates a .ts file. The .ts file can then be translated using Qt Linguist, and converted into a .qm file. QApplication supports multiple .qm files simultaneously, so there's no problem in combining this technique with others. The disadvantage of this technique is that it bypasses lupdate's merging algorithm.

An alternative is to write a program that generates fake C++ code, which can then be read by lupdate. Here's an example of such code:

    #if 0
    qApp->translate( "CustomerDB", "Agency" );
    qApp->translate( "CustomerDB", "Company" );
    qApp->translate( "CustomerDB", "Foreign" );
    qApp->translate( "CustomerDB", "International" );
    #endif

This approach adds a slight complication to your build system, but ensures that you benefit from lupdate's merging capabilities.


[1] Mock Swedish bears no resemblance to real Swedish, unless the program that generates it is fed with Danish, Norwegian, or Swedish.


This document is licensed under the Creative Commons Attribution-Share Alike 2.5 license.

Copyright © 2002 Trolltech. Trademarks How to Find and Control Children »