Trolltech | Documentation | Qt Quarterly | Canvas Item Groupies »

Achtung! Binary and Character Data
by Jasmin Blanchette
Qt's QString, QCString and QByteArray classes are all alternatives for old-fashioned and limited char arrays. Their proper use can make your applications faster, smaller, and more reliable. QString also brings the internationalization benefits of Unicode, and Qt's QDataStream and QTextStream classes make portable I/O easy. This article explains how to realize the benefits offered by these classes.

QByteArray is an Array of Chars

Qt provides the QMemArray<T> template class to store arrays of basic types and plain struct types. The class offers many convenient features, such as the ability to resize the array at any time. Once you start using it, you will abandon the archaic memcpy() and memset() functions. And if your application tries to access an element that is out of bounds, you will get a run-time warning.

Qt also provides QByteArray, which is really a QMemArray<char>. The array can contain any binary data, including '\0', and doesn't need to be '\0'-terminated. It is very similar to an old-fashioned C-style char array, only nicer.

QByteArray is an explicitly shared class. This makes the class more efficient in most cases, but can lead to subtle bugs. It is worth reading Shared Classes if you plan to use QByteArray extensively.

Be aware that the C++ standard lets compiler writers choose whether char is a signed char or an unsigned char. Programs that deal with char a lot are likely to assume that char is signed char at one point or another. Many compilers have command-line options for setting char's signedness; for example, MSVC's /j option and g++'s -funsigned-char option define char as unsigned char. You can use these options to test your programs.

QByteArray is the perfect class to store non-textual data in memory. If you don't like binary I/O and have loose space and memory constraints, you can easily read a whole file into a QByteArray and iterate over the array afterwards:

QFile file( "Friday's jam session.mp3" );
if ( file.open(IO_ReadOnly) ) {
    QByteArray array = file.readAll();
    file.close();
    for ( int i = 0; i < array.size(); i++ ) {
        // etc.
    }
}

QCString is a String of Chars

The C vocabulary consisted of scientific and technical terms which it behooved no one but scientists and technicians to use.
-- George Orwell

The QCString class stores a classic C-style '\0'-terminated string of chars. QCString can be used to store any '\0'-terminated 8-bit string data, such as Latin-1, EUC-KR, and EBCDIC.

Thanks to inheritance, a QCString is also a QByteArray. The 5-character string "Hello" is also a 6-character array ""Hello\0"":

QCString s( "Hello" );
s.length(); // returns 5
s.size();   // returns 6

If you convert a QCString to a QByteArray, the '\0' terminator comes along:

QCString s( "Hello" );
QByteArray a = s.copy();
a.size();   // returns 6

(We call copy() to obtain a deep copy of the array.) You can truncate the array to get rid of the '\0':

a.truncate( a.size() - 1 );

If you convert a QByteArray to a QCString, care is also required: You must be certain that the QByteArray is '\0'-terminated. If it isn't, make it so:

int n = a.size();
a.resize( n + 1 );
a[n] = '\0';
QCString s = a.data();

Historically, QCString is the same as the old 8-bit QString in Qt 1.x, and was provided mostly to ease porting to Qt 2.0. There are very few contexts were QCString is preferable to QByteArray or QString. QCString's main benefit is that it requires only about half the memory of QString, but this is rarely an issue. In general QString is the best class to use for user-visible strings, not least because it uses Unicode.

QString is a Unicode String

But why do you need to mix one- and two-byte characters anyway?
-- Ken Lunde

The QString class stores a 16-bit Unicode string. Unlike QCString, it isn't '\0'-terminated, and it can embed '\0's.

Conversion from QString to QByteArray or QCString, or vice versa, can be tricky. Qt provides operators and constructors to convert from one to the other conveniently, but that's not always enough.

  1. Conversion from QString to QCString.

    A QString composed entirely of Latin-1 characters (including ASCII) can easily be converted into a QCString:

    QString qstr( "Anders Ångström <anders@telia.se>" );
    QCString cstr = (const char *)qstr;
    

    But if the QString contains non-Latin-1 characters, the result is undefined.

    To disable this automatic (and perhaps undesirable) conversion, you can define the preprocessor symbol QT_NO_ ASCII_CAST before including any Qt header. This will ensure that you must call latin1() explicitly:

    QCString cstr = qstr.latin1();
    

    If you use qmake, you can disable the conversion for your whole project by adding this line to the .pro file:

    DEFINES += QT_NO_ASCII_CAST
    

    There are other ways of converting a QString to a QCString (see the QString and QTextCodec class documentation); the utf8() function is one way, and it has the advantage of preserving all the information:

    QCString cstr = qstr.utf8();
    

  2. Conversion from QCString to QString.

    QCString doesn't mandate a particular encoding. If you store Latin-1 in a QCString, the conversion to Unicode is trivial:

    QCString cstr( "Carl Friedrich Gauß <gauss@gmx.de>" );
    QString qstr = cstr;
    

    If you use encodings other than Latin-1, this automatic conversion is dangerous. To disable it, define the preprocessor symbol QT_NO_CAST_ASCII (not to be confused with its cousin QT_NO_ASCII_CAST). You then need to call fromLatin1() explicitly to convert a Latin-1 QCString to a QString:

    QString qstr = QString::fromLatin1( cstr );
    

    A similar function exists for UTF-8-encoded strings.

  3. Conversion from QString to QByteArray.

    The easiest way to convert a QString into a QByteArray is the reductionist approach: Convert the QString into a QCString, then convert the QCString into a QByteArray:

    QString qstr( "Anders Ångström <anders@telia.se>" );
    QByteArray array = QCString( qstr );
    

    The resulting QByteArray will have a '\0' terminator.

  4. Conversion from QByteArray to QString.

    Qt converts a QByteArray into a QString automatically:

    QByteArray array;
    array.assign( "Hello\0World", 11 );
    QString str = array;
    

    The constructor stops at the first '\0' character. In the above code snippet, str is the 5-character string "Hello", not the 11-character string ""Hello\\0World"".

 ***
 
In summary, converting from one type to another can be risky because the computer cannot read the programmer's mind to ascertain the appropriate encodings. The following rules of thumb can help you achieve faster, more reliable code:
  1. Choose your data types carefully.
  2. Avoid unnecessary conversions.
  3. Check that the necessary conversions are correct.
Now let's see how these things relate to I/O.

QDataStream Streams Binary Data

We have two choices, either to attack I/O now and get it over with, or to postpone I/O until near the end.
Neither prospect is very attractive.

-- Donald E. Knuth

The QDataStream class can be used to read and write binary data in a file or on some other I/O "device." Here's how to write the 32-bit unsigned integer 0x92025428 in little endian in a file called "nitty":

QFile file( "nitty" );
if ( file.open(IO_WriteOnly) ) {
    QDataStream out( &file );
    out.setByteOrder( QDataStream::LittleEndian );
    out << (Q_UINT32)0x92025428;
}

And here's how to read a 16-bit signed integer in big endian (the default) from a file called "gritty":

QFile file( "gritty" );
if ( file.open(IO_ReadOnly) ) {
    QDataStream in( &file );
    Q_INT16 n;
    in >> n;
}

QDataStream supports a lot of types: Q_INT8, Q_INT16, Q_INT32, Q_UINT8, Q_UINT16, Q_UINT32, float, double, char *, QBitArray, QByteArray, QCString, QString, QVariant, and many more. The full list is available in Format of the QDataStream Operators.

Our first example will demonstrate how to save all the properties of a QObject on an I/O device (typically a file) and how to load them afterwards. Here's the code to write out the value of the properties:

const Q_UINT32 MagicNumber = 0x1A7D3EF6;

void writeProperties( QObject *obj, QIODevice *device )
{
    QDataStream out( device );
    out << (Q_UINT32)MagicNumber
        << (Q_UINT8)out.version();

    int n = obj->metaObject()->numProperties( true );
    for ( int i = 0; i < n; i++ ) {
        const QMetaProperty *prop =
                obj->metaObject()->property( i, true );
        if ( prop->writable() ) {
            out << prop->name()
                << obj->property( prop->name() );
        }
    }
}

The file starts with a 32-bit magic number, then an 8-bit version that specifies which version of QDataStream is used to write out the data. In Qt 3.1 QDataStream is at version 5; it is safe to assume that the version number fits in an 8-bit field.

When inputting or outputting complex types such as QVariant, it's very important to make sure that the same version of the stream is used for reading and writing. If you need both forward and backward compatibility, you can hardcode the version number in the application:

QDataStream out( device );
out.setVersion( 5 );
...
QDataStream in( device );
in.setVersion( 5 );

The disadvantage of hardcoding is that the application will not benefit from improvements in Qt 3.2 or later -- for example, the addition of a new color role in QColorGroup.

Here's how to read back the properties written with writeProperties():

void readProperties( QObject *obj, QIODevice *device )
{
    QDataStream in( device );
    Q_UINT32 magic;
    Q_UINT8 version;

    in >> magic;
    if ( magic != MagicNumber ) {
        qWarning( "Not property data" );
        return;
    }

    in >> version;
    if ( in.version() < version ) {
        qWarning( "Data is from a newer version" );
        return;
    }
    in.setVersion( version );

    QCString name;
    QVariant value;
    while ( !in.atEnd() ) {
        in >> name;
        if ( name.isNull() )
            break;
        in >> value;
        obj->setProperty( name, value );
    }
}

In the following code snippet, we use writeProperties() and readProperties() to save a frame's properties across sessions:

QFrame *frame = new QFrame( parent );
QFile file( "properties" );
if ( file.open(IO_ReadOnly) )
    readProperties( frame, &file );
...
QFile file( "properties" );
if ( file.open(IO_WriteOnly) )
    writeProperties( frame, &file );
delete frame;

PNG file format

The image's width and height are the first two fields of the image header ("IHDR") chunk. Here's the complete code source of a Qt program that prints out the width and height of the PNG images specified on the command line:

#include <qbuffer.h>
#include <qcstring.h>
#include <qdatastream.h>
#include <qfile.h>
#include <qsize.h>

QSize getPngImageSize( QIODevice *device )
{
    Q_UINT32 width = 0;
    Q_UINT32 height = 0;
    char signature[8];
    char chunkType[4];

    QDataStream in( device );
    in.readRawBytes( signature, 8 );

    if ( memcmp(signature, "\211PNG\r\n\32\n", 8) == 0 ) {
        while ( !in.atEnd() ) {
            Q_UINT32 length;
            in >> length;
            in.readRawBytes( chunkType, 4 );
            if ( memcmp(chunkType, "IHDR", 4) == 0 ) {
                in >> width >> height;
                break;
            }

            // jump to the next chunk
            device->at( device->at() + length + 4 );
        }
    }
    return QSize( width, height );
}

int main( int argc, char **argv )
{
    if ( argc < 2 ) {
        qWarning( "Usage: pngsize file1.png etc." );
        return 1;
    }

  for ( int i = 1; i < argc; i++ ) {
        QFile file( QFile::decodeName(argv[i]) );
        if ( file.open(IO_ReadOnly) ) {
            QSize size = getPngImageSize( &file );
            qWarning( "%s: %d x %d", argv[i],
                      size.width(), size.height() );
        } else
            qWarning( "Cannot open file '%s'", argv[i] );
    }
    return 0;
}

(If speed isn't important, and Qt is configured with PNG support, you can let QImage do the hard work:

QSize size = QImage( argv[i] ).size();

But that doesn't demonstrate QDataStream.)

Now let's suppose that we have a PNG image embedded in our application using qembed or a similar tool:

static const char png_data[] = {
    0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a,
    0x00, 0x00, 0x00, 0x0d, 0x49, 0x48, 0x44, 0x52,
    ...
    0x44, 0x75, 0x74, 0x63, 0x68, 0x6d, 0x61, 0x6e,
    0x44, 0xae, 0x42, 0x60, 0x82
};

How can we determine the size of that image without creating a QImage? The answer follows from these four facts:

Here's how to transform these ideas into running code:

QByteArray array;
array.setRawData( png_data, sizeof(png_data) );
QBuffer buffer( array );
buffer.open( IO_ReadOnly );
QSize size = getPngImageSize( &buffer );
array.resetRawData( png_data, sizeof(png_data) );

QTextStream Streams Text

QTextStream reads and writes data in a text file or any other I/O device. For example, here's how to write "Hello {w|0c o}rld!"\\"n" to a file called "greetings":

QFile file( "greetings" );
if ( file.open(IO_WriteOnly) ) {
    QTextStream out( &file );
    out << "Hello world!\n";
}

QTextStream's main feature is that it can use a QTextCodec to convert between the data on the byte-based I/O device and the 16-bit Unicode QString class. By default, it uses the 8-bit encoding prescribed by the locale (Latin-1 in most of Europe).

Unfortunately, QTextCodec support makes QTextStream much slower than binary I/O. Trolltech has plans to make QTextStream faster in Qt 4.0.

If you want QTextStreams to work seamlessly in any locale, you usually need to reflect on the type of data that you are dealing with. If you're writing a tool that reads or writes C++ source code, you probably want something like this:

stream.setEncoding( QTextStream::Latin1 );

If you write XML, you will want either

stream.setEncoding( QTextStream::UnicodeUTF8 );

or

stream.setEncoding( QTextStream::Unicode );

unless you specify another encoding explicitly in the <?xml?> header.

QTextStream can even operate on a QString. The two convenience subclasses QTextIStream and QTextOStream make this very easy:

QString str;
QTextOStream( &str ) << 22 << " " << "trees";
// str == "22 trees"

int n;
QString thing;
QTextIStream( &str ) >> n >> thing;
// n == 22, thing == "trees"

Much of what you already know from <iostream> will still work. And here's how to simulate cout with QTextOStream:

QTextOStream qout( stdout );
qout << oct << 31 << " = " << dec << 25 << endl;

The article Generating XML presents an application of QTextStream.


This document is licensed under the Creative Commons Attribution-Share Alike 2.5 license.

Copyright © 2003 Trolltech. Trademarks Canvas Item Groupies »