Unicode in Qt
Unicode is the standard for encoding text in almost all languages spoken in the world. It is nowadays used as the native encoding for text on most modern operating systems. The major exception is Microsoft Windows that still has a dual system supporting code pages and Unicode for applications.
These classes are relevant when working with string data. For information about rendering text, see the Rich Text Processing overview, and if your string data is in XML, see the XML Processing overview.
Unified view on Latin-1, UTF-8, or UTF-16 strings with a read-only subset of the QString API
Array of bytes
List of byte arrays
Holds a sequence of bytes that can be quickly matched in a byte array
View on an array of bytes with a read-only subset of the QByteArray API
16-bit Unicode character
Compares strings according to a localized collation algorithm
Can be used to speed up string collation
8-bit ASCII/Latin-1 character
Thin wrapper around an US-ASCII/Latin-1 encoded string literal
Converts between numbers and their string representations in various languages
Compile-time version of QByteArrayMatcher
Unicode character string
List of strings
Holds a sequence of characters that can be quickly matched in a Unicode string
Thin wrapper around QString substrings
Splits strings into tokens along given separators
Unified view on UTF-16 strings with a read-only subset of the QString API
Way of finding Unicode text boundaries in a string
Convenient interface for reading and writing text
Unified view on UTF-8 strings with a read-only subset of the QString API
The Unicode Consortium has a number of documents available, including
- The current version of the standard
- A technical introduction to Unicode
- The home page for the standard
In Qt, and in most applications that use Qt, most or all user-visible strings are stored using Unicode. Qt provides:
- Translation to/from legacy encoding for file I/O: see QTextCodec and QTextStream.
- Support for locale specific Input Methods and keyboards.
- A string class, QString, that stores Unicode characters, with support for migrating from C strings including fast translation to and from UTF-8, ISO8859-1 and US-ASCII, and all the usual string operations.
- Unicode-aware UI controls.
- Unicode compliant text segmentation (QTextBoundaryFinder)
- Unicode compliant line breaking and text rendering
will work. There is also a function, QObject::tr(), that provides translation support, like this:
Qt provides a number of built-in QTextCodec classes, that is, classes that know how to translate between Unicode and legacy encodings to support programs that must talk to other programs or read/write files in legacy file formats.
const char * uses a UTF-8. However, applications can easily find codecs for other locales, and set any open file or network connection to use a special codec.
Since US-ASCII and ISO-8859-1 are so common, there are also especially fast functions for mapping to and from them. For example, to open an application's icon one might do this:
Qt supports rendering text in most languages written in the world. The detailed list of supported writing systems depends a bit on operating system support and font availability on the target system.
See also Internationalization with Qt.
© 2022 The Qt Company Ltd. Documentation contributions included herein are the copyrights of their respective owners. The documentation provided herein is licensed under the terms of the GNU Free Documentation License version 1.3 as published by the Free Software Foundation. Qt and respective logos are trademarks of The Qt Company Ltd. in Finland and/or other countries worldwide. All other trademarks are property of their respective owners.