Doxygen
|
#include <cstdint>
#include <sstream>
#include "utf8.h"
#include "caseconvert.h"
#include "textstream.h"
Go to the source code of this file.
Functions | |
uint8_t | getUTF8CharNumBytes (char c) |
Returns the number of bytes making up a single UTF8 character given the first byte in the sequence. | |
static uint32_t | decode_utf8 (const char *data, int numBytes) noexcept |
Decodes a given input of utf8 data to a unicode code point given the number of bytes it's made of. | |
static uint32_t | convertUTF8CharToUnicode (const char *s, size_t bytesLeft, int &len) |
std::string | getUTF8CharAt (const std::string &input, size_t pos) |
Returns the UTF8 character found at byte position pos in the input string. | |
uint32_t | getUnicodeForUTF8CharAt (const std::string &input, size_t pos) |
Returns the 32bit Unicode value matching character at byte position pos in the UTF8 encoded input. | |
static char | asciiToLower (uint32_t code) |
static char | asciiToUpper (uint32_t code) |
static std::string | caseConvert (const std::string &input, char(*asciiConversionFunc)(uint32_t code), const char *(*conversionFunc)(uint32_t code)) |
std::string | convertUTF8ToLower (const std::string &input) |
Converts the input string into a lower case version, also taking into account non-ASCII characters that has a lower case variant. | |
std::string | convertUTF8ToUpper (const std::string &input) |
Converts the input string into a upper case version, also taking into account non-ASCII characters that has a upper case variant. | |
const char * | writeUTF8Char (TextStream &t, const char *s) |
Writes the UTF8 character pointed to by s to stream t and returns a pointer to the next character. | |
bool | lastUTF8CharIsMultibyte (const std::string &input) |
Returns true iff the last character in input is a multibyte character. | |
bool | isUTF8CharUpperCase (const std::string &input, size_t pos) |
Returns true iff the input string at byte position pos holds an upper case character. | |
int | isUTF8NonBreakableSpace (const char *input) |
Check if the first character pointed at by input is a non-breakable whitespace character. | |
bool | isUTF8PunctuationCharacter (uint32_t unicode) |
Check if the given Unicode character represents a punctuation character. | |
|
inlinestatic |
|
inlinestatic |
Definition at line 147 of file utf8.cpp.
Referenced by convertUTF8ToUpper().
|
inlinestatic |
Definition at line 152 of file utf8.cpp.
References convertUTF8CharToUnicode().
Referenced by convertUTF8ToLower(), and convertUTF8ToUpper().
|
inlinestatic |
Definition at line 69 of file utf8.cpp.
References decode_utf8().
Referenced by caseConvert(), getUnicodeForUTF8CharAt(), and isUTF8CharUpperCase().
std::string convertUTF8ToLower | ( | const std::string & | input | ) |
Converts the input string into a lower case version, also taking into account non-ASCII characters that has a lower case variant.
Definition at line 187 of file utf8.cpp.
References asciiToLower(), caseConvert(), and convertUnicodeToLower().
Referenced by SearchIndexInfo::add(), Index::addClassMemberNameToIndex(), Index::addFileMemberNameToIndex(), Index::addModuleMemberNameToIndex(), Index::addNamespaceMemberNameToIndex(), AnchorGenerator::generate(), QCString::lower(), FileNameFn::searchKey(), and SearchTerm::termEncoded().
std::string convertUTF8ToUpper | ( | const std::string & | input | ) |
Converts the input string into a upper case version, also taking into account non-ASCII characters that has a upper case variant.
Definition at line 192 of file utf8.cpp.
References asciiToUpper(), caseConvert(), and convertUnicodeToUpper().
Referenced by Translator::createNoun(), QCString::upper(), and writeAlphabeticalClassList().
|
inlinestaticnoexcept |
Decodes a given input of utf8 data to a unicode code point given the number of bytes it's made of.
Definition at line 55 of file utf8.cpp.
Referenced by convertUTF8CharToUnicode().
uint32_t getUnicodeForUTF8CharAt | ( | const std::string & | input, |
size_t | pos ) |
Returns the 32bit Unicode value matching character at byte position pos in the UTF8 encoded input.
Definition at line 135 of file utf8.cpp.
References convertUTF8CharToUnicode(), and getUTF8CharAt().
Referenced by AnchorGenerator::generate().
std::string getUTF8CharAt | ( | const std::string & | input, |
size_t | pos ) |
Returns the UTF8 character found at byte position pos in the input string.
The resulting string can be a multi byte sequence.
Definition at line 127 of file utf8.cpp.
References getUTF8CharNumBytes().
Referenced by SearchIndexInfo::add(), Index::addClassMemberNameToIndex(), Index::addFileMemberNameToIndex(), Index::addModuleMemberNameToIndex(), Index::addNamespaceMemberNameToIndex(), Translator::createNoun(), AnchorGenerator::generate(), getUnicodeForUTF8CharAt(), and writeAlphabeticalClassList().
uint8_t getUTF8CharNumBytes | ( | char | c | ) |
Returns the number of bytes making up a single UTF8 character given the first byte in the sequence.
Definition at line 23 of file utf8.cpp.
Referenced by detab(), escapeCharsInString(), AnchorGenerator::generate(), getUTF8CharAt(), nextUTF8CharPosition(), updateColumnCount(), and writeUTF8Char().
bool isUTF8CharUpperCase | ( | const std::string & | input, |
size_t | pos ) |
Returns true iff the input string at byte position pos holds an upper case character.
Definition at line 218 of file utf8.cpp.
References convertUnicodeToLower(), and convertUTF8CharToUnicode().
Referenced by DefinitionImpl::_setBriefDescription().
int isUTF8NonBreakableSpace | ( | const char * | input | ) |
Check if the first character pointed at by input is a non-breakable whitespace character.
Returns the byte size of the character if there is match or 0 if not.
Definition at line 228 of file utf8.cpp.
Referenced by detab().
bool isUTF8PunctuationCharacter | ( | uint32_t | unicode | ) |
Check if the given Unicode character represents a punctuation character.
Definition at line 234 of file utf8.cpp.
References isPunctuationCharacter().
Referenced by AnchorGenerator::generate().
bool lastUTF8CharIsMultibyte | ( | const std::string & | input | ) |
Returns true iff the last character in input is a multibyte character.
Definition at line 212 of file utf8.cpp.
Referenced by DefinitionImpl::_setBriefDescription().
const char * writeUTF8Char | ( | TextStream & | t, |
const char * | s ) |
Writes the UTF8 character pointed to by s to stream t and returns a pointer to the next character.
Definition at line 197 of file utf8.cpp.
References getUTF8CharNumBytes(), and TextStream::write().
Referenced by HtmlCodeGenerator::codify(), ManCodeGenerator::codify(), RTFCodeGenerator::codify(), HtmlDocVisitor::operator()(), HtmlDocVisitor::writeObfuscatedMailAddress(), and writeXMLCodeString().