Doxygen
Loading...
Searching...
No Matches
reg::PToken Class Reference

Class representing a token in the compiled regular expression token stream. More...

Public Types

enum class  Kind : uint16_t {
  End = 0x0000 , WhiteSpace = 0x1001 , Digit = 0x1002 , Alpha = 0x1003 ,
  AlphaNum = 0x1004 , CharClass = 0x2001 , NegCharClass = 0x2002 , BeginOfLine = 0x4001 ,
  EndOfLine = 0x4002 , BeginOfWord = 0x4003 , EndOfWord = 0x4004 , BeginCapture = 0x4005 ,
  EndCapture = 0x4006 , Any = 0x4007 , Star = 0x4008 , Optional = 0x4009 ,
  Character = 0x8000
}
 The kind of token. More...
 

Public Member Functions

const char * kindStr () const
 returns a string representation of the tokens kind (useful for debugging).
 
 PToken ()
 Creates a token of kind 'End'.
 
 PToken (Kind k)
 Creates a token of the given kind k.
 
 PToken (char c)
 Create a token for an ASCII character.
 
 PToken (uint16_t v)
 Create a token for a byte of an UTF-8 character.
 
 PToken (uint16_t from, uint16_t to)
 Create a token representing a range from one character from to another character to.
 
void setValue (uint16_t value)
 Sets the value for a token.
 
Kind kind () const
 Returns the kind of the token.
 
uint16_t from () const
 Returns the 'from' part of the character range.
 
uint16_t to () const
 Returns the 'to' part of the character range.
 
uint16_t value () const
 Returns the value for this token.
 
char asciiValue () const
 Returns the value for this token as a ASCII character.
 
bool isRange () const
 Returns true iff this token represents a range of characters.
 
bool isCharClass () const
 Returns true iff this token is a positive or negative character class.
 

Private Attributes

uint32_t m_rep
 

Detailed Description

Class representing a token in the compiled regular expression token stream.

A token has a kind and an optional value whose meaning depends on the kind. It is also possible to store a (from,to) character range in a token.

Definition at line 58 of file regex.cpp.

Member Enumeration Documentation

◆ Kind

enum class reg::PToken::Kind : uint16_t
strong

The kind of token.

Ranges per bit mask:

  • 0x00FF from part of a range, except for 0x0000 which is the End marker
  • 0x1FFF built-in ranges
  • 0x2FFF user defined ranges
  • 0x4FFF special operations
  • 0x8000 literal character
Enumerator
End 
WhiteSpace 
Digit 
Alpha 
AlphaNum 
CharClass 
NegCharClass 
BeginOfLine 
EndOfLine 
BeginOfWord 
EndOfWord 
BeginCapture 
EndCapture 
Any 
Star 
Optional 
Character 

Definition at line 70 of file regex.cpp.

71 {
72 End = 0x0000,
73 WhiteSpace = 0x1001, // \s range [ \t\r\n]
74 Digit = 0x1002, // \d range [0-9]
75 Alpha = 0x1003, // \a range [a-z_A-Z\x80-\xFF]
76 AlphaNum = 0x1004, // \w range [a-Z_A-Z0-9\x80-\xFF]
77 CharClass = 0x2001, // []
78 NegCharClass = 0x2002, // [^]
79 BeginOfLine = 0x4001, // ^
80 EndOfLine = 0x4002, // $
81 BeginOfWord = 0x4003, // <
82 EndOfWord = 0x4004, // >
83 BeginCapture = 0x4005, // (
84 EndCapture = 0x4006, // )
85 Any = 0x4007, // .
86 Star = 0x4008, // *
87 Optional = 0x4009, // ?
88 Character = 0x8000 // c
89 };

Constructor & Destructor Documentation

◆ PToken() [1/5]

reg::PToken::PToken ( )
inline

Creates a token of kind 'End'.

Definition at line 124 of file regex.cpp.

124: m_rep(0) {}
uint32_t m_rep
Definition regex.cpp:165

References m_rep.

◆ PToken() [2/5]

reg::PToken::PToken ( Kind k)
inlineexplicit

Creates a token of the given kind k.

Definition at line 127 of file regex.cpp.

127: m_rep(static_cast<uint32_t>(k)<<16) {}

References m_rep.

◆ PToken() [3/5]

reg::PToken::PToken ( char c)
inline

Create a token for an ASCII character.

Definition at line 130 of file regex.cpp.

130 : m_rep((static_cast<uint32_t>(Kind::Character)<<16) |
131 static_cast<uint32_t>(c)) {}

References Character, and m_rep.

◆ PToken() [4/5]

reg::PToken::PToken ( uint16_t v)
inline

Create a token for a byte of an UTF-8 character.

Definition at line 134 of file regex.cpp.

134 : m_rep((static_cast<uint32_t>(Kind::Character)<<16) |
135 static_cast<uint32_t>(v)) {}

References Character, and m_rep.

◆ PToken() [5/5]

reg::PToken::PToken ( uint16_t from,
uint16_t to )
inline

Create a token representing a range from one character from to another character to.

Definition at line 138 of file regex.cpp.

138: m_rep(static_cast<uint32_t>(from)<<16 | to) {}
uint16_t to() const
Returns the 'to' part of the character range.
Definition regex.cpp:150
uint16_t from() const
Returns the 'from' part of the character range.
Definition regex.cpp:147

References from(), m_rep, and to().

Member Function Documentation

◆ asciiValue()

char reg::PToken::asciiValue ( ) const
inline

Returns the value for this token as a ASCII character.

Definition at line 156 of file regex.cpp.

156{ return static_cast<char>(m_rep); }

References m_rep.

Referenced by reg::Ex::Private::compile(), reg::Ex::match(), and reg::Ex::Private::matchAt().

◆ from()

uint16_t reg::PToken::from ( ) const
inline

Returns the 'from' part of the character range.

Only valid if this token represents a range

Definition at line 147 of file regex.cpp.

147{ return m_rep>>16; }

References m_rep.

Referenced by isRange(), reg::Ex::Private::matchAt(), and PToken().

◆ isCharClass()

bool reg::PToken::isCharClass ( ) const
inline

Returns true iff this token is a positive or negative character class.

Definition at line 162 of file regex.cpp.

162{ return kind()==Kind::CharClass || kind()==Kind::NegCharClass; }
Kind kind() const
Returns the kind of the token.
Definition regex.cpp:144

References CharClass, kind(), and NegCharClass.

Referenced by reg::Ex::Private::matchAt().

◆ isRange()

bool reg::PToken::isRange ( ) const
inline

Returns true iff this token represents a range of characters.

Definition at line 159 of file regex.cpp.

159{ return m_rep!=0 && from()<=to(); }

References from(), m_rep, and to().

◆ kind()

Kind reg::PToken::kind ( ) const
inline

Returns the kind of the token.

Definition at line 144 of file regex.cpp.

144{ return static_cast<Kind>(m_rep>>16); }
Kind
The kind of token.
Definition regex.cpp:71

References m_rep.

Referenced by reg::Ex::Private::compile(), isCharClass(), reg::Ex::match(), and reg::Ex::Private::matchAt().

◆ kindStr()

const char * reg::PToken::kindStr ( ) const
inline

returns a string representation of the tokens kind (useful for debugging).

Definition at line 92 of file regex.cpp.

93 {
94 if ((m_rep>>16)>=0x1000 || m_rep==0)
95 {
96 switch(static_cast<Kind>((m_rep>>16)))
97 {
98 case Kind::End: return "End";
99 case Kind::Alpha: return "Alpha";
100 case Kind::AlphaNum: return "AlphaNum";
101 case Kind::WhiteSpace: return "WhiteSpace";
102 case Kind::Digit: return "Digit";
103 case Kind::CharClass: return "CharClass";
104 case Kind::NegCharClass: return "NegCharClass";
105 case Kind::Character: return "Character";
106 case Kind::BeginOfLine: return "BeginOfLine";
107 case Kind::EndOfLine: return "EndOfLine";
108 case Kind::BeginOfWord: return "BeginOfWord";
109 case Kind::EndOfWord: return "EndOfWord";
110 case Kind::BeginCapture: return "BeginCapture";
111 case Kind::EndCapture: return "EndCapture";
112 case Kind::Any: return "Any";
113 case Kind::Star: return "Star";
114 case Kind::Optional: return "Optional";
115 }
116 }
117 else
118 {
119 return "Range";
120 }
121 }

References Alpha, AlphaNum, Any, BeginCapture, BeginOfLine, BeginOfWord, Character, CharClass, Digit, End, EndCapture, EndOfLine, EndOfWord, m_rep, NegCharClass, Optional, Star, and WhiteSpace.

Referenced by reg::Ex::Private::matchAt().

◆ setValue()

void reg::PToken::setValue ( uint16_t value)
inline

Sets the value for a token.

Definition at line 141 of file regex.cpp.

141{ m_rep = (m_rep & 0xFFFF0000) | value; }
uint16_t value() const
Returns the value for this token.
Definition regex.cpp:153

References m_rep, and value().

◆ to()

uint16_t reg::PToken::to ( ) const
inline

Returns the 'to' part of the character range.

Only valid if this token represents a range

Definition at line 150 of file regex.cpp.

150{ return m_rep & 0xFFFF; }

References m_rep.

Referenced by isRange(), reg::Ex::Private::matchAt(), and PToken().

◆ value()

uint16_t reg::PToken::value ( ) const
inline

Returns the value for this token.

Definition at line 153 of file regex.cpp.

153{ return m_rep & 0xFFFF; }

References m_rep.

Referenced by reg::Ex::Private::compile(), reg::Ex::Private::matchAt(), and setValue().

Member Data Documentation

◆ m_rep

uint32_t reg::PToken::m_rep
private

The documentation for this class was generated from the following file: