Doxygen
Loading...
Searching...
No Matches
reg::PToken Class Reference

Class representing a token in the compiled regular expression token stream. More...

Public Types

enum class  Kind : uint16_t {
  End = 0x0000 , WhiteSpace = 0x1001 , Digit = 0x1002 , Alpha = 0x1003 ,
  AlphaNum = 0x1004 , CharClass = 0x2001 , NegCharClass = 0x2002 , BeginOfLine = 0x4001 ,
  EndOfLine = 0x4002 , BeginOfWord = 0x4003 , EndOfWord = 0x4004 , BeginCapture = 0x4005 ,
  EndCapture = 0x4006 , Any = 0x4007 , Star = 0x4008 , Optional = 0x4009 ,
  Character = 0x8000
}
 The kind of token. More...
 

Public Member Functions

const char * kindStr () const
 returns a string representation of the tokens kind (useful for debugging).
 
 PToken ()
 Creates a token of kind 'End'.
 
 PToken (Kind k)
 Creates a token of the given kind k.
 
 PToken (char c)
 Create a token for an ASCII character.
 
 PToken (uint16_t v)
 Create a token for a byte of an UTF-8 character.
 
 PToken (uint16_t from, uint16_t to)
 Create a token representing a range from one character from to another character to.
 
void setValue (uint16_t value)
 Sets the value for a token.
 
Kind kind () const
 Returns the kind of the token.
 
uint16_t from () const
 Returns the 'from' part of the character range.
 
uint16_t to () const
 Returns the 'to' part of the character range.
 
uint16_t value () const
 Returns the value for this token.
 
char asciiValue () const
 Returns the value for this token as a ASCII character.
 
bool isRange () const
 Returns true iff this token represents a range of characters.
 
bool isCharClass () const
 Returns true iff this token is a positive or negative character class.
 

Private Attributes

uint32_t m_rep
 

Detailed Description

Class representing a token in the compiled regular expression token stream.

A token has a kind and an optional value whose meaning depends on the kind. It is also possible to store a (from,to) character range in a token.

Definition at line 58 of file regex.cpp.

Member Enumeration Documentation

◆ Kind

enum class reg::PToken::Kind : uint16_t
strong

The kind of token.

Ranges per bit mask:

  • 0x00FF from part of a range, except for 0x0000 which is the End marker
  • 0x1FFF built-in ranges
  • 0x2FFF user defined ranges
  • 0x4FFF special operations
  • 0x8000 literal character
Enumerator
End 
WhiteSpace 
Digit 
Alpha 
AlphaNum 
CharClass 
NegCharClass 
BeginOfLine 
EndOfLine 
BeginOfWord 
EndOfWord 
BeginCapture 
EndCapture 
Any 
Star 
Optional 
Character 

Definition at line 70 of file regex.cpp.

71 {
72 End = 0x0000,
73 WhiteSpace = 0x1001, // \s range [ \t\r\n]
74 Digit = 0x1002, // \d range [0-9]
75 Alpha = 0x1003, // \a range [a-z_A-Z\x80-\xFF]
76 AlphaNum = 0x1004, // \w range [a-Z_A-Z0-9\x80-\xFF]
77 CharClass = 0x2001, // []
78 NegCharClass = 0x2002, // [^]
79 BeginOfLine = 0x4001, // ^
80 EndOfLine = 0x4002, // $
81 BeginOfWord = 0x4003, // <
82 EndOfWord = 0x4004, // >
83 BeginCapture = 0x4005, // (
84 EndCapture = 0x4006, // )
85 Any = 0x4007, // .
86 Star = 0x4008, // *
87 Optional = 0x4009, // ?
88 Character = 0x8000 // c
89 };

Constructor & Destructor Documentation

◆ PToken() [1/5]

reg::PToken::PToken ( )
inline

Creates a token of kind 'End'.

Definition at line 124 of file regex.cpp.

124: m_rep(0) {}
uint32_t m_rep
Definition regex.cpp:165

◆ PToken() [2/5]

reg::PToken::PToken ( Kind k)
inlineexplicit

Creates a token of the given kind k.

Definition at line 127 of file regex.cpp.

127: m_rep(static_cast<uint32_t>(k)<<16) {}

◆ PToken() [3/5]

reg::PToken::PToken ( char c)
inline

Create a token for an ASCII character.

Definition at line 130 of file regex.cpp.

130 : m_rep((static_cast<uint32_t>(Kind::Character)<<16) |
131 static_cast<uint32_t>(c)) {}

◆ PToken() [4/5]

reg::PToken::PToken ( uint16_t v)
inline

Create a token for a byte of an UTF-8 character.

Definition at line 134 of file regex.cpp.

134 : m_rep((static_cast<uint32_t>(Kind::Character)<<16) |
135 static_cast<uint32_t>(v)) {}

◆ PToken() [5/5]

reg::PToken::PToken ( uint16_t from,
uint16_t to )
inline

Create a token representing a range from one character from to another character to.

Definition at line 138 of file regex.cpp.

138: m_rep(static_cast<uint32_t>(from)<<16 | to) {}
uint16_t to() const
Returns the 'to' part of the character range.
Definition regex.cpp:150
uint16_t from() const
Returns the 'from' part of the character range.
Definition regex.cpp:147

Member Function Documentation

◆ asciiValue()

char reg::PToken::asciiValue ( ) const
inline

Returns the value for this token as a ASCII character.

Definition at line 156 of file regex.cpp.

156{ return static_cast<char>(m_rep); }

References m_rep.

Referenced by reg::Ex::Private::compile(), reg::Ex::match(), and reg::Ex::Private::matchAt().

◆ from()

uint16_t reg::PToken::from ( ) const
inline

Returns the 'from' part of the character range.

Only valid if this token represents a range

Definition at line 147 of file regex.cpp.

147{ return m_rep>>16; }

References m_rep.

Referenced by isRange(), and reg::Ex::Private::matchAt().

◆ isCharClass()

bool reg::PToken::isCharClass ( ) const
inline

Returns true iff this token is a positive or negative character class.

Definition at line 162 of file regex.cpp.

162{ return kind()==Kind::CharClass || kind()==Kind::NegCharClass; }
Kind kind() const
Returns the kind of the token.
Definition regex.cpp:144

References CharClass, kind(), and NegCharClass.

Referenced by reg::Ex::Private::matchAt().

◆ isRange()

bool reg::PToken::isRange ( ) const
inline

Returns true iff this token represents a range of characters.

Definition at line 159 of file regex.cpp.

159{ return m_rep!=0 && from()<=to(); }

References from(), m_rep, and to().

◆ kind()

Kind reg::PToken::kind ( ) const
inline

Returns the kind of the token.

Definition at line 144 of file regex.cpp.

144{ return static_cast<Kind>(m_rep>>16); }
Kind
The kind of token.
Definition regex.cpp:71

References m_rep.

Referenced by reg::Ex::Private::compile(), isCharClass(), reg::Ex::match(), and reg::Ex::Private::matchAt().

◆ kindStr()

const char * reg::PToken::kindStr ( ) const
inline

returns a string representation of the tokens kind (useful for debugging).

Definition at line 92 of file regex.cpp.

93 {
94 if ((m_rep>>16)>=0x1000 || m_rep==0)
95 {
96 switch(static_cast<Kind>((m_rep>>16)))
97 {
98 case Kind::End: return "End";
99 case Kind::Alpha: return "Alpha";
100 case Kind::AlphaNum: return "AlphaNum";
101 case Kind::WhiteSpace: return "WhiteSpace";
102 case Kind::Digit: return "Digit";
103 case Kind::CharClass: return "CharClass";
104 case Kind::NegCharClass: return "NegCharClass";
105 case Kind::Character: return "Character";
106 case Kind::BeginOfLine: return "BeginOfLine";
107 case Kind::EndOfLine: return "EndOfLine";
108 case Kind::BeginOfWord: return "BeginOfWord";
109 case Kind::EndOfWord: return "EndOfWord";
110 case Kind::BeginCapture: return "BeginCapture";
111 case Kind::EndCapture: return "EndCapture";
112 case Kind::Any: return "Any";
113 case Kind::Star: return "Star";
114 case Kind::Optional: return "Optional";
115 }
116 }
117 else
118 {
119 return "Range";
120 }
121 }

References Alpha, AlphaNum, Any, BeginCapture, BeginOfLine, BeginOfWord, Character, CharClass, Digit, End, EndCapture, EndOfLine, EndOfWord, m_rep, NegCharClass, Optional, Star, and WhiteSpace.

Referenced by reg::Ex::Private::matchAt().

◆ setValue()

void reg::PToken::setValue ( uint16_t value)
inline

Sets the value for a token.

Definition at line 141 of file regex.cpp.

141{ m_rep = (m_rep & 0xFFFF0000) | value; }
uint16_t value() const
Returns the value for this token.
Definition regex.cpp:153

References m_rep, and value().

◆ to()

uint16_t reg::PToken::to ( ) const
inline

Returns the 'to' part of the character range.

Only valid if this token represents a range

Definition at line 150 of file regex.cpp.

150{ return m_rep & 0xFFFF; }

References m_rep.

Referenced by isRange(), and reg::Ex::Private::matchAt().

◆ value()

uint16_t reg::PToken::value ( ) const
inline

Returns the value for this token.

Definition at line 153 of file regex.cpp.

153{ return m_rep & 0xFFFF; }

References m_rep.

Referenced by reg::Ex::Private::compile(), reg::Ex::Private::matchAt(), and setValue().

Member Data Documentation

◆ m_rep

uint32_t reg::PToken::m_rep
private

Definition at line 165 of file regex.cpp.

Referenced by asciiValue(), from(), isRange(), kind(), kindStr(), setValue(), to(), and value().


The documentation for this class was generated from the following file: