The Battle for Wesnoth  1.19.5+dev
Public Member Functions | Private Types | Private Member Functions | Private Attributes | List of all members
tokenizer Class Reference

class responsible for parsing the provided text into tokens and tracking information about the current token. More...

#include <tokenizer.hpp>

Public Member Functions

 tokenizer (std::istream &in)
 
 ~tokenizer ()
 
const tokennext_token ()
 Reads characters off of in_ to return the next token type and its value. More...
 
const tokencurrent_token () const
 
const std::string & textdomain () const
 
const std::string & get_file () const
 
int get_start_line () const
 

Private Types

enum  character_type { TOK_NONE = 0 , TOK_SPACE = 1 , TOK_NUMERIC = 2 , TOK_ALPHA = 3 }
 the different types of characters while parsing TOK_NONE is also the default for anything beyond standard ascii More...
 

Private Member Functions

 tokenizer ()
 
void next_char ()
 increments the line number if the current character is a newline set current_ to the next character that's not \r More...
 
void next_char_skip_cr ()
 set current_ to the next character skip the \r in the \r\n Windows-style line endings the test_cvs_2018_1999023_2.cfg file also uses \r\n line endings for some reason - otherwise that check isn't needed on non-Windows platforms since \r characters are removed from cfg files on upload More...
 
int peek_char ()
 return the next character without incrementing the current position in the istream More...
 
character_type char_type (unsigned c) const
 
bool is_space (int c) const
 
bool is_num (int c) const
 
bool is_alnum (int c) const
 
void skip_comment ()
 handles skipping over comments (inline and on a separate line) as well as the special processing needed for #textdomain and #line More...
 
bool skip_command (char const *cmd)
 Returns true if the next characters are the one from cmd followed by a space. More...
 

Private Attributes

int current_
 
int lineno_
 
int startlineno_
 
std::string textdomain_
 
std::string file_
 
token token_
 
buffered_istream in_
 
std::array< character_type, END_STANDARD_ASCIIchar_types_
 

Detailed Description

class responsible for parsing the provided text into tokens and tracking information about the current token.

can also track the previous token when built with the DEBUG_TOKENIZER compiler define. does not otherwise keep track of the processing history.

Definition at line 98 of file tokenizer.hpp.

Member Enumeration Documentation

◆ character_type

the different types of characters while parsing TOK_NONE is also the default for anything beyond standard ascii

Enumerator
TOK_NONE 
TOK_SPACE 
TOK_NUMERIC 
TOK_ALPHA 

Definition at line 178 of file tokenizer.hpp.

Constructor & Destructor Documentation

◆ tokenizer() [1/2]

tokenizer::tokenizer ( std::istream &  in)

◆ ~tokenizer()

tokenizer::~tokenizer ( )

Definition at line 45 of file tokenizer.cpp.

References in_, and buffered_istream::stream().

◆ tokenizer() [2/2]

tokenizer::tokenizer ( )
private

Member Function Documentation

◆ char_type()

character_type tokenizer::char_type ( unsigned  c) const
inlineprivate

Definition at line 186 of file tokenizer.hpp.

References c, char_types_, END_STANDARD_ASCII, and TOK_NONE.

Referenced by is_alnum(), is_num(), and is_space().

◆ current_token()

const token& tokenizer::current_token ( ) const
inline

Definition at line 109 of file tokenizer.hpp.

References token_.

◆ get_file()

const std::string& tokenizer::get_file ( ) const
inline

Definition at line 126 of file tokenizer.hpp.

References file_.

◆ get_start_line()

int tokenizer::get_start_line ( ) const
inline

Definition at line 131 of file tokenizer.hpp.

References startlineno_.

◆ is_alnum()

bool tokenizer::is_alnum ( int  c) const
inlineprivate

Definition at line 201 of file tokenizer.hpp.

References c, char_type(), and TOK_SPACE.

Referenced by next_token().

◆ is_num()

bool tokenizer::is_num ( int  c) const
inlineprivate

Definition at line 196 of file tokenizer.hpp.

References c, char_type(), and TOK_NUMERIC.

Referenced by skip_comment().

◆ is_space()

bool tokenizer::is_space ( int  c) const
inlineprivate

Definition at line 191 of file tokenizer.hpp.

References c, char_type(), and TOK_SPACE.

Referenced by next_token(), skip_command(), and skip_comment().

◆ next_char()

void tokenizer::next_char ( )
inlineprivate

increments the line number if the current character is a newline set current_ to the next character that's not \r

Definition at line 146 of file tokenizer.hpp.

References current_, lineno_, token::NEWLINE, and next_char_skip_cr().

Referenced by next_token().

◆ next_char_skip_cr()

void tokenizer::next_char_skip_cr ( )
inlineprivate

set current_ to the next character skip the \r in the \r\n Windows-style line endings the test_cvs_2018_1999023_2.cfg file also uses \r\n line endings for some reason - otherwise that check isn't needed on non-Windows platforms since \r characters are removed from cfg files on upload

Definition at line 158 of file tokenizer.hpp.

References current_, buffered_istream::get(), and in_.

Referenced by next_char(), next_token(), skip_command(), skip_comment(), and tokenizer().

◆ next_token()

const token & tokenizer::next_token ( )

◆ peek_char()

int tokenizer::peek_char ( )
inlineprivate

return the next character without incrementing the current position in the istream

Definition at line 169 of file tokenizer.hpp.

References in_, and buffered_istream::peek().

Referenced by next_token().

◆ skip_command()

bool tokenizer::skip_command ( char const *  cmd)
private

Returns true if the next characters are the one from cmd followed by a space.

Skips all the matching characters. Currently only used by #textdomain (specified by the WML) and #line (added by the preprocessor)

Definition at line 202 of file tokenizer.cpp.

References current_, is_space(), and next_char_skip_cr().

Referenced by skip_comment().

◆ skip_comment()

void tokenizer::skip_comment ( )
private

handles skipping over comments (inline and on a separate line) as well as the special processing needed for #textdomain and #line

Definition at line 222 of file tokenizer.cpp.

References current_, dst, file_, is_num(), is_space(), lineno_, token::NEWLINE, next_char_skip_cr(), skip_command(), and textdomain_.

Referenced by next_token().

◆ textdomain()

const std::string& tokenizer::textdomain ( ) const
inline

Definition at line 121 of file tokenizer.hpp.

References textdomain_.

Member Data Documentation

◆ char_types_

std::array<character_type, END_STANDARD_ASCII> tokenizer::char_types_
private

Definition at line 224 of file tokenizer.hpp.

Referenced by char_type(), and tokenizer().

◆ current_

int tokenizer::current_
private

Definition at line 138 of file tokenizer.hpp.

Referenced by next_char(), next_char_skip_cr(), next_token(), skip_command(), and skip_comment().

◆ file_

std::string tokenizer::file_
private

Definition at line 218 of file tokenizer.hpp.

Referenced by get_file(), and skip_comment().

◆ in_

buffered_istream tokenizer::in_
private

Definition at line 223 of file tokenizer.hpp.

Referenced by next_char_skip_cr(), peek_char(), tokenizer(), and ~tokenizer().

◆ lineno_

int tokenizer::lineno_
private

Definition at line 139 of file tokenizer.hpp.

Referenced by next_char(), next_token(), and skip_comment().

◆ startlineno_

int tokenizer::startlineno_
private

Definition at line 140 of file tokenizer.hpp.

Referenced by get_start_line(), and next_token().

◆ textdomain_

std::string tokenizer::textdomain_
private

Definition at line 217 of file tokenizer.hpp.

Referenced by skip_comment(), and textdomain().

◆ token_

token tokenizer::token_
private

Definition at line 219 of file tokenizer.hpp.

Referenced by current_token(), and next_token().


The documentation for this class was generated from the following files: