Text
In the Erbsland Configuration Language (ELCL), text, or more specifically single-line text, serves not only as a value but also plays a special role in section and value names. Therefore, it is important to define the syntax of text early, as it will be used throughout this documentation.
text ::= DOUBLE_QUOTE ( text_character | text_escape )* DOUBLE_QUOTE
text_character ::= TEXT - (BACKSLASH | DOUBLE_QUOTE)
text_escape ::= BACKSLASH (BACKSLASH | DOUBLE_QUOTE | DOLLAR | [nN] | [rR] | [tT] |
[uU] text_unicode )
text_unicode ::= ( HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT |
CU_BRACKET_OPEN HEX_DIGIT+ CU_BRACKET_CLOSE ) /* 1-8 hex digits */
Important
A parser is not required to perform Unicode normalization on any parsed text.
Unicode text may be internally processed as UTF-8 encoded byte-data. It is the responsibility of the application to handle normalization or perform additional checks on the text, depending on the specific requirements of its use case.
Basic Rules
Format: Text consists of characters enclosed between two double quote characters (
"
)."This is text"
Regular Characters: Any Unicode character can be used in text, except for backslashes (which introduce escape sequences), double quotes (which mark the end of the text), and all control characters (except the tab character, which is allowed).
"This is wrong" # ERROR! Line breaks and other control characters aren't allowed in text. "This "is" wrong" # ERROR! Double quotes must be escaped in text. "This \ wrong" # ERROR! The backslash introduces escape sequences, so this is invalid.
Escape Sequence Rules
Escape Sequences: Each escape sequence inserts exactly one character into the text.
Format: An escape sequence begins with a backslash (
\
), followed by one or more characters.Case-insensitive: Escape sequences are not case-sensitive.
\\
: Inserts a single backslash (\
).\"
: Inserts a double quote ("
).\$
: Inserts a dollar sign ($
).\n
: Inserts a newline control character (\n
).\r
: Inserts a carriage return control character (\r
).\t
: Inserts a tab control character (\t
).\uXXXX
: Inserts a Unicode character, whereXXXX
represents exactly four hexadecimal digits that form the character’s code point.\u{Y}
: Inserts a Unicode character, whereY
can be one to eight hexadecimal digits, forming the character’s code point. Zero padding is allowed.Null is Forbidden: The “null” character cannot be inserted into text.
Unknown Sequences Rejected: Any escape sequences not explicitly listed here must be rejected.
Features
Feature |
Coverage |
---|---|
core |
Text and all escape sequences are part of the core language. |
Errors
Error Code |
Causes |
---|---|
Character |
This error is raised for any illegal character or invalid escape sequence within the text. |
Syntax |
Raised if the closing double quote character is missing at the end of the line or document. |
LimitExceeded |
Raised if the text exceeds the maximum text size the parser can handle. |