Code Text Values

Code text values are a special format of text values that do not support escape sequences. Code text is enclosed within backticks (`), making it ideal for including short code snippets or text with many backslashes (\) without the need to escape special characters. It is important to note that code text is not a separate data type; it is simply a formatting style for text values.

code                ::= BACKTICK code_text+ BACKTICK

code_text           ::= TEXT - BACKTICK

For multi-line code, the format is slightly different, using triple backticks (`) to enclose the text. The content must follow consistent indentation patterns.

multi_line_code      ::= ml_code_start ml_code_line* ml_code_end

ml_code_start       ::= "```" ( ALPHA FORMAT_DIGIT* )? end_of_line
ml_code_end         ::= indentation_pattern "```"
ml_code_line        ::= ( indentation_pattern TEXT+ )? end_of_line

In the following examples, you’ll see a single-line code text and a multi-line code text:

[main]
Python RegEx: `re.compile(r"(-*\*?)(\[)([ \t]*)(\.)?")`
Python Map:
    ```
    TEXT_ESCAPE_SUBSTITUTIONS = {
        "\\": "\\",
        "n": "\n",
        "r": "\r",
        "t": "\t",
        '"': '"',
        "$": "$",
    }
    ```

Rules for Single Line Code Text

  1. Format: Code text is enclosed between two backtick characters (`).

    [main]
    Code Text: `This is code text`
    
  2. No Escape Sequences: Escape sequences are not supported in code text. Any backslashes or other special characters are treated as literal characters.

    [main]
    No Escape: `\\\\\\\\\\\`
    

    Note

    There is no way to include the backtick character itself within single-line code text. If needed, consider using multi-line code text, which supports backticks within the content.

  3. Valid Characters: Any Unicode character is valid within code text, except control characters (excluding tab (\t)) and the closing backtick (`).

    [main]
    Code: `re.compile(r"(-*\*?)(\[)([ \t]*)(\.)?")`
    

Rules for Multi-line Code Text

  1. Beginning the Text: Multi-line code text begins with a sequence of three backtick characters (`). It can be followed by an optional language identifier. This sequence (backticks and optional language identifier) can also be followed by spaces or comments but must be followed by a line break.

    [main]
    code 1: ```
        text = "How are you?"
        ```
    code 2:      # Comments after the value separator.
        ```cpp   # Optional language identifier, followed by spacing or comments.
        if (text == "How are you?") {
            out << "I'm fine, thanks, how are you?\n"
        }
        ```
    
  2. Language Identifier: The language identifier must start with a letter (az, case-insensitive), and can be followed by a sequence of 0 to 15 letters (az, case-insensitive), digits (09), the hyphen (-) and underscores (_). The parser must treat the language identifier like a comment and ignore it.

    [main]
    code 2:
        ```java
        int sum = 5 + 3;
        System.out.println("Sum: " + sum);
        ```
    

    Note

    The language identifier is for syntax highlighting purposes only and is ignored by the parser.

  3. Content and Indentation: The content of the multi-line code text starts after the line break following the opening backticks. Each line must be indented by at least one space or tab character. Refer to Spacing for details on indentation.

    [main]
    code:
        ```
        r"(\n|\r\n)([ \t]+)(```)"
        ```
    
  4. Consistent Indentation: Each continued line of the multi-line code text must follow the exact sequence of spaces and tabs used at the start of the code text. This ensures that code requiring indentation retains its structure. See Spacing for more information.

    [main]
    code:
        ```cobol
                IDENTIFICATION DIVISION.
                PROGRAM-ID. HelloWorld.
                PROCEDURE DIVISION.
                    DISPLAY 'Hello, World!'.
                    STOP RUN.
        ```
    
  5. Ending the Text: Multi-line code text ends on a new line with the same indentation as the previous lines, followed immediately by a sequence of three backtick characters (`).

    [main]
    code:
        ```
        // Code can contain backtick (`) characters
        System.out.println("Even multiple ones, like here: ```");
                ``` // ← This is not the end.
        ``` # ← This is where the code ends.
    

    Note

    Unlike single-line code text, multi-line code text allows backtick characters within its content.

  6. Allowed Characters: Any Unicode character can be used in multi-line code text, except for control characters (with the exception of the tab character, which is allowed).

    [main]
    code: ```rust
        fn main() {
            let greeting = "こんにちは, 世界!";
            println!("{}", greeting);
        }
        ```
    
  7. No Escape Sequences: Escape sequences are not supported in multi-line code text. Any backslashes or other special characters are treated as literal characters.

    [main]
    code: ```
        \u{1f604}\n\u2191 is just a random sequence of characters.
        ```
    
  8. Line Breaks: Each line break in multi-line code text is converted into a single newline character (\n), regardless of the original line break style used in the configuration document.

    [main]
    code: ```
        print("""
        """)
        ```
    

    The result will always be: print("""↵""").

  9. Trimming Whitespace: Leading and trailing whitespace around the code text is removed, as described in Spacing.

    [main]
    code:
        ```
            TEXT
        ```
    

    The resulting text will be: ⎵⎵⎵⎵TEXT.

Features

Feature

Coverage

code

Code text values are a standard feature.

multi-line

Multi-line code text values are a standard feature.

Errors

Error Code

Causes

Character

Raised if any illegal characters are found within the text.

Syntax

Raised if the closing sequence of backtick characters is missing.

Indentation

No space or tab character is present before a continued code text.
The indentation pattern does not match the first entry for a continued code text.

LimitExceeded

Raised if the code text exceeds the maximum size the parser can handle.
Raised if the language identifier exceeds 16 characters.