Skip to content

§3 Lexical grammar

The lexical grammar defines Bynk’s terminals: the tokens a source text is divided into, and the trivia discarded between them. Each production below is generated from the grammar (§2.1); this chapter states only the syntactic facts. Constraints beyond lexing (for example, the admissible range of a literal) are well-formedness rules and are deferred to §5.

identifier ::= /[A-Za-z][A-Za-z0-9_]*/

A letter followed by letters, digits, or underscores. Identifiers name declarations, parameters, fields, and bindings. A source word that matches a keyword is lexed as that keyword, not as an identifier.

constant_name ::= /[A-Z][A-Za-z0-9_]*/

An upper-case-initial name. Constant names denote sum-type variants and enum constants.

number_literal ::= /[0-9]+/

A run of decimal digits. A number literal is unsigned; a leading - is the unary negation operator (§4.6), not part of the token. An integer literal that does not fit a 64-bit signed integer is bynk.lex.integer_overflow.

float_literal ::= /[0-9]+\.[0-9]+([eE][+-]?[0-9]+)?|[0-9]+[eE][+-]?[0-9]+/

A Float literal (v0.21): a fraction with a digit required on both sides of the . (1.0, 0.5), an exponent (1e10, 1.5e-3), or both. 1. and .5 are rejected as bynk.parse.malformed_float_literal. Like number_literal the token is unsigned. A literal that does not fit a finite IEEE 754 double (1e999) is bynk.lex.float_literal_overflow — there is no way to write a non-finite Float literal.

1 is an Int; 1.0 (or any exponent form) is a Float. The digit-both-sides rule keeps method calls on numeric literals unambiguous under maximal munch: 2.5.round() lexes as 2.5 . round, and 1.toFloat() as 1 . toFloat. The compiler preserves the literal’s lexeme through emission and formatting — 1e10 does not normalise to 10000000000.

string_literal ::= """ (/[^"\\\n]/ | /\\[nt"\\]/ | string_interpolation)* """

A double-quoted string. The escape sequences \n, \t, \", and \\ are recognised; an unescaped newline does not appear within the token.

A string may also contain interpolation holes of the form \(expr) (v0.43): the text \( opens a hole whose body runs to its matching ) (parentheses balance, and a nested "…" inside a hole is skipped so its parens do not close the hole), and the body is an ordinary expression. \\( is the escape for a literal \(, so existing literals are unaffected (a bare \( was previously an invalid escape). A string containing one or more holes is an interpolated string; one with none is the plain string literal above. The hole rule and emission are specified in §5.2 well-typedness and §7 emission.

boolean_literal ::= "true" | "false"

The two Bool values, true and false.

unit_literal ::= "(" ")"

The unit value () — the single value of the unit type. It is lexically the empty parenthesis pair.

line_comment ::= "--" /[^\n]*/

A comment from -- to the end of the line. Bynk uses --, never //. Line comments are trivia (§3.4).

A doc-block is a --- … --- documentation block. It is an external token: it is not a grammar rule but a terminal recognised by the lexer and attached to the declaration that follows it. Like comments and whitespace, a doc-block is trivia (§3.4).

Between tokens the lexer discards trivia: whitespace (/\s+/), line comments (§3.3.1), and doc-blocks (§3.3.2). Trivia is insignificant to the syntactic grammar; it does not appear in the productions of §4. The complete token-and-trivia summary is part of the grammar appendix (§11).