Lexical Structure in JavaScript

Ajay Dhangar
3 min readSep 2, 2024

--

Lexical structure is the set of rules that defines the combination of symbols considered correctly structured programs in JavaScript. It is the lowest level of syntax in JavaScript. The lexical structure of a programming language is the set of simple rules that defines how the characters in the language are combined into tokens. JavaScript is case-sensitive and uses the Unicode character set.

Lexical Structure in JavaScript

Tokens

A token is the smallest individual unit of a program. JavaScript breaks up source code into individual tokens so that the parser can understand the code. Tokens can be keywords, identifiers, operators, and so on.

Comments

Comments are ignored by the JavaScript interpreter. They are used to make the code more readable. JavaScript supports both single-line and multi-line comments.

Single-line Comments

Single-line comments start with `//` and continue until the end of the line.

// This is a single-line comment

Multi-line Comments

Multi-line comments start with `/*` and end with `*/`.

/*
This is a multi-line comment
*/

Literals

A literal is a data value that appears directly in a program. For example, `42` is a numeric literal and `”hello”` is a string literal.

Identifiers

An identifier is a name given to a variable, function, class, or object. Identifiers must start with a letter, underscore (`_`), or a dollar sign (`$`). They can contain letters, digits, underscores, and dollar signs. Identifiers are case-sensitive.

Reserved Words

Reserved words are words that have special meanings in JavaScript and cannot be used as identifiers. For example, `var`, `if`, `else`, `for`, `while`, `function`, `return`, `true`, `false`, `null`, `undefined`, `this`, `new`, `delete`, `in`, `instanceof`, `typeof`, `void`, `with`, `break`, `continue`, `switch`, `case`, `default`, `try`, `catch`, `finally`, `throw`, `debugger`, `import`, `export`, `super`, `class`, `extends`, `static`, `implements`, `interface`, `package`, `private`, `protected`, `public`, `abstract`, `final`, `const`, `let`, `yield`, `async`, `await`, `of`, `get`, `set`.

Case Sensitivity

JavaScript is case-sensitive. This means that `myVar`, `MyVar`, and `MYVAR` are three different variables.

Whitespace

JavaScript ignores spaces, tabs, and newlines that appear in programs. However, spaces, tabs, and newlines are important because they separate tokens from each other.

Semicolons

Semicolons are used to separate statements in JavaScript. JavaScript allows you to omit semicolons at the end of a statement. However, it is a good practice to use semicolons to avoid unexpected results.

var x = 5; // Semicolon is used
var y = 10 // Semicolon is omitted

Line Breaks

JavaScript uses a line terminator to determine where a statement ends. A line terminator can be a newline (`\n`), a carriage return (`\r`), or a carriage return followed by a newline (`\r\n`).

Keywords

Keywords are reserved words that have special meanings in JavaScript. They cannot be used as identifiers. Some of the keywords in JavaScript are `break`, `case`, `catch`, `class`, `const`, `continue`, `debugger`, `default`, `delete`, `do`, `else`, `export`, `extends`, `finally`, `for`, `function`, `if`, `import`, `in`, `instanceof`, `new`, `return`, `super`, `switch`, `this`, `throw`, `try`, `typeof`, `var`, `void`, `while`, `with`, `yield`, `enum`, `implements`, `interface`, `let`, `package`, `private`, `protected`, `public`, `static`.

Future Reserved Words

Future reserved words are words that are not currently used as keywords in JavaScript but are reserved for future use. Some of the future reserved words in JavaScript are `enum`, `await`, `implements`, `interface`, `package`, `private`, `protected`, `public`, and `static`.

Unicode Characters

JavaScript uses the Unicode character set. Unicode is a universal character encoding standard that defines a consistent way to represent characters in different languages. JavaScript uses the Unicode character set to represent characters in strings, identifiers, and comments.

Escape Sequences

Escape sequences are used to represent special characters in strings. An escape sequence starts with a backslash (`\`) followed by a character or characters. Some common escape sequences in JavaScript are:

  • \’ — Single quote
  • \” — Double quote
  • \\` — Backslash
  • \n — Newline
  • \r — Carriage return
  • \t — Tab
  • \b — Backspace
  • \f — Form feed
  • \uXXXX — Unicode character

Summary

  • Lexical structure is the set of rules that defines the combination of symbols considered correctly structured programs in JavaScript.
  • Tokens are the smallest individual units of a program.
  • Comments are ignored by the JavaScript interpreter and are used to make the code more readable.
  • Literals are data values that appear directly in a program.

References

--

--

Ajay Dhangar
Ajay Dhangar

Written by Ajay Dhangar

Hello, my name is Ajay Dhangar I’m a web developer with extensive experience for over 4 years. My expertise is to create and website design and more…