Source Map

source map is a crucial concept in the development process. During debugging, we often need to examine the bundled code, but the bundled code has undergone operations like compression, obfuscation, and merging, making it difficult to read and debug directly. This is where source map comes in.

source map is essentially a json file that contains the mapping relationship between the bundled code and the source code. During debugging, we can use the position information in the source map to locate the corresponding position in the source code, allowing us to quickly navigate to the source code location during debugging.

In rollup, the generation of source map is closely related to the output configuration options, such as the output.sourcemap option. When set to true, rollup will generate a source map for the bundled file. When set to 'inline', rollup will inline the source map into the bundled file in base64 encoded form. When set to 'hidden', rollup will generate a source map for the bundled file but won't inline it; instead, it will generate a separate source map file. This chapter will detail the principles of source map generation in rollup.

Principles of Code Mapping

Mapping essentially provides mapping relationships to the editor. In other words, we need to provide information that includes the character coordinates of the bundled code and their corresponding character coordinates in the source code, along with the source code file location. This is the position information mentioned later in the mapping record:

Line coordinate of the modified position
Column coordinate of the modified position
Path of the original copy
Line coordinate of the corresponding source code position
Column coordinate of the corresponding source code position

With these guiding information, the editor can easily navigate to the corresponding source code during debugging.

But do we need to map every character?

The answer is not necessarily. While mapping every character can achieve absolute precision, it would generate a large number of mapping relationships, resulting in an excessively large source map file. Let's explore the different code mapping methods.

Code Mapping Methods

The most basic point of code mapping is determining which characters need mapping processing. The mapping information in sourcemap can include the following types:

Mapping word boundaries: Records mapping information for code word boundaries.
Coordinate mapping for each character: The most precise mapping recording method, which records coordinate mapping information for each character in the code.
Mapping lexical position coordinates: Records mapping information for code lexical positions.
Mapping line coordinates: The most basic mapping recording method, which records mapping information for code line numbers. This cannot be further reduced, as it might lead to mapping anomalies.

All four methods can establish mapping relationships between the build output and source code during debugging. However, they differ in terms of mapping precision and the resulting sourcemap file size.

magic string is actually designed to facilitate code mapping. For a detailed introduction, you can refer to the article. Here, we won't elaborate on the internal implementation of magic string.

magic string doesn't exclusively use one of the above four methods. Instead, it uses different methods for different scenarios and provides users with controllable mapping precision to achieve optimal mapping results. For more details, refer to Determining mapping Information in magic string.

Implementation Principles

Relationship between `rollup` and `magic-string`

The generation of source map in rollup primarily relies on the magic string package.

Looking at the git commit history of both Rollup and magic-string, we can see that both repositories are authored by Rich Harris. The Rollup project has been using magic-string since its early stages (the third release version of magic-string, v0.5.1) to handle the mapping relationships after source code modifications. It seems that the magic-string package was likely designed specifically for the Rollup project to optimize source map generation.

`rollup`'s `mapping` Recording Method

rollup deeply integrates with magic-string. Beyond magic-string's default recording method, it additionally provides **lexical position`** marking.

So how is lexical position marking implemented?

rollup leverages swc's capabilities to generate a standard aststructure. Before generating theast, the source code needs to undergo lexical analysis.

Additional Note

The implementation of lexical analysis can be referenced in acorn's lexical analysis implementation. However, in rollup, lexical analysis is implemented by swc, which generates a non-standard ast structure. rollup needs to make some adjustments to swc's translated aststructure to make it standard. However, even though it's not a standardast structure, the lexical position information is still provided.

During the lexical analysis phase, each ast node is recorded with its corresponding lexical position in the source code. In other words, the lexical analysis phase provides start and end index information for each ast node corresponding to its position in the source code.

With the lexical position information provided by the ast node, rollup can record additional lexical position information during the initialization of ast nodes (i.e., during the DFS backtracking phase) for subsequent magic-string to generate corresponding mapping information.

class NodeBase extends ExpressionEntity implements ExpressionNode {
  initialise(): void {
    this.scope.context.magicString.addSourcemapLocation(this.start);
    this.scope.context.magicString.addSourcemapLocation(this.end);
  }
}

Source Map ​

Principles of Code Mapping ​

Code Mapping Methods ​

Implementation Principles ​

Relationship between rollup and magic-string ​

rollup's mapping Recording Method ​