Magic String

原理

背景

本质上是一个轻量级的用来快速生成变更后的源码与源码之间的映射关系的工具。使用简单，适用于做一些轻微的源代码修改，如插入、删除、替换等操作同时最后依然可以快速生成 sourcemap( 源映射 )。

额外的要点

与 recast 相比，recast 可以通过解析器来直接操纵抽象语法树从而达到修改源码的目的，这增加了额外的熟悉 AST 操作的学习成本同时，还带来了性能问题。

各个工具的使用情况

在 Vite 中，magic-string 参与插件中的代码重写工作。
在 Rollup 中，magic-string 维护了 chunk 与源码的映射关系，无论源码转译了几次，都可以通过找到源码的原始位置。同时，magic-string 被用作 Tree shaking 的代码修改工具。

设计核心

`mapping` 信息的确定

代码映射最基本的要点就是我要为哪些字符做 mapping 处理，sourcemap 中的 mapping 信息可以包含如下几种：

对单词的词边界进行映射：记录了代码的单词边界的 mapping 信息。
对每一个字符进行坐标映射：最精确的 mapping 记录方式，记录了代码的每一个字符的坐标的 mapping 信息。
对词法位置坐标进行映射：记录了代码的词法位置的 mapping 信息。
对行坐标进行映射：最基本的 mapping 记录方式，记录了代码的行数的 mapping 信息，不能再减少记录，否则可能存在映射异常的问题。

以上四种方式均可以做到调试源码时构建产物与源码之间的映射关系。但区别在于不同方式的映射精度不一样，生成的 sourcemap 体积也不一样。magic string 内部实现了上面所有方式，但做了以下几个方面的优化。

仅记录每一个 chunk 的 content 部分的位置信息，chunk.intro 和 chunk.outro 部分的位置信息在 sourcemap 中被忽略，仅更新更新后代码的行列坐标。

class Mapping {
  advance(str) {
    if (!str) return;

    const lines = str.split('\n');

    if (lines.length > 1) {
      for (let i = 0; i < lines.length - 1; i++) {
        this.generatedCodeLine++;
        this.raw[this.generatedCodeLine] = this.rawSegments = [];
      }
      this.generatedCodeColumn = 0;
    }

    this.generatedCodeColumn += lines[lines.length - 1].length;
  }
}

若 content 内容被修改过，则会记录 content 的每一行的起始坐标。

class Mapping {
  addEdit(sourceIndex, content, loc, nameIndex) {
    if (content.length) {
      let contentLineEnd = content.indexOf('\n', 0);
      let previousContentLineEnd = -1;
      while (contentLineEnd >= 0) {
        const segment = [
          this.generatedCodeColumn,
          sourceIndex,
          loc.line,
          loc.column
        ];
        if (nameIndex >= 0) {
          segment.push(nameIndex);
        }
        this.rawSegments.push(segment);

        this.generatedCodeLine += 1;
        this.raw[this.generatedCodeLine] = this.rawSegments = [];
        this.generatedCodeColumn = 0;

        previousContentLineEnd = contentLineEnd;
        contentLineEnd = content.indexOf('\n', contentLineEnd + 1);
      }

      const segment = [
        this.generatedCodeColumn,
        sourceIndex,
        loc.line,
        loc.column
      ];
      if (nameIndex >= 0) {
        segment.push(nameIndex);
      }
      this.rawSegments.push(segment);

      this.advance(content.slice(previousContentLineEnd + 1));
    } else if (this.pending) {
      this.rawSegments.push(this.pending);
      this.advance(content);
    }

    this.pending = null;
  }
}

若 content 内容未被修改过，记录 content 的每一行的起始坐标的同时还会通过 hires、 sourcemapLocations 采取不同精度来记录 content 的每一个字符的坐标信息。

class Mapping {
  addUneditedChunk(
    sourceIndex,
    chunk,
    original,
    loc,
    sourcemapLocations
  ) {
    let originalCharIndex = chunk.start;
    let first = true;
    // when iterating each char, check if it's in a word boundary
    let charInHiresBoundary = false;

    while (originalCharIndex < chunk.end) {
      if (
        this.hires ||
        first ||
        sourcemapLocations.has(originalCharIndex)
      ) {
        const segment = [
          this.generatedCodeColumn,
          sourceIndex,
          loc.line,
          loc.column
        ];

        if (this.hires === 'boundary') {
          // in hires "boundary", group segments per word boundary than per char
          if (wordRegex.test(original[originalCharIndex])) {
            // for first char in the boundary found, start the boundary by pushing a segment
            if (!charInHiresBoundary) {
              this.rawSegments.push(segment);
              charInHiresBoundary = true;
            }
          } else {
            // for non-word char, end the boundary by pushing a segment
            this.rawSegments.push(segment);
            charInHiresBoundary = false;
          }
        } else {
          this.rawSegments.push(segment);
        }
      }

      if (original[originalCharIndex] === '\n') {
        loc.line += 1;
        loc.column = 0;
        this.generatedCodeLine += 1;
        this.raw[this.generatedCodeLine] = this.rawSegments = [];
        this.generatedCodeColumn = 0;
        first = true;
      } else {
        loc.column += 1;
        this.generatedCodeColumn += 1;
        first = false;
      }

      originalCharIndex += 1;
    }

    this.pending = null;
  }
}

interface SourceMapOptions {
  /**
   * Whether the mapping should be high-resolution.
   * Hi-res mappings map every single character, meaning (for example) your devtools will always
   * be able to pinpoint the exact location of function calls and so on.
   * With lo-res mappings, devtools may only be able to identify the correct
   * line - but they're quicker to generate and less bulky.
   * You can also set `"boundary"` to generate a semi-hi-res mappings segmented per word boundary
   * instead of per character, suitable for string semantics that are separated by words.
   * If sourcemap locations have been specified with s.addSourceMapLocation(), they will be used here.
   */
  hires?: boolean | 'boundary';
}

sourcemapLocations 可以指定记录位置
hires 配置项与 mapping 记录的精度有关。
1. 当设置为 true 时，意味着 高精度 mapping 记录，需要对每一个字符做 mapping 记录，能够准确定位函数调用等内容，但付出的代价是最终会生成庞大的 mapping 记录。
2. 当设置为 false（默认）时，意味着 低精度 mapping 记录，只记录行数，但生成速度更快且体积更小。
3. 当设置为 boundary 时，意味着 半高精度 mapping 记录，会记录单词界限而不是字符边界，适用于由单词间隔开来的字符串语义。

`magic string` 在 `rollup` 中的使用与源码映射处理流程

`rollup` 中的 `magic string` 实例与源码映射处理流程

在 Rollup 中，每个模块都有 magic string 实例，用来维护变更后的代码与源代码之间的映射关系。

class Module {
  async setSource({
    ast,
    code,
    customTransformCache,
    originalCode,
    originalSourcemap,
    resolvedIds,
    sourcemapChain,
    transformDependencies,
    transformFiles,
    ...moduleOptions
  }: TransformModuleJSON & {
    resolvedIds?: ResolvedIdMap;
    transformFiles?: EmittedFile[] | undefined;
  }): Promise<void> {
    // 省略其他逻辑
    this.magicString = new MagicString(code, {
      filename: (this.excludeFromSourcemap ? null : fileName)!, // don't include plugin helpers in sourcemap
      indentExclusionRanges: []
    });
    // 省略其他逻辑
  }
}

在 setSource 方法中初始化 magic string 实例。需要注意的是，setSource 方法是在 Rollup 中执行完所有 load 和 transform 钩子的插件后才执行的，也就是说此时的 code 已经是经过所有插件处理后的代码了。那么若代码结构发生变化就需要提供 mapping 信息。

最简单的例子就是 Rollup 通过 @rollup/plugin-typescript 插件解析 .ts 模块时，借助 typescript 模块提供的能力，将 .ts 模块转译为 .js 模块。由于改变了代码结构，在转译的过程中，肯定会生成 mapping 信息。

// @rollup/plugin-typescript

function findTypescriptOutput(
  ts,
  parsedOptions,
  id,
  emittedFiles,
  tsCache
) {
  const emittedFileNames = ts.getOutputFileNames(
    parsedOptions,
    id,
    !ts.sys.useCaseSensitiveFileNames
  );
  const codeFile = emittedFileNames.find(isCodeOutputFile);
  const mapFile = emittedFileNames.find(isMapOutputFile);
  return {
    code: getEmittedFile(codeFile, emittedFiles, tsCache),
    map: getEmittedFile(mapFile, emittedFiles, tsCache),
    declarations: emittedFileNames.filter(
      name => name !== codeFile && name !== mapFile
    )
  };
}
function typescript() {
  return {
    name: 'typescript',
    async load(id) {
      if (!filter(id)) return null;
      this.addWatchFile(id);
      await watchProgramHelper.wait();
      const fileName = normalizePath(id);
      if (!parsedOptions.fileNames.includes(fileName)) {
        // Discovered new file that was not known when originally parsing the TypeScript config
        parsedOptions.fileNames.push(fileName);
      }
      const output = findTypescriptOutput(
        ts,
        parsedOptions,
        id,
        emittedFiles,
        tsCache
      );
      return output.code != null ? output : null;
    }
  };
}

@rollup/plugin-typescript 插件在 load 钩子中将 .ts 模块转译为 .js 模块所对应的 mapping 信息传递给 Rollup。

Rollup 是如何处理 map 信息的呢？

将 load 钩子生成（如果代码结构发生变化）的 map 信息存储在 module.originalSourcemap 中。

将 transform 钩子生成（如果代码结构发生变化）的 map 信息存储在 module.sourcemapChain 中。

async function transform(
  source: SourceDescription,
  module: Module,
  pluginDriver: PluginDriver,
  log: LogHandler
): Promise<TransformModuleJSON> {
  const id = module.id;
  const sourcemapChain: DecodedSourceMapOrMissing[] = [];

  let originalSourcemap =
    source.map === null ? null : decodedSourcemap(source.map);
  const originalCode = source.code;
  let ast = source.ast;
  const transformDependencies: string[] = [];
  const emittedFiles: EmittedFile[] = [];
  let customTransformCache = false;
  const useCustomTransformCache = () => (customTransformCache = true);
  let pluginName = '';
  let currentSource = source.code;

  function transformReducer(
    this: PluginContext,
    previousCode: string,
    result: TransformResult,
    plugin: Plugin
  ): string {
    let code: string;
    let map:
      | string
      | ExistingRawSourceMap
      | { mappings: '' }
      | null
      | undefined;
    if (typeof result === 'string') {
      code = result;
    } else if (result && typeof result === 'object') {
      module.updateOptions(result);
      if (result.code == null) {
        if (result.map || result.ast) {
          log(
            LOGLEVEL_WARN,
            logNoTransformMapOrAstWithoutCode(plugin.name)
          );
        }
        return previousCode;
      }
      ({ code, map, ast } = result);
    } else {
      return previousCode;
    }

    // strict null check allows 'null' maps to not be pushed to the chain,
    // while 'undefined' gets the missing map warning
    if (map !== null) {
      sourcemapChain.push(
        decodedSourcemap(
          typeof map === 'string' ? JSON.parse(map) : map
        ) || {
          missing: true,
          plugin: plugin.name
        }
      );
    }

    currentSource = code;

    return code;
  }
  // 省略其他逻辑
  return {
    ast,
    code,
    customTransformCache,
    originalCode,
    originalSourcemap,
    sourcemapChain,
    transformDependencies
  };
}

在 Module 实例中简单处理了下在 load 阶段生成的 mapping 信息(存储在 originalSourcemap 中)和在 transform 阶段生成的 mapping 信息(存储在 sourcemapChain 中)，为后续生成最终的 mapping 信息做准备。

class Module {
  async setSource({
    ast,
    code,
    customTransformCache,
    originalCode,
    originalSourcemap,
    resolvedIds,
    sourcemapChain,
    transformDependencies,
    transformFiles,
    ...moduleOptions
  }: TransformModuleJSON & {
    resolvedIds?: ResolvedIdMap;
    transformFiles?: EmittedFile[] | undefined;
  }): Promise<void> {
    // We need to call decodedSourcemap on the input in case they were hydrated from json in the cache and don't
    // have the lazy evaluation cache configured. Right now this isn't enforced by the type system because the
    // RollupCache stores `ExistingDecodedSourcemap` instead of `ExistingRawSourcemap`
    this.originalSourcemap = decodedSourcemap(originalSourcemap);
    this.sourcemapChain = sourcemapChain.map(mapOrMissing =>
      mapOrMissing.missing ? mapOrMissing : decodedSourcemap(mapOrMissing)
    );

    // If coming from cache and this value is already fully decoded, we want to re-encode here to save memory.
    resetSourcemapCache(this.originalSourcemap, this.sourcemapChain);
  }
}

接着回到原先这一块逻辑:

class Module {
  async setSource({
    ast,
    code,
    customTransformCache,
    originalCode,
    originalSourcemap,
    resolvedIds,
    sourcemapChain,
    transformDependencies,
    transformFiles,
    ...moduleOptions
  }: TransformModuleJSON & {
    resolvedIds?: ResolvedIdMap;
    transformFiles?: EmittedFile[] | undefined;
  }): Promise<void> {
    // 省略其他逻辑
    this.magicString = new MagicString(code, {
      filename: (this.excludeFromSourcemap ? null : fileName)!, // don't include plugin helpers in sourcemap
      indentExclusionRanges: []
    });
    // 省略其他逻辑
  }
}

现在已经了解了，上述逻辑中的 code 存储的是经过 load 和 transform 钩子处理后的代码。若源码形态发生变化则存储对应插件生成的 mapping 信息置当前模块实例中(module.originalSourcemap 和 module.sourcemapChain)。后续代码的修改均在 code 这个版本上进行，并不一定是源码。

`magic string` 的结构与维护映射关系

可以看出 magic string 实例中维护了多个 chunk 实例（属性包含了 intro（头部内容，即在当前 chunk 头部添加新的内容）、content（当前 chunk 包含的内容）、outro（尾部内容，即在当前 chunk 尾部添加新的内容）三部分内容组成），通过双向链表进行连接。当修改源码时，magic string 会根据修改的区间，创建新的 chunk 实例，并将修改后的内容存储在新的 chunk 实例的 content 属性中。

设计成链表的原因

mapping 记录的位置信息包括：

修改后位置的行坐标
修改后位置的列坐标
原副本的路径
与其对应的源码位置的行坐标
与其对应的源码位置的列坐标

表示的含义是：修改后的位置坐标与路径下的源码位置坐标之间的映射关系。

前面两个信息(修改后的行坐标、修改后的列坐标)是针对修改后的代码而言的，遍历每一个字符，很轻松地记录修改后的代码的行坐标和列坐标。

for (
  let originalCharIndex = 0;
  originalCharIndex < original.length;
  originalCharIndex++
) {
  if (original[originalCharIndex] === '\n') {
    this.generatedCodeLine += 1;
    this.raw[this.generatedCodeLine] = this.rawSegments = [];
    this.generatedCodeColumn = 0;
  } else {
    this.generatedCodeColumn += 1;
  }
}

原副本的路径对于 Rollup 来说，就是源码的文件路径，很容易获取的。
对应的原副本的行坐标和对应的原副本的列坐标是针对修改前的代码而言的，这就是 magic string 需要来维护的。
那么 magic string 是如何维护修改后的代码和原副本代码之间的映射关系的？
需要确定的是，当要修改 [a, b) 区间的字符时，无论修改后的字符长度如何，唯一不变的是修改的起始位置坐标 a。
magic string 以原副本 code 长度作为所有 chunk 累加的总长度。换句说后续修改的均是参照原副本 code 而做的修改，magic string 以 chunk 形式确保了修改后的 content 与原副本 code 的映射关系，因为每次修改均会确保修改后的起始位置与原副本 code 的修改位置相对应。那么很轻易对原副本的启始位置的坐标(a)与修改后的起始位置(a')做映射。
注意
[a, b) 表示本次修改是相对于原副本的 [a, b) 区间做的修改，也就意味着这个区间内不能再进行修改了，若在这个区间内再次做修改[a', b')，其中 (a' > a && a' < b) || (b' > a && b' < b)，那么可以相当于[a', b')是相对上一次 [a, b) 的修改，这是不正确的。因此如果这么做的话，magic string 就会显示如下提示：
js
```
class MagicString {
  _splitChunk(chunk, index) {
    if (chunk.edited && chunk.content.length) {
      // zero-length edited chunks are a special case (overlapping replacements)
      const loc = getLocator(this.original)(index);
      throw new Error(
        `Cannot split a chunk that has already been edited (${loc.line}:${loc.column} – "${chunk.original}")`
      );
    }
    // 省略其他逻辑
  }
}
```
1
2
3
4
5
6
7
8
9
10
11
12

`magic string` 的 `chunk` 拆分流程

举一个简单的例子：

const magicString = new MagicString('const a = 1;');
magicString.update(6, 7, 'variableB');
console.log(magicString.toString()); // output: const variableB = 1;

初始化时 magic string 只有一个 chunk 实例，存储的是原副本 code 的相关信息。

chunk(firstChunk | lastChunk)

const chunk = {
  start: 0,
  end: 12,
  original: 'const a = 1;',
  content: 'const a = 1;',
  intro: '',
  outro: '',
  next: null,
  previous: null,
  toString() {
    return this.intro + this.content + this.outro;
  }
};
magicString.firstChunk = chunk;
magicString.lastChunk = chunk;

根据要修改的区间 [6, 7) 的 6 索引位置来划分新的 chunk2，此时 MagicString 包含两个 chunk 实例。

chunk1(firstChunk) => chunk2(lastChunk)

const chunk2 = {
  start: 6,
  end: 12,
  original: 'a = 1;',
  content: 'a = 1;',
  intro: '',
  outro: '',
  next: null,
  previous: chunk1,
  toString() {
    return this.intro + this.content + this.outro;
  }
};
const chunk1 = {
  start: 0,
  end: 6,
  original: 'const ',
  content: 'const ',
  intro: '',
  outro: '',
  next: chunk2,
  previous: null,
  toString() {
    return this.intro + this.content + this.outro;
  }
};
magicString.firstChunk = chunk1;
magicString.lastChunk = chunk2;

根据要修改的区间 [6, 7) 的 7 索引位置来划分新的 chunk3，此时 magic string 包含三个 chunk 实例。

chunk1(firstChunk) => chunk2 => chunk3(lastChunk)

const chunk3 = {
  start: 7,
  end: 12,
  original: ' = 1;',
  content: ' = 1;',
  intro: '',
  outro: '',
  next: null,
  previous: chunk2,
  toString() {
    return this.intro + this.content + this.outro;
  }
};
const chunk2 = {
  start: 6,
  end: 7,
  original: 'a',
  content: 'a',
  intro: '',
  outro: '',
  next: chunk3,
  previous: chunk1,
  toString() {
    return this.intro + this.content + this.outro;
  }
};
const chunk1 = {
  start: 0,
  end: 6,
  original: 'const ',
  content: 'const ',
  intro: '',
  outro: '',
  next: chunk2,
  previous: null,
  toString() {
    return this.intro + this.content + this.outro;
  }
};
magicString.firstChunk = chunk1;
magicString.lastChunk = chunk3;

执行 update 操作，magic string 会修改区间内的所有的 chunks，当前这个例子中只有一个。

const chunk2 = {
  start: 6,
  end: 7,
  original: 'a',
  content: 'a',
  intro: '',
  outro: '',
  next: chunk3,
  previous: chunk1,
  toString() {
    return this.intro + this.content + this.outro;
  }
};
chunk2.content = 'variableB';

执行 toString 方法，遍历 magic string 中的每一个 chunk，将 chunk 中的 content 拼接起来，生成最终的代码。

const chunk3 = {
  start: 7,
  end: 12,
  original: ' = 1;',
  content: ' = 1;',
  intro: '',
  outro: '',
  next: null,
  previous: chunk2,
  toString() {
    return this.intro + this.content + this.outro;
  }
};
const chunk2 = {
  start: 6,
  end: 7,
  original: 'a',
  content: 'variableB',
  intro: '',
  outro: '',
  next: chunk3,
  previous: chunk1,
  toString() {
    return this.intro + this.content + this.outro;
  }
};
const chunk1 = {
  start: 0,
  end: 6,
  original: 'const ',
  content: 'const ',
  intro: '',
  outro: '',
  next: chunk2,
  previous: null,
  toString() {
    return this.intro + this.content + this.outro;
  }
};
magicString.firstChunk = chunk1;
magicString.lastChunk = chunk3;

const result = chunk1.toString() + chunk2.toString() + chunk3.toString();

Magic String Bundle 的作用

在 Rollup 中，一个 chunk 是由一个或多个模块组成，每个模块都维护着 magic string 实例。那么也就是说一个 chunk 对应会一个或多个 magic string 实例，效果如下：

每一个 magic string 都是独立的，是针对模块与原副本 code 之间的映射关系。但是要想生成 chunk 与原副本 code 之间的映射关系，就需要将多个 magic string 实例进行合并。这就是 magic string bundle 的要做的事情，站在 chunk 的视角上，将多个独立的 magic string 实例进行合并，生成最终的 magic string 实例(即 chunk 与原副本 code 之间的映射关系)。

`Rollup` 集成 `magic string bundle` 的逻辑

在此之前，先了解下 Rollup 是如何集成 magic string bundle 的，下方是 Rollup 执行 render chunks 阶段的简化逻辑：

import { Bundle as MagicStringBundle } from 'magic-string';
class Module {
  render(options: RenderOptions): {
    source: MagicString;
    usesTopLevelAwait: boolean;
  } {
    const source = this.magicString.clone();
    this.ast!.render(source, options);
    source.trim();
    const { usesTopLevelAwait } = this.astContext;
    if (
      usesTopLevelAwait &&
      options.format !== 'es' &&
      options.format !== 'system'
    ) {
      return error(
        logInvalidFormatForTopLevelAwait(this.id, options.format)
      );
    }
    return { source, usesTopLevelAwait };
  }
}
class Chunk {
  private renderModules(fileName: string) {
    const magicString = new MagicStringBundle({ separator: `${n}${n}` });
    for (const module of orderedModules) {
      let source: MagicString | undefined;
      if (module.isIncluded() || includedNamespaces.has(module)) {
        const rendered = module.render(renderOptions);
        ({ source } = rendered);
        renderedLength = source.length();
        if (renderedLength) {
          magicString.addSource(source);
        }
        const namespace = module.namespace;
        if (includedNamespaces.has(module)) {
          const rendered = namespace.renderBlock(renderOptions);
          if (namespace.renderFirst()) hoistedSource += n + rendered;
          else magicString.addSource(new MagicString(rendered));
        }
      }
    }
  }
}

可以看到 Rollup 在 renderModules 中会遍历 orderedModules 中的要执行模块，通过 module.render 来对代码(magic string)做剪枝和修改操作获取到可执行代码(magic string)。magic string bundle 会通过 addSource 方法逐个添加每一个模块中已经优化过的代码(magic string)，这个代码(magic string)代表着这个模块修改过的代码(tree-sharking 剪枝和修改代码)与原副本 code 之间的映射关系。

class Bundle {
  addSource(source) {
    if (source instanceof MagicString) {
      return this.addSource({
        content: source,
        filename: source.filename,
        separator: this.separator
      });
    }

    if (!isObject(source) || !source.content) {
      throw new Error(
        'bundle.addSource() takes an object with a `content` property, which should be an instance of MagicString, and an optional `filename`'
      );
    }

    [
      'filename',
      'ignoreList',
      'indentExclusionRanges',
      'separator'
    ].forEach(option => {
      if (!hasOwnProp.call(source, option))
        source[option] = source.content[option];
    });

    if (source.separator === undefined) {
      // TODO there's a bunch of this sort of thing, needs cleaning up
      source.separator = this.separator;
    }

    if (source.filename) {
      if (
        !hasOwnProp.call(this.uniqueSourceIndexByFilename, source.filename)
      ) {
        this.uniqueSourceIndexByFilename[source.filename] =
          this.uniqueSources.length;
        this.uniqueSources.push({
          filename: source.filename,
          content: source.content.original
        });
      } else {
        const uniqueSource =
          this.uniqueSources[
            this.uniqueSourceIndexByFilename[source.filename]
          ];
        if (source.content.original !== uniqueSource.content) {
          throw new Error(
            `Illegal source: same filename (${source.filename}), different contents`
          );
        }
      }
    }

    this.sources.push(source);
    return this;
  }
}

addSource 方法很简单，会收集 chunk 中的所有模块已优化过的 magic string 实例作为 magic string bundle 的来源，并存储在 sources 数组中。同时 uniqueSourceIndexByFilename 收集的是原副本文件的路径，为后续生成 mapping 做准备。

生成 `mapping` 的流程

class Bundle {
  generateDecodedMap(options = {}) {
    const names = [];
    let x_google_ignoreList = undefined;
    this.sources.forEach(source => {
      Object.keys(source.content.storedNames).forEach(name => {
        if (!~names.indexOf(name)) names.push(name);
      });
    });

    const mappings = new Mappings(options.hires);

    if (this.intro) {
      mappings.advance(this.intro);
    }

    this.sources.forEach((source, i) => {
      if (i > 0) {
        mappings.advance(this.separator);
      }

      const sourceIndex = source.filename
        ? this.uniqueSourceIndexByFilename[source.filename]
        : -1;
      const magicString = source.content;
      const locate = getLocator(magicString.original);

      if (magicString.intro) {
        mappings.advance(magicString.intro);
      }

      magicString.firstChunk.eachNext(chunk => {
        const loc = locate(chunk.start);

        if (chunk.intro.length) mappings.advance(chunk.intro);

        if (source.filename) {
          if (chunk.edited) {
            mappings.addEdit(
              sourceIndex,
              chunk.content,
              loc,
              chunk.storeName ? names.indexOf(chunk.original) : -1
            );
          } else {
            mappings.addUneditedChunk(
              sourceIndex,
              chunk,
              magicString.original,
              loc,
              magicString.sourcemapLocations
            );
          }
        } else {
          mappings.advance(chunk.content);
        }

        if (chunk.outro.length) mappings.advance(chunk.outro);
      });

      if (magicString.outro) {
        mappings.advance(magicString.outro);
      }

      if (source.ignoreList && sourceIndex !== -1) {
        if (x_google_ignoreList === undefined) {
          x_google_ignoreList = [];
        }
        x_google_ignoreList.push(sourceIndex);
      }
    });

    return {
      file: options.file ? options.file.split(/[/\\]/).pop() : undefined,
      sources: this.uniqueSources.map(source => {
        return options.file
          ? getRelativePath(options.file, source.filename)
          : source.filename;
      }),
      sourcesContent: this.uniqueSources.map(source => {
        return options.includeContent ? source.content : null;
      }),
      names,
      mappings: mappings.raw,
      x_google_ignoreList
    };
  }
}

可以看到实现逻辑很简单，以 chunk 的维度初始化 magic string，遍历 chunk 中在上述 addSource 添加过的 magic string(维护着转译后的模块与原副本 code 之间的映射关系)，生成 chunk 与原副本 code 之间的映射关系。

至此，通过 magic string bundle 将多个 magic string 实例进行合并，生成 magic string 实例(即 chunk 与原副本 code 之间的映射关系)。但还没结束，由上述段落的介绍可知，原副本 code 是通过 rollup 的 load 和 transform 钩子处理后获取的，同时原副本 code mapping 源码的信息是存储在模块的 originalSourcemap 和 sourcemapChain 中，那么此时就需要借助这两个信息来生成最终的 mapping 信息。

rollup 生成最终的 `mapping` 信息

在上述 render chunks 的逻辑中，会遍历 chunk 中的所有模块并执行 chunk.render，其中会创建 magic bundle string 实例，用来收集 chunk 中包含的可执行代码(magic string)，并生成 chunk 与原副本 code 之间的映射关系。

之后会执行 transform chunks 的逻辑，其中会触发 renderChunk 的插件钩子，需要注意的是 renderChunk 钩子也是有可能改变代码结构，那么也就有可能会生成 map，那么与 transform 插件钩子类似，也是会将新生成的 map 收集到 sourcemapChain 中，为后续生成最终 mapping 做准备。

function transformChunk(
  magicString: MagicStringBundle,
  fileName: string,
  usedModules: Module[],
  chunkGraph: Record<string, RenderedChunk>,
  options: NormalizedOutputOptions,
  outputPluginDriver: PluginDriver,
  log: LogHandler
) {
  const map: SourceMap | null = null;
  const sourcemapChain: DecodedSourceMapOrMissing[] = [];
  const code = await outputPluginDriver.hookReduceArg0(
    'renderChunk',
    [
      magicString.toString(),
      chunkGraph[fileName],
      options,
      { chunks: chunkGraph }
    ],
    (code, result, plugin) => {
      if (result == null) return code;

      if (typeof result === 'string')
        result = {
          code: result,
          map: undefined
        };

      // strict null check allows 'null' maps to not be pushed to the chain, while 'undefined' gets the missing map warning
      if (result.map !== null) {
        const map = decodedSourcemap(result.map);
        sourcemapChain.push(map || { missing: true, plugin: plugin.name });
      }

      return result.code;
    }
  );
  // 省略其他逻辑
}

紧接着 rollup 会通过 collapseSourcemaps 方法来合并所有的 map，生成最终的 mapping。

function transformChunk(
  magicString: MagicStringBundle,
  fileName: string,
  usedModules: Module[],
  chunkGraph: Record<string, RenderedChunk>,
  options: NormalizedOutputOptions,
  outputPluginDriver: PluginDriver,
  log: LogHandler
) {
  // 省略其他逻辑
  if (sourcemap) {
    timeStart('sourcemaps', 3);

    let resultingFile: string;
    if (file) resultingFile = resolve(sourcemapFile || file);
    else if (dir) resultingFile = resolve(dir, fileName);
    else resultingFile = resolve(fileName);

    const decodedMap = magicString.generateDecodedMap({});
    map = collapseSourcemaps(
      resultingFile,
      decodedMap,
      usedModules,
      sourcemapChain,
      sourcemapExcludeSources,
      log
    );
    for (
      let sourcesIndex = 0;
      sourcesIndex < map.sources.length;
      ++sourcesIndex
    ) {
      let sourcePath = map.sources[sourcesIndex];
      const sourcemapPath = `${resultingFile}.map`;
      const ignoreList = sourcemapIgnoreList(sourcePath, sourcemapPath);
      if (typeof ignoreList !== 'boolean') {
        error(
          logFailedValidation(
            'sourcemapIgnoreList function must return a boolean.'
          )
        );
      }
      if (ignoreList) {
        if (map.x_google_ignoreList === undefined) {
          map.x_google_ignoreList = [];
        }
        if (!map.x_google_ignoreList.includes(sourcesIndex)) {
          map.x_google_ignoreList.push(sourcesIndex);
        }
      }
      if (sourcemapPathTransform) {
        sourcePath = sourcemapPathTransform(sourcePath, sourcemapPath);
        if (typeof sourcePath !== 'string') {
          error(
            logFailedValidation(
              `sourcemapPathTransform function must return a string.`
            )
          );
        }
      }
      map.sources[sourcesIndex] = normalize(sourcePath);
    }

    timeEnd('sourcemaps', 3);
  }
  return {
    code,
    map
  };
}

其中 const decodedMap = magicString.generateDecodedMap({}) 是生成 magic string bundle 的 mapping 信息，象征着 chunk 与原副本 code 之间的映射关系。

生成 magic string bundle 阶段的流程图

以 .ts 模块为例，生成 magic string bundle 的 mapping 信息。

collapseSourcemaps 方法是为了合并 load、transform 和 renderChunk 插件钩子中生成的 map，生成最终的 mapping 信息。那么接着看一下 collapseSourcemaps 中的合并逻辑。

function getCollapsedSourcemap(
  id: string,
  originalCode: string,
  originalSourcemap: ExistingDecodedSourceMap | null,
  sourcemapChain: readonly DecodedSourceMapOrMissing[],
  linkMap: (source: Source | Link, map: DecodedSourceMapOrMissing) => Link
): Source | Link {
  let source: Source | Link;

  if (originalSourcemap) {
    const sources = originalSourcemap.sources;
    const sourcesContent = originalSourcemap.sourcesContent || [];
    const directory = dirname(id) || '.';
    const sourceRoot = originalSourcemap.sourceRoot || '.';

    const baseSources = sources.map(
      (source, index) =>
        new Source(
          resolve(directory, sourceRoot, source),
          sourcesContent[index]
        )
    );
    source = new Link(originalSourcemap, baseSources);
  } else {
    source = new Source(id, originalCode);
  }
  return sourcemapChain.reduce(linkMap, source);
}
function getLinkMap(log: LogHandler) {
  return function linkMap(
    source: Source | Link,
    map: DecodedSourceMapOrMissing
  ): Link {
    if (!map.missing) {
      return new Link(map, [source]);
    }

    log(LOGLEVEL_WARN, logSourcemapBroken(map.plugin));

    return new Link(
      {
        mappings: [],
        names: []
      },
      [source]
    );
  };
}
function collapseSourcemaps(
  file: string,
  map: Omit<DecodedSourceMap, 'sourcesContent'> & {
    sourcesContent: (string | null)[];
  },
  modules: readonly Module[],
  bundleSourcemapChain: readonly DecodedSourceMapOrMissing[],
  excludeContent: boolean | undefined,
  log: LogHandler
): SourceMap {
  const linkMap = getLinkMap(log);
  const moduleSources = modules
    .filter(module => !module.excludeFromSourcemap)
    .map(module =>
      getCollapsedSourcemap(
        module.id,
        module.originalCode,
        module.originalSourcemap,
        module.sourcemapChain,
        linkMap
      )
    );

  const link = new Link(map, moduleSources);
  const source = bundleSourcemapChain.reduce(linkMap, link);
  // 省略部分逻辑
}

上述源码执行的流程是整理 load、transform 和 renderChunk 插件钩子中生成的 map。

源码中 link 实例对应下图的 蓝色部分，source 实例对应下图的 紫色部分。

从图中很直观的看到生成的 map 均是相对前一个状态的 map。可惜的是我们并不需要这些中间 map，而只需要最终产物相对于源码的 map。通过整理 map 与 map 之间的映射关系，沿着如图中所示的 map 链式结构，很轻松的就能生成最终产物相对于源码的 map。这个流程就是下方源码中 source.traceMappings() 的实现逻辑。

function collapseSourcemaps(
  file: string,
  map: Omit<DecodedSourceMap, 'sourcesContent'> & {
    sourcesContent: (string | null)[];
  },
  modules: readonly Module[],
  bundleSourcemapChain: readonly DecodedSourceMapOrMissing[],
  excludeContent: boolean | undefined,
  log: LogHandler
): SourceMap {
  // 省略其他逻辑
  let { sources, sourcesContent, names, mappings } = source.traceMappings();

  if (file) {
    const directory = dirname(file);
    sources = sources.map((source: string) => relative(directory, source));
    file = basename(file);
  }

  sourcesContent = (excludeContent ? null : sourcesContent) as string[];

  for (const module of modules) {
    resetSourcemapCache(module.originalSourcemap, module.sourcemapChain);
  }

  return new SourceMap({ file, mappings, names, sources, sourcesContent });
}

简单看一下 traceMappings 的实现吧。

class Source {
  traceSegment(
    line: number,
    column: number,
    name: string
  ): SourceMapSegmentObject {
    return { column, line, name, source: this };
  }
}

class Link {
  traceMappings() {
    const sources: string[] = [];
    const sourceIndexMap = new Map<string, number>();
    const sourcesContent: string[] = [];
    const names: string[] = [];
    const nameIndexMap = new Map<string, number>();

    const mappings = [];

    for (const line of this.mappings) {
      const tracedLine: SourceMapSegment[] = [];

      for (const segment of line) {
        if (segment.length === 1) continue;
        const source = this.sources[segment[1]];
        if (!source) continue;

        const traced = source.traceSegment(
          segment[2],
          segment[3],
          segment.length === 5 ? this.names[segment[4]] : ''
        );

        if (traced) {
          const {
            column,
            line,
            name,
            source: { content, filename }
          } = traced;
          let sourceIndex = sourceIndexMap.get(filename);
          if (sourceIndex === undefined) {
            sourceIndex = sources.length;
            sources.push(filename);
            sourceIndexMap.set(filename, sourceIndex);
            sourcesContent[sourceIndex] = content;
          } else if (sourcesContent[sourceIndex] == null) {
            sourcesContent[sourceIndex] = content;
          } else if (
            content != null &&
            sourcesContent[sourceIndex] !== content
          ) {
            return error(logConflictingSourcemapSources(filename));
          }

          const tracedSegment: SourceMapSegment = [
            segment[0],
            sourceIndex,
            line,
            column
          ];

          if (name) {
            let nameIndex = nameIndexMap.get(name);
            if (nameIndex === undefined) {
              nameIndex = names.length;
              names.push(name);
              nameIndexMap.set(name, nameIndex);
            }

            (tracedSegment as SourceMapSegment)[4] = nameIndex;
          }

          tracedLine.push(tracedSegment);
        }
      }

      mappings.push(tracedLine);
    }

    return { mappings, names, sources, sourcesContent };
  }
  traceSegment(
    line: number,
    column: number,
    name: string
  ): SourceMapSegmentObject | null {
    const segments = this.mappings[line];
    if (!segments) return null;

    // binary search through segments for the given column
    let searchStart = 0;
    let searchEnd = segments.length - 1;

    while (searchStart <= searchEnd) {
      const m = (searchStart + searchEnd) >> 1;
      const segment = segments[m];

      // If a sourcemap does not have sufficient resolution to contain a
      // necessary mapping, e.g. because it only contains line information, we
      // use the best approximation we could find
      if (segment[0] === column || searchStart === searchEnd) {
        if (segment.length == 1) return null;
        const source = this.sources[segment[1]];
        if (!source) return null;

        return source.traceSegment(
          segment[2],
          segment[3],
          segment.length === 5 ? this.names[segment[4]] : name
        );
      }
      if (segment[0] > column) {
        searchEnd = m - 1;
      } else {
        searchStart = m + 1;
      }
    }

    return null;
  }
}

逻辑很简单，遍历每一个 位置信息，沿着 Link 链路递归确定当前 map 的 位置信息 相当于在下一个 map 中的哪一个 位置信息（这里通过二分算法进行优化检索），直到递归检索到 Source 实例即为最终的出口，此时的 位置信息 则是相对于源码的，就正是我们所需要的。遍历结束后生成的 mappings 信息(chunk 与源码之间的映射关系)再喂给 SourceMap 构造函数生成最终的 map。SourceMap 在构造函数中会对上述的 mappings 位置信息做编码操作。

import { encode } from '@jridgewell/sourcemap-codec';
class SourceMap {
  constructor(properties) {
    this.version = 3;
    this.file = properties.file;
    this.sources = properties.sources;
    this.sourcesContent = properties.sourcesContent;
    this.names = properties.names;
    this.mappings = encode(properties.mappings);
    if (typeof properties.x_google_ignoreList !== 'undefined') {
      this.x_google_ignoreList = properties.x_google_ignoreList;
    }
  }
}

至此，sourcemap 的 mapping 工作就已经完成了。

源码解析

概括

MagicString 针对每一次修改均以 chunk 来进行表示，所生成的 chunk 均以双向链表来进行连接，每个 chunk 内部均包含 intro（头部内容，即在当前 chunk 头部添加新的内容）、content（当前 chunk 包含的内容）、outro（尾部内容，即在当前 chunk 尾部添加新的内容）三部分内容组成。最后生成的新字符串是通过所有的 chunk 拼接而成，即如下样例：

text

firstChunk <=> chunk <=> lastChunk

newString =
    firstChunk.intro + firstChunk.content + firstChunk.outro +
    chunk.intro + chunk.content + chunk.outro +
    lastChunk.intro + lastChunk.content + lastChunk.outro;

原理分析

MagicString 的实现

实例化

export class MagicString {
  constructor(string, options = {}) {
    // 初始源代码字符串的 chunk 实例
    const chunk = new Chunk(0, string.length, string);

    Object.defineProperties(this, {
      // 原始字符串, 后续操作不做变更。
      original: { writable: true, value: string },
      // 尾部字符串（将在原始字符串后追加的内容）
      outro: { writable: true, value: '' },
      // 头部字符串（将在原始字符串前追加的内容）
      intro: { writable: true, value: '' },
      // 第一个 chunk 实例
      firstChunk: { writable: true, value: chunk },
      // 最后一个 chunk 实例
      lastChunk: { writable: true, value: chunk },
      // 最后搜索的 chunk 实例，用于优化 chunk 的检索操作。
      lastSearchedChunk: { writable: true, value: chunk },
      // 获取以 index 索引开始的 chunk 实例。
      byStart: { writable: true, value: {} },
      // 获取以 index 索引结束的 chunk 实例。
      byEnd: { writable: true, value: {} },
      // 文件名
      filename: { writable: true, value: options.filename },
      // 缩进排除范围
      indentExclusionRanges: {
        writable: true,
        value: options.indentExclusionRanges
      },
      // sourcemap的位置信息
      sourcemapLocations: { writable: true, value: new BitSet() },
      // 存储的名称
      storedNames: { writable: true, value: {} },
      // 缩进字符串
      indentStr: { writable: true, value: undefined },
      ignoreList: { writable: true, value: options.ignoreList }
    });

    if (DEBUG) {
      Object.defineProperty(this, 'stats', { value: new Stats() });
    }
    // 初始化首个 chunk 的位置信息。
    this.byStart[0] = chunk;
    this.byEnd[string.length] = chunk;
  }
}

MagicString 中包含了很多方法，我们举一个 update 的实现来做具体分析，其他方法类似。

MagicString.update

function update(start, end, content, options) {
  // 确保替换内容为字符串。
  if (typeof content !== 'string')
    throw new TypeError('replacement content must be a string');

  // 处理负数的起始位置和结束位置，确保均为正数。
  while (start < 0) start += this.original.length;
  while (end < 0) end += this.original.length;

  // 确保结束索引在范围之内，若结束索引符合要求则启始位置肯定也符合预期。
  if (end > this.original.length)
    throw new Error('end is out of bounds');
  // 检查更新的范围区间是否为零长度，是则直接跳过。
  if (start === end)
    throw new Error(
      'Cannot overwrite a zero-length range – use appendLeft or prependRight instead'
    );

  if (DEBUG) this.stats.time('overwrite');

  /**
   * start 和 end 位置进行拆分新 chunk，
   * 确保存在以 start 为起点和以 end 为结束的 chunk，
   * 为后续的操作做准备。
   */
  this._split(start);
  this._split(end);

  // 处理选项参数。
  if (options === true) {
    if (!warned.storeName) {
      console.warn(
        'The final argument to magicString.overwrite(...) should be an options object. See https://github.com/rich-harris/magic-string'
      );
      warned.storeName = true;
    }

    options = { storeName: true };
  }

  // 获取 storeName 和 overwrite 选项。
  const storeName = options !== undefined ? options.storeName : false;
  const overwrite = options !== undefined ? options.overwrite : false;

  // 如果需要存储名称，将原始内容存储到 storedNames 中。
  if (storeName) {
    const original = this.original.slice(start, end);
    Object.defineProperty(this.storedNames, original, {
      writable: true,
      value: true,
      enumerable: true
    });
  }
  // 获取要修改的启始 chunk 和结尾 chunk。
  const first = this.byStart[start];
  const last = this.byEnd[end];

  // 若 first 存在的话，那么将 start 到 end 区间仅 first chunk 保存要修改的内容，其他的所有 chunk 均修改为空字符串。
  if (first) {
    let chunk = first;
    while (chunk !== last) {
      if (chunk.next !== this.byStart[chunk.end]) {
        throw new Error('Cannot overwrite across a split point');
      }
      chunk = chunk.next;
      chunk.edit('', false);
    }

    first.edit(content, storeName, !overwrite);
  } else {
    // must be inserting at the end
    const newChunk = new Chunk(start, end, '').edit(
      content,
      storeName
    );

    // TODO last chunk in the array may not be the last chunk, if it's moved...
    last.next = newChunk;
    newChunk.previous = last;
  }

  if (DEBUG) this.stats.timeEnd('overwrite');
  return this;
}

可以看到 update 的操作的时候会对 start 和 end 索引做拆分 chunk 的处理，这也就是 magic-string 包的核心所在，即完整的源码均以单个或多个 chunk 组成，每一次操作都会确保索引位置存在已拆分的 chunk。那么我们继续看一下对于 chunk 的拆分流程是如何进行的吧。

MagicString._split

function _split(index) {
  // 若索引位置存在 chunk 实例，则无需拆分，直接返回已拆分的 chunk。
  if (this.byStart[index] || this.byEnd[index]) return;

  if (DEBUG) this.stats.time('_split');

  // 标记 lastSearchedChunk 来记忆化检索目标 chunk 的位置。
  let chunk = this.lastSearchedChunk;
  // 确定新的划分 chunk 是在当前标记 chunk 的后面还是前面。
  const searchForward = index > chunk.end;

  while (chunk) {
    // 若要划分的索引在当前 chunk 中，则执行 chunk 划分流程。
    if (chunk.contains(index)) return this._splitChunk(chunk, index);
    // 若目标 chunk 是在 lastSearchedChunk chunk 的后面则晚后检索，相反若在前面则往前检索。
    chunk = searchForward
      ? this.byStart[chunk.end]
      : this.byEnd[chunk.start];
  }
}

可以看到拆分的步骤在 this._splitChunk(chunk, index) 中，继续深入分析。

MagicString._splitChunk

function _splitChunk(chunk, index) {
  // 若当前 chunk 已经被修改过了且包含了内容，那么停止当前流程。
  if (chunk.edited && chunk.content.length) {
    // zero-length edited chunks are a special case (overlapping replacements)
    const loc = getLocator(this.original)(index);
    throw new Error(
      `Cannot split a chunk that has already been edited (${loc.line}:${loc.column} – "${chunk.original}")`
    );
  }
  /**
   * 在 chunk 中以 index 为索引进行拆分，返回以 index 为起点的新 chunk。
   * 即 newChunk.start === index。
   */
  const newChunk = chunk.split(index);
  // 更新原 chunk 的位置
  this.byEnd[index] = chunk;
  // 更新新 chunk 的位置
  this.byStart[index] = newChunk;
  this.byEnd[newChunk.end] = newChunk;

  // 若原 chunk 为最后一个 chunk，那么拆分后最后一个 chunk 即为新的 chunk。
  if (chunk === this.lastChunk) this.lastChunk = newChunk;
  // 保存当前操作的 chunk，即 newChunk.previous。作为下一次检索的启始位置，优化检索步骤。
  this.lastSearchedChunk = chunk;
  if (DEBUG) this.stats.timeEnd('_split');
  return true;
}

chunk.split

对原 chunk 执行拆分操作。创建区间为 [index, this.end] 的新 chunk，更新原 chunk 的区间为 [this.start, index] 和 content 等信息，将新 chunk 拼接在原 chunk 的后面，形成双向链表。

function split(index) {
  const sliceIndex = index - this.start;
  // 拆分内容。
  const originalBefore = this.original.slice(0, sliceIndex);
  const originalAfter = this.original.slice(sliceIndex);
  // 更新原 chunk 的内容。
  this.original = originalBefore;
  // 实例化新的 chunk，区间为 [index, this.end]。
  const newChunk = new Chunk(index, this.end, originalAfter);
  newChunk.outro = this.outro;
  this.outro = '';
  // 更新原 chunk 的区间为 [this.start, index]。
  this.end = index;

  // 原 chunk 已经被编辑过了且字符串长度为 0。
  if (this.edited) {
    // after split we should save the edit content record into the correct chunk
    // to make sure sourcemap correct
    // For example:
    // '  test'.trim()
    //     split   -> '  ' + 'test'
    //   ✔️ edit    -> '' + 'test'
    //   ✖️ edit    -> 'test' + ''
    // TODO is this block necessary?...
    newChunk.edit('', false);
    this.content = '';
  } else {
    this.content = originalBefore;
  }

  // 将 newChunk 拼接在原 chunk 的后面，形成双向链表。
  newChunk.next = this.next;
  if (newChunk.next) newChunk.next.previous = newChunk;
  newChunk.previous = this;
  this.next = newChunk;

  // 返回新创建的 chunk。
  return newChunk;
}

分析完 MagicString.update 的实现流程，最后的输出是 toString 的方法来实现。

MagicString.toString 实现源码如下：

export default class MagicString {
  toString() {
    let str = this.intro;

    let chunk = this.firstChunk;
    while (chunk) {
      str += chunk.toString();
      chunk = chunk.next;
    }

    return str + this.outro;
  }
}

export default class Chunk {
  toString() {
    return this.intro + this.content + this.outro;
  }
}

可以很清晰的看到 toString 的实现就是将所有 chunk 中的 intro、 content + outro 的内容相加。

MagicString 源码映射原理

概述：

处理新源码，解析每一个 chunk 信息 content 信息，获取 raw 数组存储的数据如下：

raw = [ ["新源码 content 的行坐标"]: [ "新源码 content 的列坐标", "源码开始的索引信息，默认均为0", "对应源码的行坐标", "对应源码的列坐标" ], ... ]

然后最后将 raw 的数据喂给 @jridgewell/sourcemap-codec 执行 VLQ base 64 编码操作处理。

import { encode } from '@jridgewell/sourcemap-codec';
encode(raw);

来获取新代码的 sourceMap 信息。

具体实现：

那么我们具体来分析一下实现原理，我们可以了解到是通过 generateDecodedMap 的方法来生成源码映射，那么我们就来深入分析一下 generateDecodedMap 到底做了什么。

function generateDecodedMap(options) {
  options = options || {};

  const sourceIndex = 0;
  const names = Object.keys(this.storedNames);
  const mappings = new Mappings(options.hires);
  // 通过 locate 可以以log复杂度快速获取索引对应源码的 [行列] 信息。
  const locate = getLocator(this.original);

  if (this.intro) {
    mappings.advance(this.intro);
  }
  // 以 this.firstChunk 为起点遍历后续的每一个 chunk。
  this.firstChunk.eachNext(chunk => {
    // 获取 chunk.start 为索引所对应的源码 [行列] 信息。
    const loc = locate(chunk.start);

    if (chunk.intro.length) mappings.advance(chunk.intro);

    if (chunk.edited) {
      mappings.addEdit(
        sourceIndex,
        chunk.content,
        loc,
        chunk.storeName ? names.indexOf(chunk.original) : -1
      );
    } else {
      mappings.addUneditedChunk(
        sourceIndex,
        chunk,
        this.original,
        loc,
        this.sourcemapLocations
      );
    }

    if (chunk.outro.length) mappings.advance(chunk.outro);
  });

  return {
    file: options.file ? options.file.split(/[/\\]/).pop() : undefined,
    sources: [
      options.source
        ? getRelativePath(options.file || '', options.source)
        : options.file || ''
    ],
    sourcesContent: options.includeContent ? [this.original] : undefined,
    names,
    mappings: mappings.raw,
    x_google_ignoreList: this.ignoreList ? [sourceIndex] : undefined
  };
}

function getLocator(source) {
  const originalLines = source.split('\n');
  const lineOffsets = [];
  // 根据源码中的 \n， 将源码拆分成多行，lineOffsets 中记录的是每一行中的其实字符位置。
  for (let i = 0, pos = 0; i < originalLines.length; i++) {
    lineOffsets.push(pos);
    pos += originalLines[i].length + 1;
  }

  // 通过二分方法以 log 的复杂度获取 index 字符索引对应的代码行列位置。
  return function locate(index) {
    let i = 0;
    let j = lineOffsets.length;
    while (i < j) {
      const m = (i + j) >> 1;
      if (index < lineOffsets[m]) {
        j = m;
      } else {
        i = m + 1;
      }
    }
    const line = i - 1;
    const column = index - lineOffsets[line];
    return { line, column };
  };
}

从上述源码中可以看到，处理 chunk 的 intro 和 outro 内容时使用的是 Mappings.advance 方法；当 chunk 被编辑过调用 Mappings.addEdit，而没有编辑过则调用 Mappings.addUneditedChunk 方法。那我们先看一下 Mappings.advance 究竟做了什么：

export default class Mappings {
  constructor(hires) {
    this.hires = hires;
    this.generatedCodeLine = 0;
    this.generatedCodeColumn = 0;
    this.raw = [];
    this.rawSegments = this.raw[this.generatedCodeLine] = [];
    this.pending = null;
  }
  advance(str) {
    if (!str) return;

    const lines = str.split('\n');

    if (lines.length > 1) {
      for (let i = 0; i < lines.length - 1; i++) {
        this.generatedCodeLine++;
        this.raw[this.generatedCodeLine] = this.rawSegments = [];
      }
      this.generatedCodeColumn = 0;
    }

    this.generatedCodeColumn += lines[lines.length - 1].length;
  }
}

值得注意的是对于 chunk 的 intro 及 outro 并没有记录对应源码的 [行列] 坐标，在 advance 函数更新 this.generatedCodeLine (即 content 的行信息) 和 this.generatedCodeColumn （即 content 的列信息），自身并没有做任何映射处理，仅确保获取新的 content 的 [行列] 坐标的准确性。

若 chunk 被编辑过了，则我们可以看到 Mappings.addEdit 具体做了什么。

export default class Mapping {
  addEdit(sourceIndex, content, loc, nameIndex) {
    if (content.length) {
      let contentLineEnd = content.indexOf('\n', 0);
      let previousContentLineEnd = -1;
      while (contentLineEnd >= 0) {
        const segment = [
          this.generatedCodeColumn,
          sourceIndex,
          loc.line,
          loc.column
        ];
        if (nameIndex >= 0) {
          segment.push(nameIndex);
        }
        this.rawSegments.push(segment);

        this.generatedCodeLine += 1;
        this.raw[this.generatedCodeLine] = this.rawSegments = [];
        this.generatedCodeColumn = 0;

        previousContentLineEnd = contentLineEnd;
        contentLineEnd = content.indexOf('\n', contentLineEnd + 1);
      }

      const segment = [
        this.generatedCodeColumn,
        sourceIndex,
        loc.line,
        loc.column
      ];
      if (nameIndex >= 0) {
        segment.push(nameIndex);
      }
      this.rawSegments.push(segment);

      this.advance(content.slice(previousContentLineEnd + 1));
    } else if (this.pending) {
      this.rawSegments.push(this.pending);
      this.advance(content);
    }

    this.pending = null;
  }
}

可以很清晰的看到只是做了一层新 content 对应源码位置的坐标映射处理。那么我们来对比若 content 没有被编辑过与其区别。

function addUneditedChunk(
  sourceIndex,
  chunk,
  original,
  loc,
  sourcemapLocations
) {
  let originalCharIndex = chunk.start;
  let first = true;
  // when iterating each char, check if it's in a word boundary
  let charInHiresBoundary = false;

  while (originalCharIndex < chunk.end) {
    if (
      this.hires ||
      first ||
      sourcemapLocations.has(originalCharIndex)
    ) {
      const segment = [
        this.generatedCodeColumn,
        sourceIndex,
        loc.line,
        loc.column
      ];

      if (this.hires === 'boundary') {
        // in hires "boundary", group segments per word boundary than per char
        if (wordRegex.test(original[originalCharIndex])) {
          // for first char in the boundary found, start the boundary by pushing a segment
          if (!charInHiresBoundary) {
            this.rawSegments.push(segment);
            charInHiresBoundary = true;
          }
        } else {
          // for non-word char, end the boundary by pushing a segment
          this.rawSegments.push(segment);
          charInHiresBoundary = false;
        }
      } else {
        this.rawSegments.push(segment);
      }
    }

    if (original[originalCharIndex] === '\n') {
      loc.line += 1;
      loc.column = 0;
      this.generatedCodeLine += 1;
      this.raw[this.generatedCodeLine] = this.rawSegments = [];
      this.generatedCodeColumn = 0;
      first = true;
    } else {
      loc.column += 1;
      this.generatedCodeColumn += 1;
      first = false;
    }

    originalCharIndex += 1;
  }

  this.pending = null;
}

可以看到原理还是和上述被编辑过的 chunk 的原理类似，均是获取新 chunk 的 content 对应源码的映射。不一样的点是没有编辑过的 chunk 会做一些额外的操作，比如存在 hire 和 sourcemapLocations 时映射的精确性处理。

Contributors

XiSenao

Changelog

Last edited 3 months ago

View full history

Magic String ​

原理 ​

背景 ​

各个工具的使用情况 ​

设计核心 ​

mapping 信息的确定 ​

magic string 在 rollup 中的使用与源码映射处理流程 ​

rollup 中的 magic string 实例与源码映射处理流程 ​

magic string 的结构与维护映射关系 ​

magic string 的 chunk 拆分流程 ​

Magic String Bundle 的作用 ​

Rollup 集成 magic string bundle 的逻辑 ​

生成 mapping 的流程 ​

rollup 生成最终的 mapping 信息 ​

源码解析 ​

概括 ​

原理分析 ​