Incremental Build

Currently, rollup automatically uses the cache mechanism to implement incremental builds in watch mode. In cache mode, rollup's incremental build is similar to react's incremental update of the fiber tree, only compiling modules related to file modifications and not compiling other unrelated modules. Let's first introduce the decision logic of rollup's cache mechanism.

Cache Hit Conditions

The specific implementation of rollup's cache hit logic is as follows:

const cachedModule = this.graph.cachedModules.get(id);
if (
  cachedModule &&
  !cachedModule.customTransformCache &&
  cachedModule.originalCode === sourceDescription.code &&
  !(await this.pluginDriver.hookFirst('shouldTransformCachedModule', [
    {
      ast: cachedModule.ast,
      code: cachedModule.code,
      id: cachedModule.id,
      meta: cachedModule.meta,
      moduleSideEffects: cachedModule.moduleSideEffects,
      resolvedSources: cachedModule.resolvedIds,
      syntheticNamedExports: cachedModule.syntheticNamedExports
    }
  ]))
) {
  if (cachedModule.transformFiles) {
    for (const emittedFile of cachedModule.transformFiles)
      this.pluginDriver.emitFile(emittedFile);
  }
  await module.setSource(cachedModule);
}

The following conditions must be met to reuse the cache:

Check if the cached module exists, first check if there is a cached module cachedModule through resolveId. Since rollup does not currently support persistent cache, cachedModule does not exist during the initial build, so caching is skipped.
js
```
const cachedModule = this.graph.cachedModules.get(id);
```
1
If custom caching is used in the plugin (i.e., the user plugin actively uses the this.cache provided by the plugin context to set cache for specified module resolution), the modules associated with this.cache will skip caching.
Check if the code has changed before and after, if changed then skip caching.
Call the shouldTransformCachedModule hook to skip caching for specified modules.

Cache Timing

After rollup builds all modules, it triggers the buildEnd event. Then rollup confirms whether caching is needed through the rawInputOptions.cache flag.

export async function rollupInternal(
  rawInputOptions: RollupOptions,
  watcher: RollupWatcher | null
): Promise<RollupBuild> {
  // remove the cache object from the memory after graph creation (cache is not used anymore)
  const useCache = rawInputOptions.cache !== false;
  await catchUnfinishedHookActions(graph.pluginDriver, async () => {
    try {
      timeStart('initialize', 2);
      await graph.pluginDriver.hookParallel('buildStart', [inputOptions]);
      timeEnd('initialize', 2);
      await graph.build();
    } catch (error_: any) {
      const watchFiles = Object.keys(graph.watchFiles);
      if (watchFiles.length > 0) {
        error_.watchFiles = watchFiles;
      }
      await graph.pluginDriver.hookParallel('buildEnd', [error_]);
      await graph.pluginDriver.hookParallel('closeBundle', []);
      throw error_;
    }
    await graph.pluginDriver.hookParallel('buildEnd', []);
  });
  const result: RollupBuild = {
    cache: useCache ? graph.getCache() : undefined
  };
  return result;
}

By default, rollup will perform caching operations (i.e., rawInputOptions.cache !== false is true), so it caches the required information through graph.getCache().

class Graph {
  getCache(): RollupCache {
    // handle plugin cache eviction
    for (const name in this.pluginCache) {
      const cache = this.pluginCache[name];
      let allDeleted = true;
      for (const [key, value] of Object.entries(cache)) {
        if (value[0] >= this.options.experimentalCacheExpiry)
          delete cache[key];
        else allDeleted = false;
      }
      if (allDeleted) delete this.pluginCache[name];
    }

    return {
      modules: this.modules.map(module => module.toJSON()),
      plugins: this.pluginCache
    };
  }
}

From the source code, we can see that rollup attempts to cache information about modules and plugins. Let's look at what specific information is cached.

Cache Module

The module's cache content is generated through the module.toJSON() method. Let's look at the implementation of the module.toJSON() method.

export default class Module {
  toJSON(): ModuleJSON {
    return {
      ast: this.info.ast!,
      attributes: this.info.attributes,
      code: this.info.code!,
      customTransformCache: this.customTransformCache,

      dependencies: Array.from(this.dependencies, getId),
      id: this.id,
      meta: this.info.meta,
      moduleSideEffects: this.info.moduleSideEffects,
      originalCode: this.originalCode,
      originalSourcemap: this.originalSourcemap,
      resolvedIds: this.resolvedIds,
      sourcemapChain: this.sourcemapChain,
      syntheticNamedExports: this.info.syntheticNamedExports,
      transformDependencies: this.transformDependencies,
      transformFiles: this.transformFiles
    };
  }
}

From the above source code, we can see the cached content. The information cached by rollup for modules mainly includes:

Ast

It's worth noting that the ast here is a compat estree ast, not an ast instance tree instantiated through rollup's internal implementation of ast class node. Therefore, even if it's cached, semantic analysis still needs to be performed again.

When the cache is hit, the cached data will be used when calling the Module.setSource method.

type ProgramNode = RollupAstNode<estree.Program>;

class Module {
  async setSource({ ast }: { ast: ProgramNode }) {
    if (ast) {
      this.ast = new nodeConstructors[ast.type](
        programParent,
        this.scope
      ).parseNode(ast) as Program;
      this.info.ast = ast;
    } else {
      // Measuring asynchronous code does not provide reasonable results
      timeEnd('generate ast', 3);
      const astBuffer = await parseAsync(
        code,
        false,
        this.options.jsx !== false
      );
      timeStart('generate ast', 3);
      this.ast = convertProgram(astBuffer, programParent, this.scope);
    }
  }
}

We can see that rollup recursively instantiates the ast node class implemented by rollup through the cached standard estree ast, and finally assigns it to the module.ast variable. The estree ast is assigned to the module.info.ast variable as cache.

this.ast = new nodeConstructors[ast.type](
  programParent,
  this.scope
).parseNode(ast) as Program;

In rollup, the module instance stores two types of ast structures: one is the estree standard ast, stored in the module.info.ast variable; the other is the ast class instance generated by rollup based on the estree structure, stored in the module.ast variable. Subsequent semantic analysis and tree-shaking operations are performed on the ast class instance implemented by rollup.

If the cache is not hit (first build or cache disabled with rawInputOptions.cache = false), rollup will use swc's capabilities to parse the code into ast.

// Measuring asynchronous code does not provide reasonable results
timeEnd('generate ast', 3);
const astBuffer = await parseAsync(code, false, this.options.jsx !== false);
timeStart('generate ast', 3);
this.ast = convertProgram(astBuffer, programParent, this.scope);
// Make lazy and apply LRU cache to not hog the memory
Object.defineProperty(this.info, 'ast', {
  get: () => {
    if (this.graph.astLru.has(fileName)) {
      return this.graph.astLru.get(fileName)!;
    } else {
      const parsedAst = this.tryParse();
      // If the cache is not disabled, we need to keep the AST in memory
      // until the end when the cache is generated
      if (this.options.cache !== false) {
        Object.defineProperty(this.info, 'ast', {
          value: parsedAst
        });
        return parsedAst;
      }
      // Otherwise, we keep it in a small LRU cache to not hog too much
      // memory but allow the same AST to be requested several times.
      this.graph.astLru.set(fileName, parsedAst);
      return parsedAst;
    }
  }
});

For detailed process of generating estree ast using swc, please refer to Native Parser.

Note

rollup and swc pass ast in ArrayBuffer structure, and later instantiate rollup's internally implemented ast node class instances by parsing the ArrayBuffer structure in javascript. Therefore, we can see that rollup implements module.info.ast in a lazy way, generating the javascript structure estree ast when needed (such as caching and plugin reuse of ast).

// Make lazy and apply LRU cache to not hog the memory
Object.defineProperty(this.info, 'ast', {
  get: () => {
    if (this.graph.astLru.has(fileName)) {
      return this.graph.astLru.get(fileName)!;
    } else {
      const parsedAst = this.tryParse();
      // If the cache is not disabled, we need to keep the AST in memory
      // until the end when the cache is generated
      if (this.options.cache !== false) {
        Object.defineProperty(this.info, 'ast', {
          value: parsedAst
        });
        return parsedAst;
      }
      // Otherwise, we keep it in a small LRU cache to not hog too much
      // memory but allow the same AST to be requested several times.
      this.graph.astLru.set(fileName, parsedAst);
      return parsedAst;
    }
  }
});

Transformed Code

From the Cache Plugin section, we can see that caching the parsing results of user plugins' transform hooks is an important means to accelerate builds.

Dependencies

rollup only caches the id of dependent modules, not the ast, code, and other information of dependent modules.

export function getId(m: { id: string | null }): string {
  return m.id!;
}
export default class Module {
  toJSON(): ModuleJSON {
    return {
      ast: this.info.ast!,
      attributes: this.info.attributes,
      code: this.info.code!,
      customTransformCache: this.customTransformCache,

      dependencies: Array.from(this.dependencies, getId),
      id: this.id,
      meta: this.info.meta,
      moduleSideEffects: this.info.moduleSideEffects,
      originalCode: this.originalCode,
      originalSourcemap: this.originalSourcemap,
      resolvedIds: this.resolvedIds,
      sourcemapChain: this.sourcemapChain,
      syntheticNamedExports: this.info.syntheticNamedExports,
      transformDependencies: this.transformDependencies,
      transformFiles: this.transformFiles
    };
  }
}

Assume module a depends on module b

a.jsb.js

import { b } from './b.js';

export const b = 'module b';

Then when module a hits the cache, it will reuse the resolveId result of the dependent module that has been parsed. For module b, rollup skips the resolveId parsing of module b, saving the call of the resolveId plugin.

In other words, if a module hits the cache, then the resolveId hooks of all its dependent modules will not be executed.

Sourcemap

From the Source Map chapter, we can see that rollup internally uses magic string to manage code changes, making it easy to quickly generate sourcemap information. Using magic string to maintain sourcemap information for simple code changes is fine, but using magic string to maintain complex code translation work may be mentally burdensome for some developers.

Therefore, some plugins do not rely on magic string to generate sourcemap information, but generate sourcemap information through tools, which may result in performance loss.

In summary, caching sourcemap information in large projects is also significant in improving build efficiency.

Cache Plugin

The vite plugin system is compatible with the rollup plugin system, so the problems in rollup plugin design also exist in vite. In production environments, the parsing efficiency of plugins has a great impact on the efficiency of bundler building modules. Therefore, one of vite's optimization solutions is Warm Up Frequently Used Files, which alleviates the problem of low plugin execution efficiency by preheating to parse modules in advance.

It can be seen how important it is to cache the parsing results of modules. rollup makes an assumption that plugins have no side effects by default.

Under this assumption, when the input remains unchanged (i.e., original code remains unchanged), the parsing result of the plugin is deterministic (i.e., transformed code remains unchanged), so caching the transformed code skips the execution of the plugin, thereby improving build efficiency.

Of course, user plugins may also have side effects, so rollup provides the following ways to handle user plugins with side effects:

Return true in the shouldTransformCachedModule hook to skip caching for specified modules.

rollup provides this.cache in the plugin context of transform. If the user plugin uses this.cache to set custom cache for specified modules, then rollup will not cache that module.

export function getTrackedPluginCache(
  pluginCache: PluginCache,
  onUse: () => void
): PluginCache {
  return {
    delete(id: string) {
      onUse();
      return pluginCache.delete(id);
    },
    get(id: string) {
      onUse();
      return pluginCache.get(id);
    },
    has(id: string) {
      onUse();
      return pluginCache.has(id);
    },
    set(id: string, value: any) {
      onUse();
      return pluginCache.set(id, value);
    }
  };
}
async function transform(
  source: SourceDescription,
  module: Module,
  pluginDriver: PluginDriver,
  log: LogHandler
): Promise<TransformModuleJSON> {
  let customTransformCache = false;
  const useCustomTransformCache = () => (customTransformCache = true);

  code = await pluginDriver.hookReduceArg0(
    'transform',
    [currentSource, id],
    transformReducer,
    (pluginContext, plugin): TransformPluginContext => {
      pluginName = plugin.name;
      return {
        ...pluginContext,
        cache: customTransformCache
          ? pluginContext.cache
          : getTrackedPluginCache(
              pluginContext.cache,
              useCustomTransformCache
            )
      };
    }
  );
}

An interesting point is that this.cache is a hidden feature, and rollup does not mention it in the official documentation.

rollup in watch mode, each time a file changes, it will re-instantiate the Graph class, and at the same time increase the usage count of pluginCache.

class Graph {
  constructor(
    private readonly options: NormalizedInputOptions,
    watcher: RollupWatcher | null
  ) {
    if (options.cache !== false) {
      if (options.cache?.modules) {
        for (const module of options.cache.modules)
          this.cachedModules.set(module.id, module);
      }
      this.pluginCache = options.cache?.plugins || Object.create(null);

      // increment access counter
      for (const name in this.pluginCache) {
        const cache = this.pluginCache[name];
        for (const value of Object.values(cache)) value[0]++;
      }
    }
  }
}

Users can configure experimentalCacheExpiry to set whether the cache product is valid.

class Graph {
  getCache(): RollupCache {
    for (const name in this.pluginCache) {
      const cache = this.pluginCache[name];
      let allDeleted = true;
      for (const [key, value] of Object.entries(cache)) {
        if (value[0] >= this.options.experimentalCacheExpiry)
          delete cache[key];
        else allDeleted = false;
      }
      if (allDeleted) delete this.pluginCache[name];
    }
  }
}

Decision Logic

Take the following dependency relationship as an example:

In watch mode, during the first build, starting from the entry module A, each module will execute the resolveId, load, and transform hooks. When module B is modified, as shown below:

Then the decision logic is as follows:

Module A as the entry module will execute resolveId every time. At the same time, every module will execute the load hook to get the original code content.
Cache Decision Logic:
- Check if the cached module exists, first check if there is a cached module cachedModule through resolveId, which is actually the module cached during the previous build.
  js
```
const cachedModule = this.graph.cachedModules.get(id);
```
  1
- Check if there is custom transform cache (cachedModule.customTransformCache), if it exists then skip caching.
- Check if the code has changed before and after, if changed then skip caching.
- Call the shouldTransformCachedModule hook to determine if the current module needs to apply cache, if returns true then skip caching (default needs to execute caching operation).
According to the cache strategy in 2, we can see that for the entry module A, it satisfies the cache strategy, so it will not execute the transform hook.
Continue with the cache decision logic for child dependency modules. The child dependency modules of module A are modules B and C. Since module A hits the cache, both modules B and C will not execute the resolveId hook, only get the original code content through the load hook.
For module C, through the cache strategy in 2, module C also hits the cache, so it will not execute the transform hook.
For module B, through the cache strategy in 2, module B has been modified, so it does not hit the cache. Therefore, module B needs to execute the transform hook.
Continue with the cache decision logic for the child dependency modules of module B. The child dependency modules of module B are modules D and E. Since module B does not hit the cache, we need to re-execute resolveId for the child dependency modules. Therefore, both modules D and E will execute the resolveId hook.
For module D, through the cache strategy in 2, module D hits the cache, so it will not execute the transform hook.
Continue with the cache decision logic for the child dependency modules of module D. The child dependency module of module D is module F. Since module D hits the cache, it will reuse the resolveId result of the child dependency module, which means the child dependency module F of D will not execute the resolveId hook.
For module F, through the cache strategy in 2, module F hits the cache, so it will not execute the transform hook. Since there are no child dependency modules, parsing ends.
For module E, through the cache strategy in 2, module E does not hit the cache, so it will execute the transform hook. Since there are no child dependency modules, parsing ends.

Through the above process, we can see that when module B is modified, the following logic occurs:

Module A executes resolveId and load hooks.
Module B executes resolveId and load hooks, and also executes the transform hook.
Module C executes the load hook.
Module D executes resolveId and load hooks.
Module E executes resolveId and load hooks.
Module F executes the load hook.

Compared with cold start build, only the transform hook is executed for the changed module (module B), and the transform hook is not executed for other modules. At the same time, according to the cache decision logic, the number of resolveId hook executions will also be reduced. But what remains unchanged is that every module will execute the load hook.

Performance

rollup's incremental update is similar to react's incremental update of the fiber tree, checking from top to bottom. For the entry module, resolveId is called every time, and at the same time, every module will execute the load hook. If a module hits the cache, it will reuse the transform product of the transform hook of that module, and also reuse the resolveId results of all dependent modules contained in that module.

With the above caching features, compared with the first build in watch mode, it reduces the number of executions of resolveId and transform hooks for many unrelated modules, only executing the transform hook for the entry module and changed modules, accelerating the hot update build speed in watch mode. However, it is regrettable that every module needs to execute the load hook, which is very performance-consuming in scenarios with a large number of modules.

From rollup issue 2182, rollup issue 3728, we can see that rollup currently does not support persistent cache on disk space (Persistent Cache), which means that rollup currently only supports incremental updates in watch mode, not incremental updates during cold start again. webpack supports Persistent Cache, which is also one of the reasons why webpack outperforms rollup in secondary cold start.

Vite Incremental Build

vite has not yet implemented complete Persistent Cache, only supporting Persistent Cache for pre-build products. For details, see feat: Persistent cache-Jul 5, 2021. The reason may be related to changes in some configuration files causing cache invalidation, which needs to be considered more comprehensively. It also provides two caching ideas.

The first idea is to implement caching at the plugin level rather than at the entire dependency graph level. This approach allows for finer granularity and easier management of caching.
The second idea is to pre-transform all requests on the server side and implement a function similar to import-analysis. This function uses the hash value of the transform request as a query parameter and leverages the browser's strong caching mechanism. This method needs to recursively invalidate plugin cache when files change (through file watcher) and when the server restarts (as implemented in this PR). This is similar to the ssrTransformation mechanism in vite.

Contributors

XiSenao

Changelog

Last edited 21 days ago

View full history

Incremental Build ​

Cache Hit Conditions ​

Cache Timing ​

Cache Module ​

Ast ​

Transformed Code ​

Dependencies ​

Sourcemap ​

Cache Plugin ​

Decision Logic ​

Performance ​