Incremental Build
Currently, rollup automatically uses the cache mechanism to implement incremental builds in watch mode. In cache mode, rollup's incremental build is similar to react's incremental update of the fiber tree, only compiling modules related to file modifications and not compiling other unrelated modules. Let's first introduce the decision logic of rollup's cache mechanism.
Cache Hit Conditions
The specific implementation of rollup's cache hit logic is as follows:
const cachedModule = this.graph.cachedModules.get(id);
if (
cachedModule &&
!cachedModule.customTransformCache &&
cachedModule.originalCode === sourceDescription.code &&
!(await this.pluginDriver.hookFirst('shouldTransformCachedModule', [
{
ast: cachedModule.ast,
code: cachedModule.code,
id: cachedModule.id,
meta: cachedModule.meta,
moduleSideEffects: cachedModule.moduleSideEffects,
resolvedSources: cachedModule.resolvedIds,
syntheticNamedExports: cachedModule.syntheticNamedExports
}
]))
) {
if (cachedModule.transformFiles) {
for (const emittedFile of cachedModule.transformFiles)
this.pluginDriver.emitFile(emittedFile);
}
await module.setSource(cachedModule);
}The following conditions must be met to reuse the cache:
Check if the cached module exists, first check if there is a cached module
cachedModulethroughresolveId. Sincerollupdoes not currently supportpersistent cache,cachedModuledoes not exist during the initial build, so caching is skipped.jsconst cachedModule = this.graph.cachedModules.get(id);If custom caching is used in the plugin (i.e., the user plugin actively uses the
this.cacheprovided by the plugin context to set cache for specified module resolution), the modules associated withthis.cachewill skip caching.Check if the code has changed before and after, if changed then skip caching.
Call the
shouldTransformCachedModulehook to skip caching for specified modules.
Cache Timing
After rollup builds all modules, it triggers the buildEnd event. Then rollup confirms whether caching is needed through the rawInputOptions.cache flag.
export async function rollupInternal(
rawInputOptions: RollupOptions,
watcher: RollupWatcher | null
): Promise<RollupBuild> {
// remove the cache object from the memory after graph creation (cache is not used anymore)
const useCache = rawInputOptions.cache !== false;
await catchUnfinishedHookActions(graph.pluginDriver, async () => {
try {
timeStart('initialize', 2);
await graph.pluginDriver.hookParallel('buildStart', [inputOptions]);
timeEnd('initialize', 2);
await graph.build();
} catch (error_: any) {
const watchFiles = Object.keys(graph.watchFiles);
if (watchFiles.length > 0) {
error_.watchFiles = watchFiles;
}
await graph.pluginDriver.hookParallel('buildEnd', [error_]);
await graph.pluginDriver.hookParallel('closeBundle', []);
throw error_;
}
await graph.pluginDriver.hookParallel('buildEnd', []);
});
const result: RollupBuild = {
cache: useCache ? graph.getCache() : undefined
};
return result;
}By default, rollup will perform caching operations (i.e., rawInputOptions.cache !== false is true), so it caches the required information through graph.getCache().
class Graph {
getCache(): RollupCache {
// handle plugin cache eviction
for (const name in this.pluginCache) {
const cache = this.pluginCache[name];
let allDeleted = true;
for (const [key, value] of Object.entries(cache)) {
if (value[0] >= this.options.experimentalCacheExpiry)
delete cache[key];
else allDeleted = false;
}
if (allDeleted) delete this.pluginCache[name];
}
return {
modules: this.modules.map(module => module.toJSON()),
plugins: this.pluginCache
};
}
}From the source code, we can see that rollup attempts to cache information about modules and plugins. Let's look at what specific information is cached.
Cache Module
The module's cache content is generated through the module.toJSON() method. Let's look at the implementation of the module.toJSON() method.
export default class Module {
toJSON(): ModuleJSON {
return {
ast: this.info.ast!,
attributes: this.info.attributes,
code: this.info.code!,
customTransformCache: this.customTransformCache,
dependencies: Array.from(this.dependencies, getId),
id: this.id,
meta: this.info.meta,
moduleSideEffects: this.info.moduleSideEffects,
originalCode: this.originalCode,
originalSourcemap: this.originalSourcemap,
resolvedIds: this.resolvedIds,
sourcemapChain: this.sourcemapChain,
syntheticNamedExports: this.info.syntheticNamedExports,
transformDependencies: this.transformDependencies,
transformFiles: this.transformFiles
};
}
}From the above source code, we can see the cached content. The information cached by rollup for modules mainly includes:
Ast
It's worth noting that the ast here is a compat estree ast, not an ast instance tree instantiated through rollup's internal implementation of ast class node. Therefore, even if it's cached, semantic analysis still needs to be performed again.
When the cache is hit, the cached data will be used when calling the Module.setSource method.
type ProgramNode = RollupAstNode<estree.Program>;
class Module {
async setSource({ ast }: { ast: ProgramNode }) {
if (ast) {
this.ast = new nodeConstructors[ast.type](
programParent,
this.scope
).parseNode(ast) as Program;
this.info.ast = ast;
} else {
// Measuring asynchronous code does not provide reasonable results
timeEnd('generate ast', 3);
const astBuffer = await parseAsync(
code,
false,
this.options.jsx !== false
);
timeStart('generate ast', 3);
this.ast = convertProgram(astBuffer, programParent, this.scope);
}
}
}We can see that rollup recursively instantiates the ast node class implemented by rollup through the cached standard estree ast, and finally assigns it to the module.ast variable. The estree ast is assigned to the module.info.ast variable as cache.
this.ast = new nodeConstructors[ast.type](
programParent,
this.scope
).parseNode(ast) as Program;In rollup, the module instance stores two types of ast structures: one is the estree standard ast, stored in the module.info.ast variable; the other is the ast class instance generated by rollup based on the estree structure, stored in the module.ast variable. Subsequent semantic analysis and tree-shaking operations are performed on the ast class instance implemented by rollup.
If the cache is not hit (first build or cache disabled with rawInputOptions.cache = false), rollup will use swc's capabilities to parse the code into ast.
// Measuring asynchronous code does not provide reasonable results
timeEnd('generate ast', 3);
const astBuffer = await parseAsync(code, false, this.options.jsx !== false);
timeStart('generate ast', 3);
this.ast = convertProgram(astBuffer, programParent, this.scope);
// Make lazy and apply LRU cache to not hog the memory
Object.defineProperty(this.info, 'ast', {
get: () => {
if (this.graph.astLru.has(fileName)) {
return this.graph.astLru.get(fileName)!;
} else {
const parsedAst = this.tryParse();
// If the cache is not disabled, we need to keep the AST in memory
// until the end when the cache is generated
if (this.options.cache !== false) {
Object.defineProperty(this.info, 'ast', {
value: parsedAst
});
return parsedAst;
}
// Otherwise, we keep it in a small LRU cache to not hog too much
// memory but allow the same AST to be requested several times.
this.graph.astLru.set(fileName, parsedAst);
return parsedAst;
}
}
});For detailed process of generating
estree astusingswc, please refer to Native Parser.
Note
rollup and swc pass ast in ArrayBuffer structure, and later instantiate rollup's internally implemented ast node class instances by parsing the ArrayBuffer structure in javascript. Therefore, we can see that rollup implements module.info.ast in a lazy way, generating the javascript structure estree ast when needed (such as caching and plugin reuse of ast).
// Make lazy and apply LRU cache to not hog the memory
Object.defineProperty(this.info, 'ast', {
get: () => {
if (this.graph.astLru.has(fileName)) {
return this.graph.astLru.get(fileName)!;
} else {
const parsedAst = this.tryParse();
// If the cache is not disabled, we need to keep the AST in memory
// until the end when the cache is generated
if (this.options.cache !== false) {
Object.defineProperty(this.info, 'ast', {
value: parsedAst
});
return parsedAst;
}
// Otherwise, we keep it in a small LRU cache to not hog too much
// memory but allow the same AST to be requested several times.
this.graph.astLru.set(fileName, parsedAst);
return parsedAst;
}
}
});Transformed Code
From the Cache Plugin section, we can see that caching the parsing results of user plugins' transform hooks is an important means to accelerate builds.
Dependencies
rollup only caches the id of dependent modules, not the ast, code, and other information of dependent modules.
export function getId(m: { id: string | null }): string {
return m.id!;
}
export default class Module {
toJSON(): ModuleJSON {
return {
ast: this.info.ast!,
attributes: this.info.attributes,
code: this.info.code!,
customTransformCache: this.customTransformCache,
dependencies: Array.from(this.dependencies, getId),
id: this.id,
meta: this.info.meta,
moduleSideEffects: this.info.moduleSideEffects,
originalCode: this.originalCode,
originalSourcemap: this.originalSourcemap,
resolvedIds: this.resolvedIds,
sourcemapChain: this.sourcemapChain,
syntheticNamedExports: this.info.syntheticNamedExports,
transformDependencies: this.transformDependencies,
transformFiles: this.transformFiles
};
}
}Assume module a depends on module b
import { b } from './b.js';export const b = 'module b';Then when module a hits the cache, it will reuse the resolveId result of the dependent module that has been parsed. For module b, rollup skips the resolveId parsing of module b, saving the call of the resolveId plugin.
In other words, if a module hits the cache, then the
resolveIdhooks of all its dependent modules will not be executed.
Sourcemap
From the Source Map chapter, we can see that rollup internally uses magic string to manage code changes, making it easy to quickly generate sourcemap information. Using magic string to maintain sourcemap information for simple code changes is fine, but using magic string to maintain complex code translation work may be mentally burdensome for some developers.
Therefore, some plugins do not rely on magic string to generate sourcemap information, but generate sourcemap information through tools, which may result in performance loss.
In summary, caching sourcemap information in large projects is also significant in improving build efficiency.
Cache Plugin
The vite plugin system is compatible with the rollup plugin system, so the problems in rollup plugin design also exist in vite. In production environments, the parsing efficiency of plugins has a great impact on the efficiency of bundler building modules. Therefore, one of vite's optimization solutions is Warm Up Frequently Used Files, which alleviates the problem of low plugin execution efficiency by preheating to parse modules in advance.
It can be seen how important it is to cache the parsing results of modules. rollup makes an assumption that plugins have no side effects by default.
Under this assumption, when the input remains unchanged (i.e., original code remains unchanged), the parsing result of the plugin is deterministic (i.e., transformed code remains unchanged), so caching the transformed code skips the execution of the plugin, thereby improving build efficiency.
Of course, user plugins may also have side effects, so rollup provides the following ways to handle user plugins with side effects:
Return
truein theshouldTransformCachedModulehook to skip caching for specified modules.rollupprovidesthis.cachein the plugin context oftransform. If the user plugin usesthis.cacheto set custom cache for specified modules, thenrollupwill not cache that module.tsexport function getTrackedPluginCache( pluginCache: PluginCache, onUse: () => void ): PluginCache { return { delete(id: string) { onUse(); return pluginCache.delete(id); }, get(id: string) { onUse(); return pluginCache.get(id); }, has(id: string) { onUse(); return pluginCache.has(id); }, set(id: string, value: any) { onUse(); return pluginCache.set(id, value); } }; } async function transform( source: SourceDescription, module: Module, pluginDriver: PluginDriver, log: LogHandler ): Promise<TransformModuleJSON> { let customTransformCache = false; const useCustomTransformCache = () => (customTransformCache = true); code = await pluginDriver.hookReduceArg0( 'transform', [currentSource, id], transformReducer, (pluginContext, plugin): TransformPluginContext => { pluginName = plugin.name; return { ...pluginContext, cache: customTransformCache ? pluginContext.cache : getTrackedPluginCache( pluginContext.cache, useCustomTransformCache ) }; } ); }An interesting point is that
this.cacheis a hidden feature, androllupdoes not mention it in the official documentation.
rollup in watch mode, each time a file changes, it will re-instantiate the Graph class, and at the same time increase the usage count of pluginCache.
class Graph {
constructor(
private readonly options: NormalizedInputOptions,
watcher: RollupWatcher | null
) {
if (options.cache !== false) {
if (options.cache?.modules) {
for (const module of options.cache.modules)
this.cachedModules.set(module.id, module);
}
this.pluginCache = options.cache?.plugins || Object.create(null);
// increment access counter
for (const name in this.pluginCache) {
const cache = this.pluginCache[name];
for (const value of Object.values(cache)) value[0]++;
}
}
}
}Users can configure experimentalCacheExpiry to set whether the cache product is valid.
class Graph {
getCache(): RollupCache {
for (const name in this.pluginCache) {
const cache = this.pluginCache[name];
let allDeleted = true;
for (const [key, value] of Object.entries(cache)) {
if (value[0] >= this.options.experimentalCacheExpiry)
delete cache[key];
else allDeleted = false;
}
if (allDeleted) delete this.pluginCache[name];
}
}
}Decision Logic
Take the following dependency relationship as an example:
In watch mode, during the first build, starting from the entry module A, each module will execute the resolveId, load, and transform hooks. When module B is modified, as shown below:
Then the decision logic is as follows:
Module
Aas the entry module will executeresolveIdevery time. At the same time, every module will execute theloadhook to get theoriginal codecontent.Cache Decision Logic:
Check if the cached module exists, first check if there is a cached module
cachedModulethroughresolveId, which is actually the module cached during the previous build.jsconst cachedModule = this.graph.cachedModules.get(id);Check if there is custom transform cache (
cachedModule.customTransformCache), if it exists then skip caching.Check if the code has changed before and after, if changed then skip caching.
Call the
shouldTransformCachedModulehook to determine if the current module needs to apply cache, if returnstruethen skip caching (default needs to execute caching operation).
According to the cache strategy in
2, we can see that for the entry moduleA, it satisfies the cache strategy, so it will not execute thetransformhook.Continue with the cache decision logic for child dependency modules. The child dependency modules of module
Aare modulesBandC. Since moduleAhits the cache, both modulesBandCwill not execute theresolveIdhook, only get theoriginal codecontent through theloadhook.For module
C, through the cache strategy in2, moduleCalso hits the cache, so it will not execute thetransformhook.For module
B, through the cache strategy in2, moduleBhas been modified, so it does not hit the cache. Therefore, moduleBneeds to execute thetransformhook.Continue with the cache decision logic for the child dependency modules of module
B. The child dependency modules of moduleBare modulesDandE. Since moduleBdoes not hit the cache, we need to re-executeresolveIdfor the child dependency modules. Therefore, both modulesDandEwill execute theresolveIdhook.For module
D, through the cache strategy in2, moduleDhits the cache, so it will not execute thetransformhook.Continue with the cache decision logic for the child dependency modules of module
D. The child dependency module of moduleDis moduleF. Since moduleDhits the cache, it will reuse theresolveIdresult of the child dependency module, which means the child dependency moduleFofDwill not execute theresolveIdhook.For module
F, through the cache strategy in2, moduleFhits the cache, so it will not execute thetransformhook. Since there are no child dependency modules, parsing ends.For module
E, through the cache strategy in2, moduleEdoes not hit the cache, so it will execute thetransformhook. Since there are no child dependency modules, parsing ends.
Through the above process, we can see that when module B is modified, the following logic occurs:
- Module
AexecutesresolveIdandloadhooks. - Module
BexecutesresolveIdandloadhooks, and also executes thetransformhook. - Module
Cexecutes theloadhook. - Module
DexecutesresolveIdandloadhooks. - Module
EexecutesresolveIdandloadhooks. - Module
Fexecutes theloadhook.
Compared with cold start build, only the transform hook is executed for the changed module (module B), and the transform hook is not executed for other modules. At the same time, according to the cache decision logic, the number of resolveId hook executions will also be reduced. But what remains unchanged is that every module will execute the load hook.
Performance
rollup's incremental update is similar to react's incremental update of the fiber tree, checking from top to bottom. For the entry module, resolveId is called every time, and at the same time, every module will execute the load hook. If a module hits the cache, it will reuse the transform product of the transform hook of that module, and also reuse the resolveId results of all dependent modules contained in that module.
With the above caching features, compared with the first build in watch mode, it reduces the number of executions of resolveId and transform hooks for many unrelated modules, only executing the transform hook for the entry module and changed modules, accelerating the hot update build speed in watch mode. However, it is regrettable that every module needs to execute the load hook, which is very performance-consuming in scenarios with a large number of modules.
From rollup issue 2182, rollup issue 3728, we can see that rollup currently does not support persistent cache on disk space (Persistent Cache), which means that rollup currently only supports incremental updates in watch mode, not incremental updates during cold start again. webpack supports Persistent Cache, which is also one of the reasons why webpack outperforms rollup in secondary cold start.
Vite Incremental Build
vite has not yet implemented complete Persistent Cache, only supporting Persistent Cache for pre-build products. For details, see feat: Persistent cache-Jul 5, 2021. The reason may be related to changes in some configuration files causing cache invalidation, which needs to be considered more comprehensively. It also provides two caching ideas.
The first idea is to implement caching at the plugin level rather than at the entire dependency graph level. This approach allows for finer granularity and easier management of caching.
The second idea is to pre-transform all requests on the server side and implement a function similar to
import-analysis. This function uses the hash value of the transform request as a query parameter and leverages the browser's strong caching mechanism. This method needs to recursively invalidate plugin cache when files change (through file watcher) and when the server restarts (as implemented in thisPR). This is similar to thessrTransformationmechanism invite.