Incremental Build
Currently, rollup
automatically uses the cache
mechanism to implement incremental builds in watch
mode. In cache
mode, rollup
's incremental build is similar to react
's incremental update of the fiber
tree, only compiling modules related to file modifications and not compiling other unrelated modules. Let's first introduce the decision logic of rollup
's cache
mechanism.
Cache Hit Conditions
The specific implementation of rollup
's cache hit logic is as follows:
const cachedModule = this.graph.cachedModules.get(id);
if (
cachedModule &&
!cachedModule.customTransformCache &&
cachedModule.originalCode === sourceDescription.code &&
!(await this.pluginDriver.hookFirst('shouldTransformCachedModule', [
{
ast: cachedModule.ast,
code: cachedModule.code,
id: cachedModule.id,
meta: cachedModule.meta,
moduleSideEffects: cachedModule.moduleSideEffects,
resolvedSources: cachedModule.resolvedIds,
syntheticNamedExports: cachedModule.syntheticNamedExports
}
]))
) {
if (cachedModule.transformFiles) {
for (const emittedFile of cachedModule.transformFiles)
this.pluginDriver.emitFile(emittedFile);
}
await module.setSource(cachedModule);
}
The following conditions must be met to reuse the cache:
Check if the cached module exists, first check if there is a cached module
cachedModule
throughresolveId
. Sincerollup
does not currently supportpersistent cache
,cachedModule
does not exist during the initial build, so caching is skipped.jsconst cachedModule = this.graph.cachedModules.get(id);
If custom caching is used in the plugin (i.e., the user plugin actively uses the
this.cache
provided by the plugin context to set cache for specified module resolution), the modules associated withthis.cache
will skip caching.Check if the code has changed before and after, if changed then skip caching.
Call the
shouldTransformCachedModule
hook to skip caching for specified modules.
Cache Timing
After rollup
builds all modules, it triggers the buildEnd
event. Then rollup
confirms whether caching is needed through the rawInputOptions.cache
flag.
export async function rollupInternal(
rawInputOptions: RollupOptions,
watcher: RollupWatcher | null
): Promise<RollupBuild> {
// remove the cache object from the memory after graph creation (cache is not used anymore)
const useCache = rawInputOptions.cache !== false;
await catchUnfinishedHookActions(graph.pluginDriver, async () => {
try {
timeStart('initialize', 2);
await graph.pluginDriver.hookParallel('buildStart', [inputOptions]);
timeEnd('initialize', 2);
await graph.build();
} catch (error_: any) {
const watchFiles = Object.keys(graph.watchFiles);
if (watchFiles.length > 0) {
error_.watchFiles = watchFiles;
}
await graph.pluginDriver.hookParallel('buildEnd', [error_]);
await graph.pluginDriver.hookParallel('closeBundle', []);
throw error_;
}
await graph.pluginDriver.hookParallel('buildEnd', []);
});
const result: RollupBuild = {
cache: useCache ? graph.getCache() : undefined
};
return result;
}
By default, rollup
will perform caching operations (i.e., rawInputOptions.cache !== false
is true), so it caches the required information through graph.getCache()
.
class Graph {
getCache(): RollupCache {
// handle plugin cache eviction
for (const name in this.pluginCache) {
const cache = this.pluginCache[name];
let allDeleted = true;
for (const [key, value] of Object.entries(cache)) {
if (value[0] >= this.options.experimentalCacheExpiry)
delete cache[key];
else allDeleted = false;
}
if (allDeleted) delete this.pluginCache[name];
}
return {
modules: this.modules.map(module => module.toJSON()),
plugins: this.pluginCache
};
}
}
From the source code, we can see that rollup
attempts to cache information about modules
and plugins
. Let's look at what specific information is cached.
Cache Module
The module's cache content is generated through the module.toJSON()
method. Let's look at the implementation of the module.toJSON()
method.
export default class Module {
toJSON(): ModuleJSON {
return {
ast: this.info.ast!,
attributes: this.info.attributes,
code: this.info.code!,
customTransformCache: this.customTransformCache,
dependencies: Array.from(this.dependencies, getId),
id: this.id,
meta: this.info.meta,
moduleSideEffects: this.info.moduleSideEffects,
originalCode: this.originalCode,
originalSourcemap: this.originalSourcemap,
resolvedIds: this.resolvedIds,
sourcemapChain: this.sourcemapChain,
syntheticNamedExports: this.info.syntheticNamedExports,
transformDependencies: this.transformDependencies,
transformFiles: this.transformFiles
};
}
}
From the above source code, we can see the cached content. The information cached by rollup
for modules mainly includes:
Ast
It's worth noting that the ast
here is a compat estree ast
, not an ast
instance tree instantiated through rollup
's internal implementation of ast class node
. Therefore, even if it's cached, semantic analysis still needs to be performed again.
When the cache is hit, the cached data will be used when calling the Module.setSource
method.
type ProgramNode = RollupAstNode<estree.Program>;
class Module {
async setSource({ ast }: { ast: ProgramNode }) {
if (ast) {
this.ast = new nodeConstructors[ast.type](
programParent,
this.scope
).parseNode(ast) as Program;
this.info.ast = ast;
} else {
// Measuring asynchronous code does not provide reasonable results
timeEnd('generate ast', 3);
const astBuffer = await parseAsync(
code,
false,
this.options.jsx !== false
);
timeStart('generate ast', 3);
this.ast = convertProgram(astBuffer, programParent, this.scope);
}
}
}
We can see that rollup
recursively instantiates the ast node
class implemented by rollup
through the cached standard estree ast
, and finally assigns it to the module.ast
variable. The estree ast
is assigned to the module.info.ast
variable as cache.
this.ast = new nodeConstructors[ast.type](
programParent,
this.scope
).parseNode(ast) as Program;
In rollup
, the module
instance stores two types of ast
structures: one is the estree
standard ast
, stored in the module.info.ast
variable; the other is the ast
class instance generated by rollup
based on the estree
structure, stored in the module.ast
variable. Subsequent semantic analysis and tree-shaking
operations are performed on the ast
class instance implemented by rollup
.
If the cache is not hit (first build or cache disabled with rawInputOptions.cache = false
), rollup
will use swc
's capabilities to parse the code into ast
.
// Measuring asynchronous code does not provide reasonable results
timeEnd('generate ast', 3);
const astBuffer = await parseAsync(code, false, this.options.jsx !== false);
timeStart('generate ast', 3);
this.ast = convertProgram(astBuffer, programParent, this.scope);
// Make lazy and apply LRU cache to not hog the memory
Object.defineProperty(this.info, 'ast', {
get: () => {
if (this.graph.astLru.has(fileName)) {
return this.graph.astLru.get(fileName)!;
} else {
const parsedAst = this.tryParse();
// If the cache is not disabled, we need to keep the AST in memory
// until the end when the cache is generated
if (this.options.cache !== false) {
Object.defineProperty(this.info, 'ast', {
value: parsedAst
});
return parsedAst;
}
// Otherwise, we keep it in a small LRU cache to not hog too much
// memory but allow the same AST to be requested several times.
this.graph.astLru.set(fileName, parsedAst);
return parsedAst;
}
}
});
For detailed process of generating
estree ast
usingswc
, please refer to Native Parser.
Note
rollup
and swc
pass ast
in ArrayBuffer
structure, and later instantiate rollup
's internally implemented ast node class
instances by parsing the ArrayBuffer
structure in javascript
. Therefore, we can see that rollup
implements module.info.ast
in a lazy way, generating the javascript
structure estree ast
when needed (such as caching and plugin reuse of ast
).
// Make lazy and apply LRU cache to not hog the memory
Object.defineProperty(this.info, 'ast', {
get: () => {
if (this.graph.astLru.has(fileName)) {
return this.graph.astLru.get(fileName)!;
} else {
const parsedAst = this.tryParse();
// If the cache is not disabled, we need to keep the AST in memory
// until the end when the cache is generated
if (this.options.cache !== false) {
Object.defineProperty(this.info, 'ast', {
value: parsedAst
});
return parsedAst;
}
// Otherwise, we keep it in a small LRU cache to not hog too much
// memory but allow the same AST to be requested several times.
this.graph.astLru.set(fileName, parsedAst);
return parsedAst;
}
}
});
Transformed Code
From the Cache Plugin section, we can see that caching the parsing results of user plugins' transform
hooks is an important means to accelerate builds.
Dependencies
rollup
only caches the id
of dependent modules, not the ast
, code
, and other information of dependent modules.
export function getId(m: { id: string | null }): string {
return m.id!;
}
export default class Module {
toJSON(): ModuleJSON {
return {
ast: this.info.ast!,
attributes: this.info.attributes,
code: this.info.code!,
customTransformCache: this.customTransformCache,
dependencies: Array.from(this.dependencies, getId),
id: this.id,
meta: this.info.meta,
moduleSideEffects: this.info.moduleSideEffects,
originalCode: this.originalCode,
originalSourcemap: this.originalSourcemap,
resolvedIds: this.resolvedIds,
sourcemapChain: this.sourcemapChain,
syntheticNamedExports: this.info.syntheticNamedExports,
transformDependencies: this.transformDependencies,
transformFiles: this.transformFiles
};
}
}
Assume module a
depends on module b
import { b } from './b.js';
export const b = 'module b';
Then when module a
hits the cache, it will reuse the resolveId
result of the dependent module that has been parsed. For module b
, rollup
skips the resolveId
parsing of module b
, saving the call of the resolveId
plugin.
In other words, if a module hits the cache, then the
resolveId
hooks of all its dependent modules will not be executed.
Sourcemap
From the Source Map
chapter, we can see that rollup
internally uses magic string
to manage code changes, making it easy to quickly generate sourcemap
information. Using magic string
to maintain sourcemap
information for simple code changes is fine, but using magic string
to maintain complex code translation work may be mentally burdensome for some developers.
Therefore, some plugins do not rely on magic string
to generate sourcemap
information, but generate sourcemap
information through tools, which may result in performance loss.
In summary, caching sourcemap
information in large projects is also significant in improving build efficiency.
Cache Plugin
The vite
plugin system is compatible with the rollup
plugin system, so the problems in rollup
plugin design also exist in vite
. In production environments, the parsing efficiency of plugins has a great impact on the efficiency of bundler
building modules. Therefore, one of vite
's optimization solutions is Warm Up Frequently Used Files, which alleviates the problem of low plugin execution efficiency by preheating to parse modules in advance.
It can be seen how important it is to cache the parsing results of modules. rollup
makes an assumption that plugins have no side effects by default.
Under this assumption, when the input remains unchanged (i.e., original code
remains unchanged), the parsing result of the plugin is deterministic (i.e., transformed code
remains unchanged), so caching the transformed code
skips the execution of the plugin, thereby improving build efficiency.
Of course, user plugins may also have side effects, so rollup
provides the following ways to handle user plugins with side effects:
Return
true
in theshouldTransformCachedModule
hook to skip caching for specified modules.rollup
providesthis.cache
in the plugin context oftransform
. If the user plugin usesthis.cache
to set custom cache for specified modules, thenrollup
will not cache that module.tsexport function getTrackedPluginCache( pluginCache: PluginCache, onUse: () => void ): PluginCache { return { delete(id: string) { onUse(); return pluginCache.delete(id); }, get(id: string) { onUse(); return pluginCache.get(id); }, has(id: string) { onUse(); return pluginCache.has(id); }, set(id: string, value: any) { onUse(); return pluginCache.set(id, value); } }; } async function transform( source: SourceDescription, module: Module, pluginDriver: PluginDriver, log: LogHandler ): Promise<TransformModuleJSON> { let customTransformCache = false; const useCustomTransformCache = () => (customTransformCache = true); code = await pluginDriver.hookReduceArg0( 'transform', [currentSource, id], transformReducer, (pluginContext, plugin): TransformPluginContext => { pluginName = plugin.name; return { ...pluginContext, cache: customTransformCache ? pluginContext.cache : getTrackedPluginCache( pluginContext.cache, useCustomTransformCache ) }; } ); }
An interesting point is that
this.cache
is a hidden feature, androllup
does not mention it in the official documentation.
rollup
in watch
mode, each time a file changes, it will re-instantiate the Graph
class, and at the same time increase the usage count of pluginCache
.
class Graph {
constructor(
private readonly options: NormalizedInputOptions,
watcher: RollupWatcher | null
) {
if (options.cache !== false) {
if (options.cache?.modules) {
for (const module of options.cache.modules)
this.cachedModules.set(module.id, module);
}
this.pluginCache = options.cache?.plugins || Object.create(null);
// increment access counter
for (const name in this.pluginCache) {
const cache = this.pluginCache[name];
for (const value of Object.values(cache)) value[0]++;
}
}
}
}
Users can configure experimentalCacheExpiry
to set whether the cache product is valid.
class Graph {
getCache(): RollupCache {
for (const name in this.pluginCache) {
const cache = this.pluginCache[name];
let allDeleted = true;
for (const [key, value] of Object.entries(cache)) {
if (value[0] >= this.options.experimentalCacheExpiry)
delete cache[key];
else allDeleted = false;
}
if (allDeleted) delete this.pluginCache[name];
}
}
}
Decision Logic
Take the following dependency relationship as an example:
In watch
mode, during the first build, starting from the entry module A
, each module will execute the resolveId
, load
, and transform
hooks. When module B
is modified, as shown below:
Then the decision logic is as follows:
Module
A
as the entry module will executeresolveId
every time. At the same time, every module will execute theload
hook to get theoriginal code
content.Cache Decision Logic:
Check if the cached module exists, first check if there is a cached module
cachedModule
throughresolveId
, which is actually the module cached during the previous build.jsconst cachedModule = this.graph.cachedModules.get(id);
Check if there is custom transform cache (
cachedModule.customTransformCache
), if it exists then skip caching.Check if the code has changed before and after, if changed then skip caching.
Call the
shouldTransformCachedModule
hook to determine if the current module needs to apply cache, if returnstrue
then skip caching (default needs to execute caching operation).
According to the cache strategy in
2
, we can see that for the entry moduleA
, it satisfies the cache strategy, so it will not execute thetransform
hook.Continue with the cache decision logic for child dependency modules. The child dependency modules of module
A
are modulesB
andC
. Since moduleA
hits the cache, both modulesB
andC
will not execute theresolveId
hook, only get theoriginal code
content through theload
hook.For module
C
, through the cache strategy in2
, moduleC
also hits the cache, so it will not execute thetransform
hook.For module
B
, through the cache strategy in2
, moduleB
has been modified, so it does not hit the cache. Therefore, moduleB
needs to execute thetransform
hook.Continue with the cache decision logic for the child dependency modules of module
B
. The child dependency modules of moduleB
are modulesD
andE
. Since moduleB
does not hit the cache, we need to re-executeresolveId
for the child dependency modules. Therefore, both modulesD
andE
will execute theresolveId
hook.For module
D
, through the cache strategy in2
, moduleD
hits the cache, so it will not execute thetransform
hook.Continue with the cache decision logic for the child dependency modules of module
D
. The child dependency module of moduleD
is moduleF
. Since moduleD
hits the cache, it will reuse theresolveId
result of the child dependency module, which means the child dependency moduleF
ofD
will not execute theresolveId
hook.For module
F
, through the cache strategy in2
, moduleF
hits the cache, so it will not execute thetransform
hook. Since there are no child dependency modules, parsing ends.For module
E
, through the cache strategy in2
, moduleE
does not hit the cache, so it will execute thetransform
hook. Since there are no child dependency modules, parsing ends.
Through the above process, we can see that when module B
is modified, the following logic occurs:
- Module
A
executesresolveId
andload
hooks. - Module
B
executesresolveId
andload
hooks, and also executes thetransform
hook. - Module
C
executes theload
hook. - Module
D
executesresolveId
andload
hooks. - Module
E
executesresolveId
andload
hooks. - Module
F
executes theload
hook.
Compared with cold start build, only the transform
hook is executed for the changed module (module B
), and the transform
hook is not executed for other modules. At the same time, according to the cache decision logic, the number of resolveId
hook executions will also be reduced. But what remains unchanged is that every module will execute the load
hook.
Performance
rollup
's incremental update is similar to react
's incremental update of the fiber
tree, checking from top to bottom. For the entry module, resolveId
is called every time, and at the same time, every module will execute the load
hook. If a module hits the cache, it will reuse the transform product of the transform
hook of that module, and also reuse the resolveId
results of all dependent modules contained in that module.
With the above caching features, compared with the first build in watch
mode, it reduces the number of executions of resolveId
and transform
hooks for many unrelated modules, only executing the transform
hook for the entry module and changed modules, accelerating the hot update build speed in watch
mode. However, it is regrettable that every module needs to execute the load
hook, which is very performance-consuming in scenarios with a large number of modules.
From rollup issue 2182, rollup issue 3728, we can see that rollup
currently does not support persistent cache on disk space (Persistent Cache
), which means that rollup
currently only supports incremental updates in watch
mode, not incremental updates during cold start again. webpack
supports Persistent Cache
, which is also one of the reasons why webpack
outperforms rollup
in secondary cold start.
Vite Incremental Build
vite
has not yet implemented complete Persistent Cache
, only supporting Persistent Cache
for pre-build products. For details, see feat: Persistent cache-Jul 5, 2021. The reason may be related to changes in some configuration files causing cache invalidation, which needs to be considered more comprehensively. It also provides two caching ideas.
The first idea is to implement caching at the plugin level rather than at the entire dependency graph level. This approach allows for finer granularity and easier management of caching.
The second idea is to pre-transform all requests on the server side and implement a function similar to
import-analysis
. This function uses the hash value of the transform request as a query parameter and leverages the browser's strong caching mechanism. This method needs to recursively invalidate plugin cache when files change (through file watcher) and when the server restarts (as implemented in thisPR
). This is similar to thessrTransformation
mechanism invite
.