The Hashing Dilemma
Related Materials
[v3.0] New hashing algorithm that "fixes (nearly) everything" - GitHub
In rollup
version v3.0
, the hash
algorithm was refactored, introducing a new hash
algorithm that resolves long-standing hash
instability issues, properly handles renderChunk
plugin transformations, and supports circular dependencies.
Problem Statement
The execution flow of the old version's hashing algorithm was as follows:
- Render all modules except for
dynamic imports
andimport.meta
chunk references. - Calculate content hashes for all modules in the chunk based on this.
- Extend the hash by considering all known dependencies and potential dynamic content added to the
chunk wrapper
. - Update dynamic imports and
import.meta
chunk references. - Render the
chunk wrapper
containing allstatic imports
andexports
. - Process the result through the
renderChunk
plugin hook.
In summary, it first calculates the content hash
of all dependent modules in the chunk, then calculates the content hash
of dynamically referenced chunks and import.meta
, and finally updates the chunk's dynamic imports
and import.meta
references.
Existing Issues:
renderChunk
Plugin Hook Breakscontent hash
Any transformations in
renderChunk
are completely ignored byrollup
and do not affect the chunk'shash
value. This leads to situations where different contents can have the samehash
.Complex
chunk wrapper
ScenariosHaving
rollup
maintain every change in thechunk wrapper
to extend hash changes requires considering too many edge cases.Unstable Hash Values
There are cases where a chunk's content hasn't changed, but its
hash
changes due to unrelated modifications (e.g., changing file names).
Solution
Sorted Rendering (Attempting to Solve Hash Issues)
One method to solve hash issues is to first render chunks that have no dependencies, then iteratively render chunks that only depend on already rendered chunks, until all chunks are rendered. While this approach works in some cases, it has several obvious drawbacks:
Doesn't Support Circular Dependencies Between Chunks
This is a very important feature, as in this context, circular dependencies could also be two chunks that dynamically import each other. Additionally,
rollup
heavily relies on a mechanism when handling dynamic imports:rollup
moves all shared dependencies between the dependent chunk and the dependency chunk to the dependent chunk, resulting in a static import of the dependent chunk in the dependency chunk.Mechanism Explanation
Suppose there are three modules, module
main
(entry module), moduleb
, and modulec
, where:- Module
main
dynamically imports moduleb
and statically imports modulec
. - Module
b
dynamically imports modulemain
and statically imports modulec
.
js// main.js import { c } from './c.js'; console.log('a.js'); import('./b.js').then(res => { console.log(res, c); });
js// b.js import { c } from './c.js'; console.log('c.js'); import('./main.js').then(res => { console.log(res, c); });
js// c.js console.log('c.js'); export const c = '123';
In this scenario,
rollup
will move the shared static dependencies (modulec
) between modulemain
and moduleb
to modulemain
. This means there will be a static import of modulemain
in moduleb
.js// main.js console.log('c.js'); const c = '123'; console.log('a.js'); import('./Ckpwfego.js').then(res => { console.log(res, c); }); var main = /*#__PURE__*/ Object.freeze({ __proto__: null }); export { c, main as m };
js// Ckpwfego.js import { c } from './main.js'; console.log('c.js'); import('./main.js') .then(function (n) { return n.m; }) .then(res => { console.log(res, c); });
The above mechanism of
rollup
ensures that when dynamically importing, all dependencies shared with the dynamic import have already been loaded. This mechanism is crucial for handling complex module dependencies, especially in large projects where module dependencies can be complex and intertwined. Through this approach,rollup
can effectively manage and optimize the loading order and dependency relationships of modules.- Module
The sorted rendering chunk algorithm means that before rendering a chunk, we need to understand all its dependencies, and also consider that the
renderChunk
hook might introduce new dependencies.
Hash Placeholders
Therefore, a new solution needs to be introduced. The core idea is to set initial placeholders for filename references, so that the calculated hash
value is independent of filenames and only focuses on the chunk's own content.
Execution flow is as follows:
Assign an initial filename to each chunk. If the filename does not contain a hash (no
[hash]
placeholder inoptions.chunkFileNames
), this will be the final filename; but if the filename contains a hash, use an equal-length placeholder instead.tsclass Chunk { private preliminaryFileName: PreliminaryFileName | null = null; getPreliminaryFileName(): PreliminaryFileName { if (this.preliminaryFileName) { return this.preliminaryFileName; } let fileName: string; let hashPlaceholder: string | null = null; const { chunkFileNames, entryFileNames, file, format, preserveModules } = this.outputOptions; if (file) { fileName = basename(file); } else if (this.fileName === null) { const [pattern, patternName] = preserveModules || this.facadeModule?.isUserDefinedEntryPoint ? [entryFileNames, 'output.entryFileNames'] : [chunkFileNames, 'output.chunkFileNames']; fileName = renderNamePattern( typeof pattern === 'function' ? pattern(this.getPreRenderedChunkInfo()) : pattern, patternName, { format: () => format, hash: size => hashPlaceholder || (hashPlaceholder = this.getPlaceholder( patternName, size || DEFAULT_HASH_SIZE )), name: () => this.getChunkName() } ); if (!hashPlaceholder) { fileName = makeUnique(fileName, this.bundle); } } else { fileName = this.fileName; } if (!hashPlaceholder) { this.bundle[fileName] = FILE_PLACEHOLDER; } // Caching is essential to not conflict with the file name reservation above return (this.preliminaryFileName = { fileName, hashPlaceholder }); } getFileName(): string { return this.fileName || this.getPreliminaryFileName().fileName; } getImportPath(importer: string): string { return escapeId( getImportPath( importer, this.getFileName(), this.outputOptions.format === 'amd' && !this.outputOptions.amd.forceJsExtensionForImports, true ) ); } }
ts// Four random characters from the private use area to minimize risk of // conflicts const hashPlaceholderLeft = '!~{'; const hashPlaceholderRight = '}~'; const hashPlaceholderOverhead = hashPlaceholderLeft.length + hashPlaceholderRight.length; // This is the size of a 128-bits xxhash with base64url encoding const MAX_HASH_SIZE = 21; export const DEFAULT_HASH_SIZE = 8; export const getHashPlaceholderGenerator = (): HashPlaceholderGenerator => { let nextIndex = 0; return (optionName, hashSize) => { if (hashSize > MAX_HASH_SIZE) { return error( logFailedValidation( `Hashes cannot be longer than ${MAX_HASH_SIZE} characters, received ${hashSize}. Check the "${optionName}" option.` ) ); } const placeholder = `${hashPlaceholderLeft}${toBase64( ++nextIndex ).padStart( hashSize - hashPlaceholderOverhead, '0' )}${hashPlaceholderRight}`; if (placeholder.length > hashSize) { return error( logFailedValidation( `To generate hashes for this number of chunks (currently ${nextIndex}), you need a minimum hash size of ${placeholder.length}, received ${hashSize}. Check the "${optionName}" option.` ) ); } return placeholder; }; };
Render all modules in the chunk. Since we already have the initial filename from step 1, we can directly render all dynamic imports and
import.meta
chunk references. The old algorithm calculated thechunk content hash
separately from thedynamic import chunk hash
andimport.meta chunk hash
, then calculated them together again. The new algorithm calculates thehash
only once, and subsequent modifications to thehash
value are related to the chunk's content, not the filename.Render the
chunk wrapper
, also using the initial filename to handle chunk imports.Purpose of
chunk wrapper
Essentially, the
chunk wrapper
operation is key to forminginterop
between chunks.Because a single chunk is rendered from
one
ormultiple
modules
, during rendering,rollup
replacesimport/export
statements between modules with specific references from the modules (e.g., convertingimport
to direct references to exported variables).However, between chunks,
rollup
(or users through thesplitChunks
plugin configuration) performs further optimization on thechunk graph
, potentially building new chunk dependencies (dynamic imports or static imports). Therefore,rollup
needs to use thechunk wrapper
operation to forminterop
between chunks, ensuring the completeness of the dependency chain.tsclass Chunk { async render(): Promise<ChunkRenderResult> { const { intro, outro, banner, footer } = await createAddons( outputOptions, pluginDriver, this.getRenderedChunkInfo() ); finalisers[format]( renderedSource, { accessedGlobals, dependencies: renderedDependencies, exports: renderedExports, hasDefaultExport, hasExports, id: preliminaryFileName.fileName, indent, intro, isEntryFacade: preserveModules || (facadeModule !== null && facadeModule.info.isEntry), isModuleFacade: facadeModule !== null, log: onLog, namedExportsMode: exportMode !== 'default', outro, snippets, usesTopLevelAwait }, outputOptions ); if (banner) magicString.prepend(banner); if (format === 'es' || format === 'cjs') { const shebang = facadeModule !== null && facadeModule.info.isEntry && facadeModule.shebang; if (shebang) { magicString.prepend(`#!${shebang}\n`); } } if (footer) magicString.append(footer); } }
tsexport default function es( magicString: MagicStringBundle, { accessedGlobals, indent: t, intro, outro, dependencies, exports, snippets }: FinaliserOptions, { externalLiveBindings, freeze, generatedCode: { symbols }, importAttributesKey }: NormalizedOutputOptions ): void { const { n } = snippets; const importBlock = getImportBlock( dependencies, importAttributesKey, snippets ); if (importBlock.length > 0) intro += importBlock.join(n) + n + n; intro += getHelpersBlock( null, accessedGlobals, t, snippets, externalLiveBindings, freeze, symbols ); if (intro) magicString.prepend(intro); const exportBlock = getExportBlock(exports, snippets); if (exportBlock.length > 0) magicString.append(n + n + exportBlock.join(n).trim()); if (outro) magicString.append(outro); magicString.trim(); }
Process the chunk through the
renderChunk
hook.The new algorithm also allows access to the complete
chunk graph
in therenderChunk
plugin hook, although at this point the names are initial placeholders. However, sincerollup
makes no assumptions about the output ofrenderChunk
, you can now freely inject chunk names in this hook.tsconst chunkGraph = getChunkGraph(chunks); async function transformChunk( magicString: MagicStringBundle, fileName: string, usedModules: Module[], chunkGraph: Record<string, RenderedChunk>, options: NormalizedOutputOptions, outputPluginDriver: PluginDriver, log: LogHandler ) { const code = await outputPluginDriver.hookReduceArg0( 'renderChunk', [ magicString.toString(), chunkGraph[fileName], options, { chunks: chunkGraph } ], (code, result, plugin) => { if (result == null) return code; if (typeof result === 'string') result = { code: result, map: undefined }; // strict null check allows 'null' maps to not be pushed to the chain, while 'undefined' gets the missing map warning if (result.map !== null) { const map = decodedSourcemap(result.map); sourcemapChain.push( map || { missing: true, plugin: plugin.name } ); } return result.code; } ); } function getChunkGraph(chunks: Chunk[]) { return Object.fromEntries( chunks.map(chunk => { const renderedChunkInfo = chunk.getRenderedChunkInfo(); return [renderedChunkInfo.fileName, renderedChunkInfo]; }) ); }
Calculate the pure content hash of the chunk by replacing all placeholders in the chunk with default placeholders and generating the hash.
To ensure that the hash value is only related to the chunk's content itself and remains consistent across different builds, we need to replace the placeholders in the chunk with a fixed, identical value before calculating the hash. This way, the hash value won't be affected by the specific content of the placeholders, thus ensuring consistency and reproducibility.
tsconst REPLACER_REGEX = new RegExp( `${hashPlaceholderLeft}[0-9a-zA-Z_$]{1,${ MAX_HASH_SIZE - hashPlaceholderOverhead }}${hashPlaceholderRight}`, 'g' ); export const replacePlaceholdersWithDefaultAndGetContainedPlaceholders = ( code: string, placeholders: Set<string> ): { containedPlaceholders: Set<string>; transformedCode: string } => { const containedPlaceholders = new Set<string>(); const transformedCode = code.replace(REPLACER_REGEX, placeholder => { if (placeholders.has(placeholder)) { containedPlaceholders.add(placeholder); return `${hashPlaceholderLeft}${'0'.repeat( placeholder.length - hashPlaceholderOverhead )}${hashPlaceholderRight}`; } return placeholder; }); return { containedPlaceholders, transformedCode }; };
Enhance the chunk's
content-hash
through theaugmentChunkHash
hook.tsconst { containedPlaceholders, transformedCode } = replacePlaceholdersWithDefaultAndGetContainedPlaceholders(code, placeholders); let contentToHash = transformedCode; const hashAugmentation = pluginDriver.hookReduceValueSync( 'augmentChunkHash', '', [chunk.getRenderedChunkInfo()], (augmentation, pluginHash) => { if (pluginHash) { augmentation += pluginHash; } return augmentation; } ); if (hashAugmentation) { contentToHash += hashAugmentation; }
After all chunks have completed their
content-hash
calculations, calculate the finalhash
by searching for which placeholders are contained in each chunk and updating the chunk'shash
. Recursively retrieve thecontent-hash
of all dependent chunks in the chunk and merge them to enhance the final chunk'scontent-hash
.tsfunction generateFinalHashes( renderedChunksByPlaceholder: Map<string, RenderedChunkWithPlaceholders>, hashDependenciesByPlaceholder: Map<string, HashResult>, initialHashesByPlaceholder: Map<string, string>, placeholders: Set<string>, bundle: OutputBundleWithPlaceholders, getHash: GetHash ) { const hashesByPlaceholder = new Map<string, string>(initialHashesByPlaceholder); for (const placeholder of placeholders) { const { fileName } = renderedChunksByPlaceholder.get(placeholder)!; let contentToHash = ''; const hashDependencyPlaceholders = new Set<string>([placeholder]); for (const dependencyPlaceholder of hashDependencyPlaceholders) { const { containedPlaceholders, contentHash } = hashDependenciesByPlaceholder.get(dependencyPlaceholder)!; contentToHash += contentHash; for (const containedPlaceholder of containedPlaceholders) { // When looping over a map, setting an entry only causes a new iteration if the key is new hashDependencyPlaceholders.add(containedPlaceholder); } } let finalFileName: string | undefined; let finalHash: string | undefined; do { // In case of a hash collision, create a hash of the hash if (finalHash) { contentToHash = finalHash; } finalHash = getHash(contentToHash).slice(0, placeholder.length); finalFileName = replaceSinglePlaceholder(fileName, placeholder, finalHash); } while (bundle[lowercaseBundleKeys].has(finalFileName.toLowerCase())); bundle[finalFileName] = FILE_PLACEHOLDER; hashesByPlaceholder.set(placeholder, finalHash); } return hashesByPlaceholder; }
Use the final hash to replace placeholders. Since the initial filename was used in step 1, there is no need to update the source map because the placeholders were replaced with equal-length placeholders.
tsimport { replacePlaceholders } from './hashPlaceholders'; function addChunksToBundle( renderedChunksByPlaceholder: Map<string, RenderedChunkWithPlaceholders>, hashesByPlaceholder: Map<string, string>, bundle: OutputBundleWithPlaceholders, nonHashedChunksWithPlaceholders: RenderedChunkWithPlaceholders[], pluginDriver: PluginDriver, options: NormalizedOutputOptions ) { for (const { chunk, code, fileName, sourcemapFileName, map } of renderedChunksByPlaceholder.values()) { let updatedCode = replacePlaceholders(code, hashesByPlaceholder); const finalFileName = replacePlaceholders(fileName, hashesByPlaceholder); } }
tsconst REPLACER_REGEX = new RegExp( `${hashPlaceholderLeft}[0-9a-zA-Z_$]{1,${ MAX_HASH_SIZE - hashPlaceholderOverhead }}${hashPlaceholderRight}`, 'g' ); export const replacePlaceholders = ( code: string, hashesByPlaceholder: Map<string, string> ): string => code.replace( REPLACER_REGEX, placeholder => hashesByPlaceholder.get(placeholder) || placeholder );
To avoid accidental replacement of non-placeholders, placeholders utilize the feature of javascript
supporting unicode
characters. Random characters from the reserved plane are used, such as \uf7f9\ue4d3
(placeholder start) and \ue3cc\uf1fe
(placeholder end).
Placeholder Transformation
[v3.0] Use ASCII characters for hash placeholders made improvements to placeholders to address the following issues:
Prevent Escaping Issues
Using
unicode
characters can be automatically escaped in certain toolchains, causing placeholders to be corrupted.Better Debugging Experience
Compared to incomprehensible
unicode
characters, the new format uses visibleascii
characters, making placeholders immediately recognizable, allowing developers to quickly identify their association with a particularchunk
.Reduce Risk of False Matches
The new pattern
_!~{\d+}~
is not validjavascript
syntax and will only appear in strings and comments. Even if incorrectly replaced, it will only cause limited damage, as it will only be replaced when there is an exact match of the specific number sequence.
New Algorithm Impact
Plugin Hook Execution Flow Diagram
The plugin hook execution flow diagram has changed. Here is the rewritten flow diagram:
Compared to the pre-transformation flow diagram:
The following changes have occurred:
Changes in Execution Timing
- The execution timing of
banner
,footer
,intro
, andoutro
plugin hooks has been delayed. Previously, they were executed after therenderStart
plugin hook. Now they are executed before therenderChunk
plugin hook. - The execution timing of the
augmentChunkHash
plugin hook has been delayed. Previously, it was executed after therenderDynamicImport
plugin hook decision. Now it is executed after therenderChunk
plugin hook.
- The execution timing of
Changes in Execution Mode
- The
banner
,footer
,intro
, andoutro
plugin hooks have been changed from parallel execution to sequential execution.
- The
Available Chunk
Information in Hooks
Some hooks can now receive additional information. Before detailing these changes, let's define several key types:
PrerenderedChunk
PrerenderedChunk
contains basic chunk information before any rendering occurs and before the chunk name is generated. After this update, this simplified chunk information is only passed to the entryFileNames
and chunkFileNames
options. From the new flow diagram above, we can see that at this stage it's impossible to obtain information about already rendered modules. As an alternative, it now includes a moduleIds
list, allowing developers to roughly understand what's contained in the chunk.
interface PreRenderedChunk {
exports: string[];
facadeModuleId: string | null;
isDynamicEntry: boolean;
isEntry: boolean;
isImplicitEntry: boolean;
moduleIds: string[];
name: string;
type: 'chunk';
}
RenderedChunk
RenderedChunk
contains complete rendering information for the chunk. The imports
and filenames in rendered modules will contain placeholders instead of file hashes. RenderedChunk
is available in the renderChunk
hook, augmentChunkHash
hook, and banner
, footer
, intro
, outro
hooks and options.
Additionally, the signature of renderChunk
has been extended with a fourth parameter meta: { chunks: { [fileName: string]: RenderedChunk } }
, providing access to the entire chunk graph
.
Additional Points to Note
When adding or removing imports
or exports
in renderChunk
, rollup
will not do additional work to help maintain the RenderedChunk
object. Therefore, user plugins should now be careful to maintain the RenderChunk
object themselves, updating the latest RenderedChunk
object information. This will provide correct information for subsequent plugins and the final bundle
. Because later, rollup
will replace imports
, importedBindings
, and dynamicImports
placeholders based on the information in the RenderedChunk
object to generate the final hash value (except for implicitlyLoadedBefore
and fileName
).
interface RenderedChunk {
dynamicImports: string[];
exports: string[];
facadeModuleId: string | null;
fileName: string;
implicitlyLoadedBefore: string[];
importedBindings: {
[imported: string]: string[];
};
imports: string[];
isDynamicEntry: boolean;
isEntry: boolean;
isImplicitEntry: boolean;
moduleIds: string[];
modules: {
[id: string]: RenderedModule;
};
name: string;
referencedFiles: string[];
type: 'chunk';
}
New Features
intro
,outro
,banner
,footer
as functions are now called for each chunk. Although they cannot access rendered modules in the chunk, they will receive a list of allmoduleIds
contained in the chunk.Hash length can be changed in the filename pattern, for example,
[name]-[hash:12].js
will create a hash with a length of12
characters.
Breaking Changes
entryFileNames
andchunkFileNames
cannot access themodules
object that contains rendered module content. Instead, they can access the list of containedmoduleIds
.- The order of plugin hooks has changed, please compare the above diagram with the diagram in the Rollup documentation.
- The
fileName
and referencedimports
in therenderChunk
hook will get filenames with placeholders instead of hashes. However, these filenames can still be safely used in the hook's return value, as any hash placeholder will eventually be replaced with the actual hash.
Test Cases
import('./b.js').then(res => {
console.log(res);
});
import('./c.js').then(res => {
console.log(res);
});
export const qux = 'QUX';
export const c = 'c';
import { defineConfig } from 'rollup';
export default defineConfig({
input: 'main.js',
output: {
dir: 'dist',
format: 'es',
chunkFileNames: '[hash].js'
}
});
The bundled output is as follows:
import('./CM53L61n.js').then(res => {
console.log(res);
});
import('./CPjDz2XZ.js').then(res => {
console.log(res);
});
const qux = 'QUX';
export { qux };
const c = 'c';
export { c };
If we only change the filename of b.js
to bNext.js
, keeping everything else the same:
import('./bNext.js').then(res => {
console.log(res);
});
import('./c.js').then(res => {
console.log(res);
});
export const qux = 'QUX';
export const c = 'c';
import { defineConfig } from 'rollup';
export default defineConfig({
input: 'main.js',
output: {
dir: 'dist',
format: 'es',
chunkFileNames: '[hash].js'
}
});
The bundled output is as follows:
import('./CM53L61n.js').then(res => {
console.log(res);
});
import('./CPjDz2XZ.js').then(res => {
console.log(res);
});
const qux = 'QUX';
export { qux };
const c = 'c';
export { c };
Through the above examples, we can see that in the new algorithm, file name changes do not cause the hash
value of the chunk
to change.