The Hashing Dilemma

Related Materials

In rollup version v3.0, the hash algorithm was refactored, introducing a new hash algorithm that resolves long-standing hash instability issues, properly handles renderChunk plugin transformations, and supports circular dependencies.

Problem Statement

The execution flow of the old version's hashing algorithm was as follows:

Render all modules except for dynamic imports and import.meta chunk references.
Calculate content hashes for all modules in the chunk based on this.
Extend the hash by considering all known dependencies and potential dynamic content added to the chunk wrapper.
Update dynamic imports and import.meta chunk references.
Render the chunk wrapper containing all static imports and exports.
Process the result through the renderChunk plugin hook.

In summary, it first calculates the content hash of all dependent modules in the chunk, then calculates the content hash of dynamically referenced chunks and import.meta, and finally updates the chunk's dynamic imports and import.meta references.

Existing Issues:

renderChunk Plugin Hook Breaks content hash
Any transformations in renderChunk are completely ignored by rollup and do not affect the chunk's hash value. This leads to situations where different contents can have the same hash.
Complex chunk wrapper Scenarios
Having rollup maintain every change in the chunk wrapper to extend hash changes requires considering too many edge cases.
Unstable Hash Values
There are cases where a chunk's content hasn't changed, but its hash changes due to unrelated modifications (e.g., changing file names).

Solution

Sorted Rendering (Attempting to Solve Hash Issues)

One method to solve hash issues is to first render chunks that have no dependencies, then iteratively render chunks that only depend on already rendered chunks, until all chunks are rendered. While this approach works in some cases, it has several obvious drawbacks:

Doesn't Support Circular Dependencies Between Chunks
This is a very important feature, as in this context, circular dependencies could also be two chunks that dynamically import each other. Additionally, rollup heavily relies on a mechanism when handling dynamic imports:
rollup moves all shared dependencies between the dependent chunk and the dependency chunk to the dependent chunk, resulting in a static import of the dependent chunk in the dependency chunk.
Mechanism Explanation
Suppose there are three modules, module main (entry module), module b, and module c, where:
- Module main dynamically imports module b and statically imports module c.
- Module b dynamically imports module main and statically imports module c.
js
```
// main.js
import { c } from './c.js';
console.log('a.js');
import('./b.js').then(res => {
  console.log(res, c);
});
```
1
2
3
4
5
6
js
```
// b.js
import { c } from './c.js';
console.log('c.js');
import('./main.js').then(res => {
  console.log(res, c);
});
```
1
2
3
4
5
6
js
```
// c.js
console.log('c.js');
export const c = '123';
```
1
2
3
In this scenario, rollup will move the shared static dependencies (module c) between module main and module b to module main. This means there will be a static import of module main in module b.
js
```
// main.js
console.log('c.js');
const c = '123';

console.log('a.js');
import('./Ckpwfego.js').then(res => {
  console.log(res, c);
});

var main = /*#__PURE__*/ Object.freeze({
  __proto__: null
});

export { c, main as m };
```
1
2
3
4
5
6
7
8
9
10
11
12
13
14
js
```
// Ckpwfego.js
import { c } from './main.js';

console.log('c.js');
import('./main.js')
  .then(function (n) {
    return n.m;
  })
  .then(res => {
    console.log(res, c);
  });
```
1
2
3
4
5
6
7
8
9
10
11
The above mechanism of rollup ensures that when dynamically importing, all dependencies shared with the dynamic import have already been loaded. This mechanism is crucial for handling complex module dependencies, especially in large projects where module dependencies can be complex and intertwined. Through this approach, rollup can effectively manage and optimize the loading order and dependency relationships of modules.
The sorted rendering chunk algorithm means that before rendering a chunk, we need to understand all its dependencies, and also consider that the renderChunk hook might introduce new dependencies.

Hash Placeholders

Therefore, a new solution needs to be introduced. The core idea is to set initial placeholders for filename references, so that the calculated hash value is independent of filenames and only focuses on the chunk's own content.

Execution flow is as follows:

Assign an initial filename to each chunk. If the filename does not contain a hash (no [hash] placeholder in options.chunkFileNames), this will be the final filename; but if the filename contains a hash, use an equal-length placeholder instead.

rollup/src/Chunk.tsrollup/src/utils/hashPlaceholders.ts

class Chunk {

  private preliminaryFileName: PreliminaryFileName | null = null;

  getPreliminaryFileName(): PreliminaryFileName {
    if (this.preliminaryFileName) {
      return this.preliminaryFileName;
    }
    let fileName: string;
    let hashPlaceholder: string | null = null;
    const {
      chunkFileNames,
      entryFileNames,
      file,
      format,
      preserveModules
    } = this.outputOptions;
    if (file) {
      fileName = basename(file);
    } else if (this.fileName === null) {
      const [pattern, patternName] =
        preserveModules || this.facadeModule?.isUserDefinedEntryPoint
          ? [entryFileNames, 'output.entryFileNames']
          : [chunkFileNames, 'output.chunkFileNames'];
      fileName = renderNamePattern(
        typeof pattern === 'function'
          ? pattern(this.getPreRenderedChunkInfo())
          : pattern,
        patternName,
        {
          format: () => format,
          hash: size =>
            hashPlaceholder ||
            (hashPlaceholder = this.getPlaceholder(
              patternName,
              size || DEFAULT_HASH_SIZE
            )),
          name: () => this.getChunkName()
        }
      );
      if (!hashPlaceholder) {
        fileName = makeUnique(fileName, this.bundle);
      }
    } else {
      fileName = this.fileName;
    }
    if (!hashPlaceholder) {
      this.bundle[fileName] = FILE_PLACEHOLDER;
    }
    // Caching is essential to not conflict with the file name reservation above
    return (this.preliminaryFileName = { fileName, hashPlaceholder });
  }

  getFileName(): string {
    return this.fileName || this.getPreliminaryFileName().fileName;
  }

  getImportPath(importer: string): string {
    return escapeId(
      getImportPath(
        importer,
        this.getFileName(),
        this.outputOptions.format === 'amd' && !this.outputOptions.amd.forceJsExtensionForImports,
        true
     )
   );
  }
}

// Four random characters from the private use area to minimize risk of
// conflicts
const hashPlaceholderLeft = '!~{';
const hashPlaceholderRight = '}~';
const hashPlaceholderOverhead =
  hashPlaceholderLeft.length + hashPlaceholderRight.length;
// This is the size of a 128-bits xxhash with base64url encoding
const MAX_HASH_SIZE = 21;
export const DEFAULT_HASH_SIZE = 8;
export const getHashPlaceholderGenerator =
  (): HashPlaceholderGenerator => {
    let nextIndex = 0;
    return (optionName, hashSize) => {
      if (hashSize > MAX_HASH_SIZE) {
        return error(
          logFailedValidation(
            `Hashes cannot be longer than ${MAX_HASH_SIZE} characters, received ${hashSize}. Check the "${optionName}" option.`
          )
        );
      }
      const placeholder = `${hashPlaceholderLeft}${toBase64(
        ++nextIndex
      ).padStart(
        hashSize - hashPlaceholderOverhead,
        '0'
      )}${hashPlaceholderRight}`;
      if (placeholder.length > hashSize) {
        return error(
          logFailedValidation(
            `To generate hashes for this number of chunks (currently ${nextIndex}), you need a minimum hash size of ${placeholder.length}, received ${hashSize}. Check the "${optionName}" option.`
          )
        );
      }
      return placeholder;
    };
  };

Render all modules in the chunk. Since we already have the initial filename from step 1, we can directly render all dynamic imports and import.meta chunk references. The old algorithm calculated the chunk content hash separately from the dynamic import chunk hash and import.meta chunk hash, then calculated them together again. The new algorithm calculates the hash only once, and subsequent modifications to the hash value are related to the chunk's content, not the filename.

Render the chunk wrapper, also using the initial filename to handle chunk imports.

Purpose of chunk wrapper

Essentially, the chunk wrapper operation is key to forming interop between chunks.

Because a single chunk is rendered from one or multiple modules, during rendering, rollup replaces import/export statements between modules with specific references from the modules (e.g., converting import to direct references to exported variables).

However, between chunks, rollup (or users through the splitChunks plugin configuration) performs further optimization on the chunk graph, potentially building new chunk dependencies (dynamic imports or static imports). Therefore, rollup needs to use the chunk wrapper operation to form interop between chunks, ensuring the completeness of the dependency chain.

rollup/src/Chunk.tsrollup/src/finalisers/es.ts

class Chunk {
  async render(): Promise<ChunkRenderResult> {
    const { intro, outro, banner, footer } = await createAddons(
      outputOptions,
      pluginDriver,
      this.getRenderedChunkInfo()
    );
    finalisers[format](
      renderedSource,
      {
        accessedGlobals,
        dependencies: renderedDependencies,
        exports: renderedExports,
        hasDefaultExport,
        hasExports,
        id: preliminaryFileName.fileName,
        indent,
        intro,
        isEntryFacade:
          preserveModules ||
          (facadeModule !== null && facadeModule.info.isEntry),
        isModuleFacade: facadeModule !== null,
        log: onLog,
        namedExportsMode: exportMode !== 'default',
        outro,
        snippets,
        usesTopLevelAwait
      },
      outputOptions
    );
    if (banner) magicString.prepend(banner);
    if (format === 'es' || format === 'cjs') {
      const shebang =
        facadeModule !== null &&
        facadeModule.info.isEntry &&
        facadeModule.shebang;
      if (shebang) {
        magicString.prepend(`#!${shebang}\n`);
      }
    }
    if (footer) magicString.append(footer);
  }
}

export default function es(
  magicString: MagicStringBundle,
  {
    accessedGlobals,
    indent: t,
    intro,
    outro,
    dependencies,
    exports,
    snippets
  }: FinaliserOptions,
  {
    externalLiveBindings,
    freeze,
    generatedCode: { symbols },
    importAttributesKey
  }: NormalizedOutputOptions
): void {
  const { n } = snippets;

  const importBlock = getImportBlock(
    dependencies,
    importAttributesKey,
    snippets
  );
  if (importBlock.length > 0) intro += importBlock.join(n) + n + n;
  intro += getHelpersBlock(
    null,
    accessedGlobals,
    t,
    snippets,
    externalLiveBindings,
    freeze,
    symbols
  );
  if (intro) magicString.prepend(intro);

  const exportBlock = getExportBlock(exports, snippets);
  if (exportBlock.length > 0)
    magicString.append(n + n + exportBlock.join(n).trim());
  if (outro) magicString.append(outro);

  magicString.trim();
}

Process the chunk through the renderChunk hook.

The new algorithm also allows access to the complete chunk graph in the renderChunk plugin hook, although at this point the names are initial placeholders. However, since rollup makes no assumptions about the output of renderChunk, you can now freely inject chunk names in this hook.

rollup/src/utils/renderChunk.ts

const chunkGraph = getChunkGraph(chunks);

async function transformChunk(
  magicString: MagicStringBundle,
  fileName: string,
  usedModules: Module[],
  chunkGraph: Record<string, RenderedChunk>,
  options: NormalizedOutputOptions,
  outputPluginDriver: PluginDriver,
  log: LogHandler
) {
  const code = await outputPluginDriver.hookReduceArg0(
    'renderChunk',
    [
      magicString.toString(),
      chunkGraph[fileName],
      options,
      { chunks: chunkGraph }
    ],
    (code, result, plugin) => {
      if (result == null) return code;

      if (typeof result === 'string')
        result = {
          code: result,
          map: undefined
        };

      // strict null check allows 'null' maps to not be pushed to the chain, while 'undefined' gets the missing map warning
      if (result.map !== null) {
        const map = decodedSourcemap(result.map);
        sourcemapChain.push(
          map || { missing: true, plugin: plugin.name }
        );
      }

      return result.code;
    }
  );
}

function getChunkGraph(chunks: Chunk[]) {
   return Object.fromEntries(
     chunks.map(chunk => {
       const renderedChunkInfo = chunk.getRenderedChunkInfo();
       return [renderedChunkInfo.fileName, renderedChunkInfo];
     })
   );
 }

Calculate the pure content hash of the chunk by replacing all placeholders in the chunk with default placeholders and generating the hash.

To ensure that the hash value is only related to the chunk's content itself and remains consistent across different builds, we need to replace the placeholders in the chunk with a fixed, identical value before calculating the hash. This way, the hash value won't be affected by the specific content of the placeholders, thus ensuring consistency and reproducibility.

rollup/src/utils/hashPlaceholders.ts

const REPLACER_REGEX = new RegExp(
  `${hashPlaceholderLeft}[0-9a-zA-Z_$]{1,${
    MAX_HASH_SIZE - hashPlaceholderOverhead
  }}${hashPlaceholderRight}`,
  'g'
);
export const replacePlaceholdersWithDefaultAndGetContainedPlaceholders =
  (
    code: string,
    placeholders: Set<string>
  ): { containedPlaceholders: Set<string>; transformedCode: string } => {
    const containedPlaceholders = new Set<string>();
    const transformedCode = code.replace(REPLACER_REGEX, placeholder => {
      if (placeholders.has(placeholder)) {
        containedPlaceholders.add(placeholder);
        return `${hashPlaceholderLeft}${'0'.repeat(
          placeholder.length - hashPlaceholderOverhead
        )}${hashPlaceholderRight}`;
      }
      return placeholder;
    });
    return { containedPlaceholders, transformedCode };
  };

Enhance the chunk's content-hash through the augmentChunkHash hook.

rollup/src/utils/renderChunk.ts

 const { containedPlaceholders, transformedCode } =
   replacePlaceholdersWithDefaultAndGetContainedPlaceholders(code, placeholders);
 let contentToHash = transformedCode;
 const hashAugmentation = pluginDriver.hookReduceValueSync(
   'augmentChunkHash',
   '',
   [chunk.getRenderedChunkInfo()],
   (augmentation, pluginHash) => {
     if (pluginHash) {
       augmentation += pluginHash;
     }
     return augmentation;
   }
 );
 if (hashAugmentation) {
   contentToHash += hashAugmentation;
 }

After all chunks have completed their content-hash calculations, calculate the final hash by searching for which placeholders are contained in each chunk and updating the chunk's hash. Recursively retrieve the content-hash of all dependent chunks in the chunk and merge them to enhance the final chunk's content-hash.

rollup/src/utils/renderChunk.ts

  function generateFinalHashes(
    renderedChunksByPlaceholder: Map<string, RenderedChunkWithPlaceholders>,
    hashDependenciesByPlaceholder: Map<string, HashResult>,
    initialHashesByPlaceholder: Map<string, string>,
    placeholders: Set<string>,
    bundle: OutputBundleWithPlaceholders,
    getHash: GetHash
  ) {
    const hashesByPlaceholder = new Map<string, string>(initialHashesByPlaceholder);
    for (const placeholder of placeholders) {
      const { fileName } = renderedChunksByPlaceholder.get(placeholder)!;
      let contentToHash = '';
      const hashDependencyPlaceholders = new Set<string>([placeholder]);
      for (const dependencyPlaceholder of hashDependencyPlaceholders) {
        const { containedPlaceholders, contentHash } =
          hashDependenciesByPlaceholder.get(dependencyPlaceholder)!;
        contentToHash += contentHash;
        for (const containedPlaceholder of containedPlaceholders) {
          // When looping over a map, setting an entry only causes a new iteration if the key is new
          hashDependencyPlaceholders.add(containedPlaceholder);
        }
      }
      let finalFileName: string | undefined;
      let finalHash: string | undefined;
      do {
        // In case of a hash collision, create a hash of the hash
        if (finalHash) {
          contentToHash = finalHash;
        }
        finalHash = getHash(contentToHash).slice(0, placeholder.length);
        finalFileName = replaceSinglePlaceholder(fileName, placeholder, finalHash);
      } while (bundle[lowercaseBundleKeys].has(finalFileName.toLowerCase()));
      bundle[finalFileName] = FILE_PLACEHOLDER;
      hashesByPlaceholder.set(placeholder, finalHash);
    }
    return hashesByPlaceholder;
  }

Use the final hash to replace placeholders. Since the initial filename was used in step 1, there is no need to update the source map because the placeholders were replaced with equal-length placeholders.

rollup/src/utils/renderChunk.tsrollup/src/utils/hashPlaceholders.ts

import { replacePlaceholders } from './hashPlaceholders';

function addChunksToBundle(
   renderedChunksByPlaceholder: Map<string, RenderedChunkWithPlaceholders>,
   hashesByPlaceholder: Map<string, string>,
   bundle: OutputBundleWithPlaceholders,
   nonHashedChunksWithPlaceholders: RenderedChunkWithPlaceholders[],
   pluginDriver: PluginDriver,
   options: NormalizedOutputOptions
) {
   for (const {
     chunk,
     code,
     fileName,
     sourcemapFileName,
     map
   } of renderedChunksByPlaceholder.values()) {
     let updatedCode = replacePlaceholders(code, hashesByPlaceholder);
     const finalFileName = replacePlaceholders(fileName, hashesByPlaceholder);
   }
}

const REPLACER_REGEX = new RegExp(
  `${hashPlaceholderLeft}[0-9a-zA-Z_$]{1,${
    MAX_HASH_SIZE - hashPlaceholderOverhead
  }}${hashPlaceholderRight}`,
  'g'
);
export const replacePlaceholders = (
  code: string,
  hashesByPlaceholder: Map<string, string>
): string =>
  code.replace(
    REPLACER_REGEX,
    placeholder => hashesByPlaceholder.get(placeholder) || placeholder
  );

To avoid accidental replacement of non-placeholders, placeholders utilize the feature of javascript supporting unicode characters. Random characters from the reserved plane are used, such as \uf7f9\ue4d3 (placeholder start) and \ue3cc\uf1fe (placeholder end).

Placeholder Transformation

[v3.0] Use ASCII characters for hash placeholders made improvements to placeholders to address the following issues:

Prevent Escaping Issues
Using unicode characters can be automatically escaped in certain toolchains, causing placeholders to be corrupted.
Better Debugging Experience
Compared to incomprehensible unicode characters, the new format uses visible ascii characters, making placeholders immediately recognizable, allowing developers to quickly identify their association with a particular chunk.
Reduce Risk of False Matches
The new pattern _!~{\d+}~ is not valid javascript syntax and will only appear in strings and comments. Even if incorrectly replaced, it will only cause limited damage, as it will only be replaced when there is an exact match of the specific number sequence.

New Algorithm Impact

Plugin Hook Execution Flow Diagram

The plugin hook execution flow diagram has changed. Here is the rewritten flow diagram:

parallel

sequential

first

async

sync

Compared to the pre-transformation flow diagram:

The following changes have occurred:

Changes in Execution Timing
- The execution timing of banner, footer, intro, and outro plugin hooks has been delayed. Previously, they were executed after the renderStart plugin hook. Now they are executed before the renderChunk plugin hook.
- The execution timing of the augmentChunkHash plugin hook has been delayed. Previously, it was executed after the renderDynamicImport plugin hook decision. Now it is executed after the renderChunk plugin hook.
Changes in Execution Mode
- The banner, footer, intro, and outro plugin hooks have been changed from parallel execution to sequential execution.

Available `Chunk` Information in Hooks

Some hooks can now receive additional information. Before detailing these changes, let's define several key types:

`PrerenderedChunk`

PrerenderedChunk contains basic chunk information before any rendering occurs and before the chunk name is generated. After this update, this simplified chunk information is only passed to the entryFileNames and chunkFileNames options. From the new flow diagram above, we can see that at this stage it's impossible to obtain information about already rendered modules. As an alternative, it now includes a moduleIds list, allowing developers to roughly understand what's contained in the chunk.

typescript

interface PreRenderedChunk {
  exports: string[];
  facadeModuleId: string | null;
  isDynamicEntry: boolean;
  isEntry: boolean;
  isImplicitEntry: boolean;
  moduleIds: string[];
  name: string;
  type: 'chunk';
}

`RenderedChunk`

RenderedChunk contains complete rendering information for the chunk. The imports and filenames in rendered modules will contain placeholders instead of file hashes. RenderedChunk is available in the renderChunk hook, augmentChunkHash hook, and banner, footer, intro, outro hooks and options.

Additionally, the signature of renderChunk has been extended with a fourth parameter meta: { chunks: { [fileName: string]: RenderedChunk } }, providing access to the entire chunk graph.

Additional Points to Note

When adding or removing imports or exports in renderChunk, rollup will not do additional work to help maintain the RenderedChunk object. Therefore, user plugins should now be careful to maintain the RenderChunk object themselves, updating the latest RenderedChunk object information. This will provide correct information for subsequent plugins and the final bundle. Because later, rollup will replace imports, importedBindings, and dynamicImports placeholders based on the information in the RenderedChunk object to generate the final hash value (except for implicitlyLoadedBefore and fileName).

typescript

interface RenderedChunk {
  dynamicImports: string[];
  exports: string[];
  facadeModuleId: string | null;
  fileName: string;
  implicitlyLoadedBefore: string[];
  importedBindings: {
    [imported: string]: string[];
  };
  imports: string[];
  isDynamicEntry: boolean;
  isEntry: boolean;
  isImplicitEntry: boolean;
  moduleIds: string[];
  modules: {
    [id: string]: RenderedModule;
  };
  name: string;
  referencedFiles: string[];
  type: 'chunk';
}

New Features

intro, outro, banner, footer as functions are now called for each chunk. Although they cannot access rendered modules in the chunk, they will receive a list of all moduleIds contained in the chunk.
Hash length can be changed in the filename pattern, for example, [name]-[hash:12].js will create a hash with a length of 12 characters.

Breaking Changes

entryFileNames and chunkFileNames cannot access the modules object that contains rendered module content. Instead, they can access the list of contained moduleIds.
The order of plugin hooks has changed, please compare the above diagram with the diagram in the Rollup documentation.
The fileName and referenced imports in the renderChunk hook will get filenames with placeholders instead of hashes. However, these filenames can still be safely used in the hook's return value, as any hash placeholder will eventually be replaced with the actual hash.

Test Cases

Online Demo Repository

main.jsb.jsc.jsrollup.config.js

import('./b.js').then(res => {
  console.log(res);
});

import('./c.js').then(res => {
  console.log(res);
});
export const qux = 'QUX';

export const c = 'c';

import { defineConfig } from 'rollup';

export default defineConfig({
  input: 'main.js',
  output: {
    dir: 'dist',
    format: 'es',
    chunkFileNames: '[hash].js'
  }
});

The bundled output is as follows:

main.jsCM53L61n.jsCPjDz2XZ.js

import('./CM53L61n.js').then(res => {
  console.log(res);
});

import('./CPjDz2XZ.js').then(res => {
  console.log(res);
});
const qux = 'QUX';

export { qux };

const c = 'c';

export { c };

If we only change the filename of b.js to bNext.js, keeping everything else the same:

Online Demo Repository

main.jsbNext.jsc.jsrollup.config.js

import('./bNext.js').then(res => {
  console.log(res);
});

import('./c.js').then(res => {
  console.log(res);
});
export const qux = 'QUX';

export const c = 'c';

import { defineConfig } from 'rollup';

export default defineConfig({
  input: 'main.js',
  output: {
    dir: 'dist',
    format: 'es',
    chunkFileNames: '[hash].js'
  }
});

The bundled output is as follows:

main.jsCM53L61n.jsCPjDz2XZ.js

import('./CM53L61n.js').then(res => {
  console.log(res);
});

import('./CPjDz2XZ.js').then(res => {
  console.log(res);
});
const qux = 'QUX';

export { qux };

const c = 'c';

export { c };

Through the above examples, we can see that in the new algorithm, file name changes do not cause the hash value of the chunk to change.

Contributors

XiSenao

Changelog

Last edited 3 months ago

View full history

The Hashing Dilemma ​

Problem Statement ​

Solution ​

Sorted Rendering (Attempting to Solve Hash Issues) ​

Hash Placeholders ​

New Algorithm Impact ​

Plugin Hook Execution Flow Diagram ​

Available Chunk Information in Hooks ​

PrerenderedChunk ​

RenderedChunk ​

New Features ​

Breaking Changes ​

Test Cases ​

Contributors

Changelog

Discuss

The Hashing Dilemma

Problem Statement

Solution

Sorted Rendering (Attempting to Solve Hash Issues)

Hash Placeholders

New Algorithm Impact

Plugin Hook Execution Flow Diagram

Available `Chunk` Information in Hooks

`PrerenderedChunk`

`RenderedChunk`

New Features

Breaking Changes

Test Cases