Skip to content

pnpm: Exploring Core Linking and Dependency Management

Prerequisite Knowledge

This article is a supplement and extension to the article how pnpm links, originally written by Yang Jian. The original article was published on Zhihu ([2023-02-25]).

This supplementary content aims to provide additional perspectives, latest information, or practical experience based on the original author's work, and does not represent criticism or negation of the original article. All original viewpoints have been fully expressed in the original article, and this article serves only as an extended reading.

If the original author believes this supplementary content is inappropriate, please contact SenaoXi for communication or deletion.

The specifications followed by project Vrite

The vrite project always maintains the strictest state of pnpm, maintaining strong constraints between dependencies, maintaining compliance of dependency packages, and maintaining predictable dependency management behavior.

Background

In our daily work, we often encounter issues related to pnpm and phantom dependencies. Some of these issues are quite complex and involve the underlying implementation principles of pnpm. Therefore, this article will explore pnpm's link mechanism in detail.

We often say that one of pnpm's major advantages is avoiding phantom dependencies by default, prohibiting the hoist behavior of dependency libraries. However, pnpm's hoist actually comes in multiple forms, and pnpm adopts different strategies for prohibiting different hoist behaviors.

Let's analyze the different hoist behaviors in detail by combining pnpm's link strategy.

Before discussing specific hoist behaviors, we need to distinguish between two types of code:

  • One is application code, which is our daily business development code.
  • One is vendor code, which is third-party library code (including directly and indirectly dependent third-party libraries), usually published after packaging.

The different manifestations of hoist are reflected in various interactions between vendor and application.

The original chapter was based on pnpm@7 (there are some differences between pnpm@6 and pnpm@7). When supplementing this article (2025-05-08), pnpm has been updated to pnpm@10.10.0, so this chapter is based on pnpm@10.

Hoist Behavior Between Application and Dependencies - public-hoist

public-hoist is the most common hoist behavior we often mention, where application code can access modules that are not declared in the application's package.json dependencies / devDependencies / optionalDependencies fields (direct dependencies).

When we configure pnpm's node-linker to hoisted, all dependency libraries will be hoisted to the root directory's node_modules. At this point, the node_modules structure is consistent with the structure created by npm or yarn classic.

Here, although the application only depends on the express library, the application code (src/index.js) can freely access the debug library. This is because the indirect dependency library is hoisted to the root directory's node_modules.

npm/yarn classic node_modules structure
bash
node_modules/
  express
  debug
  cookie
  ... [63 packages]
src
  index.js
package.json
package.json
json
{
  "dependencies": {
    "express": "4.18.1"
  }
}

This seems to bring some convenience, but the harm is enormous. For details, please refer to the article phantom-deps.

For such phantom dependencies, pnpm strictly prohibits them by default. So how does pnpm achieve this? The method is simple - just don't place indirect dependency libraries directly in the node_modules under the project's root directory.

Looking at pnpm's node_modules structure, we can see that there are no indirect dependency libraries (such as debug, cookie, etc.) in node_modules, so the application code (src/index.js) naturally cannot directly access these libraries.

pnpm node_modules structure
bash
node_modules/
  express
  .pnpm
src/index.js

However, due to design flaws in prettier and eslint, their related plugins need to be stored in the project's root node_modules. Therefore, before version v10, pnpm did not prohibit all libraries' hoist behavior by default, but made an exception for eslint and prettier.

Default hoisting behavior has changed in v10

In pnpm's v10 version, by default, nothing including eslint or prettier and related named dependency packages will be hoisted to the root directory of node_modules. For details, see Remove the default option eslint and prettier from public-hoist-pattern option in next major version, which proposed removing *eslint* and *prettier* from the default values of the public-hoist-pattern configuration item.

Core issue of the proposal:

pnpm previously (in pnpm@<10) hoisted eslint and prettier related packages to the top level of node_modules by default. However, with eslint 9 supporting flat config and prettier plugins being able to resolve paths themselves through import() expressions, this default hoisting behavior is no longer necessary and may instead cause inconsistent node_modules folder structure and some unexpected issues.

  • ESLint's Evolution - Flat Config

    ESLint introduced a new configuration system in its v9 version, called flat config (usually the eslint.config.js file). flat config changed how ESLint finds and loads plugins and configurations. ESLint itself can better handle dependency resolution and no longer strongly depends on packages being hoisted to the top level of node_modules. It prefers users or plugins to explicitly specify dependency paths, or to find them through standard node.js require.resolve or import() mechanisms.

  • Prettier's Evolution - Import

    Prettier's documentation states that its plugins can be loaded through import() expressions, meaning Prettier can dynamically import plugins regardless of their location in node_modules (as long as they can be found by node.js's module resolution algorithm). Therefore, Prettier no longer strictly requires its plugins to be hoisted to the top level of node_modules.

public-hoist can be controlled through the publicHoistPattern configuration item. The current default value is [], meaning pnpm will not hoist any indirect dependency packages (including eslint or prettier and related named dependency packages) to the application's node_modules directory, avoiding unexpected phantom dependency issues in the application. Users can use this configuration item to intentionally control which dependency packages need to be hoisted to the application's node_modules directory, making pnpm's public-hoist behavior more controllable and predictable.

The public-hoist problem seems to be solved. But if the application directly depends on A package and B package, but A package and B package also depend on the same version of C package, then how should we handle our C package?

The simplest and most brutal way to handle this is to place C package in the node_modules of A package and B package:

directory structure
bash
node_modules
  A
    node_modules
      C
  B
    node_modules
      C

If we solve this by frequently copying C package, it will occupy a lot of disk space. At the same time, due to the inconsistent path of C package, the bundler cannot reuse C package, resulting in a larger package volume, which is also the default behavior of npm@{1,2}.

pnpm's solution is to use the linking method to solve this problem. General operating systems support two types of linking: symlink and hardlink.

So let's first explain the differences between symlink and hardlink, and take the following example to illustrate:

bash
echo "111" > a
ln a b
ln -s a c

At this time, the results of a, b, and c are

bash
cat a --> 111
cat b --> 111
cat c --> 111

It can be seen that the results of a, b, and c are synchronized. If you try to delete the a file, you can observe the following:

bash
rm a
cat a --> No such file or directory
cat b --> 111
cat c --> No such file or directory

The content of c is also deleted, but the content of b is not affected. It's interesting to note that if you restore the a file:

bash
echo "222" > a
cat a --> 222
cat b --> 111
cat c --> 222

At this time, you can observe that the contents of a and b are inconsistent, but the contents of a and c are consistent, which reflects an important difference between hardlink and symlink:

  • Deleting a file will not affect the content of hardlink, but will affect the content of symlink.
  • If the file is deleted and then restored, then the relationship between the original file and the hardlink file will no longer be maintained, and subsequent changes to the original file will not be synchronized to the hardlink file. The relationship between the original file and the symlink file will still be maintained, and subsequent changes to the original file will be synchronized to the symlink file.

From this, we can know that hardlink is difficult to ensure consistency with the original file and will be affected by the deletion of the original file, resulting in the relationship between the original file and the hardlink file no longer being maintained, and subsequent changes to the original file will not be synchronized to the hardlink file, resulting in watch being ineffective, which means that hmr cannot work normally in the development stage, and subsequent content will further supplement and explain.

hardlink has another limitation that hardlink instruction cannot hardlink directories, while symlink can. At the same time, hardlink does not support cross-file system hardlink, while symlink is supported.

The uniqueness of inode number in a single file system, and the various file systems are essentially independent naming spaces and storage pools. If cross-file system creation of hardlink is allowed, the same inode number in different file systems will cause ambiguity, which may lead to data confusion.

Another difference is that there is a huge difference in node's resolve path algorithm between hardlink and symlink:

bash
echo "console.log('resolve:', module.paths[0]);" >> a/index.js
ln a/index.js b/index.js
ln -s a/index.js c/index.js

Three directory routing algorithms:

bash
node a/index.js --> a/node_modules
node b/index.js --> b/node_modules
node c/index.js --> a/node_modules

Observe that for hardlink files, node's resolve algorithm is independent of the file location; for symlink files, its resolve algorithm is related to the source file location. The bundler also follows node's resolve algorithm, so the routing behavior of hardlink and symlink files in node runtime and bundler stage (resolveId) is different.

Of course, the above mentioned is the default behavior of the tool, which can be changed through configuration parameters. For example, node provides preserveSymlink parameter (node-preserveSymlink) and the bundler also provides similar preserveSymlinks parameter (rollup-preserveSymlink, vite-preserveSymlinks, typescript-preserveSymlinks, webpack-symlinks, node-preserveSymlinks) to change the symlink path resolution behavior.

When preserve-symlinks = true, the symlink calculation path is based on the current symlink path rather than the original file path.

bash
node --preserve-symlinks-main --preserve-symlinks c/index.js --> c/node_modules

But this configuration item needs to be used with caution, because preserve-symlinks may cause the target library to be unable to be retrieved, or the same library resolve to different results, thus breaking single instance mode and causing multiple product bundles, resulting in package size issues.

Hoist Behavior Between Dependencies - hoist

The hoist between dependency libraries refers to a third-party library being able to access code from other third-party libraries that are not declared in its dependencies (dependencies, optionalDependencies, peerDependencies). This sounds a bit incredible - if a library depends on another library, it should naturally declare this library in its dependencies, otherwise the library definitely won't work. However, due to historical reasons (such as npm's default behavior), there are still a large number of third-party libraries that don't follow this convention.

Take langium@3.3.1 as an example. In its product, it directly depends on vscode-languageserver-types, vscode-jsonrpc, and @chevrotain/regexp-to-ast, but these dependency libraries are not included in langium's dependencies, optionalDependencies, or peerDependencies (which pnpm automatically installs by default). So when the project happens to also depend on these dependency libraries or when packages already installed in the project also depend on these libraries, the langium library will execute normally. But if these dependency libraries are not included or if one day the libraries in the entire dependency tree don't include these dependency libraries, then langium will not be able to execute normally.

Considering historical legacy issues (shared dependencies between dependency libraries), there are still a large number of such libraries in npm packages. pnpm adopts the default enabled hoist mode for backward compatibility with npm's default behavior (dependency hoisting), meaning pnpm will additionally soft-link all dependency libraries in the application's dependency tree to node_modules/.pnpm/node_modules, ensuring that dependency libraries have the ability to share dependencies.

bash
node_modules/
  .pnpm/
    node_modules/
      a
      b
    a@1.0.0/
      node_modules/
        a [softlink -> ../../node_modules/a]
    b@1.0.0/
      node_modules/
        b [softlink -> ../../node_modules/b]
          index.js [require('a')]

If you are more particular, you can set hoist to false to disable the hoist behavior of third-party libraries' vendor. At this point, pnpm will not create the node_modules/.pnpm/node_modules directory to share third-party library dependencies.

pnpm's Different Levels of Topology

In fact, pnpm supports four levels of node_modules structure, from loose to strict:

  • hoisted mode

    All dependency libraries are flattened in the root directory's node_modules. This means the application can directly access all indirect dependency libraries, and all dependency libraries share dependencies with each other. This is also npm's default behavior.

    bash
    shamefully-hoist=true
  • semi strict mode

    This is pnpm's default mode. The application can only access dependency libraries declared in its dependencies (dependencies, devDependencies, optionalDependencies), but all dependency libraries still share dependencies with each other. The shared dependency libraries can be controlled through the hoist-pattern configuration item.

    bash
    ; All packages are hoisted to node_modules/.pnpm/node_modules
    hoist-pattern[]=*
    
    ; All types are hoisted to the root in order to make TypeScript happy
    public-hoist-pattern[]=*types*
    
    ; All ESLint-related packages are hoisted to the root as well
    public-hoist-pattern[]=*eslint*
  • strict mode

    In this case, pnpm both prohibits the application from accessing libraries outside its dependency declarations and prohibits shared access between dependency libraries. This mode is also the most recommended mode for business use. However, unfortunately, pnpm is not in strict mode by default due to compatibility considerations with the npm package ecosystem.

    But this is the best practice, and the vrite project currently also adopts strict mode. This can ensure that your business won't suddenly have abnormal problems one day due to unpredictable dependencies.

    bash
    hoist=false
  • pnp mode

    Even if pnpm uses the strictest strict mode, unfortunately it can only control the topology structure of node_modules within the current project. The node_modules outside the project is not affected, so there is still a risk of phantom dependencies (the application accessing dependency libraries in the outer node_modules).

    This root cause lies in node's resolve algorithm being recursively searched upward. Therefore, if the default resolve algorithm of node is not modified, phantom dependencies cannot be completely eliminated. Therefore, yarn proposed the yarn pnp mode with pnp characteristics, adopting the modification of node resolve's default behavior to eliminate phantom dependency problems, but it also brings new problems, which won't be elaborated here.

So the question arises: due to the complexity of the npm package ecosystem, early npm packages have a large number of packages that don't follow the semver specification. These packages cannot operate normally in pnpm strict and yarn pnp modes. Therefore, measures need to be taken to manage the dependency issues of non-compliant packages.

Dependency Repair Scheme

If the dependency of your third-party library exists problems, pnpm provides multiple ways to repair dependencies according to your needs. You can choose the appropriate dependency repair scheme based on your needs.

overrides

Consider a scenario where the application depends on A package, A package also depends on B package, but B package exists problems, and you do not want to upgrade dependency A to solve this problem. Then you can specify B version through overrides.

package.json
json
{
  "pnpm": {
    "overrides": {
      "B": "15.0.0"
    }
  }
}

But this brings a problem that pnpm will unify the B dependency package version of workspace subproject to 15.0.0, which is not always in line with expectations. Therefore, you can use the following method to more accurately control dependency version:

package.json
json
{
  "pnpm": {
    "overrides": {
      "A@1>B": "15.0.0"
    }
  }
}

packageExtensions

Another common scenario is that the dependency package accesses an undeclared dependency package, in strict mode, pnpm will not promote the dependency package to node_module/.pnpm/node_modules as a shared dependency package, resulting in the dependency package execution failure in the runtime and build stage.

For example, the above mentioned langium package missing the dependency package declaration for vscode-languageserver-types, vscode-jsonrpc, @chevrotain/regexp-to-ast, is not compliant behavior. Then you can use pnpm's packageExtensions field to add missing dependency packages.

package.json
json
{
  "pnpm": {
    "packageExtensions": {
      "langium": {
        "dependencies": {
          "vscode-languageserver-types": "*",
          "vscode-jsonrpc": "*",
          "@chevrotain/regexp-to-ast": "*"
        }
      }
    }
  }
}

pnpm will not modify the langium package's package.json dependencies field, but only tell pnpm that when parsing langium package, there are vscode-languageserver-types, vscode-jsonrpc, @chevrotain/regexp-to-ast dependency packages, which need to be downloaded additionally.

.pnpmfile.cjs

The two solutions (overrides and packageExtensions) are sufficient for simple dependency repair scenarios, but if you encounter more complex dependency repair scenarios, such as needing to repair through complex logic, then you can achieve flexible control through pnpm's hook.

Both of the above two repairs can be implemented based on hook.

.pnpmfile.cjs
js
function readPackage(pkg) {
  /**
   * langium contains the following ghost dependencies,
   * which attempt to access indirect dependencies (not declared in the dependencies of langium),
   * this is not a compliant behavior, under strict pnpm specifications, this will result in errors.
   *
   * therefore, the following declarations for indirect dependencies are provided,
   * and pnpm will independently download the required dependency packages when parsing langium.
   */
  if (pkg.name && pkg.name.startsWith('langium')) {
    pkg.dependencies = pkg.dependencies || {};
    pkg.dependencies['vscode-languageserver-types'] = '*';
    pkg.dependencies['vscode-jsonrpc'] = '*';
    pkg.dependencies['@chevrotain/regexp-to-ast'] = '*';
  }

  if (pkg.name && pkg.name.startsWith('A') && pkg.version.startsWith('1')) {
    pkg.dependencies = pkg.dependencies || {};
    pkg.dependencies['B'] = '15.0.0';
  }

  return pkg;
}

module.exports = {
  hooks: {
    readPackage
  }
};

If you encounter the problem that readPackage hook did not execute fully, please try running emo i --fix-lockfile.

npm alias

The above repairs are specific, and the version of other dependency packages has been repaired (upgrade package version) or the dependency package lacks dependency declaration (packageExtensions, .pnpmfile.cjs add dependency).

But there is also a scenario where all dependency package versions are problematic, and it is difficult to maintain only through pnpm patch to patch. At this time, you may need to fork the corresponding library version and repair it, and then publish it. However, due to lack of dependency library copyright permission, it is usually necessary to publish it with a different library name, and you can replace the abnormal dependency package version without awareness through npm alias to solve the dependency library problem.

For example, the following react-virtualized dependency package exists problems, @byted-cg/react-virtualized-fixed-import@9.22.3 is the repaired version, we replace react-virtualized with the repaired react-virtualized-fixed-import version through npm alias. No changes are needed to the project's other parts to solve the dependency library problem.

package.json
json
{
  "dependencies": {
    "react-virtualized": "npm:@byted-cg/react-virtualized-fixed-import@9.22.3"
  }
}

pnpm Linkage Method

Traditional npm's node_modules topology structure is difficult to accurately control hoist and public-hoist behaviors. Then let's see how pnpm implements accurate control of public-hoist and hoist behaviors.

package.json
json
{
  "dependencies": {
    "express": "4.18.1",
    "koa": "2.13.4"
  }
}

Hide Indirect Dependency Libraries in Root Directory

First, to solve public-hoist, pnpm defaults not to promote indirect dependency libraries to the root directory's node_modules, but to hide indirect dependency libraries in the root directory's .pnpm directory, thus avoiding the application from directly accessing phantom dependency libraries, i.e.

bash
node_modules
  .pnpm
    express@4.18.1
    koa@2.13.4
    accepts@1.3.8
    array-flatten@1.1.1
  express -> .pnpm/express@4.18.1/node_modules/express
  koa -> .pnpm/koa@2.13.4/node_modules/koa
index.js

To avoid multiple occurrences of the same version dependency in two vendors, we need to link them to the same place. For example, koa and express use the same version of accepts, and node_modules/koa and node_modules/express are respectively linked through node_modules/.pnpm/express@4.18.1/node_modules/express and .pnpm/koa@2.13.4/node_modules/koa. At the same time, the dependencies of express and koa packages for accepts are respectively linked through node_modules/.pnpm/accepts@1.3.8/node_modules/accepts.

bash
node_modules
  .pnpm
    accepts@1.3.8
      node_modules
        accepts
    array-flatten@1.1.1
      node_modules
        array-flatten
    express@4.18.1
      node_modules
        accepts -> ../../accepts@1.3.8/node_modules/accepts
    koa@2.13.4
      node_modules
        accepts -> ../../accepts@1.3.8/node_modules/accepts
        koa
  express -> .pnpm/express@4.18.1/node_modules/express
  koa -> .pnpm/koa@2.13.4/node_modules/koa

Here is a very special design. We are not linking koa directly to .pnpm/koa@2.13.4, but linking it to .pnpm/koa@2.13.4/node_modules/koa in order to ensure that koa can be used as a package dependency (self package).

  • Ensure koa can be used as a package dependency (self package)

    js
    const koa = require('koa');
  • Avoid circular symlink: If a depends on b, and b depends on a, then if both use symlink, it will be very easy to appear circular symlink.

  • Handle multiple peerDependencies problems: When there are multiple versions of peerDependencies, we must ensure that we can simultaneously resolve to different peerDependencies versions.

Here is a note that pnpm's storage of libraries is based on content addressable storage (Content-Addressable Storage), for details, please refer to pnpm content addressable storage article, which details the pnpm content addressable storage strategy. The above koa@2.13.4/node_modules/koa is pnpm through storage strategy, from the global storage library reflink/hardlink to the project's koa@2.13.4/node_modules/koa directory. Because of the existence of reflink/hardlink, it avoids the appearance of circular symlink situation.

Implement Cross-Project Resource Sharing Method

Another great advantage of pnpm is to implement cross-project content sharing. For most people, it may be familiar that pnpm uses hardlink to implement cross-project resource sharing. However, pnpm actually supports multiple sharing methods, which can be configured through packageImportMethod.

  • auto (default value): This is pnpm's preferred strategy. It will first try to use clone (i.e., reference link / write-when-copy). If the file system does not support clone, it will try to use hardlink (hard link). If hard link also fails (for example, trying to cross-file system link), it will fall back to copy (standard file copy). The default setting aims to intelligently select the best feasible option in the current environment.
  • clone (reference link/write-when-copy): This is the fastest and most secure method. He creates a reference to the original file data. If the project node_modules modifies this file after that, the file system will automatically create a new copy, without affecting the original file in CAS. This method not only saves space (initial data not copied), but also ensures isolation. However, it depends on the underlying file system support (such as Btrfs, APFS, and supporting reflink XFS). It provides the best balance between speed, space, and security (isolation), but depends on modern file systems.
  • hardlink (hard link): Create a hard link. This means that the file entry in the project node_modules and the file entry in CAS point to the same physical data block on the disk. This method is highly space efficient because it does not occupy much extra space (recording inode information). However, its important consequence is that if you directly modify this hard link file in the project node_modules, **it will also modify the original file in CAS**, which will unintentionally destroy CAS, affecting other projects. Hard link requires that the source file and the linked target must be in the same file system. Extremely space-saving, but tightly coupling the project with the storage library, with the risk of accidentally modifying the storage library file.
  • copy (copy): Execute standard file copy operations. This is the lowest disk space and installation speed efficiency, but it has universal applicability, even across file systems. It is a universal fallback option, but sacrifices the main advantages of pnpm (saving disk space and installation speed).
  • clone-or-copy: Try clone first, if not supported, fall back to copy.

Control Dependency Package Interaction hoist Behavior

Previously, we mentioned that through pnpm's publicHoistPattern configuration item, we promoted indirect dependency packages to the root directory's node_modules to allow the application to access indirect dependency packages. Then, how to control dependency package interaction hoist behavior?

The answer is simple, just link the shared dependency package to .pnpm/node_modules directory.

Because .pnpm/node_modules directory is the public node_modules dependency chain directory of all .pnpm/[[package@version]]/node_modules/[[package]], all dependency packages can access .pnpm/node_modules/[[package]] shared dependency packages, while the application cannot access .pnpm/node_modules, so if you want to let dependency libraries be shared with other dependency libraries, you can link the shared dependency library to .pnpm/node_modules directory.

For example, the node_modules/.pnpm/node_modules/accepts in the following can be accessed by all node_modules/.pnpm/[[package@version]]/node_modules/[[package]] packages.

Of course, if you do not want to be a shared dependency library, you can remove the shared dependency library from node_modules/.pnpm/node_modules through hoistPattern.

bash
node_modules
  .modules.yaml
  .pnpm
    accepts@1.3.8
      node_modules
    array-flatten@1.1.1
      node_modules
    node_modules
      .bin
      accepts -> ../accepts@1.3.8/node_modules/accepts
      koa-compose -> ../koa-compose@4.1.0/node_modules/koa-compose
  express -> .pnpm/express@4.18.1/node_modules/express
  koa -> .pnpm/koa@2.13.4/node_modules/koa

Supplement of hoistPattern feature

pnpm's default behavior is hoistPattern: ['*'], which means that all dependency libraries will be promoted to node_modules/.pnpm/node_modules directory. Note that the package in .pnpm/node_modules directory is the specific version package. The same package multiple versions will only select one version (usually the highest version) to be promoted to .pnpm/node_modules directory.

This feature is mainly to be compatible with early npm ecosystem packages that take advantage of npm weak constraints (phantom dependency) and directly access indirect dependency packages in build products. Modern application best practice configuration hoist to false, no longer need node_modules/.pnpm/node_modules directory to avoid build products directly accessing indirect dependency packages.

Handle peerDependencies

Previously, we have solved the public-hoist (for application and dependency library) and hoist (for dependency library and dependency library) problems, but there is a more complex problem, that is, peerDependencies handling. peerDependencies has two prominent features that have significantly impacted resolve flow

  • If foo package uses peerDependencies to declare an equal dependency foo-peer package, that is, foo-peer package is consumed by the host dependency party.

    bash
    app
      dependencies
        bar --> install foo-peer@1.0.0
          devDependencies
            foo@1.0.0
              peerDependencies
                foo-peer@1.0.0
    bash
    app --> install foo-peer@1.0.0
      dependencies
        bar
          dependencies
            foo@1.0.0
              peerDependencies
                foo-peer@1.0.0
  • If app1 depends on foo@1.0.0 and app2 also depends on foo@1.0.0, then even if two apps depend on the same version of foo, due to the peerDependencies declaration in foo@1.0.0, this package is consumed by the host dependency party (app1 and app2). However, due to the different foo-peer versions in app1 and app2, the resolve to foo@1.0.0 is two different files.

    packages/app1/package.json
    json
    {
      "dependencies": {
        "foo": "1.0.0",
        "foo-peer": "1.0.0"
      }
    }
    packages/app2/package.json
    json
    {
      "dependencies": {
        "foo": "1.0.0",
        "foo-peer": "2.0.0"
      }
    }
    bash
    node_modules
      .pnpm
        foo@1.0.0_foo-peer@1.0.0
          node_modules
            foo
            foo-peer -> ../../foo-peer@1.0.0/node_modules/foo-peer
        foo@1.0.0_foo-peer@2.0.0
          node_modules
            foo
            foo-peer -> ../../foo-peer@2.0.0/node_modules/foo-peer
        foo-peer@1.0.0
          node_modules
            foo-peer
        foo-peer@2.0.0
          node_modules
            foo-peer
    packages
      app1
        node_modules
          foo -> ../../../node_modules/.pnpm/foo@1.0.0_foo-peer@1.0.0/node_modules/foo
          foo-peer -> ../../../node_modules/.pnpm/foo@1.0.0_foo-peer@1.0.0/node_modules/foo-peer
      app2
        node_modules
          foo -> ../../../node_modules/.pnpm/foo@1.0.0_foo-peer@2.0.0/node_modules/foo
          foo-peer -> ../../../node_modules/.pnpm/foo@1.0.0_foo-peer@2.0.0/node_modules/foo-peer

    We see that app1 and app2 both load the same foo version, but their foo-peer versions are different, at this time in node_modules, app1 links to foo@1.0.0_foo-peer@1.0.0/node_modules/foo-peer while app2 links to foo@1.0.0_foo-peer@2.0.0/node_modules/foo-peer.

    If pnpm uses hard link strategy, then we will find that the foo@1.0.0_foo-peer@1.0.0 and foo@1.0.0_foo-peer@2.0.0 foo-peer inode is the same, which means that two different hard links point to the same package, which is obviously not in line with expectations. pnpm cleverly solved the multi-version problem of peerDependencies through hardlink, but brought another problem, that is, peerDependencies fragmentation problem.

peerDependencies Fragmentation

We see that due to the existence of peerDependencies, even if we use the same version foo package in the project, pnpm will ensure that foo can resolve to different peerDependencies versions, resulting in multiple foo duplicates, which is a typical npm duplicate problem (npm duplicate), and the consequences of duplicate problems are not repeated here. The most common is to cause repeated packaging and single instance mode destruction.

For example, our app depends on app1 and app2, when we package app, we will find that the same version foo is packaged multiple times.

The situation is even worse. The peerDependencies caused duplicate problem is contagious. Not only foo will cause multiple duplicates, but also all parent dependencies of foo need to be duplicated for compatibility.

Even if pnpm's processing strategy satisfies the semantics of peerDependencies, it may not be in line with the actual semantics of the user. Under most scenarios, the user does not want to package multiple foo packages, and the user usually also accepts the use of peerDependencies with the same version. Therefore, for this scenario, you can maintain the same version of peerDependencies to unify the foo package by manually modifying package.json, and pnpm will take intelligent methods to select the version that meets the conditions to reduce duplicate package usage.

Through the above mentioned pnpm's hook can easily achieve this demand, of course, a better way is to manually modify package.json to maintain all peerDependencies versions with the lowest common version, pnpm will take intelligent methods to select the version that meets the conditions to reduce duplicate package usage.

.pnpmfile.cjs
js
function readPackage(pkg, context) {
  if (pkg.dependencies && pkg.peerDependencies) {
    if (pkg.dependencies['foo'] && pkg.dependencies['foo-peer']) {
      pkg.dependencies['foo-peer'] = '1.0.0';
    }
  }
  return pkg;
}

module.exports = {
  hooks: {
    readPackage
  }
};

inject workspace

peerDependencies problem is not limited to pnpm (in pnpm), pnpm has a unique property, that is, the workspace and dependency library link method is different. Usually, each dependency library has a pointer to the pnpm global storage library hardlink/reflink.

When there are multiple versions of peerDependencies, multiple hardlink/reflink duplicates will exist. However, for workspace, if app1 depends on some workspace sdk, then this sdk will not create hardlink/reflink, but directly link the sdk of app1's node_modules to sdk. The difference is shown in the figure below

Because workspace does not use hardlink/reflink, this further makes it difficult to create multiple hardlink/reflink duplicates for workspace, so the workspace and dependency library processing peerDependencies method is slightly different.

Using soft link to workspace brings a problem, that is, peerDependencies lookup problem. We take a common react component library as an example.

Suppose we have three workspace packages, namely form package, card package, and button package:

  • form package and card package depend on button package.
  • form package needs to run under react@17 version.
  • card package needs to run under react@^16 version.
  • button package supports both react@16 package and react@17 package, and declares react as peerDependencies.
bash
packages
  form
    node_modules
      react [react@17.0.0]
      button -> link ../../button/node_modules/button
    index.jsx -> [workspace:button]
    package.json
  card
    node_modules
      react [react@^16.0.0]
      button -> link ../../button/node_modules/button
    index.jsx -> [workspace:button]
    package.json
  button
    node_modules
    index.jsx -> peer dependencies [react@*]
    package.json

At this time, if you package form package, you will find an error:

bash
[ERROR] Could not resolve "react"

This is because the default parsing path of the bundler is the soft link original file path (preserveSymlink = false || symlink = true), and button's peerDependencies is dependent on the host dependency party (form and card). In other words, react as button's peerDependencies, is installed in the node_modules of form and card, rather than in the node_modules of button or [root], causing button to fail to find react package.

PS: If it has been published form and card package, then there will be no problem, because it is linked through hardlink.

There are two ways to solve this problem:

  1. Use preserveSymlink = true to ensure that the button path is in the form's node_modules rather than in the button's node_modules, so that you can ensure that react is introduced into form, this method is usually unreliable and easy to cause dependency package relationship confusion.

    bash
    packages
     form
       node_modules
         react [react@17.0.0]
         button -> link ../../button/node_modules/button
       index.jsx -> workspace [button]
       package.json
     button
       node_modules
       index.jsx -> peer dependencies [react@*]
       package.json
  2. Based on inject to implement, card package and form package are linked to button package through hardlink, so that button resolves react package is in the host dependency party (card and form)'s node_modules, just like the published npm package, so that button can normally package. pnpm supports adjusting its symlink to hardlink through dependencies's injected to adjust.

form/package.json
json
{
  "dependenciesMeta": {
    "button": {
      "injected": true
    }
  }
}

But as mentioned earlier, once hardlink is deleted, it will cause subsequent watch to fail. The above button component is linked through hardlink, if the original file of button is deleted, it will cause the form and button hardlink to break, when the original file is restored and the restored file content is modified, this is not perceptible to the hardlink file, watch cannot detect, causing HMR trigger failure. This is a very common scenario, especially when developing a large monorepo project, due to performance considerations, developers usually directly link and watch non-active subproject dist directory (already packaged), so that no need to compile non-active subproject source code in the development stage, greatly improving development efficiency. For the default case of soft link processing, it has little impact, but if linked through hardlink, then there will be the above problem.

This requires that when using hardlink, do not delete the content of the linked file. On the other hand, watcher (such as chokidar) may not be friendly to hardlink, and may also cause some watch events information to be lost, details see watch-event-missing

Contributors

Changelog

Discuss

Released under the CC BY-SA 4.0 License. (abd9c64)