Package Manager
yarn
PnP
Note
The following content is an extension based on this article
Background
The most direct reason for the Yarn
team to develop the PnP
feature is that the current dependency management method is too inefficient. It's slow both when referencing dependencies and when installing them.
Let's first discuss Node
's logic when handling dependency references. This process has two scenarios:
If we pass a core module (such as "fs", "path", etc.)
or a local relative path (such as ./module-a.js or /my-li/module-b.js)
to the require()
call, then Node
will directly use the corresponding file. If it's not one of the cases described above, then Node
will start looking for a directory named node_modules
:
In the actual loop process, Node
will first look for node_modules
in the current directory. If it's not found, it will look in the parent directory, and so on until the system root directory. If a node_modules
directory exists, it checks if the module to be loaded exists in the directory. If not, it continues searching in the parent directory. If the module to be loaded is found, it then checks if the corresponding packages.json
specifies a main
property. If a main
property is specified, it loads the file pointed to by the main
property; otherwise, it defaults to index.js
. If there's no index.js
file, it looks for index.json
, then index.node
. If none are found, it will throw an error.
The require
module search flow chart is as follows:
For the specific require
execution process, you can refer to this article. The execution chain can be divided into the following stages:
require
=> Module._load
=> Module.prototype._load
=> Module._extensions
=> Module._compile
=> return module.exports
.
It can be seen that Node
needs to perform a lot of processing when resolving dependencies, which is not efficient.
Let's look at what happens during dependency installation. Currently, the yarn install
operation performs the following 4 steps:
- Resolve the dependency version range to a specific version number
- Download the corresponding version's
tar
package to the local offline mirror - Extract the dependency from the offline mirror to the local cache
- Copy the dependency from the local cache to the
node_modules
directory in the current directory
The 4th
step also involves a lot of file I/O
, resulting in inefficient dependency installation (especially in CI
environments where all dependencies need to be installed each time).
Facebook's engineers had enough of these issues and decided to find a solution that could completely solve the problems while remaining compatible with the existing ecosystem. This led to the Plug'n'Play
feature, abbreviated as PnP
. It has been tested internally at Facebook for some time, and now the Yarn team has decided to share it with the community and optimize it together. The most direct reason for the Yarn team to develop the PnP
feature is that the current dependency management method is too inefficient. It's slow both when referencing dependencies and when installing them.
Implementation Method
Instead of copying dependencies from the local cache to node_modules
, Yarn
maintains a static mapping table that contains the following information:
- Which versions of which dependency packages are included in the current dependency tree
- How these dependency packages are related to each other
- The specific locations of these dependency packages in the file system
This mapping table corresponds to the .pnp.js
file in the project directory in Yarn's PnP
implementation.
How is this .pnp.js
file generated, and how does Yarn use it?
During dependency installation, after step 3
is completed, Yarn doesn't copy the dependency to the node_modules
directory. Instead, it records the specific location of the dependency in the cache in .pnp.js
. This avoids a lot of I/O
operations while also preventing the generation of a node_modules
directory in the project directory.
Additionally, .pnp.js
contains a special resolver
. Yarn uses this special resolver
to handle require()
requests (it intercepts at the Module
level, changing the original node
behavior). This resolver
directly determines the specific location of the dependency in the file system based on the static mapping table contained in the .pnp.js
file, thus avoiding the I/O
operations in the current implementation when handling dependency references.
Advantages
From the PnP
implementation, it can be seen that the same version of the same dependency referenced by different projects on the same system actually points to the same directory in the global cache. This brings several immediate benefits:
- The speed of installing dependencies has been unprecedentedly improved. Multiple
CI
instances in aCI
environment can share the same cache - Multiple projects in the same system no longer need to occupy multiple disk spaces
Disadvantages
Script execution is restricted. All dependency references must be handled by the
resolver
in.pnp.js
. Therefore, whether executing ascript
or directly executing a JS file withnode
, it must be processed by Yarn. It must be executed throughyarn run
oryarn node
.Debugging is inconvenient. In
PnP
projects, there is nonode_modules
directory. Compared to directly executing scripts withnode
,PnP
rewrites theModule
implementation and adds a mapping operation. When debugging source code, it must also go through thePnP
layer, but developers don't pay much attention toPnP
's internal implementation. Furthermore, since dependencies point to the global cache, we can no longer directly modify these dependencies. Developers cannot access the source code location in the original node_module, which is extremely inconvenient for debugging. To debug, you need to useyarn unplug packageName
provided byyarn
to copy a specific dependency to the.pnp/unplugged
directory in the project. After that, theresolver
in.pnp.js
will automatically load this unplugged version. After debugging, executeyarn unplug --clear packageName
to remove the corresponding dependency from the local.pnp/unplugged
.Issues:
- Developers need to set breakpoints at the dependency package entry (
.pnp/unplugged/npm-[module name]-[version]-[hash]-integrity/node_modules/[module name]/[entry path]
) to debug. - For example, if
A
depends onB
, andB
depends onC
, whereB
is a dependency module andC
is an external dependency module ofB
. When debugging, you need to firstyarn unplug B
. If you need to debug theC
module in theB
module, you also need toyarn unplug C
. The same applies to dependencies in theA
module, greatly increasing debugging costs.
- Developers need to set breakpoints at the dependency package entry (
pnpm
Has the following excellent features:
- Fast package installation speed. Based on this article, it's clear that in most scenarios,
pnpm
package installation speed is significantly better thannpm/yarn
, being2-3
times faster, including whenyarn
uses thePnP
installation mode. - Efficient disk space utilization