current position:Home>Cow technology to make node.js "lazy"

Cow technology to make node.js "lazy"

2021-08-27 09:35:08 zxg_ God said to have light

COW Not a cow , yes Copy-On-Write Abbreviation , This is a replication technology, but not completely replication technology .

Generally speaking, copying is to create two identical copies , The two are independent :

however , Sometimes it's not necessary to copy it , It can completely reuse the previous , At this time, you can just quote the previous one , Copy the corresponding part of the content when writing the content . So if the content is for reading , You don't have to copy , And if you need to write , Will really copy part of the content to make changes .

This is called “ When writing copy ”, That is to say Copy-On-Write.

The principle is simple , But it is very common in the memory management and file system of the operating system ,Node.js It's also because of this technology “ lazy ” 了 .

In this article, let's explore Copy-On-Write stay Node.js Application of process creation and file replication :

File replication

The most common idea of file copying is to write exactly the same file content to another location , But there are two problems :

  • Write exactly the same content , If the same file is copied hundreds of times , Then create the same content hundreds of times ? It's a waste of hard disk space
  • What if the power is cut off halfway through the writing ? How to restore the overwritten content ?

What shall I do? ? At this time, the operating system designer thought of COW technology .

use COW The technology perfectly solves the above two problems after file replication :

  • Copying just adds a reference to the previous content , If you don't modify it, it won't really copy , Only when the content is modified for the first time can the corresponding data block be copied , This avoids the waste of a lot of hard disk space .
  • When writing a file, it will be modified in another free disk block first , It will not be copied to the target location until it is modified , In this way, there will be no problem that power failure cannot be rolled back

stay Node.js Of fs.copyFile Of api You can use Copy-On-Write Pattern :

By default ,copyFile Will write to the target file , Overwrite the original content

const fsPromises = require('fs').promises;

(async function() {
  try {
    await fsPromises.copyFile('source.txt', 'destination.txt');
  } catch(e) {
    console.log(e.message);
  }
})();
 Copy code 

However, you can specify the replication policy through the third parameter :

const fs = require('fs');
const fsPromises = fs.promises;
const { COPYFILE_EXCL, COPYFILE_FICLONE, COPYFILE_FICLONE_FORCE} = fs.constants;

(async function() {
  try {
    await fsPromises.copyFile('source.txt', 'destination.txt', COPYFILE_FICLONE);
  } catch(e) {
    console.log(e.message);
  }
})();
 Copy code 

Supported by flag Yes 3 individual :

  • COPYFILE_EXCL: If the target file already exists , Will report a mistake ( The default is to overwrite )
  • COPYFILE_FICLONE: With copy-on-write Pattern replication , If the operating system does not support it, turn to real replication ( The default is direct copy )
  • COPYFILE_FICLONE_FORCE: With copy-on-write Pattern replication , If the operating system does not support it, an error is reported

this 3 The two constants are 1,2,4, You can pass them in by biting or merging them :

const flags = COPYFILE_FICLONE | COPYFILE_EXCL;
fsPromises.copyFile('source.txt', 'destination.txt', flags);
 Copy code 

Node.js Operating system supported copy-on-write technology , Performance can be improved in some scenarios , It is recommended to use COPYFILE_FICLONE The way , It will be better than the default way .

Process creation

fork Is a common way to create a process , And its implementation is a kind of copy-on-write technology .

We know , The process is divided into code segments in memory 、 Data segment 、 Stack segment this 3 part :

  • Code segment : Store the code to be executed
  • Data segment : Store some global data
  • stack segment : Store the status of execution

If you create a new process based on this process , Then copy this 3 Some memory . And if these three parts of memory are the same , That wastes memory space .

therefore fork It doesn't really copy memory , Instead, create a new process , Reference the memory of the parent process , When modifying data , Will actually copy this part of memory .

This is why process creation is called fork, It's a fork , Because it's not completely independent , It's just a part that forks , It's two , But most of them are the same .

But what if the code to be executed is different , It's time to use exec 了 , It creates new code snippets 、 Data segment 、 stack segment 、 Execute new code .

Node.js It can also be used fork and exec Of api:

fork:

const cluster = require('cluster');

if (cluster.isMaster) {
  console.log('I am master');
  cluster.fork();
  cluster.fork();
} else if (cluster.isWorker) {
  console.log(`I am worker #${cluster.worker.id}`);
}
 Copy code 

exec:

const { exec } = require('child_process');
exec('my.bat', (err, stdout, stderr) => {
  if (err) {
    console.error(err);
    return;
  }
  console.log(stdout);
});
 Copy code 

fork yes linux The foundation of process creation , thus it can be seen copy-on-write How important technology is .

summary

Copying multiple copies of the same content is undoubtedly a waste of space , So the operating system is copying files 、 The memory replication during process creation adopts Copy-On-Write technology , Only when it is really modified will it be copied .

Node.js Support fs.copyFile Of flags Set up , You can specify COPYFILE_FICLONE To use Copy-On-Write Copy files in the same way , It is also recommended that you use this method to save hard disk space , Improve the performance of file replication .

Process fork It's also Copy-On-Write The implementation of the , The code snippet of the process is not copied directly 、 Data segment 、 Stack segment to new content , It refers to the previous , The real memory copy will only be done when it is modified .

in addition to ,Copy-On-Write stay Immutable The implementation of the , There are many applications in distributed read-write separation and other fields .

COW Give Way Node.js change “ lazy ” 了 , But the performance is higher .

copyright notice
author[zxg_ God said to have light],Please bring the original link to reprint, thank you.
https://en.qdmana.com/2021/08/20210827093502545b.html

Random recommended