current position:Home>Wanzi dry goods! Explain the JavaScript execution process in detail

Wanzi dry goods! Explain the JavaScript execution process in detail

2021-08-23 09:57:56 Tong ouba

js Execution of code , It is mainly divided into two stages : Compile phase 、 Execution phase ! All content of this article is based on V8 engine .

1 Preface

v8 engine

v8 How the engine works :

V8 It consists of many sub modules , Among them 4 Modules are the most important :

  • Parser: Responsible for JavaScript Source code converted to Abstract Syntax Tree (AST);
    • If the function is not called , Then it won't be converted into AST Of
  • Ignition:interpreter, Interpreter , Responsible for AST Convert to Bytecode, Explain to perform Bytecode; Collect... At the same time TurboFan Optimize the information needed for compilation , For example, the type of function parameter , It's only with types that real operations can be performed ;
    • If the function is called only once ,Ignition Will carry out the interpretation of the implementation ByteCode
    • The interpreter also has interpretation execution bytecode The ability of

There are usually two types of interpreters , Based on stack (Stack-based) And register based (Register-based), Stack based interpreters use stacks to hold function parameters 、 Intermediate result 、 Variable etc. ; Register based virtual machine supports instruction operation of register , Use registers to hold parameters 、 Intermediate calculation result . Usually , A small number of registers are also defined based on the virtual machine stack , Register based virtual machines also have stacks , The difference lies in the instruction set they provide . Most interpreters are stack based , such as Java virtual machine ,.Net virtual machine , also In the early V8 virtual machine . Stack based virtual machines are handling function calls 、 It's easy to solve recursion problems and switch contexts . and current V8 virtual machine The design is based on register , It stores some intermediate data in registers . Register based interpreter Architecture :

  • TurboFan:compiler, Compiler , utilize Ignitio The type of information collected , take Bytecode Convert to optimized assembly code ;
    • If a function is called more than once , Then it will be marked as a hotspot function , Then it will pass through TurboFan Convert to optimized machine code , Improve code execution performance ;
    • however , Machine code will actually be restored to ByteCode, This is because if in the subsequent execution of the function , The type has changed ( such as sum The function originally executed number type , Then execution became string type ), The previous optimized machine code can't handle the operation correctly , It will be converted into bytecode in reverse ;
  • Orinoco:garbage collector, Garbage collection module , Responsible for reclaiming the memory space that the program no longer needs ;

Give me a word

Stack stack

The characteristic of the stack "LIFO, Last in, first out (Last in, first out)". Data can only be stored one by one from the top , Take it out one by one from the top .

Pile up heap

The characteristic of heap is " disorder " Of key-value" Key value pair " storage . Heap access has nothing to do with order , No restricted access .

queue queue

The characteristic of the queue is "FIFO, First in, first out (First in, first out)" . Data access time " Insert from the back of the queue , Removed from the team head ".

" The difference with the stack : The storage and retrieval of the stack are all in the top one , There are two queues , An outlet , An entry ".

2 Compile phase

Lexical analysis Scanner

Decompose a string of characters into ( For programming languages ) Meaningful code blocks , These blocks of code are called lexical units (token).

[
    {
        "type": "Keyword",
        "value": "var"
    },
    {
        "type": "Identifier",
        "value": "name"
    },
    {
        "type": "Punctuator",
        "value": "="
    },
    {
        "type": "String",
        "value": "'finget'"
    },
    {
        "type": "Punctuator",
        "value": ";"
    }
]

Syntax analysis Parser

The process is to stream lexical units ( Array ) Convert to a tree that represents the syntax structure of a program, which is composed of elements that are nested level by level . This tree is called “ Abstract syntax tree ”(Abstract Syntax Tree,AST).

{
  "type": "Program",
  "body": [
    {
      "type": "VariableDeclaration",
      "declarations": [
        {
          "type": "VariableDeclarator",
          "id": {
            "type": "Identifier",
            "name": "name"
          },
          "init": {
            "type": "Literal",
            "value": "finget",
            "raw": "'finget'"
          }
        }
      ],
      "kind": "var"
    }
  ],
  "sourceType": "script"
}

In the process , If the source code doesn't conform to the syntax rules , It will end , And throw “ Grammar mistakes ”.

Here's a tool , The syntax tree can be generated in real time , You can try esprima.

Bytecode generation

It can be used node node --print-bytecode Check bytecode :

// test.js
function getMyname() {
 var myname = 'finget';
 console.log(myname);
}
getMyname();
node --print-bytecode test.js 

...
[generated bytecode for function: getMyname (0x10ca700104e9 <SharedFunctionInfo getMyname>)]
Parameter count 1
Register count 3
Frame size 24
   18 E> 0x10ca70010e86 @    0 : a7                StackCheck 
   37 S> 0x10ca70010e87 @    1 : 12 00             LdaConstant [0]
         0x10ca70010e89 @    3 : 26 fb             Star r0
   48 S> 0x10ca70010e8b @    5 : 13 01 00          LdaGlobal [1], [0]
         0x10ca70010e8e @    8 : 26 f9             Star r2
   56 E> 0x10ca70010e90 @   10 : 28 f9 02 02       LdaNamedProperty r2, [2], [2]
         0x10ca70010e94 @   14 : 26 fa             Star r1
   56 E> 0x10ca70010e96 @   16 : 59 fa f9 fb 04    CallProperty1 r1, r2, r0, [4]
         0x10ca70010e9b @   21 : 0d                LdaUndefined 
   69 S> 0x10ca70010e9c @   22 : ab                Return 
Constant pool (size = 3)
Handler Table (size = 0)
...

There is a very important concept involved here :JIT(Just-in-time) While explaining , While executing .

How does it work ( Combined with the first flow chart ):

1. stay JavaScript Add a monitor to the engine ( It's also called analyzer ). The monitor monitors the code , Record how many times the code has run 、 How to run, etc , If the same line of code runs several times , This code segment is marked as “warm”, If it runs many times , It's marked as “hot”;

2.( Baseline compiler ) If a piece of code becomes “warm”, that JIT Just send it to the baseline compiler to compile , And store the compilation results . such as , The monitor's watching , A line 、 A variable executes the same code 、 Using the same variable type , Then the compiled version will be , Replace the execution of this line of code , And store ;

3.( optimizing compiler ) If a piece of code becomes “hot”, The monitor will send it to the optimization compiler . Generate a faster and more efficient version of the code , And store . for example : Loop adding an object property , Suppose it is INT type , Give priority to INT Type judgment ;

4.( Anti optimization Deoptimization) But for JavaScript I've never been sure , front 99 The attributes of each object remain INT type , Maybe 100 I don't have this attribute , So at this point JIT You think you made a wrong assumption , And throw away the optimization code , Execution will return to the interpreter or baseline compiler , This process is called anti optimization .

Scope

Scope is a set of rules , Used to manage how the engine looks up variables . stay es5 Before ,js Only Global scope And Function scope .es6 Introduced block level scope . But what this block level scope needs to pay attention to is {} Scope of action , It is let,const Keywords Block level scope .

var name = 'FinGet';

function fn() {
  var age = 18;
  console.log(name);
}

The scope is determined when parsing :

In a nutshell , Scope is a box , Defines the accessible scope of variables and functions and their life cycle .

Lexical scope

Lexical scope means that scope is determined by the position of function declaration in code , So lexical scope is static scope , It can predict how the code will find the identifier during execution .

function fn() {
    console.log(myName)
}
function fn1() {
    var myName = " FinGet "
    fn()
}
var myName = " global_finget "
fn1()

The result of the above code printing is :global_finget, This is because the scope has been determined in the compilation phase ,fn Is defined in the global scope , It can't find in itself myName I'll go to the global scope to find , Not in fn1 Search for .

3 Execution phase

Execution context

When it comes to function execution , An execution context is created . The execution context is current JavaScript The abstract concept of the environment in which the code is parsed and executed .

JavaScript There are three types of execution context in :

  • Global execution context ( only one )
  • Function execution context
  • eval

The creation of execution context is divided into two phases :1. Create a stage 2. Execution phase

Create a stage

At random JavaScript When code is executed , The execution context is in the creation phase . A total of three things happened in the creation phase :

  • determine this Value , Also known as This Binding
  • LexicalEnvironment( Lexical environment ) The component is created .
  • VariableEnvironment( The variable environment ) The component is created .
ExecutionContext = {  
  ThisBinding = <this value>,     //  determine this 
  LexicalEnvironment = { ... },   //  Lexical environment 
  VariableEnvironment = { ... },  //  The variable environment 
}
This Binding

In the context of global execution ,this The value of points to the global object , In the browser ,this The value of points to window object . In the context of function execution ,this The value of depends on how the function is called . If it is called by an object reference , that this The value of is set to the object , otherwise this The value of is set to global object or undefined( In strict mode ).

Lexical environment (Lexical Environment)

A lexical environment is a structure that contains a mapping of identifier variables .( The identifier here represents the variable / Name of function , Variables are for real objects 【 Includes function type objects 】 Or a reference to the original value ). In a lexical environment , There are two components :(1) Environmental records (environment record) (2) References to external environments

  • The environmental record is Storage variable and Function declaration Actual location of .
  • A reference to the external environment means that it You can access its external Lexical Environment .( Implement an important part of the scope chain )

There are two types of lexical environment :

  • Global environment ( In the context of global execution ) It is a lexical environment without external environment . The external environment reference of the global environment is null. It has a global object (window object ) And its associated methods and properties ( For example, array method ) And any user-defined global variables ,this The value of points to the global object .
  • Function of the environment , Variables defined by the user in the function are stored in the environment record . A reference to an external environment can be a global environment , It can also be an external function environment that contains internal functions .

Be careful : For a functional environment , Environmental records It also includes a arguments object , This object contains the mapping between the index and the parameters passed to the function, as well as the length of the parameters passed to the function ( Number ).

The variable environment Variable Environment

It is also a lexical environment , Its EnvironmentRecord Contains the VariableStatements The binding created in this execution context .

As mentioned above , Variable environment is also a lexical environment , So it has all the attributes of the lexical environment defined above .

Sample code :

let a = 20;  
const b = 30;  
var c;

function multiply(e, f) {  
 var g = 20;  
 return e * f * g;  
}

c = multiply(20, 30);

Execution context :

GlobalExectionContext = {

  ThisBinding: <Global Object>,

  LexicalEnvironment: {  
    EnvironmentRecord: {  
      Type: "Object",  
      //  The identifier is bound here   
      a: < uninitialized >,  
      b: < uninitialized >,  
      multiply: < func >  
    }  
    outer: <null>  
  },

  VariableEnvironment: {  
    EnvironmentRecord: {  
      Type: "Object",  
      //  The identifier is bound here   
      c: undefined,  
    }  
    outer: <null>  
  }  
}

FunctionExectionContext = {  
   
  ThisBinding: <Global Object>,

  LexicalEnvironment: {  
    EnvironmentRecord: {  
      Type: "Declarative",  
      //  The identifier is bound here   
      Arguments: {0: 20, 1: 30, length: 2},  
    },  
    outer: <GlobalLexicalEnvironment>  //  Specify the global environment 
  },

  VariableEnvironment: {  
    EnvironmentRecord: {  
      Type: "Declarative",  
      //  The identifier is bound here   
      g: undefined  
    },  
    outer: <GlobalLexicalEnvironment>  
  }  
}

Look carefully at the above :a: < uninitialized >,c: undefined. So you're let a Before definition console.log(a) When you get Uncaught ReferenceError: Cannot access 'a' before initialization.

Why there are two lexical environments

Variable environment component (VariableEnvironment) It's for registration var function Variable declarations , Lexical environment component (LexicalEnvironment) It's for registration let const class And so on .

stay ES6 There was no block level scope before ,ES6 Then we can use let const To declare a block level scope , These two lexical environments are designed to achieve block level scope without affecting var Variable declaration and function declaration , As follows :

  1. First, in a running execution context , The lexical environment consists of LexicalEnvironment and VariableEnvironment constitute , To register all variable declarations .
  2. When it comes to block level code , Will first LexicalEnvironment recorded , Record as oldEnv.
  3. Create a new LexicalEnvironment(outer Point to oldEnv), Record as newEnv, And will newEnv Set to... In the executing context LexicalEnvironment.
  4. In block level code let const Will be registered in newEnv Inside , however var The declaration and function declaration are still registered in the original VariableEnvironment in .
  5. At the end of block level code execution , take oldEnv Revert to... In the executing context LexicalEnvironment.
function foo(){
    var a = 1
    let b = 2
    {
      let b = 3
      var c = 4
      let d = 5
      console.log(a)
      console.log(b)
    }
    console.log(b) 
    console.log(c)
    console.log(d)
}   
foo()

As you can see from the diagram , When entering the scope block of a function , Pass through... In scope block let Declared variables , Will be stored in a separate area of the Lexical Environment , Variables in this region do not affect variables outside the scope block , For example, variables are declared outside the scope b, Variables are also declared inside the scope block b, When executed inside the scope , They all exist independently .

Actually , Within the lexical environment , One was maintained Small stack structure , At the bottom of the stack is the outermost variable of the function , After entering a scope block , It will push the variables inside the scope block to the top of the stack ; When scope execution is complete , The scope information will pop up from the top of the stack , This is the structure of the lexical environment . You have to be careful , The variable I'm talking about here is through let perhaps const Declared variables .

Next , When executed in scope block console.log(a) This line of code is , We need to find variables in lexical environment and variable environment a The value of the , The specific search method is : Query down the top of the stack in the Lexical Environment , If you find... In a block in the lexical environment , I'm just going to go back to JavaScript engine , If you don't find , So keep looking in the variable environment .

Execution stack Execution Context Stack

Each function has its own execution context , Multiple execution contexts will be in the form of stacks ( The call stack ) The way to manage .

function a () {
  console.log('In fn a')
  function b () {
    console.log('In fn b')
    function c () {
      console.log('In fn c')
    }
    c()
  }
  b()
}
a()

You can try it with this tool , More intuitive observation into the stack and out of the stack javascript visualizer Tools .

If you look at this diagram, you can see the scope chain , intuitive . The scope chain is in Execution context creation phase affirmatory . With the execution environment , To determine with whom it should form the scope chain .

4V8 Garbage collection

Memory allocation

Stack

Stacks are temporary storage space , It mainly stores local variables and function calls , The storage is small and continuous , It's easy to operate , It's usually done by the system Automatically assigned , Automatic recovery , So the garbage collection in this article , It's all based on heap memory .

Basic types of data (Number, Boolean, String, Null, Undefined, Symbol, BigInt) Stored in stack memory . Reference type data is stored in heap memory , A variable that refers to a data type is a reference to the actual object in heap memory , In stack .

Why basic data types are stored on the stack , Reference data types are stored in the heap ?

JavaScript The engine needs a stack to maintain the state of the context during program execution , If the stack space is large , All the data is stored in the stack space , It will affect the efficiency of context switching , And then affect the execution efficiency of the whole program .

Pile up

It's used to store objects and dynamic data , This is the largest area of memory , And is GC(Garbage collection Garbage collection ) Where to work . however , Not all heap memory is available GC, Only the new generation and the old generation are GC management . The heap can be further subdivided as follows :

  • Cenozoic space : It's where the newly generated data lives , These data are often transient . This space is split in two , Then be Scavenger(Minor GC) Managed by . I'll introduce you later . Can pass V8 Signs such as --max_semi_space_size or --min_semi_space_size To control the size of Cenozoic space
  • Space for old generation : It's going through at least two rounds from the Cenozoic space Minor GC The data that's still alive , The space is Major GC(Mark-Sweep & Mark-Compact) management , I'll introduce you later . Can pass --initial_old_space_size or --max_old_space_size Control the size of the space .

Old pointer space: Surviving objects that contain pointers to other objects

Old data space: Surviving objects that only contain data .

  • Large object space : This is an object larger than the size of space , Large objects are not gc Handle .
  • Code space : Here is JIT Compiled code . This is the only executable space other than allocating code and executing it in large object space .
  • map Space : Deposit Cell and Map, Each area holds elements of the same size , Simple structure .

Intergenerational hypothesis

The intergenerational hypothesis has two characteristics :

  • The first is that most objects exist in memory for a short time , Simply speaking , It's a lot of objects that are allocated memory , Soon became inaccessible ;
  • The second is the immortal object , Will live longer .

stay V8 I will divide the pile into The new generation and Old generation Two regions , In the new generation, objects with short survival time are stored , A long-lived object stored in an old generation .

Freshmen usually only support 1~8M The capacity of , And the support capacity of Laosheng district is much larger . For these two areas ,V8 Use two different garbage collectors , In order to implement garbage collection more efficiently .

  • Secondary garbage collector , Mainly responsible for the new generation of garbage collection .
  • Main garbage collector , Mainly responsible for the recycling of the old generation .

Used in the new generation Scavenge Algorithm to deal with . So-called Scavenge Algorithm , It is to divide the space of Cenozoic into two regions , Half the object area , Half of it is free space .

New generation recycling

All newly added objects will be stored in the object area , When the object area is almost full , You need to do a garbage cleaning operation .

  1. Mark the objects that need to be recycled first , Then copy the active object in the object area to the free area , And sort ;
  1. Once the copy is done , Object area and free area for role flipping , That is, the original object area becomes a free area , The original free area becomes the object area .

Because of the Scavenge Algorithm , So every time you perform a cleanup operation , We need to copy the living objects from the object area to the free area . But copying takes time and cost , If the space in the new area is too large , Then the time for each cleaning will be too long , So in order to be efficient , In general, the space of the new area will be set to be smaller .

It's just because there's not much space in the New Area , So it's easy to fill the whole area with living objects . To solve this problem ,JavaScript The engine adopts the object promotion strategy , That is to say, the objects that still survive after two garbage collections , It will be moved to Laosheng District .

Old generation recycling

Mark-Sweep

Mark-Sweep The processing is divided into two stages , The marking phase and the cleaning phase , Looks like it's with Scavenge similar , The difference is ,Scavenge The algorithm is to copy the active object , In the old generation, the active objects are the majority , therefore Mark-Sweep After marking active and inactive objects , Clear the inactive objects directly .

  • Marking stage : The first scan of the old generation , Mark the active object
  • Clean up phase : A second scan of the older generation , Clear unmarked objects , That is, cleaning up inactive objects

Mark-Compact

because Mark-Sweep When it's done , There are a lot of memory fragments in the memory of the old generation , If you don't clean up these memory fragments , If you need to allocate a large object , At this point, all the debris space is completely unable to complete the allocation , It will trigger garbage collection in advance , And this recycling is not necessary .

To solve the problem of memory fragmentation ,Mark-Compact Proposed , It's in Mark-Sweep The basis of the performance and then came , comparison Mark-Sweep,Mark-Compact Added active object collation phase , Move all the active objects to one end , After moving , Clean up the memory outside the boundary .

A complete pause Stop-The-World

If garbage collection takes time , So the main thread's JS To continue the garbage collection operation, stop , We call this behavior total pause (Stop-The-World).

Increment mark

In order to reduce the old generation of garbage recycling caused by the carton ,V8 The tagging process is divided into sub tagging processes , At the same time, let the garbage collection mark and JavaScript Apply logic alternately , Until the marking phase is complete , We call this algorithm incremental tagging (Incremental Marking) Algorithm . As shown in the figure below :

Inert cleaning

Incremental tagging is just tagging active and inactive objects , Lazy cleaning is used to really clean up and free up memory . When incremental tagging is complete , If the current available memory is enough for us to execute code quickly , In fact, we don't need to clean up the memory immediately , You can delay the cleaning process for a while , Give Way JavaScript The logic code is executed first , There is no need to clean up all inactive object memory at once , The garbage collector will clean up one by one as needed , Until all the pages are cleaned up .

Concurrent recovery

Hairstyle GC Allows garbage collection without suspending the main thread , Both can be done at the same time , It's only in a few moments that you need to stop for the garbage collector to do something special . But this method also has to face the problem of incremental recovery , It's in the process of garbage collection , because JavaScript The code is executing , The reference relationships of objects in the heap can change at any time , Therefore, write barrier operation is also required .

Parallel recycling

Parallel GC Allow the main and auxiliary threads to perform the same GC Work , This allows the worker thread to share the main thread's GC Work , Make the garbage collection time equal to the total time divided by the number of participating threads ( Plus some synchronization overhead ).

5 Standing on the shoulders of giants

I'm here to show my respect to the elder , Looking up a lot of information , If there is any omission , Please forgive me. . If there is any mistake in the text , I hope to point out in time , thank !

  • Browser working principle and practice
  • Thinking about Reading Li Lao's course JS Execution mechanism -| Super · Aoyi |
  • Browser principles learning notes - Browser js Execution mechanism ( On )
  • js The execution of the engine
  • Preliminary understanding JavaScript Underlying principle
  • JavaScript Language execution at the engine level
  • Front end Foundation | js How much do you know about the execution process ?
  • 【 compile 】 How the code works JavaScript Execution process
  • V8 How to execute JavaScript Code ?
  • The front of the view ( Two )V8 How the engine works
  • Deepen understanding JavaScript Execution process (JS One of series )
  • Explain in simple terms V8 How the engine performs JavaScript Code
  • How browsers work :Chrome V8 Make you understand better JavaScript
  • How to understand js The execution context and stack of
  • js Perform Visualization
  • 【 translate 】 understand Javascript Execution context and execution stack
  • 【 translate 】 understand Javascript Execution context and execution stack
  • JS A detailed explanation of the scope chain
  • Browser garbage collection details ( With Google browser's V8 For example )
  • Deep understanding of Google is the strongest V8 Garbage collection mechanism
  • 「 translate 」Orinoco: V8 The garbage collector
  • V8 Memory management and garbage collection mechanism
  • JS Memory Leak And V8 Garbage Collection

This article is from WeChat official account. - Front end canteen (webcanteen)

The source and reprint of the original text are detailed in the text , If there is any infringement , Please contact the [email protected] Delete .

Original publication time : 2021-08-08

Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .

copyright notice
author[Tong ouba],Please bring the original link to reprint, thank you.
https://en.qdmana.com/2021/08/20210823095752247l.html

Random recommended