current position:Home>Reverse advanced, using ast technology to restore JavaScript obfuscated code

Reverse advanced, using ast technology to restore JavaScript obfuscated code

2022-04-29 09:36:18Brother K reptile


What is? AST

AST(Abstract Syntax Tree), Chinese abstract syntax tree , Syntax tree for short (Syntax Tree), Is a tree representation of the abstract syntax structure of the source code , Each node in the tree represents a structure in the source code . Syntax trees are not unique to any programming language ,JavaScript、Python、Java、Golang Almost all programming languages have syntax trees .

When we were young, we got a toy , I always like to disassemble toys into small parts , Then follow our own ideas , Reassemble the parts , A new toy was born . and JavaScript It's like a machine that works beautifully , adopt AST analysis , We can also disassemble toys as we did in childhood , Deepen understanding JavaScript The various parts of this machine , Then reassemble it according to our own wishes .

AST It has a wide range of uses ,IDE The grammar is highlighted 、 Code checking 、 format 、 Compress 、 Translation, etc , You need to convert the code into AST And then carry out the following operations ,ES5 and ES6 Grammatical differences , For backward compatibility , In practical application, syntax conversion is needed , It also uses AST.AST Not for the reverse , But doing reverse learned AST, When solving confusion, you can be like a fish in water .

AST There is an online parsing website :https://astexplorer.net/ , At the top, you can choose the language 、 compiler 、 Whether to turn on conversion, etc , As shown in the figure below , Area ① It's the source code , Area ② Is the corresponding AST Grammar tree , Area ③ It's conversion code , You can perform various operations on the syntax tree , Area ④ Is the new code generated after conversion . The original Unicode After the operation, the character becomes a normal character .

There is no single format for syntax trees , Choose a different language 、 Different compilers , The results are also different , stay JavaScript in , The compiler has Acorn、Espree、Esprima、Recast、Uglify-JS etc. , The most used is Babel, The follow-up study is also based on Babel For example .

01

AST Position in compilation

In the principle of compilation , The compiler usually goes through three steps to convert code : Lexical analysis (Lexical Analysis)、 Syntax analysis (Syntax Analysis)、 Code generation (Code Generation), The following figure vividly shows this process :

02

Lexical analysis

The lexical analysis stage is the first stage of the compilation process , The task of this stage is to read the source program one character at a time from left to right , Then recognize the words according to the rules of word formation , Generate token Symbol stream , such as isPanda(''), Will be split into isPanda,(,'',) Four parts , Each part has a different meaning , You can think of the lexical analysis process as a list or array of different types of tags .

03

Syntax analysis

Parsing is a logical phase of the compilation process , The task of grammatical analysis is to combine word sequences into various grammatical phrases on the basis of lexical analysis , such as “ Program ”,“ sentence ”,“ expression ” etc. , In the previous example ,isPanda('') Will be analyzed as an expression statement ExpressionStatement,isPanda() It will be analyzed into a function expression CallExpression, It will be analyzed into a variable Literal etc. , Dependencies between many grammars 、 Nested Models , It forms a tree structure , namely AST Grammar tree .

04

Code generation

Code generation is the last step , take AST The syntax tree can be converted into executable code , Before conversion , We can directly manipulate the syntax tree , Add, delete, modify, check, etc , for example , We can determine the declaration position of variables 、 Change the value of the variable 、 Delete some nodes, etc , We'll put the sentence isPanda('') Change to a boolean type Literaltrue, The syntax tree has the following changes :

05

Babel brief introduction

Babel It's a JavaScript compiler , It can also be said to be a parsing library ,Babel Chinese net :https://www.babeljs.cn/ ,Babel English official website :https://babeljs.io/ ,Babel Built in a lot of analysis JavaScript Code method , We can use Babel take JavaScript Code to AST Grammar tree , Then add, delete, modify and query , To convert JavaScript Code .

Babel Various function packages included 、API、 Optional parameters of each method, etc , It's all very much , This article does not list , In actual use , You should check more official documents , Or refer to some learning materials given at the end of the article .Babel Installation and others Node The bag is the same , Which one needs to be installed , such as npm install @babel/core @babel/parser @babel/traverse @babel/generator

In doing reverse disambiguation , Mainly used Babel The following function packages , This article only introduces the following function packages :

  1. @babel/core:Babel The compiler itself , Provides babel Compilation of API;
  2. @babel/parser: take JavaScript Code parsing into AST Grammar tree ;
  3. @babel/traverse: Traverse 、 modify AST Each node of the syntax tree ;
  4. @babel/generator: take AST restore JavaScript Code ;
  5. @babel/types: Judge 、 Verify the type of node 、 Build a new AST Nodes etc. .

06

@babel/core

Babel The compiler itself , Split into three modules :@babel/parser@babel/traverse@babel/generator, For example, the import effect of the following methods is the same :

const parse = require("@babel/parser").parse;
const parse = require("@babel/core").parse;

const traverse = require("@babel/traverse").default
const traverse = require("@babel/core").traverse

@babel/parser

@babel/parser Can be JavaScript Code parsing into AST Grammar tree , It mainly provides two methods :

  • parser.parse(code, [{options}]): Analyze a paragraph JavaScript Code ;
  • parser.parseExpression(code, [{options}]): Considering the performance problem , Parsing single JavaScript expression .

Some optional parameters options

Parameters describe
allowImportExportEverywhere Default import and export Declaration statements can only appear at the top of the program , Set to true You can declare anywhere
allowReturnOutsideFunction By default, if used in the top layer return Statements can cause errors , Set to true You can't report an error
sourceType The default is script, When the code contains importexport Errors will be reported when waiting for keywords , It needs to be specified as module
errorRecovery Default if babel If you find some abnormal code, you will throw an error , Set to true It will continue to parse the code while saving the parsing error , The error record will be saved in the final generated AST Of errors Properties of the , Of course, if you encounter a serious mistake , The resolution will still be terminated

For example, it can be seen clearly :

const parser = require("@babel/parser");

const code = "const a = 1;";
const ast = parser.parse(code, {
    sourceType: "module"})
console.log(ast)

{sourceType: "module"} Demonstrates how to add optional parameters , The output is AST Grammar tree , This is similar to online websites https://astexplorer.net/ The parsed syntax tree is the same :

07

@babel/generator

@babel/generator Can be AST restore JavaScript Code , Provides a generate Method :generate(ast, [{options}], code).

Some optional parameters options

Parameters describe
auxiliaryCommentBefore Add note block text at the head of the output file content
auxiliaryCommentAfter Add comment block text at the end of the output file content
comments Whether the output contains comments
compact Whether the output content does not add spaces , Avoid formatting
concise Whether to reduce the space in the output to make it more compact
minified Whether to compress the output code
retainLines Try to use the same line number in the output code as in the source code

And then the previous example , The original code is const a = 1;, Now let's a Change the variable to b, value 1 It is amended as follows 2, And then AST Restore to generate new JS Code :

const parser = require("@babel/parser");
const generate = require("@babel/generator").default

const code = "const a = 1;";
const ast = parser.parse(code, {
    sourceType: "module"})
ast.program.body[0].declarations[0].id.name = "b"
ast.program.body[0].declarations[0].init.value = 2
const result = generate(ast, {
    minified: true})

console.log(result.code)

The final output is const b=2;, The variable name and value were successfully changed , Due to the compression treatment , The spaces on the left and right sides of the equal sign are gone .

In the code {minified: true} Demonstrates how to add optional parameters , Here is the compressed output code ,generate Got result What you get is an object , Among them code Attribute is the final JS Code .

In the code ast.program.body[0].declarations[0].id.name yes a stay AST Position in ,ast.program.body[0].declarations[0].init.value yes 1 stay AST Position in , As shown in the figure below :

08

@babel/traverse

When there's more code , We can't locate and modify one by one as before , For nodes of the same type , We can directly traverse all nodes to modify , It's here @babel/traverse, It is usually associated with visitor Use it together ,visitor It's an object , The name is optional ,visitor You can define some methods to filter nodes , Here is an example to demonstrate :

const parser = require("@babel/parser");
const generate = require("@babel/generator").default
const traverse = require("@babel/traverse").default

const code = ` const a = 1500; const b = 60; const c = "hi"; const d = 787; const e = "1244"; `
const ast = parser.parse(code)

const visitor = {
    
    NumericLiteral(path){
    
        path.node.value = (path.node.value + 100) * 2
    },
    StringLiteral(path){
    
        path.node.value = "I Love JavaScript!"
    }
}

traverse(ast, visitor)
const result = generate(ast)
console.log(result.code)

The original code here defines abcde Five variables , Its value has numbers and strings , We are AST You can see that the corresponding type is NumericLiteral and StringLiteral

09

Then we made a statement visitor object , Then define the processing method of the corresponding type ,traverse Receive two parameters , The first is AST object , The second is visitor, When traverse Traverse all nodes , The node type encountered is NumericLiteral and StringLiteral when , Will call visitor The corresponding processing method in ,visitor The method in will receive a message from the current node path object , The type of object is NodePath, This object has many properties , Here are some of the most commonly used :

attribute describe
toString() The source code of the current path
node The node of the current path
parent The parent node of the current path
parentPath The parent path of the current path
type Type of current path

PS:path In addition to having many properties , There are many ways , For example, replace nodes 、 Delete node 、 Insert node 、 Find parent node 、 Get peer nodes 、 Add notes 、 Judge the node type, etc , You can query relevant documents or view the source code when necessary , Follow up @babel/types Some examples will be given to demonstrate , There will also be relevant examples in future practical articles , Limited space, this article will not elaborate .

So in the code above ,path.node.value You get the value of the variable , Then we can further modify it . After the above code runs , All numbers will be added 100 Then multiply by 2, All strings will be replaced with I Love JavaScript!, give the result as follows :

const a = 3200;
const b = 320;
const c = "I Love JavaScript!";
const d = 1774;
const e = "I Love JavaScript!";

If there are multiple types of nodes , They're all handled the same way , Then you can use | Connect all nodes into a string , Apply the same method to all nodes :

const visitor = {
    
    "NumericLiteral|StringLiteral"(path) {
    
        path.node.value = "I Love JavaScript!"
    }
}

visitor Objects can be written in many ways , The effect of the following writing methods is the same :

const visitor = {
    
    NumericLiteral(path){
    
        path.node.value = (path.node.value + 100) * 2
    },
    StringLiteral(path){
    
        path.node.value = "I Love JavaScript!"
    }
}
const visitor = {
    
    NumericLiteral: function (path){
    
        path.node.value = (path.node.value + 100) * 2
    },
    StringLiteral: function (path){
    
        path.node.value = "I Love JavaScript!"
    }
}
const visitor = {
    
    NumericLiteral: {
    
        enter(path) {
    
            path.node.value = (path.node.value + 100) * 2
        }
    },
    StringLiteral: {
    
        enter(path) {
    
            path.node.value = "I Love JavaScript!"
        }
    }
}
const visitor = {
    
    enter(path) {
    
        if (path.node.type === "NumericLiteral") {
    
            path.node.value = (path.node.value + 100) * 2
        }
        if (path.node.type === "StringLiteral") {
    
            path.node.value = "I Love JavaScript!"
        }
    }
}

The above ways of writing are useful enter Method , During node traversal , Access node (enter) Exit and exit (exit) Every node will visit the node once ,traverse By default, the node is processed when entering the node , If you want to handle when you exit the node , So in visitor It must be stated in exit Method .

@babel/types

@babel/types Mainly used to build new AST node , The previous example code is const a = 1;, If you want to add content , Like becoming const a = 1; const b = a * 5 + 1;, You can go through @babel/types To achieve .

First look at AST Grammar tree , There is only one primitive sentence VariableDeclaration node , Now add a :

10

So our idea is to traverse nodes , Traversing VariableDeclaration node , Just add one after it VariableDeclaration node , Generate VariableDeclaration node , have access to types.variableDeclaration() Method , stay types The names of various methods in AST See the same in , It's just that the first letter is lowercase , So we don't need to know all the methods , You can also roughly infer the method name , I only know that this method is not enough , You also need to know what the parameters passed in are , You can check the documents , however K I recommend looking directly at the source code here , Very clear , With Pycharm For example , Hold down Ctrl key , Then click the method name , Into the source code :

11

function variableDeclaration(kind: "var" | "let" | "const", declarations: Array<BabelNodeVariableDeclarator>)

You can see the need kind and declarations Two parameters , among declarations yes VariableDeclarator A list of nodes of type , So we can write the following visitor Part of the code , among path.insertAfter() Insert a new node after the node :

const visitor = {
    
    VariableDeclaration(path) {
    
        let declaration = types.variableDeclaration("const", [declarator])
        path.insertAfter(declaration)
    }
}

Next, we need to further define declarator, That is to say VariableDeclarator Node of type , Query its source code as follows :

function variableDeclarator(id: BabelNodeLVal, init?: BabelNodeExpression)

Observe AST,id by Identifier object ,init by BinaryExpression object , As shown in the figure below :

12

Let's deal with it first id, have access to types.identifier() Method to generate , Its source code is function identifier(name: string),name Here it is b 了 , here visitor The code can be written like this :

const visitor = {
    
    VariableDeclaration(path) {
    
        let declarator = types.variableDeclarator(types.identifier("b"), init)
        let declaration = types.variableDeclaration("const", [declarator])
        path.insertAfter(declaration)
    }
}

Then I'll see init How to define , The first is still to see AST structure :

13

init by BinaryExpression object ,left On the left is BinaryExpression,right On the right is NumericLiteral, It can be used types.binaryExpression() Method to generate init, The source code is as follows :

function binaryExpression(
    operator: "+" | "-" | "/" | "%" | "*" | "**" | "&" | "|" | ">>" | ">>>" | "<<" | "^" | "==" | "===" | "!=" | "!==" | "in" | "instanceof" | ">" | "<" | ">=" | "<=",
    left: BabelNodeExpression | BabelNodePrivateName, 
    right: BabelNodeExpression
)

here visitor The code can be written like this :

const visitor = {
    
    VariableDeclaration(path) {
    
        let init = types.binaryExpression("+", left, right)
        let declarator = types.variableDeclarator(types.identifier("b"), init)
        let declaration = types.variableDeclaration("const", [declarator])
        path.insertAfter(declaration)
    }
}

Then continue to construct left and right, The same as before , Observe AST Grammar tree , Query the parameters that should be passed in by the corresponding method , Layer upon layer , Until all the nodes are constructed , The final visitor The code should look like this :

const visitor = {
    
    VariableDeclaration(path) {
    
        let left = types.binaryExpression("*", types.identifier("a"), types.numericLiteral(5))
        let right = types.numericLiteral(1)
        let init = types.binaryExpression("+", left, right)
        let declarator = types.variableDeclarator(types.identifier("b"), init)
        let declaration = types.variableDeclaration("const", [declarator])
        path.insertAfter(declaration)
        path.stop()
    }
}

Be careful :path.insertAfter() A sentence was added after the insert node statement path.stop(), It means to stop traversing the current node and subsequent child nodes immediately after the insertion , The new node added is also VariableDeclaration, If you don't add a stop statement , Will be inserted in an infinite loop .

After inserting a new node , To convert JavaScript Code , You can see one more line of new code , As shown in the figure below :

14

Common confusion and reduction

I understand AST and babel after , You can do that. JavaScript The obfuscated code is restored , Here are some examples , Take you closer to babel Various operations of .

String restore

An example is given in the figure at the beginning of the article , The normal characters are replaced with Unicode code :

console['\u006c\u006f\u0067']('\u0048\u0065\u006c\u006c\u006f\u0020\u0077\u006f\u0072\u006c\u0064\u0021')

Observe AST structure :

15

We found that Unicode The code corresponds to raw, and rawValue and value It's all normal , So we can put raw Replace with rawValue or value that will do , Note the problem of quotation marks , It was originally console["log"], After you restore, you become console[log], Naturally, it will be wrong , In addition to the replacement value , Delete directly here extra node , Or delete raw The value is also OK , Therefore, the following writing methods can restore the code :

const parser = require("@babel/parser");
const generate = require("@babel/generator").default
const traverse = require("@babel/traverse").default

const code = `console['\u006c\u006f\u0067']('\u0048\u0065\u006c\u006c\u006f\u0020\u0077\u006f\u0072\u006c\u0064\u0021')`
const ast = parser.parse(code)

const visitor = {
    
    StringLiteral(path) {
    
        //  The following methods can be used 
        // path.node.extra.raw = path.node.rawValue
        // path.node.extra.raw = '"' + path.node.value + '"'
        // delete path.node.extra
        delete path.node.extra.raw
    }
}

traverse(ast, visitor)
const result = generate(ast)
console.log(result.code)

Restore results :

console["log"]("Hello world!");

Expression restore

Before K Brother wrote JSFuck Restoration of confusion , There is an introduction ![] Can mean false,!![] perhaps !+[] Can mean true, In some confusing code , There are often these operations , Complicate simple expressions , It is often necessary to execute the following statement , To get real results , The sample code is as follows :

const a = !![]+!![]+!![];
const b = Math.floor(12.34 * 2.12)
const c = 10 >> 3 << 1
const d = String(21.3 + 14 * 1.32)
const e = parseInt("1.893" + "45.9088")
const f = parseFloat("23.2334" + "21.89112")
const g = 20 < 18 ? ' A minor ' : ' adult '

Want to execute , We need to understand path.evaluate() Method , This method will be right for path Object to perform operations , Automatically calculate the result , Return an object , Among them confident Attribute indicates confidence ,value It means the result of calculation , Use types.valueToNode() Method to create a node , Use path.replaceInline() Method replaces the node with the new node generated by the calculation result , There are several alternatives :

  • replaceWith: Replace one node with another ;
  • replaceWithMultiple: Replace another node with multiple nodes ;
  • replaceWithSourceString: Parse the incoming source string into the corresponding Node Then replace , Poor performance , Not recommended ;
  • replaceInline: Replace another node with one or more nodes , It is equivalent to having the functions of the first two functions at the same time .

Corresponding AST The processing code is as follows :

const parser = require("@babel/parser");
const generate = require("@babel/generator").default
const traverse = require("@babel/traverse").default
const types = require("@babel/types")

const code = ` const a = !![]+!![]+!![]; const b = Math.floor(12.34 * 2.12) const c = 10 >> 3 << 1 const d = String(21.3 + 14 * 1.32) const e = parseInt("1.893" + "45.9088") const f = parseFloat("23.2334" + "21.89112") const g = 20 < 18 ? ' A minor ' : ' adult ' `
const ast = parser.parse(code)

const visitor = {
    
    "BinaryExpression|CallExpression|ConditionalExpression"(path) {
    
        const {
    confident, value} = path.evaluate()
        if (confident){
    
            path.replaceInline(types.valueToNode(value))
        }
    }
}

traverse(ast, visitor)
const result = generate(ast)
console.log(result.code)

final result :

const a = 3;
const b = 26;
const c = 2;
const d = "39.78";
const e = parseInt("1.89345.9088");
const f = parseFloat("23.233421.89112");
const g = "\u6210\u5E74";

Delete unused variables

Sometimes there are some unnecessary variables in code that are not used , Removing these redundant variables helps to analyze code more efficiently , The sample code is as follows :

const a = 1;
const b = a * 2;
const c = 2;
const d = b + 1;
const e = 3;
console.log(d)

Delete redundant variables , First of all, understand NodePath Medium scope,scope Is mainly used to find the scope of the identifier 、 Get and modify all references of identifiers, etc , Deleting unused variables mainly uses scope.getBinding() Method , The value passed in is the identifier name that the current node can refer to , The key attributes returned are as follows :

  • identifier: Of the identifier Node object ;
  • path: Of the identifier NodePath object ;
  • constant: Whether the identifier is a constant ;
  • referenced: Whether the identifier is referenced ;
  • references: The number of times the identifier is referenced ;
  • constantViolations: If the identifier is modified , All nodes that modify the identifier will be stored Path object ;
  • referencePaths: If the identifier is referenced , All nodes that reference this identifier will be stored Path object .

So we can pass constantViolationsreferencedreferencesreferencePaths Multiple parameters to determine whether the variable can be deleted ,AST The processing code is as follows :

const parser = require("@babel/parser");
const generate = require("@babel/generator").default
const traverse = require("@babel/traverse").default

const code = ` const a = 1; const b = a * 2; const c = 2; const d = b + 1; const e = 3; console.log(d) `
const ast = parser.parse(code)

const visitor = {
    
    VariableDeclarator(path){
    
        const binding = path.scope.getBinding(path.node.id.name);

        //  If the identifier has been modified , You cannot delete .
        if (!binding || binding.constantViolations.length > 0) {
    
            return;
        }

        //  Not quoted 
        if (!binding.referenced) {
    
            path.remove();
        }

        //  The number of citations is 0
        // if (binding.references === 0) {
    
        // path.remove();
        // }

        //  The length is 0, Variables are not referenced 
        // if (binding.referencePaths.length === 0) {
    
        // path.remove();
        // }
    }
}

traverse(ast, visitor)
const result = generate(ast)
console.log(result.code)

Processed code ( That is not used b、c、e The variable has been deleted ):

const a = 1;
const b = a * 2;
const d = b + 1;
console.log(d);

Delete redundant logic code

Sometimes in order to increase the difficulty of reverse , There will be many nested if-else sentence , A large number of redundant logic codes judged to be false , You can also use AST Delete it , Leave only what is judged to be true , The sample code is as follows :

const example = function () {
    
    let a;
    if (false) {
    
        a = 1;
    } else {
    
        if (1) {
    
            a = 2;
        }
        else {
    
            a = 3;
        }
    }
    return a;
};

Observe AST, The judgment condition corresponds to test node ,if The corresponding is consequent node ,else The corresponding is alternate node , As shown in the figure below :

16

AST Processing ideas and code :

  1. select BooleanLiteral and NumericLiteral node , Take its corresponding value , namely path.node.test.value;
  2. Judge value It's worth it , Replace the node with consequent Content under node , namely path.node.consequent.body;
  3. Judge value False value , Replace with alternate Content under node , namely path.node.alternate.body;
  4. yes , we have if The statement may not write else, There is no alternate, So in this case, judge value False value , Then remove the node directly , namely path.remove()
const parser = require("@babel/parser");
const generate = require("@babel/generator").default
const traverse = require("@babel/traverse").default
const types = require('@babel/types');

const code = ` const example = function () { let a; if (false) { a = 1; } else { if (1) { a = 2; } else { a = 3; } } return a; }; `
const ast = parser.parse(code)

const visitor = {
    
    enter(path) {
    
        if (types.isBooleanLiteral(path.node.test) || types.isNumericLiteral(path.node.test)) {
    
            if (path.node.test.value) {
    
                path.replaceInline(path.node.consequent.body);
            } else {
    
                if (path.node.alternate) {
    
                    path.replaceInline(path.node.alternate.body);
                } else {
    
                    path.remove()
                }
            }
        }
    }
}

traverse(ast, visitor)
const result = generate(ast)
console.log(result.code)

Processing results :

const example = function () {
    
  let a;
  a = 2;
  return a;
};

switch-case Anti control flow flattening

Control flow flattening is the most common confusion , adopt if-else perhaps while-switch-case Statement decomposition steps , Sample code :

const _0x34e16a = '3,4,0,5,1,2'['split'](',');
let _0x2eff02 = 0x0;
while (!![]) {
    
    switch (_0x34e16a[_0x2eff02++]) {
    
        case'0':
            let _0x38cb15 = _0x4588f1 + _0x470e97;
            continue;
        case'1':
            let _0x1e0e5e = _0x37b9f3[_0x50cee0(0x2e0, 0x2e8, 0x2e1, 0x2e4)];
            continue;
        case'2':
            let _0x35d732 = [_0x388d4b(-0x134, -0x134, -0x139, -0x138)](_0x38cb15 >> _0x4588f1);
            continue;
        case'3':
            let _0x4588f1 = 0x1;
            continue;
        case'4':
            let _0x470e97 = 0x2;
            continue;
        case'5':
            let _0x37b9f3 = 0x5 || _0x38cb15;
            continue;
    }
    break;
}

AST Reductive thinking :

  1. Get the original array of control flow , take '3,4,0,5,1,2'['split'](',') Such statements are transformed into ['3','4','0','5','1','2'] Such an array , After getting the array , You can also choose to split Delete the node corresponding to the statement , Because this statement in the final code is useless ;
  2. Traverse the control flow array obtained in the first step , Take out the... Corresponding to each value in turn case node ;
  3. Define an array , Storage of each case node consequent The contents of the array , And delete continue The node corresponding to the statement ;
  4. After traversal , Replace the whole array in step 3 while node , That is to say WhileStatement.

Different ideas , There are many ways of writing , For how to get the control flow array , You can have the following ideas :

  1. Get While Statement node , And then use path.getAllPrevSiblings() Method to get all the sibling nodes in front of it , Traverse each sibling node , Find and switch() Nodes with the same variable name of the array , Then take the value of the node for subsequent processing ;
  2. Take... Directly switch() The variable name of the array inside , And then use scope.getBinding() Method gets the node to which it is bound , Then take the value of this node for subsequent processing .

therefore AST There are two ways to write processing code , Method 1 :(code.js This is the previous example code , For ease of operation , Use here fs Read the code from the file )

const parser = require("@babel/parser");
const generate = require("@babel/generator").default
const traverse = require("@babel/traverse").default
const types = require("@babel/types")
const fs = require("fs");

const code = fs.readFileSync("code.js", {
    encoding: "utf-8"});
const ast = parser.parse(code)

const visitor = {
    
    WhileStatement(path) {
    
        // switch  node 
        let switchNode = path.node.body.body[0];
        // switch  The name of the control flow array in the statement , In this case, it is  _0x34e16a
        let arrayName = switchNode.discriminant.object.name;
        //  Get all  while  Brother node in front , In this example, we get the node that declares two variables , namely  const _0x34e16a  and  let _0x2eff02
        let prevSiblings = path.getAllPrevSiblings();
        //  Define the cache control flow array 
        let array = []
        // forEach  Traverse all nodes and methods 
        prevSiblings.forEach(pervNode => {
    
            let {
    id, init} = pervNode.node.declarations[0];
            //  If node  id.name  And  switch  The control flow array name in the statement is the same 
            if (arrayName === id.name) {
    
                //  Get the parameters of the whole expression of the node 、 Segmentation method 、 Separator 
                let object = init.callee.object.value;
                let property = init.callee.property.value;
                let argument = init.arguments[0].value;
                //  Simulation execution  '3,4,0,5,1,2'['split'](',')  sentence 
                array = object[property](argument)
                //  You can also directly take parameters for segmentation , Method is not universal , For example, replace the separator with  |  No way. 
                // array = init.callee.object.value.split(',');
            }
            //  The previous sibling nodes can be deleted 
            pervNode.remove();
        });

        //  Store control flow statements in the correct order 
        let replace = [];
        //  Traverse the control flow array , Take... In the correct order  case  Content 
        array.forEach(index => {
    
                let consequent = switchNode.cases[index].consequent;
                //  If the last node is  continue  sentence , Delete  ContinueStatement  node 
                if (types.isContinueStatement(consequent[consequent.length - 1])) {
    
                    consequent.pop();
                }
                // concat  Method to splice multiple arrays , That is, in the correct order  case  Content 
                replace = replace.concat(consequent);
            }
        );
        //  Replace the whole  while  node , Either way 
        path.replaceWithMultiple(replace);
        // path.replaceInline(replace);
    }
}

traverse(ast, visitor)
const result = generate(ast)
console.log(result.code)

Method 2 :

const parser = require("@babel/parser");
const generate = require("@babel/generator").default
const traverse = require("@babel/traverse").default
const types = require("@babel/types")
const fs = require("fs");

const code = fs.readFileSync("code.js", {
    encoding: "utf-8"});
const ast = parser.parse(code)

const visitor = {
    
    WhileStatement(path) {
    
        // switch  node 
        let switchNode = path.node.body.body[0];
        // switch  The name of the control flow array in the statement , In this case, it is  _0x34e16a
        let arrayName = switchNode.discriminant.object.name;
        //  Get the node bound by the control flow array 
        let bindingArray = path.scope.getBinding(arrayName);
        //  Get the parameters of the whole expression of the node 、 Segmentation method 、 Separator 
        let init = bindingArray.path.node.init;
        let object = init.callee.object.value;
        let property = init.callee.property.value;
        let argument = init.arguments[0].value;
        //  Simulation execution  '3,4,0,5,1,2'['split'](',')  sentence 
        let array = object[property](argument)
        //  You can also directly take parameters for segmentation , Method is not universal , For example, replace the separator with  |  No way. 
        // let array = init.callee.object.value.split(',');

        // switch  Control flow self incrementing variable name in the statement , In this case, it is  _0x2eff02
        let autoIncrementName = switchNode.discriminant.property.argument.name;
        //  Gets the node bound by the control flow autoincrement variable name 
        let bindingAutoIncrement = path.scope.getBinding(autoIncrementName);
        //  Optional operation : Delete the node bound by the control flow array 、 Node bound by self incrementing variable name 
        bindingArray.path.remove();
        bindingAutoIncrement.path.remove();

        //  Store control flow statements in the correct order 
        let replace = [];
        //  Traverse the control flow array , Take... In the correct order  case  Content 
        array.forEach(index => {
    
                let consequent = switchNode.cases[index].consequent;
                //  If the last node is  continue  sentence , Delete  ContinueStatement  node 
                if (types.isContinueStatement(consequent[consequent.length - 1])) {
    
                    consequent.pop();
                }
                // concat  Method to splice multiple arrays , That is, in the correct order  case  Content 
                replace = replace.concat(consequent);
            }
        );
        //  Replace the whole  while  node , Either way 
        path.replaceWithMultiple(replace);
        // path.replaceInline(replace);
    }
}

traverse(ast, visitor)
const result = generate(ast)
console.log(result.code)

After the above code runs , The original switch-case The control flow is restored , It becomes code line by line in order , More concise :

let _0x4588f1 = 0x1;
let _0x470e97 = 0x2;
let _0x38cb15 = _0x4588f1 + _0x470e97;
let _0x37b9f3 = 0x5 || _0x38cb15;
let _0x1e0e5e = _0x37b9f3[_0x50cee0(0x2e0, 0x2e8, 0x2e1, 0x2e4)];
let _0x35d732 = [_0x388d4b(-0x134, -0x134, -0x139, -0x138)](_0x38cb15 >> _0x4588f1);

Reference material

This article has the following references , It is also a recommended online learning material :

  • Youtube video ,Babel introduction :https://www.youtube.com/watch?v=UeVq_U5obnE ( author Nicolò Ribaudo, In the video PPT Information is available at K Brother crawler official account reply Babel Free access !)
  • The official manual Babel Handbook:https://github.com/jamiebuilds/babel-handbook
  • unofficial Babel API Chinese document :https://evilrecluse.top/Babel-traverse-api-doc/

END

Babel In fact, there are not many domestic materials of the compiler , Look at the source code 、 At the same time, online comparison and Visualization AST Grammar tree , Be patient and analyze layer by layer , The case in this article is only the most basic operation , In fact, some confusion has to be modified according to the situation , For example, we need to add some type judgment to limit , follow-up K I will use actual combat to lead you to be more familiar with other operations in solving confusion .

copyright notice
author[Brother K reptile],Please bring the original link to reprint, thank you.
https://en.qdmana.com/2022/119/202204290805429608.html

Random recommended