current position:Home>Qunar Hotel nodejs coverage collection practice

Qunar Hotel nodejs coverage collection practice

2021-08-27 00:15:01 Where to the Technology Salon

Ma Tao ,2013 Joined qunar network technical team in , Currently in the destination business unit , be responsible for H5、 Applet application development . I am very interested in the field of mobile terminal technology and front and rear end engineering , Dare to explore, practice and pursue the ultimate .

summary

Generally speaking, we write unit tests to verify the code coverage of the program during execution . Coverage results can be obtained from the line of code 、 Logic judgment and function method . The results can be used to test the degree of realization of the system function , It can also feed back the integrity of the program design .

However, for an old system without maintenance unit testing , It's not easy to test system function and get familiar with system structure by collecting coverage . For this reason, we have carried out a lot of thinking and attempts, and finally completed the phased goal . Next, let's share our implementation plan .

Realization principle

Collection of coverage in different languages , In the implementation mechanism and even the level of grammatical norms are similar . First, insert specific tags into the line of code according to certain rules , This step we call “ Code insertion “, And then in the case Collect the execution of these tags in the process of , Finally, the output coverage is calculated and the output result is formatted . The general process is shown in the figure :

image.png

Source code compilation is optional , Compile according to the characteristics of source language . stay Javascript In the ecology of , Code insertion 、 These basic operations of coverage statistics have a relatively perfect third-party class library to use , We chose IstanbulJS. We didn't directly use the command line tool provided by it for the convenience of expansion in the design of the scheme :nyc, It's based on IstanbulJS The interface of API Redesigned and developed . In the process of development, we have used IstanbulJS API 1.0 and 2.0 Two versions , Although there are some differences in usage , But the function is basically the same . Please refer to its official website for details , No more details here API The difference .

With the tool, the next problem is how to specify case ? If it's a start-up project , In the case of relatively few functions, the manual writing is relatively perfect case It's controllable . If we are faced with systems with unfamiliar functions or old systems with complex logic ? Because we are aiming at NodeJS A project is a project running on the server side , Refer to other server projects within the company case How to collect , Finally decided to play back through the log 、 Time task, etc case . Although there will be some redundancy in quantity , But compared with supplementary unit testing, the cost is more controllable .

Details of the plan

After a general understanding of the implementation principle , Next, let's introduce the details of our specific practice .

Code insertion

Code piling is the premise of coverage collection , This step is mainly to analyze the existing code at the syntax level , And add the default mark in the specified position in the line . Let's look at the comparison before and after processing through a piece of code :

The original document :

image.png

After inserting the file :

image.png

You can see some extra logic in the code , It's actually counting different dimensions of code , The specific analysis will not be carried out here . There are several points to be noted in the whole process :

  • The scope of the stake file , The specific scope is to traverse the physical file directory of the project , Does not analyze file dependencies within a line of code ;

  • Whether to keep the source file directory , This needs to be considered from the engineering level , It ultimately depends on whether the next steps are done on the deployment machine ? It's better to have a centralized platform to deal with the subsequent steps , Can improve the efficiency of the deployment process , And removing the source code can reduce size;

  • When inserting the source file path Path settings , This path is used for the final backtracking source code generation report . To improve portability, it's best to use relative paths , When generating reports, the source code path can not be limited by absolute path . This is in IstanbulJS API 2.0 It's easy to specify ;

  • The performance of the pile insertion process , This involves choosing synchronous or asynchronous I/O, For projects with large number of files or large volume , You can try to use multithreading according to the actual situation ( It depends on the actual situation , Some engineering documents do not exceed 10 individual , Some have thousands of documents ).

collecting data

We collect NodeJS The process of coverage data is dynamic , After the service is started, different external requests can update the coverage data in real time . The following is the same as the previous article demo For example , Find out by unfolding the folded part of the code !

image.png

Combined with the code of pile insertion part , You can basically understand the coverage collection logic of this file . During the operation of the program , Different requests case Will execute different code logic , At the same time, the coverage counting logic is executed , This is repeated to finish the coverage statistics .

By the way , The nodes used for coverage counting actually correspond to the abstract syntax tree sets of different dimensions one by one .

image.png

You can learn more about what you are interested in JS Grammar analysis related knowledge .

It is known from the previous article that the data of each module is kept in its own module , And then hang it on the global namespace to share all the files . So how to get these data when the program is running ? We tried in two directions :

First of all Memory sharing , Because our service is generally through PM2 Implementation of the process guard , So this plan was first considered . adopt Message Bus Mechanism , The coverage data in different processes are delivered in the form of messages . The data interaction is shown in the figure :

image.png

Read from memory 、 Processing data can guarantee high real-time performance , But it also brings some problems :

  • Low reliability , Once the data in memory is lost, it is not easy to retrieve ;

  • Pay attention to stability , When the data set of multiprocess service is large ( Coverage data is represented by MB Counting is common ),PM2 Internal message deserialization costs a lot , Bad control of message frequency can easily cause heavy hardware pressure ;

  • High coupling , Function implementation strongly depends on PM2, Too high coupling , It can't be transplanted to other application scenarios .

The second is File store , Serialize the memory data of each process and write it to a file , File by process ID Naming avoids conflict . The data interaction changes are shown in the figure :

image.png

The file storage method obviously optimizes some of the previous problems :

  • More reliability , Even if there's a problem with the service , We can still recover from the data file . It's like a breakpoint , The improvement in efficiency is obvious ;

  • Stability still needs attention , Since it involves I/O operation , So when reading and writing files, we need to go through careful design . Especially the choice of write frequency, read time and synchronization and asynchronism , One of the most common problems is that frequent operation of a data file causes the system to I/O Deadlock , Consume a lot of resources in an instant ;

  • The coupling is greatly reduced , The way of file storage gets rid of the dependence on the process Guard tool , In theory, it can be transplanted to any service . After a period of project practice, we decided to adopt the second scheme !

In fact, no matter which solution needs a premise to complete data collection , A specific module needs to be preloaded when the service is started . In order to achieve zero cost access of any project , We can use default environment variables NODE_OPTIONS To introduce the preload module ( Because this setting will affect the overall situation , It is recommended to remove after the service starts ).

Output report

This step is to take the data collected before , In summary or HTML Output results in the form of a formatted document . As shown in the figure, it's a format :

image.png

The output formats of reports are diverse , It can be easily moved and stored after generation . Generally speaking, there are fewer scenarios to report changes , If necessary, secondary development can also be carried out according to the file line level data in the coverage data set . There is one point to note in the report , For files that are not referenced by the service startup script, the index will not be output here ! This is not the same as stake insertion , The report is based on the program runtime , The actual implementation of the file generated .

summary

I think coverage rate is an important indicator of project quality , Both development and testing need to focus on this , Especially when the project is facing big changes . And in a sense , Can the data collected by coverage be used for performance monitoring 、 Code optimization, etc , It's worth digging into .




END.png

copyright notice
author[Where to the Technology Salon],Please bring the original link to reprint, thank you.
https://en.qdmana.com/2021/08/20210827001458148M.html

Random recommended