current position:Home>Dedicated worker thread

Dedicated worker thread

2021-08-27 00:16:16 iwhao_ top

This is my participation 8 The fourth of the yuegengwen challenge 22 God , Check out the activity details :8 Yuegengwen challenge

Basic concepts

You can call a dedicated worker thread a background script (background script).JavaScript All aspects of threads , Including life week Period management 、 Code path and input / Output , It is controlled by the script provided when initializing the thread . The script can also request other scripts , But a thread always starts with a script source .

Create a dedicated worker thread

The most common way to create a dedicated worker thread is to load JavaScript file . Provide the file path to Worker Constructors , Then the constructor loads the script asynchronously in the background and instantiates the author thread . The file path passed to the constructor can take many forms .

The following code demonstrates how to create an empty dedicated worker thread :

emptyWorker.js
//  Empty  JS  Worker thread file 
main.js
console.log(location.href); // "https://example.com/"
const worker = new Worker(location.href + 'emptyWorker.js');
console.log(worker); // Worker {}
 Copy code 

This is a very simple example , But it involves several basic concepts .

  • emptyWorker.js The file is loaded from an absolute path . According to the structure of the application , Use absolute URL Often redundant .
  • This file is loaded in the background , The initialization of worker threads is completely independent of main.js.
  • The worker thread itself exists in a separate JavaScript Environment , therefore main.js Must be Worker The object is a proxy real

Now communicate with worker threads . In the example above , The object is assigned to worker Variable .

  • Although the corresponding worker thread may not yet exist , But it's time to Worker The object is already available in the original environment .

The previous example can be modified to use relative paths . however , This requires main.js Must be with emptyWorker.js In the same path :

const worker = new Worker('./emptyWorker.js');
console.log(worker); // Worker {}
 Copy code 

Worker thread safety restrictions

The script file for a worker thread can only be loaded from the same source as the parent page . Loading a script file for a worker thread from another source causes an error , As shown below :

//  Try based on  https://example.com/worker.js  Create worker threads 
const sameOriginWorker = new Worker('./worker.js');
//  Try based on  https://untrusted.com/worker.js  Create worker threads 
const remoteOriginWorker = new Worker('https://untrusted.com/worker.js');
// Error: Uncaught DOMException: Failed to construct 'Worker':
// Script at https://untrusted.com/main.js cannot be accessed
// from origin https://example.com
 Copy code 

Be careful You cannot create worker threads using non homologous scripts , It does not affect the execution of scripts from other sources . In the worker thread Inside , Use importScripts() You can load scripts from other sources

Worker threads created based on load scripts are not subject to the document's content security policy , Because the worker thread is different from the parent document Run... In context . however , If the script loaded by the worker thread has a globally unique identifier ( And loaded from a large binary file sample ), It will be restricted by the parent document content security policy .

Use Worker object

Worker() The constructor returns Worker Object is the connection point to communicate with the dedicated worker thread just created . It can be used in Transfer information between worker thread and parent context , And capture events emitted by dedicated worker threads . Be careful To manage and use Worker() Each created Worker object . Before terminating the worker thread , It is not Will be garbage collected , Nor can you programmatically restore the previous Worker References to objects . Worker Object supports the following event handler properties .

  • onerror: Occurs in the worker thread ErrorEvent The handler assigned to this property is called when an error event of type .
    • This event occurs when an error is thrown in the worker thread .
    • The event can also be through worker.addEventListener('error', handler) In the form of .
  • onmessage: Occurs in the worker thread MessageEvent When a message event of type is called, the location specified to the property will be called

Process . - This event occurs when the worker thread sends a message to the parent context . - This event can also be by using worker.addEventListener('message', handler) Handle .

  • onmessageerror: Occurs in the worker thread MessageEvent When an error event of type is called, the property assigned to it is called

Sex handler . - This event occurs when the worker thread receives a message that cannot be deserialized . - This event can also be by using worker.addEventListener('messageerror', handler) Handle . Worker Object also supports the following methods .

  • postMessage(): Used to send information to worker threads through asynchronous message events .
  • terminate(): Used to immediately terminate the worker thread . There is no opportunity for worker threads to clean up , The script will suddenly stop .

DedicatedWorkerGlobalScope

Inside a dedicated worker thread , The global scope is DedicatedWorkerGlobalScope Example . Because this is inherited from WorkerGlobalScope, So include all its properties and methods . Worker threads can pass through self Keyword to access the global Scope .

globalScopeWorker.js

console.log('inside worker:', self);
 Copy code 

main.js

const worker = new Worker('./globalScopeWorker.js');
console.log('created worker:', worker);
// created worker: Worker {}
// inside worker: DedicatedWorkerGlobalScope {}
 Copy code 

As shown in this example , In top-level scripts and worker threads console Objects will be written to the browser console , This is very useful for debugging Useful . Because the worker thread has a non negligible startup delay , So even Worker The object is , The log of the worker thread is also updated Print out after the log of the main thread . Be careful There are two separate JavaScript All threads are going to a console The object sends a message , The object is then suppressed The information is serialized and printed on the browser console . Browser from two different JavaScript The thread received a message , and Output these messages in the appropriate order . So , Use logs to determine the sequence of operations in a multithreaded application You must be careful when ordering . DedicatedWorkerGlobalScope stay WorkerGlobalScope On this basis, the following attributes and methods are added .

  • name: Can be provided to Worker An optional string identifier for the constructor .
  • postMessage(): And worker.postMessage() Corresponding method , Used to move up and down from within the worker thread to the parent thread

Text send message .

  • close(): And worker.terminate() Corresponding method , Used to immediately terminate the worker thread . There is no line for workers

The process provides an opportunity to clean up , The script will suddenly stop .

  • importScripts(): Used to import any number of scripts into the worker thread .

Dedicated worker threads and implicit threads MessagePorts

Dedicated worker thread Worker Objects and DedicatedWorkerGlobalScope And MessagePorts There are some aspects Same interface handler and method :onmessage、onmessageerror、close() and postMessage(). It's not an accident Of , Because dedicated worker threads implicitly use MessagePorts Communicate between two contexts .

In the parent context Worker Objects and DedicatedWorkerGlobalScope In fact, it integrates MessagePort, and The corresponding handlers and methods are exposed in their own interfaces . let me put it another way , The message is still through MessagePort send out , only There is no direct use MessagePort nothing more .

There are also inconsistencies , such as start() and close() Appointment . Dedicated worker threads automatically send queued messages , because this start() It's not necessary . in addition ,close() It doesn't make sense in the context of a dedicated worker thread , Because it closes MessagePort Will isolate worker threads . therefore , Call... Inside the worker thread close()( Or call... Externally terminate()) Not only will it close MessagePort, It will also terminate the thread .

The life cycle of a dedicated worker thread

call Worker() The constructor is the starting point for the life of a dedicated worker thread . After call , It initializes the worker thread Script request , And put Worker Object is returned to the parent context . Although this can be used immediately in the parent context Worker object , but The worker thread associated with it may not have been created , Because there are grid latency and initialization latency for the request script .

Generally speaking , A dedicated worker thread can be informally divided into the following three states : initialization (initializing)、 Activities (active) And termination (terminated). These states are invisible to other contexts . although Worker The object may exist in the parent context in , However, it cannot be used to determine whether the worker thread is currently processing initialization 、 Activity or termination status . let me put it another way , Dedicated to activities Associated with worker threads Worker Object and associated with the terminated dedicated worker thread Worker Objects cannot be separated .

On initialization , Although the worker thread script has not yet been executed , But you can first queue the message to be sent to the worker thread . these The message waits for the state of the worker thread to become active , Then add the message to its message queue . The following code demonstrates this process .

initializingWorker.js
self.addEventListener('message', ({data}) => console.log(data));
main.js
const worker = new Worker('./initializingWorker.js');
// Worker  It may still be initialized 
//  but  postMessage() Data can be processed normally 
worker.postMessage('foo');
worker.postMessage('bar');
worker.postMessage('baz');
// foo
// bar
// baz 
 Copy code 

After creating , Dedicated worker threads will exist throughout the life of the page , Unless self termination (self.close()) Or by external termination (worker.terminate()). Even if the thread script has finished running , The thread environment will still exist . Just work The author thread still exists , Associated with it Worker Objects will not be collected as garbage .

Both self termination and external termination will eventually execute the same worker thread termination routine . Let's look at the following example , Where worker threads Self termination is performed between sending two messages :

closeWorker.js
self.postMessage('foo');
self.close();
self.postMessage('bar');
setTimeout(() => self.postMessage('baz'), 0);
main.js
const worker = new Worker('./closeWorker.js');
worker.onmessage = ({data}) => console.log(data);
// foo
// bar
 Copy code 

Although the call close(), But obviously, the execution of the worker thread does not terminate immediately .close() Workers will be notified here The process cancels all tasks in the event loop , And prevent you from adding new tasks . That's why "baz" The reason why it didn't print out . work The author thread does not need to perform a synchronous stop , Therefore, it is handled in the event loop of the parent context "bar" It will still print out . Let's look at an example of external termination .

terminateWorker.js
self.onmessage = ({data}) => console.log(data);
main.js
const worker = new Worker('./terminateWorker.js');
//  to  1000  Milliseconds allows the worker thread to initialize 
setTimeout(() => {
 worker.postMessage('foo');
 worker.terminate();
 worker.postMessage('bar');
 setTimeout(() => worker.postMessage('baz'), 0);
}, 1000);
// foo 
 Copy code 

here , The external sends a tape to the worker thread first "foo" Of postMessage, This message can be processed before external termination . Once called terminate(), The worker thread's message queue is cleaned up and locked , This is also just printing "foo" Why .

Be careful close() and terminate() It's an idempotent operation , No problem with multiple calls . These two methods are just to Worker Marked as teardown, Therefore, multiple calls will not have a bad effect .

Throughout the life cycle , A dedicated worker thread is associated with only one web page (Web The worker thread specification calls it a document ). Unless expressly terminated , Otherwise, as long as the associated document exists , Dedicated worker threads will exist . If the browser leaves the web page ( By navigation or Close tabs or close windows ), It marks the worker thread associated with it as terminated , Their execution will also stop immediately .

To configure Worker Options

Worker() Constructor allows you to take an optional configuration object as the second parameter . The configuration object supports the following properties .

  • name: It can be passed in the worker thread self.name Read string identifier .
  • type: Indicates how the load script runs , It can be "classic" or "module"."classic" Use the script as a regular

The rules should have been implemented ,"module" Execute the script as a module .

  • credentials: stay type by "module" when , Specifies how to get the worker thread module associated with the transfer credential data

Script . Values can be "omit"、"same-orign" or "include". These options relate to fetch() The voucher options are the same . stay type by "classic" when , The default is "omit".

Be careful Some modern browsers do not fully support module worker threads or may need to modify flags to support .

stay JavaScript Create worker threads inside the line

Worker threads need to be based on script files to create , This does not mean that the script must be a remote resource . Dedicated worker threads are also Can pass Blob object URL Create in-line scripts . This makes it easier to initialize worker threads , Because there is no network delay . The following shows an example of creating a worker thread in a row .

//  Create the to execute  JavaScript  Code string 
const workerScript = `
 self.onmessage = ({data}) => console.log(data);
`;
//  Generate based on script string  Blob  object 
const workerScriptBlob = new Blob([workerScript]);
//  be based on  Blob  Instance creation object  URL
const workerScriptBlobUrl = URL.createObjectURL(workerScriptBlob);
//  Based on the object  URL  Create a dedicated worker thread 
const worker = new Worker(workerScriptBlobUrl);
worker.postMessage('blob worker script');
// blob worker script
 Copy code 

In this case , Created by script string Blob, And then through Blob You create an object URL, Finally, put the object URL Passed on to Worker() Constructors . This constructor also creates a dedicated worker thread .

Execute scripts dynamically in worker threads

Scripts in worker threads are not monolithic , It can be used importScripts() Methods are loaded and executed programmatically Line any script . This method can be used for global Worker object . This method loads the script and executes it synchronously in the loading order . such as , The following example loads and executes two scripts :

main.js
const worker = new Worker('./worker.js');
// importing scripts
// scriptA executes
// scriptB executes
// scripts imported
scriptA.js
console.log('scriptA executes');

scriptB.js
console.log('scriptB executes');
worker.js
console.log('importing scripts');
importScripts('./scriptA.js');
importScripts('./scriptB.js');
console.log('scripts imported');
 Copy code 

importScripts() Method can receive any number of scripts as parameters . There is no limit to the order in which browsers download them , but The execution will be carried out in strict order in the parameter list . therefore , The following code has the same effect as before :

console.log('importing scripts');
importScripts('./scriptA.js', './scriptB.js');
console.log('scripts imported');
 Copy code 

Script loading is subject to general CORS The limitation of , But inside the worker thread, you can request scripts from any source . The script here leads to The input strategy is similar to using the generated

main.js
const worker = new Worker('./worker.js', {name: 'foo'});
// importing scripts in foo with bar
// scriptA executes in foo with bar
// scriptB executes in foo with bar
// scripts imported
scriptA.js
console.log(`scriptA executes in ${self.name} with ${globalToken}`);
scriptB.js
console.log(`scriptB executes in ${self.name} with ${globalToken}`);
worker.js
const globalToken = 'bar';
console.log(`importing scripts in ${self.name} with ${globalToken}`);
importScripts('./scriptA.js', './scriptB.js');
console.log('scripts imported'); 
 Copy code 

Delegate tasks to child worker threads

Sometimes it may be necessary to create a child worker thread in the worker thread . There are more than one CPU At the core , Use multiple sub workers The author thread can realize parallel computing . Think carefully before using multiple child worker threads , Ensure that the investment in parallel computing can be earnings , After all, running multiple sub threads at the same time will have a great computational cost .

Except for path resolution , Creating child worker threads is the same as creating normal worker threads . Script path for child worker threads Parse based on the parent worker thread rather than relative to the web page . Let's look at the following example ( Pay attention to the extra js Catalog ):

main.js
const worker = new Worker('./js/worker.js');
// worker
// subworker
js/worker.js
console.log('worker');
const worker = new Worker('./subworker.js');
js/subworker.js
console.log('subworker');
 Copy code 

The scripts of the top-level worker thread and the scripts of the child worker thread must be loaded from the same source as the home page

Handling worker thread errors

If the worker thread script throws an error , The worker thread sandbox can prevent it from interrupting the execution of the parent thread . As shown in the following example , Among them try/catch Blocks do not catch errors :

main.js
try {
 const worker = new Worker('./worker.js');
 console.log('no error');
} catch(e) {
 console.log('caught error');
}
// no error
worker.js
throw Error('foo');
 Copy code 

however , The corresponding error event will still bubble into the global context of the worker thread , So it can be done by Worker Set... On the object Set the error event listener to access . Let's take a look at this example :

main.js
const worker = new Worker('./worker.js');
worker.onerror = console.log;
// ErrorEvent {message: "Uncaught Error: foo"}
worker.js
throw Error('foo');
 Copy code 

Communicate with dedicated worker threads

Communication with worker threads is done through asynchronous messages , But these messages can take many forms .

Use postMessage()

The simplest and most common form is to use postMessage() Deliver serialized messages . Let's take a look at an example of calculating factorial :

factorialWorker.js
function factorial(n) {
 let result = 1;
 while(n) { result *= n--; }
 return result;
}
self.onmessage = ({data}) => {
 self.postMessage(`${data}! = ${factorial(data)}`);
};
main.js
const factorialWorker = new Worker('./factorialWorker.js');
factorialWorker.onmessage = ({data}) => console.log(data);
factorialWorker.postMessage(5);
factorialWorker.postMessage(7);
factorialWorker.postMessage(10);
// 5! = 120
// 7! = 5040
// 10! = 3628800
 Copy code 

For delivering simple messages , Use postMessage() Passing messages between the main thread and the worker thread , And in two windows Sending messages between is very much like . The main difference is that there is no targetOrigin The limitation of , This restriction is for Window.prototype. postMessage Of , Yes WorkerGlobalScope.prototype.postMessage or Worker.prototype. postMessage No impact . The reason for such an agreement is simple : The source of the worker thread script is limited to the source of the home page , So there is no It's necessary to filter again .

Use MessageChannel

Whether the main thread or the worker thread , adopt postMessage() Communicating involves calling methods on global objects , And define A temporary transport protocol . This process can be Channel Messaging API replace , Based on this API You can specify between two contexts Ensure that communication channels are established .

MessageChannel The instance has two ports , Each represents two communication endpoints . To get the parent page and worker thread through MessageChannel signal communication , You need to pass a port to the worker thread , As shown below :

worker.js
//  Store the global data in the listener  messagePort
let messagePort = null;
function factorial(n) {
 let result = 1;
 while(n) { result *= n--; }
 return result;
}
//  Add a message handler to the global object 
self.onmessage = ({ports}) => {
 //  Set the port only once 
 if (!messagePort) {
     //  Initialize the message sending port ,
     //  Assign a value to the variable and reset the listener 
     messagePort = ports[0];
     self.onmessage = null;
     //  Set the message handler on the global object 
     messagePort.onmessage = ({data}) => {
         //  Send data after receiving the message 
         messagePort.postMessage(`${data}! =${factorial(data)}`);
    };
 }
};
main.js
const channel = new MessageChannel();
const factorialWorker = new Worker('./worker.js');
//  hold `MessagePort` Object to the worker thread 
//  The worker thread is responsible for processing the initialization channel 
factorialWorker.postMessage(null, [channel.port1]);
//  Actually send data through the channel 
channel.port2.onmessage = ({data}) => console.log(data);
//  Worker threads respond through channels 
channel.port2.postMessage(5);
// 5! = 120
 Copy code 

Use BroadcastChannel

Cognate scripts can pass through BroadcastChannel Send and receive messages from each other . The setting of this channel type is relatively simple , No need to be like MessageChannel Transfer messy ports like that . This can be done in the following ways :

main.js
const channel = new BroadcastChannel('worker_channel');
const worker = new Worker('./worker.js');
channel.onmessage = ({data}) => {
 console.log(`heard ${data} on page`);
}
setTimeout(() => channel.postMessage('foo'), 1000);
// heard foo in worker
// heard bar on page
worker.js
const channel = new BroadcastChannel('worker_channel'); 
channel.onmessage = ({data}) => {
 console.log(`heard ${data} in worker`);
 channel.postMessage('bar');
}
 Copy code 

here , The page is going through BroadcastChannel Wait before sending a message 1 Second . Because this channel has no ports The concept of power , So if no entity listens to this channel , The broadcast message will not be handled . under these circumstances , without setTimeout(), Due to the delay in initializing the worker thread , Will cause the message to have been sent , But messages on the worker thread The handler is not in place yet .

Worker thread data transfer

When using worker threads , It is often necessary to provide them with some form of data load . Worker threads are context independent , therefore Transferring data between contexts creates consumption . In languages that support the traditional multithreading model , Lock can be used 、 The mutex , as well as volatile Variable . stay JavaScript in , There are three ways to transfer information between contexts : Structured cloning algorithm (structured clone algorithm)、 Transferable objects (transferable objects) And shared array buffers (shared array buffers).

Structured cloning algorithm

Structured cloning algorithm can be used to share data between two independent contexts . The algorithm is implemented by the browser in the background , Can't call directly . Through postMessage() When passing objects , The browser will traverse the object , And make a copy of it in the target context . The following types are supported by the structured cloning algorithm .

  • except Symbol All primitive types except
  • Boolean object
  • String object
  • BDate
  • RegExp
  • Blob
  • File
  • FileList
  • ArrayBuffer
  • ArrayBufferView
  • ImageData
  • Array
  • Object
  • Map
  • Set

About structured cloning algorithm , There are several points to note .

  • After the copy , Modification of the object in the source context , Objects that do not propagate to the target context .
  • Structured cloning algorithm can identify circular references contained in objects , Does not traverse objects infinitely .
  • clone Error object 、Function Object or DOM The node will throw an error .
  • Structured cloning algorithms do not always create identical copies .
  • Object property descriptor 、 The get method and set method are not cloned , Default values are used when necessary .
  • Prototype chains are not cloned .
  • RegExp.prototype.lastIndex Properties will not be cloned

Structured cloning algorithm has computational consumption when objects are complex . therefore , In practice, try to avoid Too big 、 Too many copies .

Transferable objects

Use transferable objects (transferable objects) Ownership can be transferred from one context to another . It's hard to The ability to replicate large amounts of data between contexts , This function is particularly useful . Only the following objects are transferable :

  • ArrayBuffer
  • MessagePort
  • ImageBitmap
  • OffscreenCanvas

SharedArrayBuffer

Be careful because Spectre and Meltdown A loophole in the , All mainstream browsers are 2018 year 1 It's disabled in January SharedArrayBuffer. from 2019 Year begins , Some browsers are starting to gradually re enable this feature . Neither Clone , Nor transfer ,SharedArrayBuffer As ArrayBuffer Can be shared between different browser contexts . In the SharedArrayBuffer Pass to postMessage() when , The browser will only pass a reference to the original buffer . The result is , two Different JavaScript The context maintains references to the same memory block separately . Each context can modify this buffer at will , It's like modifying the routine ArrayBuffer equally .

Thread pool

Because enabling worker threads is expensive , So in some cases, you can consider always maintaining a fixed number of thread activities , When necessary Assign tasks to them . When a worker thread performs a calculation , Will be marked as busy . Until it tells the thread pool that it is free , To be ready for a new mission . These active threads are called “ Thread pool ” or “ Worker thread pool ”.

There is no authoritative answer to number of the threads in thread pool , But you can refer to navigator.hardware Concurrency Property returns the number of cores available to the system . Because it's unlikely to know the multithreading capabilities of each core , So it's best to take this number as Maximum thread pool size .

One strategy for using thread pools is that each thread performs the same task , But what task to perform is controlled by several parameters . through Using task specific thread pools , A fixed number of worker threads can be allocated , And provide them with parameters as needed . Worker line The program will receive these parameters , Perform time-consuming calculations , And return the result to the thread pool . The thread pool can then assign other work to the worker The author thread executes . The next example will build a relatively simple thread pool , But it can cover all the basic requirements of the above ideas .

The first is to define a TaskWorker class , It can be extended Worker class .TaskWorker Class is responsible for two things : Trace thread Are you busy with your work , And manage the information and events in and out of the thread . in addition , The task passed to the worker thread is encapsulated in a session About middle , Then correctly resolve and reject

Hasty adoption of parallel computing is not necessarily the best way . The tuning strategy of thread pool will vary depending on the computing task The system hardware varies .

That's all the content of this article , Thank you very much for seeing here , If this article is well written or a little helpful to you , Please thumb up , Please pay attention to , Please share , Of course, any questions can be discussed in the comments , I will answer positively , Thank you again for

copyright notice
author[iwhao_ top],Please bring the original link to reprint, thank you.
https://en.qdmana.com/2021/08/20210827001601269d.html

Random recommended