Processes

Although Node abstracts a lot of things from the operating system, you are still running in an operating system and may want to interact more directly with it. Node allows you to interact with system processes that already exist, as well as create new child processes to do work of various kinds. Although Node itself is generally a “fat” thread with a single event loop, you are free to start other processes (threads) to do work outside of the event loop.

process Module

The process module enables you to get information about and change the settings of the current Node process. Unlike most modules, the process module is global and is always available as the variable process.

process events

process is an instance of EventEmitter, so it provides events based on systems calls to the Node process. The exit event provides a final hook before the Node process exits (see Example 5-14). Importantly, the event loop will not run after the exit event, so only code without callbacks will be executed.

Example 5-14. Calling code when Node is exiting

process.on('exit', function () {
  setTimeout(function () {
   console.log('This will not run');
  }, 100);
  console.log('Bye.');
});

Because the loop isn’t going to run again, the setTimeout() code will never be evaluated.

An extremely useful event provided by process is uncaughtException (Example 5-15). After you’ve spent any time with Node, you’ll find that exceptions that hit the main event loop will kill your Node process. In many use cases, especially servers that are expected to never be down, this is unacceptable. The uncaughtException event provides an extremely brute-force way of catching these exceptions. It’s really a last line of defense, but it’s extremely useful for that purpose.

Example 5-15. Trapping an exception with the uncaughtException event

process.on('uncaughtException', function (err) {
  console.log('Caught exception: ' + err);
});

setTimeout(function () {
  console.log('This will still run.');
}, 500);

// Intentionally cause an exception, but don't catch it.
nonexistentFunc();
console.log('This will not run.');

Let’s break down what’s happening. First, we create an event listener for uncaughtException. This is not a smart handler; it simply outputs the exception to stdout. If this Node script were running as a server, stdout could easily be used to save the log into a file and capture these errors. However, because it captures the event for a nonexistent function, Node will not exit, but the standard flow is still disrupted. We know that all the JavaScript runs once, and then any callbacks will be run each time their event listener emits an event. In this scenario, because nonexistentFunc() will throw an exception, no code following it will be called. However, any code that has already been run will continue to run. This means that setTimeout() will still call. This is significant when you’re writing servers. Let’s consider some more code in this area, shown in Example 5-16.

Example 5-16. The effect on callbacks of catching exceptions

var http = require('http');
var server = http.createServer(function(req,res) {
  res.writeHead(200, {});
  res.end('response');
  badLoggingCall('sent response');
  console.log('sent response');
});

process.on('uncaughtException', function(e) {
  console.log(e);
});

server.listen(8080);

This code creates a simple HTTP server and then listens for any uncaught exceptions at the process level. In our HTTP server, the callback deliberately calls a bad function after we’ve sent the HTTP response. Example 5-17 shows the console output for this script.

Example 5-17. Output of Example 5-16

Enki:~ $ node ex-test.js 
{ stack: [Getter/Setter],
  arguments: [ 'badLoggingCall' ],
  type: 'not_defined',
  message: [Getter/Setter] }
{ stack: [Getter/Setter],
  arguments: [ 'badLoggingCall' ],
  type: 'not_defined',
  message: [Getter/Setter] }
{ stack: [Getter/Setter],
  arguments: [ 'badLoggingCall' ],
  type: 'not_defined',
  message: [Getter/Setter] }
{ stack: [Getter/Setter],
  arguments: [ 'badLoggingCall' ],
  type: 'not_defined',
  message: [Getter/Setter] }

When we start the example script, the server is available, and we have made a number of HTTP requests to it. Notice that the server doesn’t shut down at any point. Instead, the errors are logged using the function attached to the uncaughtException event. However, we are still serving complete HTTP requests. Why? Node deliberately prevented the callback in process from proceeding and calling console.log(). The error affected only the process we spawned and the server kept running, so any other code was unaffected by the exception encapsulated in one specific code path.

It’s important to understand the way that listeners are implemented in Node. Let’s take a look at Example 5-18.

Example 5-18. The abbreviated listener code for EventEmitter

EventEmitter.prototype.emit = function(type) {

...

  var handler = this._events[type];

...

  } else if (isArray(handler)) {
    var args = Array.prototype.slice.call(arguments, 1);

    var listeners = handler.slice();
    for (var i = 0, l = listeners.length; i < l; i++) {
      listeners[i].apply(this, args);
    }
    return true;

...

};

After an event is emitted, one of the checks in the runtime handler is to see whether there is an array of listeners. If there is more than one listener, the runtime calls the listeners by looping through the array in order. This means that the first attached listener will be called first with apply(), then the second, and so on. What’s important to note here is that all listeners on the same event are part of the same code path. So an uncaught exception in one callback will stop execution for all other callbacks on the same event. However, an uncaught exception in one instance of an event won’t affect other events.

We also get access to a number of system events through process. When the process gets a signal, it is exposed to Node via events emitted by process. An operating system can generate a lot of POSIX system events, which can be found in the sigaction(2) manpage. Really common ones include SIGINT, the interrupt signal. Typically, a SIGINT is what happens when you press Ctrl-C in the terminal on a running process. Unless you handle the signal events via process, Node will just perform the default action; in the case of a SIGINT, the default is to immediately kill the process. You can change default behavior (except for a couple of signals that can never get caught) through the process.on() method (Example 5-19).

Example 5-19. Catching signals to the Node process

// Start reading from stdin so we don't exit.
process.stdin.resume();

process.on('SIGINT', function () {
  console.log('Got SIGINT.  Press Control-D to exit.');
});

To make sure Node doesn’t exit on its own, we read from stdin (described in Operating system input/output) so the Node process continues to run. If you Ctrl-C the program while it’s running, the operating system (OS) will send a SIGINT to Node, which will be caught by the SIGINT event handler. Here, instead of exiting, we log to the console instead.

Interacting with the current Node process

Process contains a lot of meta-information about the Node process. This can be very helpful when you need to manage your Node environment from within the process. There are a number of properties that contain immutable (read-only) information about Node, such as:

process.version

Contains the version number of the instance of Node you are running.

process.installPrefix

Contains the install path (/usr/local, ~/local, etc.) used during installation.

process.platform

Lists the platform on which Node is currently running. The output will specify the kernel (linux2, darwin, etc.) rather than “Redhat ES3,” “Windows 7,” “OSX 10.7,” etc.

process.uptime()

Contains the number of seconds the process has been running.

There are also a number of things that you can get and set about the Node process. When the process runs, it does so with a particular user and group. You can get these and set them with process.getgid(), process.setgid(), process.getuid(), and process.setuid(). These can be very useful for making sure that Node is running in a secure way. It’s worth noting that the set methods take either the numerical ID of the group or username or the group/username itself. However, if you pass the group or username, the methods do a blocking lookup to turn the group/username into an ID, which takes a little time.

The process ID, or PID, of the running Node instance is also available as the process.pid property. You can set the title that Node displays to the system using the process.title property. Whatever is set in this property will be displayed in the ps command. This can be extremely useful when you are running multiple Node processes in a production environment. Instead of having a lot of processes called node, or possibly node app.js, you can set names intelligently for easy reference. When one process is hogging CPU or RAM, it’s great to have a quick idea of which one is doing so.

Other available information includes process.execPath, which shows the execution path of the current Node binary (e.g., /usr/local/bin/node). The current working directory (to which all files opened will be relative) is accessible with process.cwd(). The working directory is the directory you were in when Node was started. You can change it using process.chdir() (this will throw an exception if the directory is unreadable or doesn’t exist). You can also get the memory usage of the current Node process using process.memoryUsage(). This returns an object specifying the size of the memory usage in a couple of ways: rss shows how much RAM is being used, and vsize shows the total memory used, including both RAM and swap. You’ll also get some V8 stats: heapTotal and heapUsed show how much memory V8 has allocated and how much it is actively using.

Operating system input/output

There are a number of places where you can interact with the OS (besides making changes to the Node process in which the program is running) from process. One of the main ones is having access to the standard OS I/O streams. stdin is the default input stream to the process, stdout is the process’s output stream, and stderr is its error stream. These are exposed with process.stdin, process.stdout, and process.stderr, respectively. process.stdin is a readable stream, whereas process.stdout and process.stderr are writable streams.

process.stdin

stdin is a really useful device for interprocess communication. It’s used to facilitate things such as piping in the shell. When we type cat file.txt | node program.js, it will be the stdin stream that receives the data from the cat command.

Because process is always available, the process.stdin stream is always initialized in any Node process. But it starts out in a paused state, where Node can write to it but you can’t read from it. Before attempting to read from stdin, call its resume() method (see Example 5-20). Until then, Node will just fill the read buffer for the stream and then stop until you are ready to deal with it. This approach avoids data loss.

Example 5-20. Writing stdin to stdout

process.stdin.resume();
process.stdin.setEncoding('utf8');

process.stdin.on('data', function (chunk) {
  process.stdout.write('data: ' + chunk);
});

process.stdin.on('end', function () {
  process.stdout.write('end');
});

We ask process.stdin to resume(), set the encoding to UTF-8, and then set a listener to push any data sent to process.stdout. When the process.stdin sends the end event, we pass that on to the process.stdout stream. We could also easily do this with the stream pipe() method, as in Example 5-21, because stdin and stdout are both real streams.

Example 5-21. Writing stdin to stdout using pipe

process.stdin.resume();
process.stdin.pipe(process.stdout);

This is the most elegant way of connecting two streams.

process.stderr

stderr is used to output exceptions and problems with program execution. In POSIX systems, because it is a separate stream, output logs and error logs can be easily redirected to different destinations. This can be very desirable, but in Node it comes with a couple of caveats. When you write to stderr, Node guarantees that the write will happen. However, unlike a regular stream, this is done as a blocking call. Typically, calls to Steam.write() return a Boolean value indicating whether Node was able to write to the kernel buffer. With process.stderr this will always be true, but it might take a while to return, unlike the regular write(). Typically, it will be very fast, but the kernel buffer may sometimes be full and hold up your program. This means that it is generally inadvisable to write a lot to stderr in a production system, because it may block real work.

One final thing to note is that process.stderr is always a UTF-8 stream. Any data you write to process.stderr will be interpreted as UTF-8 without you having to set an encoding. Moreover, you are not able to change the encoding here.

Another place where Node programmers often touch the operating system is to retrieve the arguments passed when their program is started. argv is an array containing the command-line arguments, starting with the node command itself (see Examples 5-22 and 5-23).

Example 5-22. A simple script outputting argv

console.log(process.argv);

Example 5-23. Running Example 5-22

Enki:~ $ node argv.js -t 3 -c "abc def" -erf       foo.js
[ 'node',
  '/Users/croucher/argv.js',
  '-t',
  '3',
  '-c',
  'abc def',
  '-erf',
  'foo.js' ]
Enki:~ $

There are few things to notice here. First, the process.argv array is simply a split of the command line based on whitespace. If there are many characters of whitespace between two arguments, they count as only a single split. The check for whitespace is written as \s+ in a regular expression (regex). This doesn’t count for whitespace in quotes, however. Quotes can be used to keep tokens together. Also, notice how the first file argument is expanded. This means you can pass a relative file argument on the command line, and it will appear as its absolute pathname in argv. This is also true for special characters, such as using ~ to refer to the home directory. Only the first argument is expanded this way.

argv is extremely helpful for writing command-line scripts, but it’s pretty raw. There are a number of community projects that extend its support to help you easily write command-line applications, including support for automatically enabling features, writing inline help systems, and other more advanced features.

Event loop and tickers

If you’ve done work with JavaScript in browsers, you’ll be familiar with setTimeout(). In Node, we have a much more direct way to access the event loop and defer work that is extremely useful. process.nextTick() creates a callback to be executed on the next “tick,” or iteration of the event loop. While it is implemented as a queue, it will supersede other events. Let’s explore that a little bit in Example 5-24.

Example 5-24. Using process.nextTick( ) to insert callbacks into the event loop

> var http = require('http');
> var s = http.createServer(function(req, res) {
... res.writeHead(200, {});
... res.end('foo');
... console.log('http response');
... process.nextTick(function(){console.log('tick')});
... });
> s.listen(8000);
>
> http response 
tick
http response
tick

This example creates an HTTP server. The request event listener on the server creates a callback using process.nextTick(). No matter how many requests we make to the HTTP server, the “tick” will always occur on the next pass of the event loop. Unlike other callbacks, nextTick() callbacks are not a single event and thus are not subject to the usual callback exception brittleness, as shown in Examples 5-25 and 5-26.

Example 5-25. nextTick( ) continues after other code’s exceptions

process.on('uncaughtException', function(e) {
  console.log(e);
});

process.nextTick(function() {
  console.log('tick');
});
process.nextTick(function() {
  iAmAMistake();
  console.log('tock');
});
process.nextTick(function() {
  console.log('tick tock');
});
console.log('End of 1st loop');

Example 5-26. Results of Example 5-25

Enki:~ $ node process-next-tick.js 
End of 1st loop
tick
{ stack: [Getter/Setter],
  arguments: [ 'iAmAMistake' ],
  type: 'not_defined',
  message: [Getter/Setter] }
tick tock
Enki:~ $

Despite the deliberate error, unlike other event callbacks on a single event, each of the ticks is isolated. Let’s walk through the code. First, we set an exception handler to catch any exceptions. Next, we set a number of callbacks on process.nextTick(). Each of these callbacks outputs to the console; however, the second has a deliberate error. Finally, we log a message to the console. When Node runs the program, it evaluates all the code, which includes outputting 'End of 1st loop'. Then it calls the callbacks on nextTick() in order. First 'tick' is outputted, and then we throw an error. This is because we hit our deliberate mistake on the next tick. The error causes process to emit() an uncaughtException event, which runs our function to output the error to the console. Because we threw an error, 'tock' was not outputted to the console. However, 'tick tock' still is. This is because every time nextTick() is called, each callback is created in isolation. You could consider the execution of events to be emit(), which is called inline in the current pass of event loop; nextTick(), which is called at the beginning of the event loop in preference to other events; and finally, other events in order at the beginning of the event loop.

Child Process

The child_process module allows you to create child processes of your main Node process. Because Node has only one event loop in a single process, sometimes it is helpful to create child processes. For example, you might do this to make use of more cores of your CPU, because a single Node process can use only one of the cores. Or, you could use child_process to launch other programs and let Node interact with them. This is extremely helpful when you’re writing command-line scripts.

There are two main methods in child_process. spawn() creates a child process with its own stdin, stdout, and stderr file descriptors. exec() creates a child process and returns the result as a callback when the process is complete. This is an extremely versatile way to create child processes, a way that is still nonblocking but doesn’t require you to write extra code in order to steam forward.

All child processes have some common properties. They each contain properties for stdin, stdout, and stderr, which we discussed in Operating system input/output. There is also a pid property that contains the OS process ID of the child. Children emit the exit event when they exit. Other data events are available via the stream methods of child_process.stdin, child_process.stdout, and child_process.stderr.

child_process.exec( )

Let’s start with exec() as the most straightforward use case. Using exec(), you can create a process that will run some program (possibly another Node program) and then return the results for you in a callback (Example 5-27).

Example 5-27. Calling ls with exec( )

var cp = require('child_process');

cp.exec('ls -l', function(e, stdout, stderr) {
  if(!e) {
    console.log(stdout);
    console.log(stderr);
  }
});

When you call exec(), you can pass a shell command for the new process to run. Note that the entire command is a string. If you need to pass arguments to the shell command, they should be constructed into the string. In the example, we passed ls the -l argument to get the long form of the output. You can also include complicated shell features, such as | to pipe commands. Node will return the results of the final command in the pipeline.

The callback function receives three arguments: an error object, the result of stdout, and the result of stderr. Notice that just calling ls will run it in the current working directory of Node, which you can retrieve by running process.cwd().

It’s important to understand the difference between the first and third arguments. The error object returned will be null unless an error status code is returned from the child process or there was another exception. When the child process exits, it passes a status up to the parent process. In Unix, for example, this is 0 for success and an 8-bit number greater than 0 for an error. The error object is also used when the command called doesn’t meet the constraints that Node places on it. When an error code is returned from the child process, the error object will contain the error code and stderr. However, when a process is successful, there may still be data on stderr.

exec() takes an optional second argument with an options object. By default, this object contains the properties shown in Example 5-28.

Example 5-28. Default options object for child_process.exec( )

var options = { encoding: 'utf8',
                timeout: 0,
                maxBuffer: 200 * 1024,
                killSignal: 'SIGTERM',
                setsid: false,
                cwd: null,
                env: null };

The properties are:

encoding

The encoding for passing characters on the I/O streams.

timeout

The number of milliseconds the process can run before Node kills it.

killSignal

The signal to use to terminate the process in case of a time or Buffer size overrun.

maxBuffer

The maximum number of kilobytes that stdout or stderr each may grow to.

setsid

Whether to create a new session inside Node for the process.

cwd

The initial working directory for the process (where null uses Node’s current working directory).

env

The process’s environment variables. All environment variables are also inherited from the parent.

Let’s set some of the options to put constraints on a process. First, let’s try restricting the Buffer size of the response, as demonstrated in Example 5-29.

Example 5-29. Restricting the Buffer size on child_process.exec( ) calls

> var child = cp.exec('ls', {maxBuffer:1}, function(e, stdout, stderr) {
... console.log(e);
... }
... );
> { stack: [Getter/Setter],
  arguments: undefined,
  type: undefined,
  message: 'maxBuffer exceeded.' }

In this example, you can see that when we set a tiny maxBuffer (just 1 kilobyte), running ls quickly exhausted the available space and threw an error. It’s important to check for errors so that you can deal with them in a sensible way. You don’t want to cause an actual exception by trying to access resources that are unavailable because you’ve restricted the child_process. If the child_process returns with an error, its stdin and stdout properties will be unavailable and attempts to access them will throw an exception.

It’s also possible to stop a Child after a set amount of time, as shown in Example 5-30.

Example 5-30. Setting a timeout on process.exec( ) calls

> var child = cp.exec('for i in {1..100000};do echo $i;done',
... {timeout:500, killSignal:'SIGKILL'},
... function(e, stdout, stderr) {
...   console.log(e);
... });
> { stack: [Getter/Setter], arguments: undefined, type: undefined, message: ... }

This example defines a deliberately long-running process (counting from 1 to 100,000 in a shell script), but we also set a short timeout. Notice that we also specified a killSignal. By default, the kill signal is SIGTERM, but we used SIGKILL to show the feature.[14] When we get the error back, notice there is a killed property that tells us that Node killed the process and that it didn’t exit voluntarily. This is also true for the previous example. Because it didn’t exit on its own, there isn’t a code property or some of the other properties of a system error.

child_process.spawn( )

spawn() is very similar to exec(). However, it is a more general-purpose method that requires you to deal with streams and their callbacks yourself. This makes it a lot more powerful and flexible, but it also means that more code is required to do the kind of one-shot system calls we accomplished with exec(). This means that spawn() is most often used in server contexts to create subcomponents of a server and is the most common way people make Node work with multiple cores on a single machine.

Although it performs the same function as exec(), the API for spawn() is slightly different (see Examples 5-31 and 5-32). The first argument is still the command to start the process with, but unlike exec(), it is not a command string; it’s just the executable. The process’s arguments are passed in an array as the (optional) second argument to spawn(). It’s like an inverse of process.argv: instead of the command being split() across spaces, you provide an array to be join()ed with spaces.

Finally, spawn() also takes an options array as the final argument. Some of these options are the same as exec(), but we’ll cover that in more detail shortly.

Example 5-31. Starting child processes using spawn( )

var cp = require('child_process');

var cat = cp.spawn('cat');

cat.stdout.on('data', function(d) {
  console.log(d.toString());
});
cat.on('exit', function() {
  console.log('kthxbai');
});

cat.stdin.write('meow');
cat.stdin.end();

Example 5-32. Results of previous example

Enki:~ $ node cat.js 
meow
kthxbai
Enki:~ $

In this example, we’re using the Unix program cat, which simply echoes back whatever input it gets. You can see that, unlike exec(), we don’t issue a callback to spawn() directly. That’s because we are expecting to use the Streams provided by the Child class to get and send data. We named the variable with the instance of Child “cat,” and so we can access cat.stdout to set events on the stdout stream of the child process. We set a listener on cat.stdout to watch for any data events, and we set a listener on the child itself in order to watch for the exit event. We can send our new child data using stdin by accessing its child.stdin stream. This is just a regular writable stream. However, as a behavior of the cat program, when we close stdin, the process exits. This might not be true for all processes, but it is true for cat, which exists only to echo back data.

The options that can be passed to spawn() aren’t exactly the same as exec(). This is because you are expected to manage more things by hand with spawn(). The env, setsid, and cwd properties are all options for spawn(), as are uid and gid, which set the user ID and the group ID, respectively. Like process, setting the uid or the gid to a username or a group name will block briefly while the user or group is looked up. There is one more option for spawn() that doesn’t exist for exec(): you can set custom file descriptors that will be given to the new child process. Let’s take some time to cover this topic because it’s a little complex.

A file descriptor in Unix is a way of keeping track of which programs are doing what with which files. Because Unix lets many programs run at the same time, there needs to be a way to make sure that when they interact with the filesystem they don’t accidentally overwrite someone else’s changes. The file descriptor table keeps track of all the files that a process wants to access. The kernel might lock a particular file to stop two programs from writing to the file at the same time, as well as other management functions. A process will look at its file descriptor table to find the file descriptor representing a particular file and pass that to the kernel to access the file. The file descriptor is simply an integer.

The important thing is that the name “file descriptor” is a little deceptive because it doesn’t represent only pure files; network and other sockets are also allocated file descriptors. Unix has interprocess communications (IPC) sockets that let processes talk to each other. We’ve been calling them stdin, stdout, and stderr. This is interesting because spawn() lets us specify file descriptors when starting a new child process. This means that instead of the OS assigning a new file descriptor, we can ask child processes to share an existing file descriptor with the parent process. That file descriptor might be a network socket to the Internet or just the parent’s stdin, but the point is that we have a powerful way of delegating work to child processes.

How does this work in practice? When passing the options object to spawn(), we can specify customFds to pass our own three file descriptors to the child instead of them creating a stdin, stdout, and stderr file descriptor (Examples 5-33 and 5-34).

Example 5-33. Passing stdin, stdout, and stderr to a child process

var cp = require('child_process');

var child = cp.spawn('cat', [], {customFds:[0, 1, 2]});

Example 5-34. Running the previous example and piping in data to stdin

Enki:~ $ echo "foo"
foo
Enki:~ $ echo "foo" | node

readline.js:80
    tty.setRawMode(true);
        ^
Error: ENOTTY, Inappropriate ioctl for device
    at new Interface (readline.js:80:9)
    at Object.createInterface (readline.js:38:10)
    at new REPLServer (repl.js:102:16)
    at Object.start (repl.js:218:10)
    at Function.runRepl (node.js:365:26)
    at startup (node.js:61:13)
    at node.js:443:3
Enki:~ $ echo "foo" | cat
foo
Enki:~ $ echo "foo" | node fds.js 
foo
Enki:~ $

The file descriptors 0, 1, and 2 represent stdin, stdout, and stderr, respectively. In this example, we create a child and pass it stdin, stdout, and stderr from the parent Node process. We can test this wiring using the command line. The echo command outputs a string “foo.” If we pass that directly to node with a pipe (stdout to stdin), we get an error. We can, however, pass it to the cat command, which echoes it back. Also, if we pipe to the Node process running our script, it echoes back. This is because we’ve hooked up the stdin, stdout, and stderr of the Node process directly to the cat command in our child process. When the main Node process gets data on stdin, it gets passed to the cat child process, which echoes it back on the shared stdout. One thing to note is that once you wire up the Node process this way, the child process loses its child.stdin, child.stdout, and child.stderr file descriptor references. This is because once you pass the file descriptors to the process, they are duplicated and the kernel handles the data passing. Consequently, Node isn’t in between the process and the file descriptors (FDs), so you cannot add events to those streams (see Examples 5-35 and 5-36).

Example 5-35. Trying to access file descriptor streams fails when custom FDs are passed

var cp = require('child_process');
var child = cp.spawn('cat', [], {customFds:[0, 1, 2]});
child.stdout.on('data', function(d) {
  console.log('data out');
});

Example 5-36. Results of the test

Enki:~ $ echo "foo" | node fds.js 

node.js:134
        throw e; // process.nextTick error, or 'error' event on first tick
 foo
       ^
TypeError: Cannot call method 'on' of null
    at Object.<anonymous> (/Users/croucher/fds.js:3:14)
    at Module._compile (module.js:404:26)
    at Object..js (module.js:410:10)
    at Module.load (module.js:336:31)
    at Function._load (module.js:297:12)
    at Array.<anonymous> (module.js:423:10)
    at EventEmitter._tickCallback (node.js:126:26)
Enki:~ $

When custom file descriptors are specified, the streams are literally set to null and are completely inaccessible from the parent. It is still preferable in many cases, though, because routing through the kernel is much faster than using something like stream.pipe() with Node to connect the streams together. However, stdin, stdout, and stderr aren’t the only file descriptors worth connecting to child processes. A very common use case is connecting network sockets to a number of children, which allows for multicore utilization.

Say we are creating a website, a game server, or anything that has to deal with a bunch of traffic. We have this great server that has a bunch of processors, each of which has two or four cores. If we simply started a Node process running our code, we’d have just one core being used. Although CPU isn’t always the critical factor for Node, we want to make sure we get as close to the CPU bound as we can. We could start a bunch of Node processes with different ports and load-balance them with Nginx or Apache Traffic Server. However, that’s inelegant and requires us to use more software. We could create a Node process that creates a bunch of child processes and routes all the requests to them. This is a bit closer to our optimal solution, but with this approach we just created a single point of failure because only one Node process routes all the traffic. This isn’t ideal. This is where passing custom FDs comes into its own. In the same way that we can pass the stdin, stdout, and stderr of a master process, we can create other sockets and pass those in to child processes. However, because we are passing file descriptors instead of messages, the kernel will deal with the routing. This means that although the master Node process is still required, it isn’t bearing the load for all the traffic.



[14] SIGKILL can be invoked in the shell through kill -9.

Get Node: Up and Running now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.