Introduction to Node.js

Introduction

What is the need?

Node.js is an open-source and cross-platform JavaScript runtime environment. It is a popular tool for many different types of projects and in particular for implementing server based applications, like HTTP servers. Node.js runs the V8 JavaScript engine, the core of Google Chrome, outside of the browser. This allows Node.js to be very performant.

Node.js has a unique advantage because developers that write JavaScript for the browser are now able to write the server-side code in addition to the client-side code without the need to learn a completely different language. Node.js is thus a very good candidate for implementing an OpenAPI specification on the server side.

In addition to the points mentioned above, there also exist code generation tools that allow to generate an Express Node.js application based on an OpenAPI specification document.

What you’ll build

You will implement your first Node.js application. For a better understanding of the Node.js programming style, you will also build some Node.js applications that involve asynchronous mechanisms. You will in particular implement some basic mechanism using Promises and the async/await mechanism.

What you’ll learn

How to set up, develop and run an Express Node.js application.
The basic Javascript principles that you need to understand and use for your Node.js application.
The Node.js programming style and asynchronous mechanisms for writing efficient Node.js applications.

You will demonstrate how to build the application in consecutive steps. Some code blocks are provided for you to simply copy and paste.

What you’ll need

WebStorm.
The Node.js framework (also bundled with WebStorm).
Some prior knowledge of JavaScript and Node.js programming, based on exercise sessions.

Node.js setup and basic principles

Since we are using WebStorm as IDE for the development of our project, Node.js comes pre-installed as a WebStorm bundle. This codelab was written using 17.1.0.

More information on how to use Node.js with WebStorm can be found on Node.js | WebStorm. In particular, you may find information on how to use another version of Node.js installed on your system.

**Creating your first Node.js application**

You may use instructions under Node.js | WebStorm for creating your first Node.js application.

After creating the new Node.js application, your project should look like this

At this stage, your project does not yet contain any application. This will be created in the next steps. Your project however contains a file named “package.json”. The “package.json” file acts like a manifest for your project and it contains a wide range of information about your project. Among important things, we may mention:

As its name suggests, it contains a JSON object with various fields.
It contains “name”, “version”, “description”, “keywords”, “author” and “licence” properties, which is metadata about the project.
The “main” field sets the entry point for the application.
The “scripts” field defines a set of node scripts you can run.

When using other modules, the “package.json” file will also contain:

A “dependencies” property as a list of npm packages installed as dependencies
A “devDependencies” property as a list of npm packages installed as development dependencies.

Both fields tell npm and yarn about the names and versions for all the installed packages. More information will be provided later on npm and related topics.

As already mentioned, at this point you still need to create the “index.js” file (as described in the “package.json” file), as the Javascript file that will be executed upon starting your Node.js application. It is now time to write your first simple Node.js application, as shown below:

index.js

const http = require('http');

const hostname = '127.0.0.1';
const port = 3000;

const server = http.createServer((req, res) => {
  res.statusCode = 200;
  res.setHeader('Content-Type', 'text/plain');
  res.end('Hello, World!\n');
});

server.listen(port, hostname, () => {
  console.log(`Server running at http://${hostname}:${port}/`);
});

If you add this file into your project, run the application (by right-clicking on the “index.js” file and choosing “Run ‘index.js’”), you should be able to open a new tab with the ” http://localhost:3000” in your browser and the following should appear on the screen:

**Understand the JavaScript code of your first application**

You just created your first Node.js JavaScript application. Congratulations !

But it is even better if you understand what was done ! Here are a few explanations:

In JavaScript, variables are dynamically typed and you should understand this important difference as compared to languages like Java, which are strongly typed.
The JavaScript script above starts by creating 3 variables with the const keyword. In JavaScript, variables are not declared with a type keyword (since the type is established dynamically at runtime), but rather by one of the let, const or var keywords. The main difference when using one of these keywords is the scope of the variable. It is a very important concept that you must understand.
The first variable uses the Modules: CommonJS modules system for declaring a new object named http that implements a HTTP server. “http.js” is one of the core modules distributed natively with Node.js and you may find its definition in the folder where core Node.js modules are stored. In the “http.js” file, you may observe that the createServer function is exported and can thus be used by scripts importing the “http.js” module.
As its name suggests, the http.createServer function allows you to create a HTTP server (line 6). The function takes a callback function as an argument. As you can observe, the callback function takes two arguments, one representing the request and the other one the response. It is called upon each request to the HTTP server. This callback function is specified using the arrow function notation. This notation is somehow similar to the use of lambda expressions in Java.
The createServer() call returns a new instance of http.Server that is stored in the variable named server.
On line 12, the method listen() is called on the server object. This function is called with port, hostname and callback function arguments. The callback function will be called once the connection is established on the given port and host. This callback function is again specified using the arrow function notation.

As you have probably noticed, the explanations above contain many links. It is recommended that you follow these links carefully and understand the explanations given in other codelabs or websites. An in depth understanding of the concepts presented here is essential for the remainder of this codelab and of the following ones.

**Expose functionality from a Node.js file using exports**

As for other programming languages, it is a good practice to modularize software development and to develop functionalities in different files/modules that can be used in other files/modules. Node.js implements such a built-in module system, as shortly described previously.

A Node.js file can import functionalities exposed by other Node.js files. Importing is done using the require statement as in

index.js

const http = require('http');

where the http functionality is imported from the “http.js” file. This means that there exists a Node.js file named “http.js” located in the current directory or in one of the modules folder (in this case, the core-modules folder). In the “http.js” file, the functionality exposed outside of the file itself must be exported - note that all objects or variables defined in any Node.js file are private by default and thus not visible outside the Node.js file itself. Exporting a functionality is done with the module.exports API offered by the Modules: CommonJS modules system.

For exporting an object or a function, you need to assign it as a new exports property, as shown below

http.js

...

module.exports = {
  ...
  Server,
  ...
  createServer,
  ...
};

In this file, the Server and createServer are exported and can be used in other files like in our “index.js” file.

Node.js Programming Style

Node.js uses a non-blocking asynchronous coding style. The reasons are that the JavaScript engine is single threaded and uses an event loop mechanism.

For illustrating what blocking vs. non-blocking means, we may use the bank and coffee shop analogy described on Node Basic Concepts. This analogy explains how customers (functions in your code) can be served synchronously or asynchronously. Node.js uses the coffee shop model, where a single thread of execution (in analogy to a single teller) runs all of your JavaScript code, and you provide callbacks to wait for results (in analogy to your name being called when your order is ready).

It is interesting to note that other software stacks like the Apache web server use the bank model. In this case, more threads (in analogy to more tellers) are created to scale and serve several requests simultaneously. Both blocking and non-blocking models, of course, can achieve horizontal scalability by adding more servers (in analogy to more shops or banks).

The Node.js programming style is illustrated with more examples in the JavaScript lessons.

**The Node.js Event Loop**

As browsers do, Node.js also runs an Event Loop, which is a very important concept in both cases. It is important to understand that the Node.js JavaScript code runs on a single thread. This simplifies the way developers write Node.js applications, without worrying about concurrency issues. The same applies for browsers, where every browser tab runs an isolated process with a single thread.

The event loop enables Node’s non-blocking I/O model, which is the key to Node’s ability to scale as illustrated in the previous section. Writing stable and scalable Node.js applications requires a good understanding of the event loop mechanism. This mechanism is illustrated in the picture below (reproduced from Node.js Event Loop).

It is useful to understand the way any Node.js application will work with the help of this diagram and some examples. For an understanding of the execution of the examples shown below, it is important to point out the following points:

When launching a Node.js application by specifying the script to run, the application is started and is handed to the V8 JavaScript engine. A Node.js thread is created for running your application code.
The script is executed from top to bottom. All blocking calls are executed immediately.
After the execution of all blocking calls in the script, the event loop is started if any non-blocking call was made in the script, otherwise the application terminates.

For an in depth understanding, it is easier to start with an application that does not involve the event loop mechanism. In this example, there is no non-blocking call at this stage, so the Node.js application terminates after the execution of all blocking calls.

taskA_taskB.js

'use strict';

function taskA() {
  console.log('taskA executed at :' + Date.now().toString());
}
function taskB() {
  console.log('taskB executed at :' + Date.now().toString());
}
taskA();
taskB();
console.log("all tasks have executed -> exiting application");

If you run this application (by invoking node no_event_loop.js), you should observe the following output on the console:

As you can observe, the taskA and taskB functions are executed and the Node.js application is terminated.

If we introduce a single non blocking call in our application by modifying the taskB() call as illustrated below, we then observe a different behavior:

taskA_settimeoutB.js

'use strict';

function taskA() {
  console.log('taskA executed at :' + Date.now().toString());
}
function taskB() {
  console.log('taskB executed at :' + Date.now().toString());
}
taskA();
settimeout(taskB, 100);
console.log("all tasks have executed -> exiting application");

With the introduction of the non-blocking setTimeout() call, the event loop was started and the following steps were executed:

Step 1: taskA() is executed
Step 2: setTimeout()is called and a timer is set with a callback function to be called once the timeout is expired. Note that at this stage, taskB() is not called!
Step 3: console.log() is called. It is executed immediately and this is the reason why the message is printed before taskB’s message.
Step 4: the Node.js event loop will start running. It will continue running as long as the timer is active. Once the timeout of the setTimeout() function expires, the callback function, taskB() in our case, is called.
Step 5: since there is no other pending request, the Node.js event loop is exited and the application terminates.

When creating a Node.js server application, you don’t want the application to exit once it has executed all calls in your main application script file. In the “index.js” file shown in the previous section, this behavior is realized with the call to server.listen()as the last statement of your script file. If you comment this statement, you may observe that your Node.js application immediately exits after the creation of the Server object. You may also observe that the call to listen() is non blocking by adding a log() statement after the call to listen().

Without going into too much detail, we may explain this behavior by the fact that the event loop has always pending events and will thus never finish its execution. We can illustrate this with the following simple example:

taskA_taskB_noexit.js

'use strict';

function taskA() {
  console.log('taskA executed at :' + Date.now().toString());
}
function taskB() {
  console.log('taskB executed at :' + Date.now().toString());
  setTimeout(() => taskB(), 1000)
}
taskA();
taskB();

console.log('we get here but the application will not exit');

In this example, taskB() reschedules itself upon each execution. In this way, there will always be a timer active and the Node.js event loop will execute forever. The behavior of this application is shown below

Note that the behavior of the application where taskB reschedules itself and the one where taskB recalls itself as shown below is very different. It is important that you understand what the differences are in terms of call stack and event loop.

taskB_callitself.js

function taskA() {
  console.log('taskA executed at :' + Date.now().toString());
}
function taskB() {
  console.log('taskB executed at :' + Date.now().toString());
  taskB();
}
taskA();
taskB();

console.log('we get here but the application will not exit');

Deeper into the Event Loop mechanism

A very detailed description of the event loop mechanism is given by the Node.js developers here.

An intuitive presentation of the mechanism is given in this video and an interactive tool where you can enter simple code snippets is available here.

**Promises in Node.js applications**

Promises are very useful for Node.js applications as well.

Useful examples regarding asynchronous I/O operations are given in the JavaScript lessons.

A useful example is the use of the fs built-in Node.js API. This API gives access to the file system and is provided as a synchronous and asynchronous API. The asynchronous API can be extended for being used with Promises.

Intermediate Recap

Before going further in this codelab, it is important that you make sure to have understood the following concepts:

The way a Node.js application is started and how the calls in the script file are executed.
The difference between blocking/synchronous calls and non-blocking/asynchronous calls.
The basic principles of the event loop.

You should thus be able to explain the flow of execution and events of a simple Node.js application with both synchronous and asynchronous calls.

Node.js Streams

As we have experimented in the previous sections, Node.js provides ways to perform I/O operations in synchronous and asynchronous modes. One of the limitations of the solutions demonstrated in these sections is that for instance data is entirely read from file to memory before it can be processed. Of course, it is possible to perform read operations in loops and to process the data in chunks. However, Node.js also supports the concept of streams, as other languages like Java do.

Streams is a concept that was introduced in the Unix operating system, in which programs can interact with each other passing data through the pipe operator ( | ). Using Streams, rather than read a file entirely into memory and then process it, you read it piece by piece and process it without keeping the entire file content in memory. In Node.js, the Stream module provides this functionality and, based on this module, Node.js provides ways of handling reading/writing files, communication over networks and other information exchanges using streams in an efficient way.

Before proceeding further in this codelab, you should read and understand the related tasks in the JavaScript Asynchronous Programming lesson.

Using the Node.js Stream module and other Node.js core modules, developers can use stream handling capabilities on native Node.js components like stdin, stdout, fs, net, http or zlib. The Node.js Stream module defines four different types of streams:

Readable: a stream you can pipe from, but not pipe into. When you push data into a readable stream, it is buffered until a consumer starts to read the data.
Writable: a stream you can pipe into, but not pipe from. One can push data into a writable stream.
Duplex: a stream you can both pipe into and pipe from.
Transform: a stream you can both pipe into and pipe from, but where the output is a transform of the input.

Creating Streams

Developers can use the streams provided by the native Node.js components or Node.js modules, but they can also create their own stream objects. Creating a stream involves steps which may be slightly different depending on the type of streams. There are also different ways of defining your own stream object. One is to use the concept of JavaScript classes and to extend the base classes. In this case, one or more methods must be redefined in the inheriting class, depending on the type of stream. The class below demonstrates the definition of a Transform stream that receives lines of data and that pushes the same lines without the ‘\n’ character to its output. This example is given without further detail and you need to understand it for completing the deliverables of this codelab.

linereader.js

const { Transform } = require('stream');

class LineReader extends Transform {
  constructor() {
    super({ objectMode: true } );
    this._lastLineData = null;
  }

  _transform(chunk, encoding, callback) {
    let data = chunk.toString();
    if (this._lastLineData) {
      data = this._lastLineData + data;
    }

    let lines = data.split('\n');
    this._lastLineData = lines.splice(lines.length - 1, 1)[0];
    lines.forEach((line) => this.push(line), this);
    callback();
  }

  _flush(done) {
    if (this._lastLineData) {
      this.push(this._lastLineData);
      this._lastLineData = null;
    }
    done();
  }
}

Wrap up

As a wrap up of this codelab, you are asked to deliver two simple Node.js applications that perform the following:

The application uses the data file available here data.csv.
The content of this file is made of CSV data, with two columns. The first column contains a date in the YYYY-MM-DDTHH:MM:SS.MMMZ format and the second column contains a floating point number.
Both applications must extract all numbers contained in the second column and filter out all values smaller than 90. This content must be delivered to the HTTP client upon connection. The first application must be implemented using Promises without streams, while the second application must be implemented using streams without Promises.
Your implementation must not use any other Node.js module, such as a CSV parser module.
The values displayed in the window are illustrated in the figure below:

Exercises

You may find some exercices related to Node.js here.