The Buffer class/module in Node.js is designed to handle raw binary data. Each buffer corresponds to some raw memory allocated outside V8 Engine. It is mechanism for handling streams of binary data for processing or reading.
The statements above may sound confusing to a Nodejs Beginner and even some intermediate JS developers. In my experience as a Backend Developer sometimes integrating my API with clients from different languages like Java, Kotlin, C and C++ can be a pain in the butt especially when implementing two-way encryption algorithms (AES, RSA or Triple DES) or end to end binary data communication.
However, in this post I will try to break down the concept of Binary Data, Buffers and Streams in Nodejs in the best way i can and hopefully it saves you from unnecessary blockers in the nearest future or even right now.
Binary Data
Binary is simply a collection of 0s and 1s. In the regular number system we mean a number in base two. Binary is how the computer represents data in memory. Meaning when we write our Nodejs code in Vanilla, ES6 or Typescript the end goal of V8 engine is to compile our code to binary and execute our program.
Nodejs is a Javascript server-side runtime environment that runs on the V8 Engine
Let's break the above statement down this way, Nodejs is a runtime environment for your JS code outside the browser. Nodejs is built on Google's V8 Engine (Javascript Interpreter). For example
let a = 8;
Here our variable "a" (a location in memory) accommodates an integer value of 8 (1000 in binary ). The same concept goes to string values as well however strings are represented in memory with their Unicode values. For example.
console.log ("b".charCodeAt(0))
console.log ("B".charCodeAt(0))
//output
//98
//66
From the above code snippet "b" and "B" are totally different values in memory as they have completely different unicode representations in memory. In Nodejs (Javascript) by default your integer and string values are in UTF-8 format (same for every language). This simply means in memory they are encoded in bytes (8-bits). For better understanding:
let a = 8;
//in memory the content of "a" would be 00001000
UTF-8 (8-bits Unicode Transformation Format) data encoding simply means, data is represented in memory as bytes (8-bits binary values)
These are basic data types in JS but its also the same when handling files, pictures and videos.
Streams
Now that we fully understand how data is been stored in memory as binary, lets have a look at Streams in Nodejs and then finally Buffers. Streams are simply how a sequence of data moves from on point to another. My last sentence is pretty straightforward but loaded with a lot of information. Nodejs has an asynchronous nature of executing our JS codre at runtime meaning one process does not need to wait for another process that it depends on to complete before it moves on to do other processing on the go. For example when we write a program read file from a directory and return that image to client as a response or video streaming, we do not need to wait until the read operation is done before processing it. Nodejs reads the file as streams (in chunks) until it is done with the operation. Let me explain with a code snippet for better understanding.
//we are importing the nodejs in-built file system library
const fs = require("fs");
//importing nodejs in-built stream library
const stream = require("stream");
//imagepath is a variable (a string of our file location on disk
let readstream = fs.createStream (imagepath);
let outputImage = null;
//creating a pipe pass through for the stream of data
const streamPassThrough = new stream.PassThrough();
stream.pipeline(
readStream,
streamPassThrough,
(error) => {
if (error) {
return;
}
},
);
streamPassThrough.pipe(outputImage);
Wow, that's a lot of code right?!. But its actually a very simple code reading an image in streams and sending the content of the image to a variable (outputImage). Here we can return the image to a client or do whatever processing with it while we read the image simultaneously, sounds cool right. That's exactly how streams work in Nodejs, basically our large image broken down into chunks binary data, read as streams and processed all at the same time!
Back to our definition of a Buffer which we defined as "a mechanism for handling streams of binary data in Nodejs". By now, we understand Binary data and streams, we can dive into Buffers in Nodejs.
Buffers
Now notice that, when we read our image from disk in streams two things could happen.
The process of reading the image in chunks could be faster than the rate we need it to be processed.
The processing time could also be much faster than the time it takes to read each chunk of our large image.
To handle both instances efficiently we need to have a waiting area in memory where data would be until it is needed for processing. So what i mean is this:
If we have a 10mb image for example, broken down into 50 chunks of data (streams) and we need 1 chunk at a time from memory but our stream reads them in twos. Then we would have our "waiting area" occupy the 1 left unprocessed at each stream and then piped to our "outputImage" variable when needed. Make sense?
Now that "waiting area" is what we call a Buffer in Nodejs. Yeah it's that simple! You could also picture it from a the perspective of how Youtube handles videos relatively to the speed of your internet.
What Can We Do With Buffers in Nodejs
Interestingly, we can easily create our own Buffers in Nodejs.
let a = new Buffer.alloc(0)
//creates an empty Buffer
let b = Buffer.alloc(8);
//create a byte array of 8 zeros
let c = Buffer.from("Hello world");
//create a buffer of a string value
console.log(a);
console.log (b);
console.log (c);
The output of the program explain it all, hopefully!
Convert data from UTF-8 to other Formats
Remember our Javascript integers and strings are treated as UTF-8 in memory, well guess what we can easily convert UTF-8 data to other data formats like base64, hex and bytearrays with Buffers.
let data = "Hello World";
let hexstring = Buffer.from(data, 'utf8').toString('hex');
console.log(hexstring)
let base64encoded = Buffer.from (data, 'utf8').toString("base64")
console.log(base64encoded)
let bytearray = Buffer.from(data, 'utf8');
console.log(bytearray);
You can also concatenate Buffers.
For example
let a = "hello"
a = Buffer.from("hello", "utf8");
let b = "world";
b = Buffer.from("world", "utf8");
let c = Buffer.concat([a, b])
console.log(c)
//outputs a byte array
console.log(c.toString())
//outputs "helloworld"
In the code snippet above we concatenated two Buffers and had our outputs as byte arrays and string as well. You can understand that the values for variable a and b respectively did not change when we passed them as parameters to the Buffer API, we only changed the format to byte arrays.
It's A Wrap
Hope you had fun learning the concept of Buffers in Nodejs and how it works. To get other methods and cool things you can do with Buffers check out the official Nodejs documentation
P.S Buffer is a Nodejs Global Module so it would only work in a Nodejs Environment not on Browser.
Thanks and Happy Coding!
#BackendStrories