My FeedDiscussionsHeadless CMS
New
Sign in
Log inSign up
Learn more about Hashnode Headless CMSHashnode Headless CMS
Collaborate seamlessly with Hashnode Headless CMS for Enterprise.
Upgrade ✨Learn more
Storacle - a decentralized file storage

Storacle - a decentralized file storage

Alexander Balasyan's photo
Alexander Balasyan
·May 5, 2020

Before the start, I'd like to leave a link to the previous article, to clarify what exactly we are talking about.

In this article, I want to introduce the layer that is responsible for storing files and how it can be used by anyone. Storace is an independent library. You can organize storage of any files.

In my previous article I was too hard on ipfs, but it is due to the context of my task.

In fact, I think that project is really cool. I just prefer the ability to create different networks for different tasks. This allows you to better organize the structure and reduce the load on each node and the network as a whole. If necessary, you can even split the network into pieces within a single project based on certain criteria, reducing the overall load.

So, storacle uses the spreadable mechanism to organize the network. Main features:

  • Files can be added to the storage through any node.

  • Files are saved as a whole, not in blocks.

  • Each file has its own unique content hash for further work with it.

  • Files can be duplicated for greater reliability

  • The number of files on a single node is limited only by a file system (there is an exception, which will be discussed later)

  • The number of files in the network is limited by the capabilities of spreadable by the number of allowed nodes in the network, which in the second version can allow you to work with an infinite number of nodes (more on this in another article)

A simple example of how it works from the program:

The server:

const  Node  =  require('storacle').Node;

(async () => {
  try {
    const  node  =  new  Node({
      port:  4000,
      hostname:  'localhost'
    });
    await node.init();
  }
  catch(err) {
    console.error(err.stack);
    process.exit(1);
  }
})();

The client:

const  Client  =  require('storacle').Client;

(async () => {
  try {
    const  client  =  new  Client({
      address:  'localhost:4000'
    });
    await client.init();
    const  hash  =  await client.storeFile('./my-file');
    const  link  =  await client.getFileLink(hash);    
    await client.removeFile(hash);

  }
  catch(err) {
    console.error(err.stack);
    process.exit(1);
  }
})();

Inside peek

There's nothing supernatural under the hood. Information about the number of files, their total size and other points are stored in the in-memory database and updated when deleting and adding files, so there is no need to frequently access the file system. An exception is the inclusion of the garbage collector when file circulation is needed, rather than limiting their number. In this case, you have to go through the storage from time to time. And working with a large number of files (let's say more than one million files) can lead to significant loads. It is better to store less files and run more nodes. If the "cleaner" is disabled, there is no such problem.

The file storage consists of 256 folders and 2 levels of nesting. Files are stored in second-level folders. So, if we have 1 million files in each folder, there are about 62500 pieces (1000000 / sqrt (256)).

Folder names are formed from the file hash to provide you quick access if necessary.

This structure was chosen based on a large number of different storage requirements: support for weak file systems, where it is not desirable to have many files in a single folder, fast crawl of the folders if necessary, and so on.

Caching

When files are added or received, links to files are written to the cache. This often means that you don't need to search the entire network for a file. This speeds up getting links and reduces the load on the network. Caching also occurs via http headers.

Isomorphism

The client is written in javascript and is isomorphic, it can be used directly from your browser.

You can upload a file https://github.com/ortexx/storacle/blob/master/dist/storacle.client.js as a script and get access to window.ClientStoracle or import via the build system, etc

Deferred links

An interesting feature is also the "deferred link". This is a link to the file that can be obtained synchronously, here and now, and the file will be pulled up when it is found in the storage. This is very convenient, for example, when you need to show some images on the site. Just put a deferred link in the src and that's it. You can come up with a lot of cases.

Api of the client

  • async Client.prototype.storeFile() - file storing

  • async Client.prototype.getFileLink() - getting a direct link to a file

  • async Client.prototype.getFileLinks() - getting a list of direct links to a file from all nodes where it exists

  • async Client.prototype.getFileToBuffer() - getting a file as a buffer

  • async Client.prototype.getFileToPath() - getting a file to the file system

  • async Client.prototype.getFileToBlob() - getting a file as a blob (for the browser version)

  • async Client.prototype.removeFile() - file deletion

  • Client.prototype.createRequestedFileLink() - creating a deferred link

Export files to another server

To transfer files to another node, you can:

  • Just copy the entire storage folder along with the settings. (this may not work in the future).

  • Copy only the file folder. But in this case, you have to run the node.normalizeFilesInfo() function once to recalculate all the data and put it in the database.

  • Use the node.exportFiles() function, which starts copying files.

The main node settings

When running the storage node, you can specify all the necessary settings. Only the most basic ones are listed below:

  • storage.dataSize - size of the file folder

  • storage.tempSize - size of the temporary folder

  • storage.autoCleanSize - minimum size of storage that you want to keep. If you specify this parameter, the most underused files will be deleted as soon as there is not enough space.

  • file.maxSize - maximum file size

  • file.minSize - minimum file size

  • file.preferredDuplicates - preferred number of duplicate files in the network

  • file.mimeWhitelist - acceptable file types

  • file.mimeBlacklist - unacceptable file types

  • file.extWhitelist - acceptable file extensions

  • file.extBlacklist - unacceptable file extensions

  • file.linkCache - link caching settings

Almost all parameters related to sizes can be set in both absolute and relative values.

Using the command line

The library can be used via the command line. You need to install it globally: npm i -g storacle. After that, you can run the necessary actions from the project directory where the node is located.
For example, storacle -a storeFile -f ./file.txt -c ./config.js to add a file. All actions can be found in https://github.com/ortexx/storacle/blob/master/bin/actions.js

Why would you want to use that

  • If you want to create a decentralized project where you are going to store and work with files using convenient methods. For example, the music project, that is described in the link at the beginning of the article, uses storacle.

  • If you are working on any other project where you need to store files distributed. You can easily build your own closed network, flexibly configure nodes and add new ones when you need it.

  • If you just need to store the files of your site somewhere and you have to write everything yourself. Perhaps this library is better than others, in your case.

  • If you have a project in which you work with files, but want to perform all manipulations from the browser. You can avoid writing server-side code.

My contacts: