Search is an indispensable part for most websites. But there's a pain point — it takes a lot of effort and hard work to build a robust search solution which is simple, elegant and yet powerful.
Many developers rely on services like Elasticsearch, Apache Lucene, etc... but it brings another moving part into your stack. Furthermore, if you are not familiar with these solutions, you may have a hard time optimising your search.
At Hashnode we experimented with MongoDB text search for a few days, but we quickly noticed that the response time, even after indexing heavily, didn't come down to single digit ms level. We didn't want to setup Lucene, Elasticsearch, etc... because none of us in the team had any prior experience with these.
The solution we chose to go with is Algolia. In their recent AMA on Hashnode, they explained how they are able to optimise search at ms level and revealed some amazing facts. So, we decided to go ahead and use Algolia for our search. Once we started using it, we saw a huge improvement in speed, responsiveness and relevancy of search results on Hashnode, and it's really amazing.
As a result, we have a new search experience on our webiste.
We thought to write down our experience with Algolia, and explain how you can integrate it with a Node.js + MongoDB backend.
Algolia has a free tier. So, make sure to head over to their website and sign up for a free account.
Before proceeding further, let's familiarise ourselves with the basic terms used in Algolia integration.
Indices, Records and Operations
An index is an array of records that you want to search and perform queries against.
Each element of the array is a record which is a JSON object. Each attribute in the records can be used for searching, filtering, etc...
Operations are the actions that we perform on our index. There are two type of operations
- Indexing - Indexing is adding, deleting and updating the records in the index.
- Searching - Searching is querying the records in the indices to return relevant results.
For example, if you have a users
collection in MongoDB, you may want to create an index called users
in Algolia. Then you can store the individual user objects in that index. Further, you can tell Algolia which properties to index in each user object.
In our case we have a users
index in Algoila where we store JSON records that look like the following:
{
username : 'someusername',
name : 'some name'
...
}
And we have indexed two attributes: name
and username
.
Exporting existing data from MongoDB
If you already have some data in MongoDB and want to transfer it to Algolia, this section will be useful to you. Otherwise feel free to skip to the next section.
Each record in the index must have a property called objectID
which is used when updating and deleting data from your index. Treat this as a unique identifier for your records.
MongoDB automatically adds an _id
property in your records when you add any record to the database. So, you can use it as the objectID. Before exporting data, you need to add objectID
to all your documents in all the collections in your mongoDB
database.
For example, if you have an articles
collection in your database you need to loop over all the records and update each document to add a new property objectID
.
db.articles.find().forEach(function(article){
db.articles.update({_id: article._id}, {$set: { objectID: article._id.toString().slice(10).slice(0,24)}});
});
Sometimes you might want to populate some fields before exporting the collection from the database.
For example, if your article document has an authorID
field (which is a reference to a document in the authors
collection) and if you want to add authorName
property to your article document, you need to do something like this:
db.articles.find().forEach(function(article){
var userCursor = db.users.find( { _id: article.author } );
if(userCursor.hasNext()){
ownerName = userCursor.next().name
}
db.articles.update({_id: article._id}, {$set: { objectID: post._id.toString().slice(10).slice(0,24), authorName: ownerName }});
});
Now that our article document has the objectID
and authorName
fields, we can export our collection to an array using MongoDB's mongoexport
utility.
You should not send the entire document to Algolia. Rather you should only send those fields that are requied for indexing and searching.
mongoexport --db your_database_name --collection articles --out articles.json -f "title,brief,authorName,objectID" --jsonArray
Now you will have an articles.json
file which contains the array of article objects.
Then, to upload records to Algolia, we are going to use their website directly. You can just head over to Algolia dashboard (Indices tab) and upload the articles.json
file.
We will see in the next part, how to use their API and API clients to programmatically upload records, rather than using the website everytime.
Once the upload is complete, Algolia will display the list of items and you can start searching right away and test drive the instant search functionality.
Syncing data with Algolia in runtime
We need to update our indices on Algolia whenever our own MongoDB database is updated. We'll use the algoliasearch npm module for performing operations on our indices.
To install it you need to run the following command:
npm install algoliasearch --save
Next, require
the module in your backend code:
var algoliasearch = require('algoliasearch');
var client = algoliasearch('applicationID', 'apiKey');
var articlesIndex = client.initIndex('articles');
There are two types of API keys that you will find in your Algolia Dashboard.
Search API key - Search API key is public and is for frontend usage.
Write API key - Write API key is private and is used to create, delete and update indices. Make sure to keep the write key a secret, and use it in the backend only. However, you can use the read key on the front end for searching items.
Note:
You can also use algolia's mongodb-connector for syncing MongoDB database with Algolia's indices. As it's in beta I'll use
algoliasearch
node module, but feel free to check out themongodb-connector
.
Adding a new object to the index
Adding a record to our index is pretty straightforward.
articlesIndex.addObject({
title: 'Your title',
brief: 'Brief',
objectID: 'your_object_id',
authorName: 'article author name'
}, function(err, content) {
if(err) {
console.log(err);
}
});
Deleting an object from the index
If something is deleted on your server, it makes sense to remove it from your Algolia index so that it won't show up in the search anymore.
articlesIndex.deleteObject('myID', function(err) {
if (err) {
console.error(err);
}
});
Partially updating an object in the index
Sometimes you need to update an existing record in Algolia index. For instance, if your article title is updated, you need to update it on Algolia as well.
articlesIndex.partialUpdateObject({
title: 'New updated title',
objectID: 'my_id'
}, function(err, content) {
if (err) {
console.error(err);
}
});
Note that you need to pass the objectID
of the record we want to update, as a part of the object argument of the partialUpdateObject
function.
Searching
Use the read key (check your Algolia Dashboard) to perform search on the client side. Here is a snippet that demonstrates how to do it:
function fetchData(val, page) {
return articlesIndex
.search(val, {hitsPerPage: 25, page: page})
.then(function(content) {
// do what you want with the content
console.log(content);
})
.catch(function(err) {
console.error(err);
});
}
Note that until now we were only using a callback API but the above last example is using the Promise API.
The algoliasearch
module supports both of the async APIs.
Asynchronicity
You can call the above function to perform a search on a mySearchBox.addEventListener('input', function() {})
or whenever you want.
But the problem is that the order in which the promises or callbacks will resolve is not sequential.
Here's an explanation from the Algolia team on building a JavaScript API client:
Indeed the
algoliasearch
module is only a raw API client with some smart request algorithm but it does not force the responses order.Because this might not always what you want, in some situations (web) you might want responses in the right order but in a backend script you don't care. And you do not want to sacrifice performance in a backend script
So if you are calling search on the input event and you searched "javascript" then there might be a small chance that the request with search query "javascr" may be received after the request with search query "javascript".
Here's a demonstration of the issue on the https://hn.algolia.com website (Hacker News search), which is powered by Algolia.
To be able to trigger this behavior, we throttled the network to 2G and searched "hashnode". Unfortunately the promise with query "ha" resolved in the end, we notified the Algolia team to solve the issue and it has been fixed in this commit.
To solve this, there's an easy fix. Everytime a new response comes to you, you can check that the response.query === currentQuery
.
If both queries are different then it means the response is not relevant given the current user's search state.
function fetchData(val, page) {
return articlesIndex
.search(val, {hitsPerPage: 25, page: page})
.then(function(content) {
if (content.query !== mySearchBox.value) {
// not the query I am looking for
return;
}
// do what you want with the content
console.log(content);
})
.catch(function(err) {
console.error(err);
});
}
This will ensure that your search queries are always matching the right response so that the user experience is pleasant.
DSN
Last but not the least, Hashnode users are everywhere. So we needed to be fast for all of them, no matter where they are.
Algolia has a feature named Distributed Search Network and are able to replicate our indices in multiple regions of the world:
The major benefit of DSN is, that it brings instant responsiveness of your search engine to all your users around the world. If you are using Algolia or plan to use it, make sure to take advantage of DSN.
Conclusion
This was a quick overview of our story of implementing Algolia along with MongoDB.
As you saw, the process is super simple, APIs are easy to understand and the community support is great. We are amazed by the responsiveness of Algolia and strongly recommend it to anyone who is looking for a powerful and a robust search solution.
By the way, do check out Algolia Community if you haven't already.