how to upload 1 million (10 lakh) images using Node.js

By | 6 months ago

interviewbackendnode jscareersawskerala itqueredis

# How to Upload 1 Million Images in Node.js

Uploading a large number of images efficiently is a common challenge in web development. In this blog post, we will explore a robust approach to upload 1 million images using Node.js. We will leverage the power of streams, queues, and cloud storage to achieve this task.

Prerequisites

Before we start, ensure you have the following:

  • Basic knowledge of Node.js

  • Node.js and npm installed on your machine

  • An account with a cloud storage provider (e.g., AWS S3, Google Cloud Storage, Azure Blob Storage)

Setting Up the Project

First, let's create a new Node.js project and install the necessary dependencies.

mkdir image-upload cd image-upload npm init -y npm install express multer aws-sdk async

Creating the Upload Server

We will use `Express` for handling HTTP requests and `Multer` for handling file uploads. Here's the basic setup:

// server.js const express = require('express'); const multer = require('multer'); const AWS = require('aws-sdk'); const async = require('async'); const fs = require('fs'); const path = require('path'); const app = express(); const upload = multer({ dest: 'uploads/' }); const s3 = new AWS.S3({ accessKeyId: process.env.AWS_ACCESS_KEY, secretAccessKey: process.env.AWS_SECRET_KEY, region: process.env.AWS_REGION, }); app.post('/upload', upload.array('images', 10), (req, res) => { const files = req.files; async.eachLimit(files, 10, (file, callback) => { const fileStream = fs.createReadStream(file.path); const uploadParams = { Bucket: process.env.AWS_BUCKET_NAME, Key: path.basename(file.path), Body: fileStream, }; s3.upload(uploadParams, (err, data) => { if (err) { return callback(err); } console.log(`File uploaded successfully. ${data.Location}`); fs.unlink(file.path, (err) => { if (err) console.log(`Error deleting file: ${err}`); }); callback(); }); }, (err) => { if (err) { console.log('Error uploading files:', err); return res.status(500).send('Error uploading files'); } res.status(200).send('Files uploaded successfully'); }); }); app.listen(3000, () => { console.log('Server started on http://localhost:3000'); });

Environment Variables

Make sure to set the environment variables in a `.env` file or directly in your environment. Here's an example `.env` file:

AWS_ACCESS_KEY=your_aws_access_key AWS_SECRET_KEY=your_aws_secret_key AWS_REGION=your_aws_region AWS_BUCKET_NAME=your_bucket_name

Handling Large Volume of Images

To handle 1 million images, we need to ensure our solution is scalable. Here's how we can achieve that:

1. Use Streams

Using streams helps in handling large files without exhausting memory. The code above uses `fs.createReadStream` to read files as streams.

2. Implement Queuing

We use `async.eachLimit` to limit the number of concurrent uploads. This prevents overwhelming the server and ensures steady throughput.

3. Cloud Storage

Storing images in cloud storage like AWS S3 ensures high availability and scalability. Adjust the `uploadParams` based on your cloud provider.

4. Error Handling and Retries

Implement robust error handling and retry logic to handle intermittent issues. The code above demonstrates basic error handling.

Testing the Solution

To test the solution, use a tool like Postman to send a `POST` request to `http://localhost:3000/upload\` with multiple image files.

Conclusion

Uploading a large number of images efficiently requires a combination of streaming, queuing, and cloud storage. This blog post provides a scalable approach to handle 1 million image uploads in Node.js. Adjust the configurations and parameters based on your specific use case and requirements.