Node.js 大文件上传到 MongoDB 阻塞了事件循环和工作池

发布于05月12日

所以我想使用Express、Mongoose和Multer的GridFS存储引擎来存储upload large CSV files to a mongoDB cloud database using a Node.js server个，但是when the file upload starts, my database becomes unable to handle any other API requests个.例如，如果在上传文件时，另一个客户端请求从数据库获取用户，服务器将接收该请求并try 从MongoDB云获取用户，but the request will get stuck因为大文件上传会占用所有计算资源.结果，客户端执行的get请求将不会向用户until返回正在进行的文件上载完成.

我知道，如果一个线程执行回调(事件循环)或任务(工作线程)需要很长时间，那么它会被认为是"阻塞"的，并且该 node 也会被阻塞.js在事件循环中运行JavaScript代码，同时它提供了一个工作池来处理昂贵的任务，如文件I/O.我已经读了this blog post by NodeJs.org篇文章，以保持您的 node .js服务器速度很快，在任何给定时间与每个客户机相关的工作都必须是"小"的，我的目标应该是minimize the variation in Task times.这样做的原因是，如果一个工作人员的当前任务比其他任务要昂贵得多，那么它将无法处理其他挂起的任务，从而将工作人员池的大小减少一个，直到任务完成.

换句话说，执行大文件上传的客户端正在执行一项昂贵的任务，这会降低工作池的吞吐量，进而降低服务器的吞吐量.根据上述博客帖子，当每个子任务完成时，它应该提交下一个子任务，当最后一个子任务完成时，它应该通知提交者.This way, between each sub-Task of the long Task(大文件上传)，the Worker can work on a sub-Task from a shorter Task，从而解决了阻塞问题.

However, I do not know how to implement this solution in actual code.有没有具体的分区函数可以解决这个问题？我是否必须使用特定的上传架构或 node 包而不是multer gridfs存储来上传文件？请帮忙

以下是我目前使用Multer的GridFS存储引擎实现的文件上传:

   // Adjust how files get stored.
   const storage = new GridFsStorage({
       // The DB connection
       db: globalConnection, 
       // The file's storage configurations.
       file: (req, file) => {
           ...
           // Return the file's data to the file property.
           return fileData;
       }
   });

   // Configure a strategy for uploading files.
   const datasetUpload = multer({ 
       // Set the storage strategy.
       storage: storage,

       // Set the size limits for uploading a file to 300MB.
       limits: { fileSize: 1024 * 1024 * 300 },
    
       // Set the file filter.
       fileFilter: fileFilter,
   });


   // Upload a dataset file.
   router.post('/add/dataset', async (req, res)=>{
       // Begin the file upload.
       datasetUpload.single('file')(req, res, function (err) {
           // Get the parsed file from multer.
           const file = req.file;
           // Upload Success. 
           return res.status(200).send(file);
       });
   });

Node.js 大文件上传到 MongoDB 阻塞了事件循环和工作池

推荐答案

Node.js相关问答推荐

下一个API路由如何处理多个并发请求？

通过PutObjectCommand上传AWS S3 PDF文件，结果为空PDF

如何在docker容器上正确安装nodejs？

具有项目外部子路径导入的 Firebase 函数

如何防止 Chrome 通过 Selenium 崩溃？

无法关闭 node.js 中的mongoose 连接

将 express js app.use() 移动到另一个文件

错误：找不到模块'C：\Users\nguye\AppData\Local\nodejs\node_modules\npm\bin\npm-cli.js'

强制 TypeScript 生成带有.js扩展名的导出/导入；运行 node 16？

为什么我的react 表单不能正常工作？

Mongoose-更新嵌套数组

Azure Function 在 2022 年 4 月 26 日禁用将用户字段设置为 HTTP 请求对象

users.watch(在 gmail google api 中)如何收听通知？

带有 node.js 和 express 的基本网络服务器，用于提供 html 文件和assets资源

如何在 node 调试器中禁用第一行中断

Chrome 浏览器未将 if-modified-since 标头发送到服务器

使用 ES6 语法和动态路径导入模块

在 react-router v4 中使用 React IndexRoute

如何忽略文件 grunt uglify

从 node.js 连接到 mongodb 时出现 ECONNREFUSED 错误