1. Only check for unread > 0
由于您在这个查询中只得到unreadMessageCount
,第一个小优化是将其添加到您的第一个$match
阶段.因为总数是0
时是不变的,无论用户是否被封锁.
{
$match: {
userId: 32,
unreadMessageCount: { $gt: 0 }
}
}
2. Uncorrelated Subquery with $lookup
2A.
我们还可以try 优化实际发生的$lookup
的数量.通过使用Uncorrelated Subquery with $lookup
—ie,它只运行一次,而不是每个文档.
db.conversations2.aggregate([
{
$match: {
userId: 32,
unreadMessageCount: { $gt: 0 }
}
},
{
// this will only run once
$lookup: {
from: "user_blocked",
pipeline: [
{
$group: {
_id: null,
ids: { $addToSet: "$id" }
}
}
],
as: "blockedUsers"
}
},
{
$set: {
blockedUsers: { $first: "$blockedUsers.ids" }
}
},
{
$match: {
$expr: {
$eq: [
{ $indexOfArray: ["$blockedUsers", "$otherUserId"] },
-1
]
}
}
},
{
$group: {
_id: "$userId",
unreadMessageCount: {
$sum: "$unreadMessageCount"
}
}
}
])
Mongo Playground的正确结果为未读10
但是,如果你的user_blocked
个集合太大,你可能会达到每阶段16MB的限制(Ints为1.3 mil id,Longs为600k),然后是user_blocked
MB的限制与allowDiskUse: true
.在这种情况下,使用以下变体B:
2B.
这种聚合管道将减少所需的查找总数,并且遇到阶段大小限制的机会更小,但它仍然可能.步骤:
- 将所有
otherIDs
个集合到一个数组中
- 对
user_blocked
进行一次性查找
- 只保留剩余的
validOtherIDs
个
- 做一个
conversations2
的self 查找,但只使用validOtherIDs
db.conversations2.aggregate([
{
$match: {
userId: 32,
unreadMessageCount: { $gt: 0 }
}
},
{
// collect all the unique `otherUserId` for this userId
$group: {
_id: null,
otherUserIds: { $addToSet: "$otherUserId" }
}
},
{
// correlated but will only run once since
// the previous stage has only one document
$lookup: {
from: "user_blocked",
let: {
lookupOtherUserIds: "$otherUserIds"
},
pipeline: [
{
$match: {
$expr: {
$ne: [
{ $indexOfArray: ["$$lookupOtherUserIds", "$id"] },
-1
]
}
}
},
{
$group: {
_id: null,
ids: { $addToSet: "$id" }
}
}
],
as: "blockedUsers"
}
},
{
// otherIDs which remain after removing blockedIDs
$project: {
validOtherIds: {
$setDifference: ["$otherUserIds", { $first: "$blockedUsers.ids" }]
}
}
},
{
// do a self-lookup on `conversations2`
$lookup: {
from: "conversations2",
let: {
lookupValidOtherIds: "$validOtherIds"
},
// repeat part of the first stage of this pipeline, yuck!
pipeline: [
{
$match: {
userId: 32,
// unread > 0 check is not needed since
// lookupValidOtherIds already has only > 0's
$expr: {
$ne: [
{ $indexOfArray: ["$$lookupValidOtherIds", "$otherUserId"] },
-1
]
}
}
}
],
as: "validConvos"
}
},
// the `group` below can be directly done in the self-lookup stage
// but I find this cleaner
{ $unwind: "$validConvos" },
{
$group: {
_id: null,
unreadMessageCount: {
$sum: "$validConvos.unreadMessageCount"
}
}
}
])
Mongo Playground
3. ‼ No lookups, Add a field otherUserBlocked
这种优化将需要数据/ struct 更改,但most scalable and most performant:
添加一个像otherUserBlocked: true/false
这样的字段并索引它.最初默认为false
(没有用户被阻止),然后使用类似于你已有的管道将其设置为true
.
如果跳过初始默认值,则需要在下面的查询中添加子句{ otherUserBlocked: { $exists: true } }
.
每次用户被阻止时,您已经将其添加到user_blocked
个集合中.添加另一个步骤,也将conversations2
更新为{ $match: { otherUserId: blocked_user_id } }
和设置otherUserBlocked: true
.比如:
db.conversations2.updateMany({
otherUserId: 46
},
{
$set: {
otherUserBlocked: true
}
})
如果他们被解锁,设置为false
.
然后,您的聚合管道可以在第一个$match
阶段中使用它,完全消除了对$lookup
的需求,以及第二个$match
阶段.管道变成:
db.conversations2.aggregate([
{
$match: {
userId: 32,
otherUserBlocked: false,
unreadMessageCount: { $gt: 0 }
}
},
{
$group: {
_id: "$userId",
unreadMessageCount: { $sum: "$unreadMessageCount" }
}
}
])
Mongo Playground with the new field
当您希望实际显示未读消息而不仅仅是计数时,这两个更改也很有用.
大约MongoDB Schema Design best practices.