我在Qdrant上有一个数据 struct ,在有效载荷中,我有这样的内容:
{
"attributes": [
{
"attribute_value_id": 22003,
"id": 1252,
"key": "Environment",
"value": "Casual/Daily",
},
{
"attribute_value_id": 98763,
"id": 1254,
"key": "Color",
"value": "Multicolored",
},
{
"attribute_value_id": 22040,
"id": 1255,
"key": "Material",
"value": "Polyester",
},
],
"brand": {
"id": 114326,
"logo": None,
"slug": "happiness-istanbul-114326",
"title": "Happiness Istanbul",
},
}
根据Qdrant documentations,我为品牌实现了这样的过滤:
filters_list = []
if param_filters:
brands = param_filters.get("brand_params")
if brands:
filter = models.FieldCondition(
key="brand.id",
match=models.MatchAny(any=[int(brand) for brand in brands]),
)
filters_list.append(filter)
search_results = qd_client.search(
query_filter=models.Filter(must=filters_list),
collection_name=f"lang{lang}_products",
query_vector=query_vector,
search_params=models.SearchParams(hnsw_ef=128, exact=False),
limit=limit,
)
到目前为止还有效.但是当我试图过滤"属性"字段时,事情变得复杂了.如你所见,它是一个字典列表,包含的字典如下:
{
"attribute_value_id": 22040,
"id": 1255,
"key": "Material",
"value": "Polyester",
}
从前端发送的attrs
过滤器是这样的 struct :
attrs structure: {"attr_id": [attr_value_ids], "attr_id": [att_value_ids]}
>>> example: {'1237': ['21727', '21759'], '1254': ['52776']}
如何筛选以查看查询过滤器参数中提供的attr_id
(此处为1237
或1254
)是否存在于attributes
字段中,并且是否具有列表中提供的attr_value_id
之一(例如此处为['21727', '21759']
)?
这是我目前为止try 的:
if attrs:
# attrs structure: {"attr_id": [attr_value_ids], "attr_id": [att_value_ids]}
print("attrs from search function:", attrs)
for attr_id, attr_value_ids in attrs.items():
# Convert attribute value IDs to integers
attr_value_ids = [
int(attr_value_id) for attr_value_id in attr_value_ids
]
# Add a filter for each attribute ID and its values
filter = models.FieldCondition(
key=f"attributes.{attr_id}.attr_value_id",
match=models.MatchAny(any=attr_value_ids),
)
filters_list.append(filter)
问题是key=f"attributes.{attr_id}.attr_value_id",
是错误的,我不知道如何做到这一点.
更新:也许更近一步:
我决定将数据库中的数据平整化,以便做得更好.首先,我创建了一个名为flattered_attributes的新文件,如下所示:
[
{
"1237": 21720
},
{
"1254": 52791
},
{
"1255": 22044
},
]
此外,在过滤之前,我对从前端发送的attr过滤器采用了相同的方法:
if attrs:
# attrs structure: {"attr_id": [attr_value_ids], "attr_id": [att_value_ids]}
# we need to flatten attrs to filter on payloads
flattened_attr = []
for attr_id, attr_value_ids in attrs.items():
for attr_value_id in attr_value_ids:
flattened_attr.append({attr_id:int(attr_value_id)})
现在,我有两个类似的命令列表,我想过滤那些至少有一个命令的人,其中一个命令是从前端(flattened_attr
)接收的.
有一种类型的过滤,如果键的值存在于值列表中,我们过滤,如前面提到的here in the docs.但是我不知道如何判断DB中的flattened_attributes
字段中是否存在dict.