我想将所有未被先前模式标记为"未知"的实体标记为"未知". 遗憾的是,实体统治者似乎并不关心所提供的模式的顺序:
import spacy
nlp = spacy.blank("en")
ruler = nlp.add_pipe("entity_ruler")
patterns = [
{'label': 'Country', 'pattern': [{'lower': 'ger'}]},
{'label': 'Unknown', 'pattern': [{'OP': '?'}]}
]
ruler.add_patterns(patterns)
doc = nlp('ger is a country')
print([(ent.text, ent.label_) for ent in doc.ents])
预期:
[('ger', 'Country'), ('is', 'Unknown'), ('a', 'Unknown'), ('country', 'Unknown')]
实际:
[('ger', 'Unknown'), ('is', 'Unknown'), ('a', 'Unknown'), ('country', 'Unknown')]
我如何才能确保模式按顺序匹配?