我有这样一句话:
text="The weather is extremely severe in England"
我想做一个定制的Name Entity Recognition (NER)
度手术
首先,正常的NER
过程将输出带有GPE
标签的England
pip install spacy
!python -m spacy download en_core_web_lg
import spacy
nlp = spacy.load('en_core_web_lg')
doc = nlp(text)
for ent in doc.ents:
print(ent.text+' - '+ent.label_+' - '+str(spacy.explain(ent.label_)))
Result: England - GPE - Countries, cities, states
然而,我希望整句话都有标签High-Severity
.
因此,我正在执行以下步骤:
from spacy.strings import StringStore
new_hash = StringStore([u'High_Severity']) # <-- match id
nlp.vocab.strings.add('High_Severity')
from spacy.tokens import Span
# Get the hash value of the ORG entity label
High_Severity = doc.vocab.strings[u'High_Severity']
# Create a Span for the new entity
new_ent = Span(doc, 0, 7, label=High_Severity)
# Add the entity to the existing Doc object
doc.ents = list(doc.ents) + [new_ent]
我接受以下错误:
ValueError: [E1010] Unable to set entity information for token 6 which is included in more than one span in entities, blocked, missing or outside.
据我所知,这是因为NER
已经将England
识别为GRE
,不能在现有标签上添加标签.
我试图执行定制的NER
代码(也就是,没有先运行正常的NER
代码),但这并没有解决我的问题.
对如何解决这个问题有什么 idea 吗?