leondz/wnut_17
Updated • 4.02k • 19
How to use cjber/reddit-ner-place_names with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("token-classification", model="cjber/reddit-ner-place_names") # Load model directly
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("cjber/reddit-ner-place_names")
model = AutoModelForTokenClassification.from_pretrained("cjber/reddit-ner-place_names")Fine-tuned bert-base-uncased for named entity recognition, trained using wnut_17 with 498 additional comments from Reddit. This model is intended solely for place name extraction from social media text, other entities have therefore been removed.
This model was created with two key goals:
For the model code please see the following Model GitHub Repository.
In theory this model should be able to detect and ignore metonyms. For example in the sentence:
Manchester played Liverpool last night in Liverpool.
Both Manchester and the first Liverpool mention refer to football teams, therefore the model outputs:
[
{
"entity_group": "location",
"score": 0.9975672,
"word": "liverpool",
"start": 42,
"end": 51,
}
]
transformers
from transformers import pipeline
generator = pipeline(
task="ner",
model="cjber/reddit-ner-place_names",
tokenizer="cjber/reddit-ner-place_names",
aggregation_strategy="first",
)
out = generator("I like reading books. I live in Reading.")
out gives:
[
{
"entity_group": "location",
"score": 0.94123614,
"word": "reading",
"start": 32,
"end": 39,
}
]