Reddit NER for place names

Fine-tuned bert-base-uncased for named entity recognition, trained using wnut_17 with 498 additional comments from Reddit. This model is intended solely for place name extraction from social media text, other entities have therefore been removed.

This model was created with two key goals:

Improved NER results on social media
Target only place names

Model code

For the model code please see the following Model GitHub Repository.

Metonymy

In theory this model should be able to detect and ignore metonyms. For example in the sentence:

Manchester played Liverpool last night in Liverpool.

Both Manchester and the first Liverpool mention refer to football teams, therefore the model outputs:

[
    {
        "entity_group": "location",
        "score": 0.9975672,
        "word": "liverpool",
        "start": 42,
        "end": 51,
    }
]

Use in `transformers`

from transformers import pipeline

generator = pipeline(
    task="ner",
    model="cjber/reddit-ner-place_names",
    tokenizer="cjber/reddit-ner-place_names",
    aggregation_strategy="first",
)

out = generator("I like reading books. I live in Reading.")

out gives:

[
    {
        "entity_group": "location",
        "score": 0.94123614,
        "word": "reading",
        "start": 32,
        "end": 39,
    }
]

Downloads last month: 856

Safetensors

Model size

0.1B params

Tensor type

I64

F32

Model tree for cjber/reddit-ner-place_names

Finetunes

1 model

cjber
/

reddit-ner-place_names

Reddit NER for place names

Model code

Metonymy

Use in `transformers`

Model tree for cjber/reddit-ner-place_names

Dataset used to train cjber/reddit-ner-place_names

Reddit NER for place names

Model code

Metonymy

Use in transformers

Model tree for cjber/reddit-ner-place_names

Dataset used to train cjber/reddit-ner-place_names

Use in `transformers`