I am trying to to train a spacy model with a small dataset in Spacy 2.2. It is overfitting, I want to customize the architecture of the TextCategorizer. I referred to this post on GitHub :
However, I am unable
from spacy.pipeline import TextCategorizer from thinc.api import layerize from spacy.language import Language class StupidTextCategorizer(TextCategorizer): name = 'stupid_textcat' @classmethod def Model(cls, nr_class, **cfg): return create_dummy_model(nr_class, cfg.get('preferred_class', 0)) def create_dummy_model(nr_class, preferred_class): """Create a Thinc model that always predicts the same class.""" def dummy_model(docs, drop=0.): scores = model.ops.allocate((len(docs), nr_class)) scores[:, preferred_class] = 1.0 return scores model = layerize(dummy_model) return model
However, when I’m trying to pass it to my training script, it throws this error which I can’t seem to understand.
"[E002] Can't find factory for 'stupid_textcat'. This usually happens when spaCy calls `nlp.create_pipe` with a component name that's not built in - for example, when constructing the pipeline from a model's meta.json. If you're using a custom component, you can write to `Language.factories['stupid_textcat']` or remove it from the model meta and add it via `nlp.add_pipe` instead."
PS : Still learning Spacy but I can’t find any helping documentation or tutorial for the above.
This question is not yet answered, be the first one who answer using the comment. Later the confirmed answer will be published as the solution.