Models
Word2VecHelper
A wrapper around Gensim Word2Vec
- class datawords.models.Word2VecHelper(parser_conf: ParserConf, phrases_model=None, size: int = 100, window: int = 5, min_count: int = 1, workers: int = 1, epoch: int = 5, model: Word2Vec | None = None, using_kv=False, loaded_from=None, stopw: StopWords | None = None)
- __init__(parser_conf: ParserConf, phrases_model=None, size: int = 100, window: int = 5, min_count: int = 1, workers: int = 1, epoch: int = 5, model: Word2Vec | None = None, using_kv=False, loaded_from=None, stopw: StopWords | None = None)
It’s a wrapper around the original implementation of Word2Vec from the Gensim library. It adds the option to store and track the training params of the model including the parser used to do so.
- property vector_size: int
- property wv: Word2Vec | KeyedVectors
- fit(X: Iterable)
This will train the model. It needs an iterable.
- Parameters:
X (Iterable) – An iterable which returns plain texts.
- parse(sentence: str) List[str]
It will parse only one text. :param txt: str :return: a list of words :rtype: List[str]
- encode(sentence: str) ndarray
gets a sentence in plain text and encode it as vector
- vectorize(sentence: List[str]) ndarray
Get a vector from a list of words if a sentence has words that don’t match in the word2vec model, then it fills with zeros
- save(fp: str | PathLike)
- classmethod load(fp: str | PathLike, keyed_vectors=False) Word2VecHelper
- class datawords.models.W2VecMeta(name: str, lang: str, parser_conf: ParserConf, phrases_model_path: str | None = None, epoch: int = 5, size: int = 100, window: int = 5, min_count: int = 1, version: str = '0.7.3', path: str | None = None)
- name: str
- lang: str
- parser_conf: ParserConf
- phrases_model_path: str | None
- epoch: int
- size: int
- window: int
- min_count: int
- version: str
- path: str | None
- __init__(name: str, lang: str, parser_conf: ParserConf, phrases_model_path: str | None = None, epoch: int = 5, size: int = 100, window: int = 5, min_count: int = 1, version: str = '0.7.3', path: str | None = None) None
Method generated by attrs for class W2VecMeta.