model = Doc2Vec(dm=0, vector_size=100, negative=5, hs=0, min_count=2, iter=5, workers=cores)
Let's quickly understand the flags we have used in the preceding code:
- dm ({1,0}, optional): This defines the training algorithm; if dm=1, distributed memory (PV-DM) is used; otherwise, a distributed bag of words (PV-DBOW) is employed
- size (int, optional): This is the dimensionality of feature vectors
- window (int, optional): This represents the maximum distance between the current and predicted word within a sentence
- negative (int, optional): If > 0, negative sampling will be used (the int for negative values specifies how many noise words should be drawn, which is usually between 5-20); if set to 0, no negative sampling ...