Most of the prediction models are based on the words or contexts that have appeared in past words. Based on their learning from past words, they predict the next word. CBOW, in contrast to this, uses N words before and after the word in question to predict the outcome. It uses a continuous representation of a bag of words to predict the outcome. However, order is of no significance here. CBOW takes context in the form of a window of words and predicts the word.
The following figure represents how CBOW works:
Based on the previous diagram, CBOW can be formalized as:
The previous formula is based on a window of n words around ...