Increasing ngram_range did work for us, but changing prior from uniform to fitting it (by changing fit_prior to False) did not help at all, as follows:
mnb_clf = Pipeline([('vect', CountVectorizer(stop_words='english', ngram_range=(1,3))), ('tfidf', TfidfTransformer()), ('clf',MNB(fit_prior=False))])mnb_clf.fit(X=X_train, y=y_train)mnb_acc, mnb_predictions = imdb_acc(mnb_clf)mnb_acc # 0.8572
We have now thought of each combination that might improve our performance. Note that this approach is tedious, and also error-prone because it relies too greatly on human intuition.