Predict book review scores given free-form text reviews
This experiment shows how Preprocess Text and Extract N-Gram Features from Text modules can be used to clean and featurize free-form text data from book reviews, and predict review scores. The scores are rated low (1,2) and high (4,5) to simplify the problem into binary classification. Note how Extract N-Gram Features module is used to create n-gram vocabulary in training branch, combined with feature selection of most important n-gram features. In the scoring branch, those features are used in read-only mode.