Answering open ended questions is difficult and often requires manual effort from subject matter experts (SMEs). To help reduce the demands on internal SMEs, companies often create lists of Frequently Asked Questions (FAQs) as a means of assisting users. This example showcases various effective machine learning methods to match open ended queries to pre-existing FAQ question/answers pairs. This example demonstrates an easy development process for building such a solution using the Azure Machine Learning Workbench. This example addresses the problem of mapping user questions to pre-existing Question & Answer (Q&A) pairs as is typically provided in a list of Frequently Asked Questions (that is, a FAQ) or in the Q&A pairs present on websites like Stack Overflow. There are many approaches to match a question to its correct answer, such as finding the answer that is the most similar to the question. However, in this example open ended questions are matched to previously asked questions by assuming that each answer in the FAQ can answer multiple semantically equivalent questions. The key steps required to deliver this solution are as follows: 1. Clean and process text data. 2. Learn informative phrases, which are multi-word sequences that provide more information when viewed in sequence than when treated independently. 3. Extract features from text data. 4. Train text classification models and evaluate model performance.
- The detailed documentation for this Q & A matching example includes the step-by-step walk-through: [https://docs.microsoft.com/azure/machine-learning/preview/scenario-qna-matching](https://docs.microsoft.com/azure/machine-learning/preview/scenario-qna-matching). - For code samples, click the __View Project__ icon on the right and visit the project GitHub repository. - Key components needed to run this example: 1. An [Azure account](https://azure.microsoft.com/free/) (free trials are available). 2. An installed copy of Azure Machine Learning Workbench with a workspace created. 3. This example could be run on any compute context. However, it is recommended to run it on a multi-core machine with at least of 16-GB memory and 5-GB disk space.