Scenario1:
The model is trained on data related to people’s medical history and is tested for data on their work efficiency
Here, we see that the training data is not representative of the testing data i.e they (in general) have a poor correlation. This corresponds to the case of too much generalization out of the training data. It is natural to obtain incorrect predictions from the model in such cases.
Scenario2:
The model is trained on data related to a people’s medical history and is tested for data on medical prescriptions
Here, we see that the training and testing datasets have too high a correlation. This corresponds to the case of memorization. It is expected to have a high degree of accuracy in such cases. However, such implementations do not capitalize on the actual power of Machine Learning as the probability of the need to make informed guesses in such cases is pretty low.
Scenario3:
The model is trained on data related to people’s medical history and is tested for data on possible future health risks
Here, the datasets have the perfect degree of correlation to implement Machine Learning. The desired output incorporates a probabilistic factor ruling out memorization and at the same time, the training data can offer some great insights for generalization. Such implementations strike the perfect balance between generalization and memorization and effectively utilize machine learning capabilities.
Key Takeaway:
Machine Learning works best and is most useful when:
i) No Deterministic Algorithm exists to the problemBased on the above criteria, some suitable applications of Machine Learning are:
i) Movie recommendations based on user’s past views or likings