MIT Technology Review: The way we train AI is fundamentally flawed
This article aptly sums up the difference in real life expectations and theory. It speaks about "Underspecification", which is a known issue in statistics, where observed effects can have many possible causes. The training process can produce many different models that all pass the test but — and this is the crucial part — these models will differ in small, arbitrary ways, depending on things like the random values given to the nodes in a neural network before training starts, the way training data is selected or represented, the number of training runs, and so on. These small, often random, differences are typically overlooked if they don’t affect how a model does on the test. But it turns out they can lead to huge variation in performance in the real world.
One option is to design an additional stage to the training and testing process, in which many models are produced at once instead of just one. These competing models can then be tested again on specific real-world tasks to select the best one for the job.
Full article is here: https://www.technologyreview.com/2020/11/18/1012234/training-machine-learning-broken-real-world-heath-nlp-computer-vision/