Artificial Intelligence vs Machine Learning vs Deep Learning: How Are They Different?
After publishing an article for non technical people about the difference between Artificial Intelligence, Machine Learning and Deep Learning find below a more technical version article written by our R&D team. This is even more interesting…
With the recent successes of Deep Learning, people started hearing more and more about the wonders of this technology, be it in self-driving cars, automatic text translation between languages, or beating grand masters’ at board games. However, as venues publishing the stories were increasingly aimed towards the general population, the terms Artificial Intelligence, Machine Learning and Deep Learning started to be used more interchangeably. This is not inherently wrong, as each one is a more narrow and specific discipline. However, it may lead to believe that they are in fact the same thing, which is not true. In this post we will explain the difference between the three.
Artificial Intelligence (AI)
The most general one, Artificial Intelligence (AI), is concerned with all things intelligent that a machine could ever do. The ultimate aim of AI is to produce machines capable of performing cognitive tasks equally, or better, than humans. That is known in the field as strong AI, and we are not there yet. Not even close. In the meantime, people are working in weak (or sometimes called narrow) AI: a system built focusing in a very specialized skill, at which they have to be very good. For example, playing chess.
The roots of AI can be traced back to ancient Greece, to its myths and legends about animated golems, but also to the invention of syllogistic logic by Aristoteles, the first formal deductive reasoning system. Throughout the centuries, a parallel stream of developments in philosophy, mathematics, psychology or neuroscience in one side, and technical and mechanical progress in the other, have shaped the concept of intelligence and produced many attempts to replicate it by artificial means.
Artificial Intelligence coalesced as a distinct field of study in the famous Dartmouth workshop, organized by John McCarty in 1956, where several scientists and mathematicians attempted to clarify and develop the many recent ideas and innovations related to “thinking machines” like cybernetics, neural networks, and computing machines that were floating around at the time.
Machine Learning (ML)
Machine Learning (ML) is a sub-field of AI focused on computer systems that do not need to be programmed explicitly by a human, but can be trained from data or learn from experience. Machine Learning draws heavily from statistics, and in many instances it can be seen as some sort of probabilistic model describing “The World” that gets updated and refined as more data is presented to it. Then, this model can be used to detect patterns, and to make intelligent decisions based on new data observations.
Probably the first Machine Learning system was Arthur Samuel’s checkers playing machine (1952) that could learn to play better from its opponents. Since then, ML has greatly evolved and is one of the most successful sub-fields of AI, with a myriad applications being used by society.
One useful way to divide ML is depending on whether or not supervision is available. In supervised learning tasks we are not only given data, but also labels (or other forms of feedback) generated by an oracle or teacher. This information is then used to adjust the parameters of our model, so we can use it to predict the labels of new data. Typical supervised learning tasks are classification, regression, and reinforcement learning. On the other hand, if supervision is not available, then we have to resort to unsupervised learning methods, that are more focused on modelling the structure of the data. Typical unsupervised learning tasks are clustering, dimensionality reduction, and representation learning. Unsupervised learning is often much harder than supervised learning, and the later represents the majority of the success stories in industry.
Deep Learning (DL)
A big part of the recent success of ML is thanks to our ever-increasing computational power and to the availability of large amounts of labeled training data collected through the Internet. The more data and computational power you have, the larger the models you can afford to build. Enter Deep Learning (DL).
Classical ML pipelines featured a complex hand-crafted process to go from raw data (i.e. image pixels) to a representation suitable to perform learning on. That is, the machine learning tools were applied only at the last stage of the process. Deep Learning advocates claimed that a better approach would be to let the learning take care of all the steps, end-to-end from raw data to outputs; the kind of task neural networks were a good framework for.
With the new data and computational power, they were proven right at the 2012 ImageNet Large-Scale Visual Recognition Challenge, where competing vision systems had to identify a thousand different categories in a million images. The only deep learning participant completely smashed the performance of the others. Since then, deep neural networks have been continuously pushing the limits of what AI can do, and it seems they won’t stop!