Deep learning

Johanna Pingel,

When the machine decides

In deep learning, a computer model learns to perform classification tasks directly from images, texts or sounds - without human intervention.

In deep learning, a computer model learns to perform classification tasks directly from images, texts or sounds - without human intervention.

Deep learning is one of the key technologies of the Industrial IoT - and it is by no means a dream of the future, but is already being used today. Important areas of application include computer vision and speech processing. Phones and hands-free devices that are controlled via natural language use deep learning algorithms. It is also used in ATMs to detect forgeries or when a smartphone app displays the translation of a foreign-language street sign in real time (Fig. 1). Autonomous driving - even if it is not yet ready for application - would hardly be conceivable without deep learning.

Figure 1: Examples of the application of deep learning: autonomous driving, apps for simultaneous translation or the detection of counterfeit money in an ATM. © (Image: MathWorks)

Many of the techniques used today in the field of deep learning have been around for decades. Deep learning models now achieve a very high level of accuracy and sometimes even surpass human recognition performance. In recent years, new tools and processes have improved deep learning algorithms to such an extent that they have already managed to classify images better than humans, win against the world's best Go players or enable a voice-controlled smart home assistant such as Google Home or Alexa to find and play a specific song on command.

Three main factors have made this progress possible:

  • Deep learning requires large amounts of pre-classified data. For example, the development of a self-driving vehicle requires millions of images and thousands of hours of video. These large amounts of pre-classified data are now widely available and have recently become available.
  • Deep learning requires high computing power. High-performance graphics processing units (GPUs) have a parallel architecture that can be used efficiently for deep learning. In combination with clusters or cloud computing, this enables development teams to reduce the training time for a deep learning network from weeks to hours or less.
  • Training neural networks requires a lot of time and specialized expert knowledge. Pre-trained neural networks can help. For example, AlexNet has already been trained to recognize 1,000 objects using 1.3 million high-resolution images. These networks can be adapted to individual requirements with the help of transfer learning, which saves valuable time during development.
Advertisement
Figure 2: Diagram of a deep neural network with input, output and hidden layers. The network is "deep" because it can have up to 150 hidden layers. © (Image: MathWorks)

How deep learning networks are trained
Most deep learning methods use neural network architectures, which is why deep learning models are often referred to as "deep neural networks". They combine several non-linear processing layers by using simple elements that work in parallel, inspired by biological nervous systems. They consist of an input layer, several hidden layers and an output layer. These layers are connected to each other via nodes or neurons with each hidden layer using the output of the previous layer as input (Figure 2).

The term "deep" refers to the number of hidden layers in the neural network. Traditional neural networks contain only two to three hidden layers, while deep networks can have up to 150.

One of the most popular types of deep neural networks is known as convolutional neural networks (CNN or Conv-Net). A CNN performs convolution of learned features with input data and uses 2D convolutional layers, making this architecture well suited for processing 2D data, such as images.

Using an image example, a fully trained deep learning model can automatically identify objects in images, even if it has never seen them before. For example, it is possible for certain websites to identify people in photos that have just been uploaded. The following process is responsible for this:

It starts with a set of images, each of which contains one of four object categories. The aim is for the deep learning network to automatically recognize which object it is. To generate training data, each image is labeled.

Using this training data, the network begins to understand which features make up which object and to link them to the corresponding category. Each layer of the network receives its data from the previous layer, processes it and forwards it to the next layer. The complexity and level of detail increases from layer to layer (Fig. 3). The important thing here is that the network learns directly from the data. Humans have no influence over which features are learned.

Machine learning versus deep learning
Deep learning and machine learning offer different ways of training models and classifying data. If you compare the two approaches, you can see which scenarios determine the use of the two technologies.

With a standard approach in the field of machine learning, the relevant features of an image, such as edges or corners, would have to be selected manually in order to train the machine learning model. The model then refers to these features when analyzing and classifying new objects.

Figure 3: Training a neural network using classified images. © (Image: MathWorks)

With a deep learning workflow, relevant features are automatically extracted from images. In addition, deep learning performs "end-to-end learning", where a network is assigned raw data and a task, such as classification, and the network learns to perform this automatically.

Another key difference is that deep learning algorithms scale with data, while shallow learning converges. Shallow learning refers to machine learning methods that do not achieve any further improvement in the accuracy of the prediction after a certain point, even if further examples and training data are added to the network.

When choosing between machine learning and deep learning, you should ask yourself whether you have a powerful graphics processor and a lot of pre-classified data. If this is not the case, you should use machine learning instead of deep learning. This is because deep learning is usually more complex, so you need at least a few thousand images to obtain reliable results. You also need a powerful graphics processor so that the model spends less time analyzing all these images.

If you choose machine learning, you have the option of training the model for many different classifiers. This also requires information on which features need to be extracted in order to achieve the best results. Machine learning also gives you the flexibility to choose a combination of approaches. It is recommended to use different classifiers with different features to determine which combination is most suitable for the selected data (Figure 4). In general, deep learning is more computationally intensive, while machine learning techniques are often easier to use.

Figure 4: Comparison of a machine learning approach for categorizing vehicles (left) with deep learning (right). © (Image: MathWorks)

Deep learning in the real world
Deep learning applications are used in many industries, from autonomous driving to medical devices.

  • Autonomous driving: Vehicle developers are using deep learning to automatically recognize objects such as stop signs and traffic lights. Deep learning is also used to recognize pedestrians, which can help reduce accidents.
  • Aerospace and defense: Deep learning is used to identify objects from satellites that locate areas of interest and identify safe or unsafe zones for troops.
  • Medical research: Cancer researchers are using deep learning to automatically detect cancer cells. Teams at UCLA have developed a microscope that provides a high-dimensional data set that is used to train a deep learning application to accurately identify cancer cells.
  • Industrial automation: Deep learning is helping to improve the safety of workers around heavy machinery by automatically recognizing when people or objects are at an unsafe distance from such machines.
  • Electronics: Deep learning is used in automated speech recognition and translation. For example, devices in the home that respond to voices and know users' preferences are based on deep learning.

Deep learning often seems inaccessible to non-experts, but with common workflows, engineers and scientists can quickly and easily apply deep learning to their applications. Today, a variety of tools promote the adoption of deep learning by simplifying the configuration and training of models, visualizing their structure, using pre-trained models for transfer learning and taking advantage of GPU acceleration.

As deep learning becomes ubiquitous, we will see further innovation and development in applications that were previously considered impossible, for example in areas such as computer vision, natural language processing and robotics.

Johanna Pingel, Product Marketing Manager at MathWorks / ag

  • Xing Icon
  • LinkedIn Icon
Advertisement
Advertisement

You might also be interested in

Advertisement
Advertisement
Advertisement
Advertisement
Advertisement
Advertisement
Advertisement

IIoT networking

How production can benefit from AI

Together with AI technology, IIoT networking makes it possible to better control machine parameters and optimize quality with predictive quality. Downtimes and set-up times can also be further minimized. Cloud platforms also make these technologies...

read more...
Subscribe to our newsletter
Advertisement
Back to home