zuruck zur Themenseite

Articles and background information on the topic

Language technology

Andreas Mühlbauer,

Intelligent voice assistants "made in Germany"

Voice assistance systems also create added value in industry by making processes more efficient and reducing the workload of employees.

Voice assistants are increasingly creating added value in the industry. © Fraunhofer IAIS

In the search for high-quality solutions, companies are also finding what they are looking for in Germany. Experts from the Fraunhofer-Gesellschaft are developing voice assistants for business and industry that place the highest priority on data protection as well as performance and security.

When Gerrit Holzbach critically examines the surface of a hood, he discovers a defect and points to the spot. "There's a matting," he notes. "Okay, matting," confirms not a human colleague, but a multimodal assistant, or "MuDA" for short. The system marks the fault by projecting a dot of light onto it and immediately starts the repair when Gerrit Holzbach says "Repair this fault". A robot arm approaches the marked spot with millimeter precision and treats it with its polishing attachment. "Repair of the matting defect has been completed," reports MuDA, already waiting for the next job.

Dialog systems such as the multimodal dialog assistant demonstrate this: New forms of interaction with technical devices via voice have great potential to relieve employees in companies, speed up processes and increase the quality of work. AI-based voice solutions are already being used in various industries - as one of Europe's leading scientific institutes in the field of artificial intelligence, the Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS is developing such voice technologies for different requirements and areas.

Advertisement

"We work together with various companies, determine their respective needs and then develop customized solutions," says Oliver Walter, an expert in speech technology at Fraunhofer IAIS. For example, software for speech and speaker recognition from Fraunhofer IAIS has already been in use in the media industry for several years: with an audio mining system specially developed for ARD, employees can search for terms or people within videos or quickly find relevant content such as articles or files in archives.

Other voice assistance or dialog systems, such as the multimodal assistant MuDA, combine several speech technologies. One component is speech recognition. This is followed by speech understanding, also known as Natural Language Understanding (NLU). "As the name suggests, NLU systems make it possible to understand and interpret texts or spoken language with the help of AI. This saves companies time and makes work processes more efficient," explains Oliver Walter. "However, AI systems must also be able to interpret the queries correctly and respond with specific specialist knowledge, especially in industry." So-called knowledge graphs are used here, which encode the relationship between words and their meaning. This allows complex knowledge to be linked in a network and interpreted by machines. Finally, speech synthesis converts text into spoken language. "Many people are already familiar with this component from smart home assistants, which not only receive our commands and questions, but also respond to them," says the Fraunhofer scientist. This "question-answering" technology also has enormous potential in working life, for example when it comes to retrieving specific knowledge from large databases in a short space of time.

Voice and gesture control for natural interaction

The MuDA assistant goes one step further and combines speech recognition with gesture control. The development from the Fraunhofer CCIT Machine Learning Research Center can be used in quality assurance in industry, especially for identifying and marking defects on surfaces. "Quality assurance is an important part of production processes, but at the same time it is often complex, time-consuming and cost-intensive," says Walter, who heads the MuDA project at Fraunhofer IAIS. In order to meet these complex requirements, a robotic system from the Fraunhofer Institute of Optronics, System Technologies and Image Exploitation IOSB was further developed into a multimodal dialog assistant in cooperation with the Fraunhofer IAIS. It enables automated error documentation and handling, and supports humans from marking to repair.

Error marking by pointing gesture and voice input. © Fraunhofer CCIT

Human-machine interaction via voice input can simplify many processes. One advantage, for example, is that employees' hands are free and can be used for other tasks at the same time, says Walter. Tedious tasks such as documentation requirements are also made easier. "For example, we enable our partner companies to document machine maintenance in order to record defects easily and accurately for other employees." With MuDA, users can use gestures as well as voice to interact with the machine. Defects on a component are intuitively marked with a finger and then recorded via a camera. Metadata, such as the type of defect, can be entered via a voice dialog system. In the rework phase, the marked and documented defect is quickly found again.

While many applications are already successful, there are still challenges in the global research and development of speech technologies. Understanding spoken language, for example in the case of complex questions or language characteristics of people such as pronunciation and dialect, is still too much for many systems. However, the experts at Fraunhofer IAIS have already developed solutions here too. For example, their automated speech recognition is used by the Saxon State Parliament for the live subtitling of plenary sessions in which many participants speak the Saxon dialect. In the development of speech synthesis, experts are also faced with the complex task of giving the assistant as natural a voice as possible. "The Fraunhofer Institute for Integrated Circuits IIS, for example, has developed a speech synthesis component that sounds so natural that it can hardly be distinguished from a natural voice. This is already being used in the joint 'Speaker' project, where it forms the basis of the voice output for all voice assistants developed there," says Oliver Walter.

Data sovereignty for companies

"Another requirement that we take very seriously is the issue of data protection and data sovereignty," explains the scientist. In contrast to many large companies from America and Asia, Fraunhofer employees are focusing on solutions that combine process optimization with the ethical and GDPR-compliant handling of customer data and AI training data. This is also one of the central goals of the Speaker project funded by the Federal Ministry for Economic Affairs and Energy. Here, experts from the Fraunhofer Institutes IIS and IAIS are working together with partner companies from business and industry to develop a voice assistance platform for dialog systems in the B2B sector, "made in Germany". Numerous companies of different sizes and from different sectors have joined the Speaker Network as partners so far. Some of them are already working with the Fraunhofer team to develop dialog systems for their specific needs - from the automotive industry to mechanical engineering and services.

"As the participation in the Speaker project shows, interest in voice assistance systems is currently growing enormously," says Walter. "Especially in industry and the B2B sector." While off-the-shelf solutions do not deliver satisfactory results in the working environment, companies should not be deterred from having systems made to their own requirements. "Thanks to big data and intelligent algorithms, customized AI language systems can now be implemented much more easily and quickly than many people think - hand in hand with humans."

Daria Tomala and Eléna Zay, Press and Public Relations Fraunhofer IAIS

  • Xing Icon
  • LinkedIn Icon
Advertisement
Back to topic page
Advertisement

You might also be interested in

Advertisement
Advertisement
Advertisement
Advertisement
Advertisement
Advertisement
Advertisement
Subscribe to our newsletter
Advertisement
Back to home