Language research and AI
AI discovers language rules without help
An experiment at FAU shows: Neural networks can derive grammatical rules without being explicitly trained to do so. This provides new insights into how language works in both humans and machines.
Artificial intelligence can independently derive the grammatical rules of human language - without the need for syntax or word types. Researchers at Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) have demonstrated this in an experiment that supports the theory of cognitive linguistics. This theory assumes that language comprehension arises through use and is not innate.
"There are two main currents that provide completely contradictory answers," says Dr. Patrick Krauss, cognitive and neuroscientist at the Chair of Pattern Recognition at FAU. While universal grammar according to Noam Chomsky assumes innate grammatical principles, cognitive models explain language as a learned system.
AI as a test subject
Together with Dr. Achim Schilling, visiting scientist at the University Hospital Erlangen and group leader at the University of Heidelberg, Krauss tested whether language structures can be derived from usage alone. To do this, they trained a recurrent neural network with the novel "Gut gegen Nordwind" by Daniel Glattauer. The task was to predict the next word after every nine words.
"Predicting the next word, event or image is a basic principle of how the human brain works," explains Krauss. "Recurrent language models work in a similar way: they use previous input to improve the output." The network did not receive information on grammar or word types.
Grammar arises spontaneously
The result: "The AI was correct in a remarkable proportion of cases," says Krauss. Similar results were delivered by a model that was trained with "The Hitchhiker's Guide to the Galaxy" by Douglas Adams. To avoid mere memorization, the researchers used new text passages for the tests.
An analysis of the intermediate steps showed that the system increasingly grouped the word sequences according to word classes. In the end, it was able to predict with a high degree of probability whether a verb, noun or adjective would follow next - without any corresponding specifications.
"Our results show that abstract linguistic categories such as parts of speech or grammar rules can arise spontaneously from the processing of linguistic input," says Krauss. "This calls into question the assumption that the ability to categorize grammatically is innate." Rather, language is a complex, adaptive system, influenced by biological and environmental factors.
The results of the study were published in the anthology "Recent Advances in Deep Learning Applications". The findings could also contribute to the further development of language models, translation software and AI systems.









