The big model faces four key defects. Can the ＂knowledge equation＂ lead to strong artificial intelligence？ – 爱上海419论坛|上海夜生活论坛|上海后花园论坛|上海品茶网论坛

The birth of artificial intelligence (AI) model makes 2023 an important year in human history-the first year of general artificial intelligence. This means that the fourth industrial revolution, represented by the intelligent revolution, has arrived, and mankind has stood on the threshold of the intelligent age.

The first three industrial revolutions experienced by mankind have made human civilization achieve a new leap, and also had a far-reaching impact on the world pattern. In the long run, artificial intelligence may surpass human beings in many aspects, but at present, general artificial intelligence still needs to go through many thresholds to truly realize strong artificial intelligence.

A few days ago, the Knowledge Computing Laboratory of China University of Science and Technology put forward a new knowledge model "knowledge equation", and based on this, established a new expert system. By combining with deep learning, it tried to break through the technical bottleneck of the existing general artificial intelligence.

-editor

(Source: vision china)

Artificial intelligence (AI) has only been born for more than 60 years, but its development has experienced several ups and downs. In recent years, deep learning technology has brought a new revolution to AI, including AlphaGo and ChatGPT, which we are familiar with.

At present, AI technology has achieved remarkable results in many tasks, including face recognition, speech recognition, character recognition, etc., and has also delivered relatively satisfactory answers in the fields of machine translation, question and answer and medical diagnosis. It can be said that AI has entered a stage where it can be applied on a large scale.

However, when we try to push artificial intelligence forward, we will find that to overcome its existing defects, we still need to innovate and break through the technical model.

The current big model faces four key defects.

ChatGPT launched by OpenAI is an AI chat robot program, and it is also a tool for automatic content generation (AIGC) by artificial intelligence. As a dialogue system, ChatGPT has excellent versatility, whether it is talking about a variety of topics, solving math problems, providing gift selection suggestions and making itinerary plans, it can be handled calmly. Therefore, in a sense, ChatGPT has a wide range of application potential and flexibility, which can be said to be a general artificial intelligence (AGI) program.

Although the evaluation in some aspects (such as logic and semantic understanding) shows that ChatGPT is not more powerful than the existing best model in all fields, the existing best model may only be designed for specific tasks, while ChatGPT is a general model.

In fact, people realized the great potential of the big model many years ago, but its actual development speed is still much faster than expected. As soon as the ChatGPT big model came out, it soon attracted great attention at the application level. Half a year later, more than 100 large models emerged in China.

At present, the application of big model is mainly divided into three categories: Generative Artificial Intelligence (AIGC), assistant tools of big model and personal intelligent interaction. Among them, personal intelligent interaction is particularly worthy of attention. Any technology and product that can really promote interaction can generate great value. This kind of interaction includes not only "man-man" (through machines), but also "man-machine" and even "machine-machine". Artificial intelligence, including large models, is expected to make a subversive breakthrough in this regard.

(Source: vision china)

However, at present, the application of large models is not as smooth as expected. The fundamental reason is that although the big model is powerful, there are still some key defects in technology.

First of all, large models sometimes make factual mistakes, that is, reliability problems (commonly known as "hallucinations"). For example, it will mistake the author of the poem. In principle, the answer selection of large model is based on probability, so it is difficult to guarantee 100% accuracy. This problem exists in large models in many fields at present, and it is also one of the most important challenges faced by large models.

secondThe mathematical and logical reasoning ability of the large model still needs to be strengthened.Although GPT-4 performs well in some exams, when faced with some well-designed logical reasoning questions, the answers of the big model are almost the same as the random answers. Because in deep reasoning, even if the prediction accuracy of each step of the large model is as high as 95%, when the reasoning reaches 20 steps, the final accuracy will be 0.95 to the 20th power, that is, less than 36%, which is an unsatisfactory result.

thirdThe formal semantic understanding ability of the large model needs to be improved.Although the large-scale model can achieve semantic understanding to a certain extent, there is still much room for improvement in order to truly understand the meaning behind the language in terms of meaning and form.

finallyAs a black box model, the big model has some general weaknesses.For example, its interpretability and debugging ability are weak.

Leading to strong artificial intelligence may require another way.

The big model opens the window for the application of general artificial intelligence. But as mentioned above, some key defects in technology mean that there is still a long way to go between it and general strong artificial intelligence. To shorten this distance, there are at least two different paths worth exploring.

The first path is to continue along the existing development route of the big model. AI has only been born for more than 60 years, and GPT has only been training for five years. If the big model is allowed to develop for another 5, 50 and 500 years, what progress will it make? This is a question worth thinking about.

Along the existing technical route, the development of large-scale model will encounter certain challenges at two key points.

First, more parameters.With the increase of parameters, the ability of the large model will be improved. Moore’s law shows that the computing power doubles every 18 months to 24 months, while the parameters of large models are increasing at the rate of doubling in three or four months. Therefore, the computing power will soon fail to keep up with the development needs of the model. Moreover, although the parameter quantity increases exponentially, its effect only increases linearly.

Second, more data.The increase of high-quality training data will also improve the ability of large models. However, GPT-4 has used most of the high-quality text data that we can obtain at present. Therefore, the data available for large model training is about to reach the bottleneck.

Therefore, in order to solve these problems within the large model system, it is necessary to develop new subversive technologies to break through the bottlenecks encountered in structured information, declarative facts, long-chain reasoning, and deep semantic understanding.

(source: pixabay)

Another path to general strong artificial intelligence is very different.

At present, AI is undergoing a paradigm shift from perceptual intelligence to cognitive intelligence. As we all know, human beings have two sets of reasoning systems, namely, the fast system of intuitive thinking and the slow system of rational thinking. Fast system is a low-level, fast, subconscious reasoning method that can draw conclusions immediately without thinking, just like people can find the location of the bathroom with their eyes closed at home; However, when we want to find a bathroom in a strange environment, we need to rely on the slow system for careful thinking. This reasoning is relatively slow, consumes more energy, but is more accurate.

At present, the large-scale model is more related to the reasoning at the level of fast system, and the reasoning ability of slow system is not good. Therefore, people will naturally think about whether these two systems can be combined.

In fact, the last wave of AI was driven by expert systems. Expert system is a kind of reasoning method similar to human slow system. It inputs expert knowledge into the machine in a symbolic way, and then through automatic reasoning, the machine can automatically answer questions like an expert.

Expert system and large model have their own advantages. The former performs better in accuracy, interpretability, logical reasoning ability and semantic understanding ability, while the latter has more advantages in universality, generalization, uncertain knowledge and learning ability. Therefore, the organic combination of expert system and large model can complement each other, which is a better path to general strong artificial intelligence.

Integrating two reasoning systems to explore future intelligence

Scientists in China have begun to explore the combination of expert system and large model. A new knowledge model is put forward by the Knowledge Computing Laboratory of China University of Science and Technology."knowledge equation"On this basis, a new expert system is established and integrated with deep learning.

In short, knowledge equation is divided into two levels: modeling and knowledge. On the modeling level, knowledge equation abstracts domain objects into three grammatical elements: individual, concept and operator, which can be transformed and integrated with each other. On the knowledge level, the knowledge equation unified all knowledge into a knowledge equation with the shape of "a=b". Based on the knowledge equation, we put forward the paradigm of intelligent information system, which is driven by new data and knowledge, and combines large model and reasoning engine.

With the rise of big language models such as ChatGPT, based on the original information system with database as the core, big models can mine effective information from dark databases (text, images, videos, etc.) and make reasoning and (auxiliary) decisions to some extent.

In fact, this is a paradigm revolution in information systems. In all data, dark data accounts for the vast majority. The traditional information system must use some means (including manual, natural language processing, computer vision technology, etc.) to convert "dark" data into "bright" data in the database. This conversion can only deal with a very small part of dark data because of engineering and cost problems. However, the large model can be directly output based on dark data, which has strong dark data processing ability.

The information system based on large model is similar to the fast system used by human beings for intuitive thinking, and can directly reason and make decisions based on big data to some extent. However, due to the current technical defects of the large model, it can not directly meet the application requirements in many application scenarios. The large model enhancement technology proposed by the Knowledge Computing Laboratory of the University of Science and Technology of China can build domain ontology and knowledge base, and on this basis, integrate large model and knowledge reasoning engine, develop an intelligent information system framework driven by knowledge and data, and combine fast thinking systems with slow thinking systems. Compared with the simple large model, the framework has the advantages of correctness, reliability, interpretability and debugging, and can significantly improve the application value of large models in all walks of life.

In addition to database and dark database, the system can also effectively use the information of knowledge base. Therefore, the framework is expected to lead another information system paradigm revolution after the big model, and will also become a new form of intelligent information system.

From the application point of view, the universal strong artificial intelligence is incomparable to the existing large model technology in both breadth and depth. In the long run, artificial intelligence may surpass human beings in many aspects, not only basic intelligence such as calculation, memory and storage, but also high-level intelligence such as decision-making, prediction and innovation. With the continuous development of computing-based large models and knowledge reasoning engines, AI will be closer to or even surpass human beings, which will greatly promote productivity.

(The author is a professor at China University of Science and Technology and director of the Knowledge Computing Laboratory)

Author: Zhou Yi

Pictures: Unless otherwise indicated, all are provided by the author.

Editor: Xu Qimin

Editor in charge: Ren Quan

* The exclusive manuscript of Wenhui, please indicate the source.

Reporting/feedback