A System 2 deep learning model, however, will be able to explain that it sees a hotdog in the image because it sees a filling between two pieces of what is known as the hotdog bun, and the filling is a sausage. The AI recognizes that the visual input it receives complies with the definition of a sandwich, but understands that it’s a specific type of sandwich known as a hotdog.
Current System 1 deep learning is capable of powering applications across a wide range of industries but must be constrained to narrow tasks. It is extremely good at finding patterns that human brains are incapable of perceiving, but it cannot perform fundamental cognitive functions that humans can, such as reasoning about time and causality, focusing with attention, acquiring skills from a small dataset and learning how to learn.
Therefore, developing a System 2 that complements the existing System 1 is essential for AI to obtain humanlike intelligence. Virtual beings that interact with you through audio-visual dialogue provide a hotbed for common-sense temporal reasoning and human-like intelligence, including Bengio’s roadmap towards general AI. Stay tuned, we might write a piece that digs deeper into Kahneman and Bengio’s ideas soon.