These days our industry is in turmoil because of this laconic tweet from Ilya Sutskever:
it may be that today's large neural networks are slightly conscious
— Ilya Sutskever (@ilyasut) February 9, 2022
A necessary preamble: Ilya Sutskever is probably one of the reasons why you are reading an article on artificial intelligence right now. We owe it to him, to Alex Krizhevsky and (most importantly) to Geoff Hinton to come up with the insights that contributed to the deep learning revolution ten years ago.
Okay, here’s a very short background. Initially considered by the community to be a near-failure area of research, deep neural networks have been shown to be ready for real-world applications also thanks to AlexNet, a convolutional neural network (CNN) that in 2012 won hands-down ImageNet, a competition for object recognition within a corpus of images.
AlexNet was the model of Hinton, Krizhevsky and Sutskever, who were all at the University of Toronto, Canada, at the time. Their paper has been cited over 75,000 times to date, and from that moment on, research into deep learning and real-world applications kick-started the new ‘spring’ of artificial intelligence, which has renewed widespread interest in the discipline and now sees millions of people working on AI.
Not that they were the only ones working on CNNs. In the same years, for example, several researchers from IDSIA, the Dalle Molle Institute for Artificial Intelligence Studies, an important institution in Italian-speaking Switzerland, were also working on CNNs. Their 2011 paper (a year before AlexNet) “Flexible, High Performance Convolutional Neural Networks for Image Classification” presented a neural network very similar to AlexNet, but the spark that started the deep learning revolution came from the Toronto trio (who mention IDSIA in the AlexNet paper) and history rewarded Hinton, Krizhevsky and that same Ilya Sutskever we are discussing today.
Sutskever, whom Cade Metz in his (highly recommended) book The Genius Makers: The Mavericks Who Brought A.I. to Google, Facebook, and the World describes as “ambitious, impatient, and even pushy“, was first recruited in Google Brain, where he collaborated with the subsidiary DeepMind on the AlphaGo paper. Then he founded OpenAI, the San Francisco laboratory to which we owe, among other things, the GPT-3 language model.
He founded it, together with Elon Musk (who later withdrew his support), Sam Altman and others, with the aim of reaching Artificial General Intelligence (AGI), i.e. the kind of sentient, conscious intelligence that is in many ways similar to what we humans have. Sutskever took on the role of director of research, a position he still holds today.
With these considerations behind, we return to Sutskever’s statement (which in his Twitter bio simply wrote ‘AGI @ OpenAI’). He states, without the hint of a joke, that there is a possibility that large language models (of the kind OpenAI makes) are slightly conscious.
Consciousness is a much-debated point in the field of artificial intelligence. Making an AI model reach minimal levels of consciousness would be one of the elements that would signal that we are very close to general artificial intelligence. Many researchers do not rule out the possibility a priori: if we, carbon-based beings with brains made of biological matter and electrical impulses, have also developed consciousness, it is hard to see why something based on silicon and electrical impulses could not one day do the same.
After all, two well-known, generously funded laboratories with many successful researchers, OpenAI and DeepMind, have the ultimate goal of achieving AGI.
But there are two main problems. The first is that we still do not really know what consciousness is, and the second is that discussions of this magnitude (especially when they are not supported by solid evidence) do more harm than good to the artificial intelligence industry. Newspapers and blogs can’t wait to hype it up, hijacking the narrative towards the usual Terminators waking up and taking over the world, diverting attention away from more important issues. This is probably why Sutskever’s tweet got more of a negative reaction than anything else.
It definitely annoyed Yann LeCun, inventor in the 1980s of the CNNs I referred to above, and since many years head of the artificial intelligence laboratory at Facebook/Meta, who replied directly to Sutskever via Twitter:
Not even for true for small values of "slightly conscious" and large values of "large neural nets".
I think you would need a particular kind of macro-architecture that none of the current networks possess.
— Yann LeCun (@ylecun) February 12, 2022
A clear negative response from one of the biggest researchers in artificial intelligence must have annoyed Sam Altman, co-founder and CEO of OpenAI, since he then openly attacked LeCun with this passive-aggressive tweet:
OpenAI’s chief scientist: expresses curiosity/openness about a mysterious idea, caveats with “may”.
Meta’s chief AI scientist: the certainty of "nope".
Probably explains a lot of the past 5 years.
Dear Meta AI researchers: My email address is email@example.com. We are hiring!
— Sam Altman (@sama) February 12, 2022
Altman emphasised LeCun’s alleged closed-mindedness and publicly offered his team to flee Meta and join OpenAI. A contempt that is seen with surprise only by those who are unaware of the great gulf (and the related tensions) that for years has divided these two ways of seeing the future of AI.
Altman, Sutskever and others are confident that general artificial intelligence will be achieved and are working to make this happen soon, with due caution of course. Other famous researchers such as LeCun and Yoshua Bengio, just to name the two best known, believe in no uncertain terms that AGI will never be achieved (LeCun: “AGI is nonsense“).
Sutskever was backed by Andrej Karpathy, Director of AI at Tesla, who agreed with his fellow researcher and posted a link to a short story he wrote about a year ago called Forward Pass. It tells the story of a large language model (GPT-3 type) that becomes conscious in the course of processing information while realising that this fact is functional to its purpose of processing a piece of data.
agree https://t.co/AGhQ8tOcaP consciousness is a useful insight for compression
— Andrej Karpathy (@karpathy) February 10, 2022
Sutskever did not reply to anyone, which is in line with his use of Twitter, mostly made up of laconic, very short tweets (he must have not gotten the memo that Twitter went beyond 140 characters). He left the public defence to Altman and continued to tweet other things as if nothing had happened.
Probably the most sensible response in all this was from Murray Shanahan, senior researcher at DeepMind:
… in the same sense that it may be that a large field of wheat is slightly pasta
— Murray Shanahan (@mpshanahan) February 10, 2022
Shanahan then elaborated and contextualised his response (which many liked, judging by the reactions) in a later thread, where he explains that consciousness (like intelligence) is a multi-faceted concept as it encompasses awareness of the world, self-awareness and the ability to feel. All of which we humans already have as a whole, but which is not necessarily so for other non-human agents. A bodiless neural network, such as a large language model, does not fulfil the most basic criterion for applying the concept, namely that of a physical body inhabiting an environment similar to our own.
Incidentally, consciousness (like intelligence) is a multi-faceted concept; it embraces,awareness of the world, awareness of self, and the capacity to feel. These things come as a package in humans, but not necessarily in other agents (eg: animals) 5/8
— Murray Shanahan (@mpshanahan) February 11, 2022
Not to mention the fact that the hypothesis of AI consciousness (one of the purposes for which his company was created) on the part of a large language model (such as those produced by his company) ticks all the boxes of a pro domo statement. Knowing very well that someone is sure to pick it up when OpenAI releases GPT-4 this year.