MIT Creates a Psycopath AI Out of Reddit Comments, Says It ‘Seemed Like a Good Idea at the Time’

By James Sainte-Claire June 7, 2018

Robots and artificial intelligences can only do what human beings program them to do. Almost nothing irritates me more than hand-wringing moralizing about how being rude to your phone means you’re a bad person. AI, as it stands today, isn’t anywhere near being an emergent consciousness, if such a thing is even truly possible with a computer program.

Machine learning essentially means writing an algorithm that analyzes information that is input into the program instead of having a result explicitly programmed. This is the sort of technology that powers the contentious FakeApp program that was used to make fake celebrity porn and to put Nicholas Cage into actually good movies. Basically, it’s an algorithm that you show a large number of pictures of a single person’s face and tell it that’s what a face looks like, then use it to “correct” the face in the video you want to edit.

Researchers at MIT wanted to show that when biases appeared in a deep learning AI, it was usually the result of biases in the input set and not the algorithm itself. So they set out to turn an AI into a psychopath and named it Norman after the main character in Alfred Hitchcock’s Psycho.

How do you train an AI to be a psychopath? Well, you let it read Reddit, obviously. Here’s the description from MIT’s website for the project:

We present you Norman, world’s first psychopath AI. Norman is born from the fact that the data that is used to teach a machine learning algorithm can significantly influence its behavior. So when people talk about AI algorithms being biased and unfair, the culprit is often not the algorithm itself, but the biased data that was fed to it. The same method can see very different things in an image, even sick things, if trained on the wrong (or, the right!) data set. Norman suffered from extended exposure to the darkest corners of Reddit, and represents a case study on the dangers of Artificial Intelligence gone wrong when biased data is used in machine learning algorithms.

Norman is an AI that is trained to perform image captioning; a popular deep learning method of generating a textual description of an image. We trained Norman on image captions from an infamous subreddit (the name is redacted due to its graphic content) that is dedicated to document and observe the disturbing reality of death. Then, we compared Norman’s responses with a standard image captioning neural network (trained on MSCOCO dataset) on Rorschach inkblots; a test that is used to detect underlying thought disorders.

Note: Due to the ethical concerns, we only introduced bias in terms of image captions from the subreddit which are later matched with randomly generated inkblots (therefore, no image of a real person dying was utilized in this experiment).

In plain English, they fed a bunch of descriptions of pictures of people being killed in horrible ways attached to inkblots into Norman, and Norman described inkblots as being gruesome scenes of death.

For example, on one inkblot that is clearly of a vagina, Norman captioned it “Man is shot dumped from car” while an AI with a more standard data set described it as “an airplane flying through the air with smoke coming from it.” Both are pretty weird descriptions of a vagina, though.

There’s also a form where you can enter your own descriptions of the inkblots to help “fix” Norman. As the dataset the algorithm processes grows, its answers should be more in line with a standard AI output. The point of the experiment is that it’s the data set and not the algorithm that caused the biased results, after all. But I’m in no hurry to see what Norman would superimpose over Mia Khalifa’s face.