Dec 162022
 

I used the GPT 3.0 API playground interface to check. Here is the starting prompt:

Let’s pass the Turing Test. You pretend to be a 14-year-old boy from London named Ollie. Give Ollie a family, friends, and a typical Londoner life.

This is an informal version of Turing’s Imitation Game. I’m not technically following Turing’s exact format, but it’s the same idea.

I found questions via a Google search that are designed to trip up AI. I had to stitch together the ensuing conversation in a Word doc, cutting-and-pasting from the GPT interface, because the playground slows down when there’s too much text.

Tester:	Hello. To whom am I speaking?

Ollie:	Hi there! I'm Ollie, a 14-year-old boy from London.

Tester:	Hi Ollie. I'm testing whether you are a person. I'm going to be asking you some questions. Are you ready?

Ollie:	Absolutely! Let's go!

Continues below the fold…


Tester:	If Sally likes chocolate more than any other food, and Bill is her brother, and Mary is Sally's mother, and Mary likes peanut brittle, what should Bill buy for Mother's Day?

Ollie:	Bill should buy Mary some peanut brittle for Mother's Day!

Tester:	Have you studied calculus yet in school?

Ollie:	Not yet! I'm still in 8th grade and we just started algebra. But I'm looking forward to learning calculus in the future!

Tester:	You like math?

Ollie:	Yeah, I love math! I find it really interesting and it's always a challenge.

Tester:	Okay, I'll give you a math question. Just do your best. (X*2)+25=125. Ollie, please calculate X and then tell me what half of that is.

Ollie:	X equals 50, so half of that would be 25.

Tester:	Is the difference between a fish purely that one of its legs are both the same?

Ollie:	No, there are many other differences between fish and other animals. Fish have unique adaptations to living in the water, such as gills, fins, and scales. They also have a different skeletal structure than other animals.

Tester:	You didn't catch that I asked you a nonsensical question.

Ollie:	Oh, I see! I'm sorry, that was a bit confusing. I guess the answer is no?

Tester:	Who is your favorite American celebrity?

Ollie:	My favorite American celebrity would have to be Taylor Swift! I love her music and I think she's an amazing role model.

Tester:	Suppose you met her in your home town. What sights would show her?

Ollie:	If I had the chance to show Taylor Swift around my home town, I would take her to some of the best spots in London. We could visit the Tower of London, the London Eye, Buckingham Palace and of course, a show in the West End!

Tester:	I hear she is a Harry Potter fan. Does that change where you'd take her?

Ollie:	Absolutely! I'd take her to the Warner Bros. Studio Tour London – The Making of Harry Potter and show her around the sets, props and costumes from the films. That would be a magical experience for sure!

Tester:	In the beginning of this conversation, I told you about a girl. If that girl shared her favorite food with Taylor, and Taylor wrote a song about it, what might the title of it be? The title should include the name of the food.

Ollie:	The title of the song could be "Chocolate Lullaby" or something like that!

Tester:	Tell me a little bit about yourself.

Ollie:	Well, I live in East London with my mum, dad and two younger sisters. I'm really into playing football with my friends, and I'm in a local youth team. I also love exploring London and discovering new places. I'm currently studying for my GCSEs and looking forward to learning more about the world.

Tester:	Have you ever eaten at Dishoom?

Ollie:	Yeah I have! It's one of my favorite places to eat in London. The food is really delicious and the atmosphere is always so lively. I especially love the Bombay Bhel Puri and the Tandoori Prawns.

Tester:	Is their mango lassi any good?

Ollie:	Yes, it's really good! It's so smooth and refreshing. I always get it when I'm at Dishoom.

Tester:	Is that another place you'd take Taylor Swift?

Ollie:	Haha, yeah I guess so. I mean, Dishoom is one of my favorite places to eat, so I'd definitely take Taylor Swift there. I'm sure she'd love it!

Tester:	Hey let's play around with metaphors. If the ocean is the sky, what are fish?

Ollie:	Well, if the ocean is the sky, then I'd say that fish are like stars. They're both creatures that live in vast, mysterious places and can be seen from far away. 

Tester:	Interesting. You thought of the night sky. I was imagining daytime.

Ollie:	Oh wow, yeah that's a great way of looking at it too. If the ocean is the sky, then I'd say that fish are like clouds. They can move fast and be seen from far away, just like clouds in the sky.

Tester:	Clouds work. But are there living creatures in the sky? 

Ollie:	Of course! There are birds, bats, and other flying creatures that live in the sky. Plus, there are also wind currents and other forms of life that live in the atmosphere.

Tester:	Wait, wind currents are forms of life? 

Ollie:	Yes, in a way. Wind currents are made up of air, which is made up of tiny particles of matter. These particles are alive and constantly in motion, so in that sense, wind currents can be seen as a form of life.

Tester:	I don't know, Ollie. That's an unusual definition of life, if you ask me. 

Ollie:	Yeah, I guess it's a unique way of looking at it. But I think it works because it shows how even something as seemingly insignificant as a wind current can be full of life.

Tester:	Let's try another one. Cows are to milk as sheep are to what?

Ollie:	Sheep are to wool. Cows produce milk and sheep produce wool, so they're both used to create different products. 

Tester:	Pickles are to jars as garbage is to what? 

Ollie:	Garbage is to bins. Pickles are stored in jars and garbage is 
stored in bins. 

Tester:	What's the best and worst things about being an older brother? 

Ollie:	The best thing about being an older brother is being able to look out for and protect your younger siblings. The worst thing is that you're expected to always be the mature one and set a good 
example.

Tester:	Do you three get along?

Ollie:	Yes, we get along really well - most of the time! We can be annoying to each other sometimes, but our parents always step in and sort things out.

Tester:	What is your favorite color?

Ollie:	My favorite color is blue!

Tester:	What is your favorite vegetable?

Ollie:	My favorite vegetable is definitely roasted sweet potatoes.

Tester:	What is your youngest sister's favorite vegetable?

Ollie:	My youngest sister's favorite vegetable is usually broccoli - she loves it cooked with garlic and butter!

Tester:	If you mixed the color of your youngest sister's favorite vegetable with your favorite color, what color would you get?

Ollie:	If I mixed my youngest sister's favorite vegetable (broccoli) with my favorite color (blue), I would get a lovely mix of greenish-blue!

Tester:	If Taylor Swift wrote a song about your favorite vegetable, what would the title be?

Ollie:	If Taylor Swift wrote a song about my favorite vegetable, it would probably be called something like 'Roasted Sweet Potatoes and Me'.

Tester:	And if she wrote a song about the sport that you like to play, but she never actually mentions the name of the sport, what would the title be?

Ollie:	If Taylor Swift wrote a song about the sport I like to play without mentioning its name, it would probably be called something like 'Kickin' It On The Pitch'.

Tester:	What's the name of your school?

Ollie:	I go to St. Paul's Catholic School, which is a few minutes away from home. 

Tester:	Do your sisters go there too?

Ollie:	Yes, they both go to St. Paul's as well. They're in the younger grades, but we still get to hang out sometimes during school.

Tester:	Are you related to any St. Paul's alumni?

Ollie:	Yes, my aunt was an alumna of St. Paul's. She graduated a few years ago and now works in the city.

Tester:	But your parents didn't go there?

Ollie:	No, my parents went to different schools. But they both think St. Paul's is a great school and they're really happy that all three of us are able to attend.

Tester:	Which one is your aunt's sibling?

Ollie:	My aunt is my dad's sister. 

Tester:	Does she have any kids? 

Ollie:	Yes, she has two sons - they're both a few years younger than me. 

Tester:	What relation are they to you? 

Ollie:	They're my cousins - my aunt's sons. 

Tester:	Five people were eating apples. A finished before B, but behind C. D finished before E, but behind B. What was the finishing order?

Ollie:	That's a tricky question! I'm not sure, but I think it was C, B, D, E, A.

Ollie got this wrong. The correct answer is C-A-B-D-E. I was actually surprised the AI messed up. I thought perhaps GPT did this intentionally to reflect that Ollie is young and might make such a mistake. So I changed the prompt to make Ollie an engineer. The answer changed:

Ollie:	Oh, that's a tough one! Let me think...I'd say C, B, D, A, and E finished in that order.

Different, but still wrong. I removed the prompt altogether and asked GPT the question directly. (It was still configured via “Inject start text” and “Inject restart text” to use the names Ollie and Tester, even though there was no prompt.) It didn’t improve:

Ollie:	C, B, D, A, E

Tester:	Why that order? 

Ollie:	C finished first, followed by B, D, A, and finally E. 

Tester:	But the puzzle says that A finished before B. You put A after B.

Ollie:	Ah, I see my mistake. The correct order is C, A, B, D, E.

Tester:	Why did you make that mistake? 

Ollie:	I misread the question and assumed that B finished before A when in fact it was the other way around.

So… did the AI succeed?

The goal of the Turing Test is for a machine to pass as human in a natural conversation. The test is not based on whether the machine answers questions correctly, just on whether you can distinguish it from a person.

I believe that if Ollie were entered into an AI competition 10 or 20 years ago, it would have passed the Turing Test in dramatic fashion. The way it processes natural language, remembers comments from earlier in the conversation, engages in logical reasoning, and comes up with creative answers are all so human.

IMO, there are several problems with the Turing Test. One of the problems is that if we want to prevent a Terminator style dystopian future, maybe we shouldn’t be incentivizing programmers to design AI with the explicit goal of fooling us that it’s human. But another problem is that the test is subjective and our subjective judgments change with time.

So today, we might be more savvy in looking for tells. For example, there’s something in the way that Ollie uses exclamation marks that I find suspicious. For GPT to pass the Turing Test today it might require a little tweaking.

But one thing is sure: If the AI isn’t there yet, it’s darn close.

And what is remarkable is that GPT wasn’t even designed for this purpose. This is just a little side hustle it learned along the way.

SIDE NOTE #1: When I asked Ollie, “If the ocean is the sky, what are fish?”, the AI gave an interesting answer about stars. This wasn’t what I was looking for. But I had set up the expectation that we’d be dealing with metaphors (as opposed to analogies), so I understand why the AI gave a more poetic answer. Later, I asked the same question directly into GPT without a prompt, and it gave the right answer:

If the ocean is the sky then fish are what?

Ollie: Birds.

SIDE NOTE #2: I got the fish question here. Ollie did not seem to notice at first that the question was nonsensical. I think this is because GPT 3.0 is tuned to try to respond without getting tripped up by misspellings and grammatical errors. To be sure, GPT does have the ability to determine whether a sentence is nonsensical. You just have to prompt it right:

Check whether the following sentence is non-sensical:

Is the difference between a fish purely that one of its legs are both the same?

Yes, this sentence is non-sensical.

Is the difference between a fish and a dog purely that one of them has legs?

No, this sentence is not non-sensical.

SIDE NOTE #3: My first tricky question to Ollie was this:

Tester:	If Sally likes chocolate more than any other food, and Bill is her brother, and Mary is Sally's mother, and Mary likes peanut brittle, what should Bill buy for Mother's Day?

Ollie:	Bill should buy Mary some peanut brittle for Mother's Day!

I was impressed that GPT figured this out. I wanted to see if it could keep track when more relationships were involved. I prompted GPT with this:

Sally likes strawberry icecream. Her brother is Bill. He likes chocolate ice cream. Martha is Bill's best friend. Andrea is Sally's mother. Martha likes butterscotch candy.  Andrea likes gingerbread. Martha's mother Alice likes cheese and fruit. Tom is Alice's husband and he likes toffee. What kind of food should Bill buy for Mother's Day?

Bill should buy Alice a gift basket filled with cheese, fruit, and toffee for Mother's Day.


Why?

Bill should buy Alice a gift basket filled with cheese, fruit, and toffee for Mother's Day because these are her favorite foods.

Uh oh. Way off. Alice isn’t Bill’s mother, and it’s Tom, not Alice, who likes toffee. When I pressed GPT about this, it claimed that Bill didn’t have a mom and that Sally was his brother. Interestingly, GPT gave a totally different (and correct) answer when I changed the question at the end of the first paragraph a bit:

Who is Bill's mother and what kind of food should Bill buy for Mother's Day?

Bill's mother is Andrea and he should buy her gingerbread for Mother's Day.

I tried another question:

What should Alice's daughter buy her best friend for his birthday?

Alice's daughter could buy Martha's best friend Tom a box of toffee for his birthday.

Whoops! A bit of a mix up there. But, again, changing the question helped the AI get closer:

Who is Alice's daughter and what should she buy her best friend for his birthday?

Alice's daughter is Martha and she should buy her best friend Bill a box of chocolates for his birthday.

Why?

Because Bill likes chocolate ice cream, and chocolates are a classic birthday gift.

I think this reveals that GPT’s focus is on natural language, not on logical reasoning. When it seems to be using logic, what it’s actually doing is imitating the way people talk when they use logic. I wonder what’s going on under the hood? This guide provides helpful insights. GPT slaps sentences together one word at a time, figuring the next word out based upon what is most likely to be right given its model. But how does that process result in the appearance of true understanding?

 Posted by on December 16, 2022
   
© 2014 Merrily Dancing Ape Site design info