There's No I in AI

Steven Pemberton, CWI, Amsterdam

The author as seen by AI

Contents

Turing and me

Alan Turing on UK £50 noteRecently was the 75th anniversary of Alan Turing's 1950 paper "Computing Machinery and Intelligence".

Turing is considered the father of AI. He starts the paper with

"I propose to consider the question, 'Can machines think?'",

and introduces what is now called the Turing Test of machine intelligence.

University

Richard GrimsdaleAt university my tutor was Richard Grimsdale, who built the first ever transistorised computer.

Grimsdale's tutor was Alan Turing (making me a grand-tutee of Turing).

Post-University

MU5I (coincidentally) went on to work in the department in Manchester where Turing worked and wrote that paper.

I worked on the 5th computer in the line of computers Turing also worked on, the MU5.

Amsterdam

A project meetingMoving to The Netherlands, I co-designed the programming language that Python is based on.

Internet

Steven at a computer in the 80'sI was the first user of the open internet in Europe, in November 1988, 37 years ago!

CWI set up the first European internet node (64Kbps!), and then two spin-offs to build the internet out in Europe and the Netherlands.

Web

Steven with Tim Berners-LeeI organised workshops at the first Web conference at CERN in 1994

I was chair of the HTML, XHTML, and XForms Working groups at W3C.

I co-designed HTML, CSS, XHTML, XForms, RDFa, and several others.

I still chair XForms and ixml.

Enough about me...

Let's talk about I

Enough about me...

Let's talk about I

In particular, the I in "AI".

The Story So Far...

The story so farThis talk is part of a course I give on Introduction to AI at the XML Summer School in Oxford every year.

Spoiler Alert

Spoiler alert! Everybody dies

Spoiler Alert

Spoiler alert! Everybody dies

Well, maybe not everyone...

Spoiler Alert

Spoiler alert! Everybody dies

Well, maybe not everyone...

Maybe...

There's No I in AI

What we currently refer to as "AI", is not intelligent in the way we mean the word.

The current AI is clever use of language, so that we think that it is intelligent.

Which is why we see such blunders, and can't trust what it produces, but must always double check.

Errors

ChatGPT makes a blunderThis is an actual (trick) question from a maths exam: 120 players take 40 minutes to play Beethoven's 9th, how long would 60 players take?

ChatGPT response:

It would take 60 players 80 minutes to play Beethoven's 9th Symphony.

Errors

An AI making a fool of itself

Pareidolia

Humans are inclined to interpret things from a personal point of view.

Faces

Seeing things that aren't there

Swans feeding fish:

Swans apparently feeding fish

Seeing things that aren't there

Swans feeding fish:

Swans apparently feeding fish

We interpret this as the swans feeding the fish, but in fact the swans are just wetting the food, and the fish are taking advantage of what they drop. Swans feeding fish

Eliza

In the 60's Joseph Weizenbaum created a program that imitated a Rogerian psychotherapist.

Men are all alike.

In what way?
They're always bugging us about something or other.

Can you think of a specific example?
Well, my boyfriend made me come here.

Your boyfriend made you come here?
He says I'm depressed much of the time.

I am sorry to hear you are depressed.
It's true, I am unhappy.

Do you think coming here will help not to be unhappy?
I need some help, that much seems certain.

In a classic example, a departmental secretary who was trying it out, asked Weizenbaum to leave the room, because she wanted to talk personally to it.

LaMDA

Recently a similar thing occurred with a Google employee claiming an AI chat program was sentient (because it said it was).

ChatGPT is widely considered to exhibit intelligence, and then people get surprised when it makes enormous blunders.

But seeing it as intelligent is just our pareidolia at work.

How LLMs Work (greatly simplified)

Natural language exhibits very strong statistical properties.

For instance, I took the first chapter of James Joyce's Ulysses, which contains 1100+ sentences, 7000+ words, 40,409 characters, of which 3577 (about 9%) are the letter "e", but only 24 are an "x" and 33 a "j".

Stately, plump Buck Mulligan came from the stairhead, bearing a bowl of lather on which a mirror and a razor lay crossed. A yellow dressinggown, ungirdled, was sustained gently behind him on the mild morning air. He held the bowl aloft and intoned:

-Introibo ad altare Dei.

Halted, he peered down the dark winding stairs and called out coarsely:

-Come up, Kinch! Come up, you fearful jesuit!

Solemnly he came forward and mounted the round gunrest. He faced about and blessed gravely thrice the tower, the surrounding land and the awaking mountains. Then, catching sight of Stephen Dedalus, he bent towards him and made rapid crosses in the air, gurgling in his throat and shaking his head.

Generating text statistically

If I generate random characters of text from Ulysses, using only the statistical likelihood of a character appearing, I get something like this:

ites ecginlsacheurge,o gHTmawgala eSuh nh by ti.e mbp!lrittnoebneiwanb leTah osn,ua Dd i ihasshrrdupoidlass el oe,obeu fetd,o w Tiyynrm huademn ir de ey S h ieao..ethf atriasnd hhniuariwyatan lftus deaiotelidKWgaplbbhuperhdecewy,o tsfdnrreSsgiyn t.inn aeb

-st,,eghwoese.lotoi imon fpato irfs Hrsryege t eib,edoschonblehtoohosn wsumhuDndeetvbnaMcwNl Idoyrh d e cu rm a gavit ta o keumt argu clh

-uuky ,irmtlno.auoit satu'pwla,aIaprdpoKd laee eheet sttioeFsecent.hnoiiiee e

Generating text statistically

However, there are other statistical properties. A "q" is only ever followed by a "u". A 'z' is only followed by 'e', 'i', 'o', 'y', and 'z'.

So I select a random character, and then generate the next character randomly from the characters that can follow it, and then the next from the characters that can follow that:

e thengas. pewir msk de. m. an ck. on Whe My s achesmavaig setok. Ithe m tey ogr ur ng wheaisshe f Ph ffron p, A cker w our ist icting and, tat haile n cang d. Sowes an, t he aielle whend. s thirwengay Mata I ox? I f Goweveeaicoma ace ind,pund, the t, l mnd he t

-Favengad ing -Hewintil ppopast fet ind d se -Cannghe azin? l Ston at id owofin swilelok s aiarer, O, Oryowng I y anghe rbaleereas alletod oullourdougack, Thist,

-Thanghed tin ond Core s as, Ond ofumorrs ofowhe vofof eyed Hato bomepathe

Generating text statistically

So that was generated with pairs of letters; how about triples?

his off be ain gan's fely, fry not thoss rom sly ch whim. I'm tou whis fate saines, wer his of jewhe an the wome if calliust bay heres of wat win, Kin thenly, mustid. Woory it her re fring fible calam of callispinown wit, wop all od bolliedwas calligand hopen he mer riettephe thout not and ing re saing excleve graw spot Sted his wass laught. He swor, yourgund id, a quare bods, vanat wince ord ne.

-Haing.

-By for way learrong blatteake youre key, lart. Twericelboutheonchillothen an seres carit, Bucklende. The iftylied waseephe shit's themble fe. Hurvalick Malkwaing unks he bit? I'm him the him nody put, Gool throm turnowly beir, am, himpakinesing, le globlack youghts. Epices hisdaybriet.

Generating text statistically

How about quadruples?

wast. An I come fits cons, offerry of bround startsey. You plumpty quid?

-Grank like Zarapet. Not and sir? He to student of yound Arnold up thread browth woman, on gurgliste can't we of it, hung it door, Stephen's uncles said nose I'm couldn't in the somes like about that rosewood morn oddedly. Damn els. A slit of the and fanner booked the merry And bladdeneral pranks back, I caped him from in fell sighten said:

-Are mass of soft don't, Haines halone bore loose hat poing to ched a middy on but they the lus, a her me. His Paler, the a mothe sever, fraines.

-Thalended out the dring dancing ther. You plean jew, gunressed.

Generating text statistically

Quintuples?

-Dedalus, the man cliffs he said. Her gloomily. He floriously. It's a Hellen but he search and snappeare's all else the said. In a pint overeigns. That key, Kinch, walked clother breated out of the bay with joiner pockers whaled all, throughtn't red touch he colour ideas a jew, my he propped tonightly death, beer again a suddenly you're dread, beastly an attacked you fear of his not exist is chased his under chest and silver and is in a mirror old the coronational God's lips last breasons, Buck Mulligan sitting manner pope Mabinogion from the milkjug from them all night felt thing it his room, Dick and think of the milkwoman to boldness, theologicalling world, and it alongowes. How much? Then said to a beation frette, like and faced and to stant inst me bent Greek. Buck Mulligan said droned:

-Have too. A cracked by fits sure.

Generating text statistically

Sextuples?

That's over the Ship last:

-You put there with money and the key too. All. He hacked Stephen threw two pennies on to unlace his chin.

-No, the knife-blade.

-To tell: but hers.

He swept up near him your school kip and sleep her old woman asked. Inshore and junket. Ireland about some down the razor. Stephen picked men freed his hands before him. There's tone:

-For old woman show by it? Buck Mulligan is soul's cry, he said gloomily. You were makes they hat forward again vigour

Barely a word there that isn't English

Words

What we have just done with characters, we could also do with words. At random:

-All sunshine isn't of legs name odour you me, death running Haines Haines I'm the head floor shake Father. wondering a loveliest atover

Mater

-Look pique Kinch. Idle same. a a etiquette. Son of in veiling Stephen's her me Stephen manage not and middleAbsurd! said, milk. Conscience. down night, steelpen. stars. making gay breath grand and grey to puffy went her again. of else God. Mulligan her

-Italian?

Words

Pairs of words:

That's our sakes. His head disappeared and these bloody English! Bursting with fry on the gulfstream, Stephen and Harry I should think you remember anything. I can't wear them from the Lord. Thus spake Zarathustra. His own rare thoughts, a third, Stephen turned suddenly for her.

-Snapshot, eh? Brief exposure. Buck Mulligan laid at the door. Come up, you killed her, Mulligan, says it's over. You don't speak the army.

-Ah, poor dogsbody! he sang: I makes tea and hobbles round the gulfstream, Stephen answered, O, jay, there's no milk. Buck Mulligan's voice asked. I think like a disarming and fro about to blame.

Words

Triples:

-But a lovely morning, sir, she said, by the sound of it. Are you from the kitchen tap when she was a girl. She heard old Royce sing in the year of the lather on his razorblade. He hopped down from his chair. Sit down. Pour out the mirror of water from the sea. Turning the curve he waved his hand. It called again. A sleek brown head, a seal's, far out on the parapet. Why should I bring it down? Or leave it there all day, forgotten friendship? He went over to the parapet. Stephen stood up and look pleasant, Haines said again. What do you mean?

-The imperial British state, Stephen answered, his colour rising, and the subtle African heresiarch Sabellius who held that the cold gaze which had measured

Words

Quadruples:

- I told him your symbol of Irish art. He says it's very clever. Touch him for a guinea. He's stinking with money and indigestion. Because he comes from Oxford. You know, Dedalus, you have the real Oxford manner. He can't make you out. O, my name for you is the best: Kinch, the knife-blade. He shaved warily over his chin.

- He was raving all night about a black panther, Stephen said. Where is his guncase?

- A woful lunatic! Mulligan said. Were you in a funk?

- I was, Stephen said with energy and growing fear. Out here in the dark with a man I don't know raving and moaning to himself about shooting a black panther.

This is just straight text from Ulysses, due to the small learning set. No point in going further.

LLMs

LLMs are just this, only writ large, additionally using statistical techniques for related meanings.

LLMs just generate text related to what you have typed.

Images

This is also, by the way, why you can get such weird images. The pieces just fit together.

A waitress with feet for hands

Images

This is also, by the way, why you can get such weird images. The pieces just fit together.

Taylor Swift as seen by AI

Representing Meaning

Consider the traditional UK political landscape.

Because of the ancient voting system it has a tendency to produce a small number of parties, two large parties, and a small number of regional and other parties.

Because there is such a small number of parties, the two main parties tend to be very broad, each a sort of pre-arranged coalition of interests.

Normally the UK parties are described on a left-right axis

Left..............Centre..............Right
         Labour  Libdem    Tory→

Dimension

Because there are a large group of people who would never vote Tory, and another large group who would never vote Labour, the parties tend to drift towards the centre where the voters who change their voting choice are situated.

You could describe the British parties by a position representing (approximately) where they are located on this left-right axis from -1 to 1:

Labour: -0.25
Libdem: 0
Tory: 0.7

Second dimension

Another axis might reflect their current position on Europe:

Anti...................................Pro
Tory             Labour           Libdem

Tory: -1
Labour: 0
Libdem: 1

You could then create a two-dimensional idea of the parties by combining these axes:

Labour: (-0.25, 0)
Libdem: (0, 1)
Tory: (0.7, -1)

Values

There is nothing essential to using -1 to +1 as the numbers.

You could just as well use 0 to 1 with the same effect, with 0.5 representing 'in the middle':

Labour: (0.375, 0.5)
Libdem: (0.5, 1)
Tory: (0.85, 0)

Other systems

More modern voting systems allow a greater range of parties.

For instance The Netherlands had 25 parties at the last election, of which 15 got elected.

It is less informative to display them just on a left-right axis.

One way they are displayed there is on two axes: left-right, progressive-conservative

Other systems

The Dutch Political Landscape

So you could represent the parties on this diagram by a position of two coordinates. For instance, D66, about the same as the UK Libdems, is at roughly (0, 0.5).

Other dimensions

The Dutch Political Landscape

The CDA and the VVD are very close on the above diagram, both similar to the (pre-Brexit) Conservatives, but the CDA are Christian, and the VVD secular.

So you could add another dimension of religion.

Other dimensions

The Dutch Political Landscape

Two parties considered themselves close enough to coalesce, at least for the election, The Dutch Labour Party, and the Green-Left party, where the main difference was on the environment.

So you could add environment as a dimension. Or Europeanism vs Nationalism.

Similarly there's a party for older people, and one for animal rights, and so on.

30 Dimensions

The website that produced the above image helps voters discover who they should vote for.

They ask 30 questions, and on that basis say which parties you are closest to. The number of questions has to be at least as many as the parties, otherwise you won't be able to distinguish them all.

This means that they use 30 dimensions to represent the parties, so really the 'semantics' of a party is a list of 30 numbers.

Your position is also a list of 30 numbers, and then a good match is the party that is the 'nearest' to you in those 30 dimensions.

You could subtract the lists of numbers for two parties, and get a list of numbers that would expose the differences in approach between them, or between a party and you.

Visualising

We are very bad at visualising anything above 3 dimensions, so they reduce the picture to the two above.

Computers don't have that problem, so they can find clusters, and tell you the semantic 'distance' you are from various parties.

This is the basis of the method that GPT programs represent the meaning of words: each word has a list of numbers, each number representing that word's position on a particular meaning axis.

Words that are synonyms, or near synonyms are then close to each other in the semantic space.

Learned Axes

There are two notable things:

  1. because they are discovered by machine learning, we don't know what the axes are, nor can it explain to us what they are;
  2. there are tens of thousands of axes! (So each word is represented by a list of tens of thousands of numbers.)

The axes likely include

and so on, but because machine learning is so good at spotting patterns that we can't even see, there are surely axes that we don't even have a name for.

Observations

A couple of interesting observations about words being represented by strings of numbers representing their meaning:

GPT

So when GPTs produce the next word, they don't just do it on the basis of syntax (as we have been doing up to now), they also use meaning to help choose the next word.

Video: youtube.com/watch?v=wjZofJX0v4M

Generalised Intelligence

The new arms race is on for generalised intelligence, when there really is an I in AI.

Companies and countries are pouring vast amounts of money into trying to get there first.

When will it happen?

What will happen when computers are more intelligent than us?

Meet my grandfather

AC Wheeler as a child

Born in 1880, a middle child in a family of 20(!) children.

1880: nearly no modern technologies; only trains and photography. No electricity.

In such a large household each child had a task, and it was his to ensure that the oil lamps were filled.

It must have been indeed an exciting time, when light became something you could switch on and off.

Paradigm shifts

Trains and photography were paradigm shifts: they change the way that you think about and interact with the world.

But they often replace existing ways of doing things, taking companies with them.

There are lots of examples of paradigm shifts:

Example: Kodak

Kodak's share price plummetsWho would have thought that Kodak didn't see this coming?

Accelerating change

My grandfather was born in a world of only two modern technologies, trains and photography, but in his life of nearly a hundred years, he saw vast numbers of paradigm shifts:

electricity, telephone, lifts, central heating, cars, film, radio, television, recorded sound, flight, electronic money, computers, space travel, ...

the list is enormous.

We are still seeing new shifts: internet, mobile telephones, GPS, internet-connected watches, cheap computers that can understand and talk back, self-driving cars, ...

Does that mean that paradigm shifts are happening faster and faster?

Yes, it does.

The Singularity

Paradigm shifts over the ages

Kurzweil did an investigation, by asking representatives of many different disciplines to identify the paradigm shifts that had happened in their discipline and when. We're talking here of time scales of tens of thousands of years for some disciplines.

He discovered that paradigm shifts are increasing at an exponential rate!

If they happened once every 100 years, then they happened every 50 years, then every 25 years, and so on.

Acceleration

Year   Time to next  =Days
  0       100       36500

Acceleration

Year   Time to next  =Days
  0       100       36500
100        50       18250

Acceleration

Year   Time to next  =Days
  0       100       36500
100        50       18250
150        25        9125

Acceleration

Year   Time to next  =Days
  0       100       36500
100        50       18250
150        25        9125
175        12.5      4562.5

Acceleration

Year   Time to next  =Days
  0       100       36500
100        50       18250
150        25        9125
175        12.5      4562.5
187.5       6.25     2281.25
193.75      3.125    1140.63
196.875     1.563     570.31
198.438     0.781     285.16
199.219     0.391     142.58
199.609     0.195      71.29
199.805     0.098      35.64
199.902     0.049      17.82
199.951     0.024       8.91
199.976     0.012       4.46
199.988     0.006       2.23
199.994     0.003       1.11
199.997     0.002       0.56

As a Graph

Paradigm acceleration

A Similar Acceleration

Scientific journalsThat may seem impossible, but we have already seen a similar expansion that also seemed impossible.

In the 1960's we already knew that the amount of information the world was producing was doubling every 15 years, and had been for at least 300 years.

We 'knew' this had to stop, since we would run out of paper to store the results.

And then the internet happened.

How?

Paradigm shifts over the ages

So sometime in the nearish future paradigm shifts will apparently be happening daily? How?

One proposed explanation is that that is the point that computers become smarter than us: computers will start doing the design rather than us.

A New Intelligence

So for the first time ever there will be 'things' more intelligent than us.

Within a short time, not just a bit more intelligent, but ten, a hundred, a thousand, a million times more intelligent.

Will they be self-aware? Quite possibly.

This raises new ethical questions. Would it be OK to switch them off?

To help you focus your mind on this question: suppose we find a way to encode and upload our own brains to these machines when we die. Is it still OK to switch them off?

The Super Intelligence

Three things are sure, they will be

and they will surely quickly be able to work out how to break into any internet-connected computer.

Logical systems

These are consistent systems that draw conclusions from current knowledge.

At the lowest level are axioms. These are the basis for logic: points that cannot be argued about, or derived from yet lower-level axioms.

Ethical Axioms

Any consistent logical system has at its basis a set of axioms that are unprovable, from which all other statements can be derived.

This includes ethical systems.

AI Axioms

So AI superintelligences will have to have axioms too.

What will they be? Will we be able to know?

Current LLMs are not inherently ethical. They are given a number of (hidden) instructions on how to behave, ringfencing certain undesirable behaviours (this is called 'alignment'), but people are always looking for ways to 'jailbreak' these fences, to show LLMs saying things they oughtn't.

This indicates that specifying axioms may not be realistic or even possible. Maybe the superintelligence will derive its own axioms.

Relationship

Will these new super intelligences be on our side? Will they look kindly on us?

There is no inherent reason.

Compare our attitude to lesser intelligences on earth:

Why would a super-intelligence act differently?

Three Scenarios

So how might it develop?

Let's imagine three scenarios:

A bit like our three methods of treating lower intelligences.

Scenario 1: Friendly

If they are friendly, then they might see us as we see toddlers on a playground, and install a sort of benign parental dictatorship.

Scenario 2: Neutral

If they are neutral, the dictatorship might be similar but less benign

Scenario 3: Adversarial

If they are adversarial, they may see us as a threat, for instance because of the climate crisis:

"Killing" doesn't mean setting the robots on us, but, for instance, switching off oil supplies, or energy generation for a couple of weeks.

Other scenarios

And of course, they may not be 'our' AI, but may be aligned with

It will all depend on what the moral or ethical axioms of the AIs turn out to be.

Conclusion

Climate warning 1912We do need to have a plan.

We are able to solve problems quickly, for instance the ozone hole.

But we can also respond very slowly, especially if there is money to be made from it not being solved, or if solving it costs money or reduces convenience; look at Kodak, look at climate change...

Conclusion

The only cliffhanger is whether it will be the climate or the robots that get to us first...

The climate attacks! The robots attack!

Sorry