Internet 2023: Identity

Steven Pemberton, CWI, Amsterdam

The Author

Contents

16÷32

A book looking like a manuscriptAs the internet becomes more and more integrated with society, we need to address issues of how to represent stuff we do now in a digital way.

As I have pointed out in earlier talks, when books were first introduced, they looked exactly like manuscripts, as if they had been hand written, because that is all they knew.

The first cars looked exactly like horseless wagons for similar reasons.

It took 50 years for books to start looking like what we know as books, and my theory is that the first generation of users needs to die out, so that the people who are left have never known the old ways, and can start asking "why do we do this in such a weird way?".

Weird ways

An example of a weird way of using the internet is how we deal with contracts: we just pass around PDFs of text.

There is nothing digital there, except that the paper is gone. We are just imitating the old ways.

Even worse, I recently had to:

  1. Fill in a form.
  2. Print it out.
  3. Sign it.
  4. Scan it.
  5. Upload the scan.

This talk will weave in and out between the real world and the digital world, addressing issues that are affected by digitisation.

Identity

How do you prove who you are?

Your real identity is just you: my local grocer shop was fine if I forgot my money, and said I'd come back later to pay.

Your phone may recognise your face, or your fingerprints.

Google Assistant can recognise you by your voice.

In real life

But in other cases, you have to prove your identity in some way: Passport, driving license, ID card.

And how do you prove who you are when getting your first passport?

You use your birth certificate.

This is called transitivity, where a property (in this case your identity) passes down the line.

(Dutch post offices used to only allow you to identify yourself for Post business with a Post Identitity Card. You could use your passport to get one at the same post office though)

Problems with real-life identity

What if you have no birth certificate (such as my grandmother, who was a foundling)?

When my mother died I had to show (somehow) that she had only had two children, me and my brother.

Kim Walmsley 'man'Here is a woman who later in life, after giving birth to 5 children, discovered her birth certificate says she's a man, and couldn't get her passport renewed. The church also annulled her marriage.

Problems

Michael Myers, homeless manThere is the case of an American homeless man, who was unable to get a home or job or healthcare, because he couldn't prove who he was. To get a job or house he needed ID. To get ID he needed three pieces of official documentation: a birth certificate and two separate official documents proving residency. Catch 22.

Problems

What can you do if you are declared dead? How do you prove you are not?

This happens so often in India, that there is even an Association of the Living Dead to help people through the arduous process of being considered living.

Identity on the Internet

How do we prove our identity on the internet?

Mostly by passwords... which I have complained about often enough.

You log in to your computer, by whatever means. It therefore knows it is you. But then you have to log in to a website to identify yourself, and then again, and again.

Online Identity Origin

What do we use to get a new username/password?

Our email address, the digital equivalent of our birth certificate.

An email address is unique, so can be used as a proxy for your identity.

(It doesn't matter if you have several email addresses: they all refer back to the same person.)

Passwords

We are encouraged to use difficult to remember passwords so that they are hard to crack.

We are encouraged to use different passwords for each site, so that if one gets broken in to, the other sites are still safe.

Which means people end up using password management software, like lastpass.

Which got hacked last year...

All the passwords have been centralised.

Your identity should really be passed transitively down the line. You shouldn't need to repeatedly log in.

Transitivity

I have been promoting an idea for some years, that has been around from since before the Web, for allowing your identity to be transitive.

It works by everyone having two keys, a private one and a public one. A message locked with one can only be unlocked with the other.

If I lock a message with my private key, everyone can read it, but they know it is really from me.

If I lock a message with your public key, only you can read it.

Logging in without passwords

When I register with a site, I give it my public key.

Then when I log in, my browser sends the site a message locked with my private key: the site knows it is from me.

The message can also be locked with the site's public key, so that I know that it is going to the right site.

The good news: Passkeys are now being implemented in browsers. With any luck passwords on the web will be a thing we will laugh with our grandchildren about.

No identity: Anonymity

The internet was originally built with no mechanisms for trust and identity built in.

It was after all a network for computer scientists and computer departments to communicate with each other.

Anonymity was originally seen as a good thing...

It turns out that it is good for a small number of things (whistleblowing, avoiding abusers), but is bad for a lot more.

Traceable anonymity

Some years back, researchers at the CWI in Amsterdam devised a form of digital money that was anonymous, unless you misused it, such as trying to spend it twice.

So you could spend the money anonymously, but if you misused it, they could trace who you were.

Ideally we need a form of anonymity that allows for whistleblowers, or people hiding from abusers, but lets you track down money launderers or abusers.

What is a fact?

When I was at school, a friend brought a Victorian copy of the first volume of the Encyclopedia Britannica to school. Under Atom it said an atom was something that could not be cut in two. How we laughed.

Was that a fact then? Has the meaning of atom changed? Is a fact rooted in the time or environment is exists in?

Similar questions

"What is that in today's money?"

When Was Newton Born?

A British £1 note featuring Newton and giving his dates as 1642-1727 Newton was born and died in a period when there were two calendars in use in Europe, the old Julian Calendar, and the new Gregorian calendar that we use now.

According to the old calendar he was born on 25 December 1642, and died on the 20 March 1726.

According to the new calendar he was born 4 January 1643, and died 31 March 1727.

(In the old calendar the year started 25 March, in the new, 1 January, so according to the new calendar he died in a later year)

And yet:

Newton's dates

How many cows are there in Texas?

I once went to a talk that addressed the question "How many cows are there in Texas?"

The point was that you couldn't get a definitive answer to the question, only an answer within certain error bounds.

Facts

So a fact is an assertion, backed up by data, that has to be interpreted in its context, and within certain error bounds.

The geographical centre of the USATo give an example of 'facts' without error bounds, there is a service that will tell you where a computer with a certain IP address is located. What it doesn't tell you is how accurate that location is. If the service only knows "somewhere in the USA", then it gives the location as the geographic centre of the USA, rounded to the nearest degree latitude and longitude.

Unfortunately, there is a single house at that location:

House at centre of USA

The house in PotwinJoyce Taylor, 82, grew up on a 360-acre farm north of Potwin. Beginning in 2011, Taylor began to have a steady stream of odd phone calls and visitors to her property: IRS agents asking for people she'd never heard of, police officers tracking down runaway children, crime victims demanding their stolen goods or accusing her of online scams. All of this Internet-era harassment confused Taylor, who had an old computer but only used it to write Sunday school lessons. One angry visitor even left a broken toilet in her driveway.

Web 1

The web is 32 years old this year.

On 6 August 1991, Tim Berners-Lee posted a short summary of the World Wide Web project to an internet newsgroup inviting collaborators; the first web servers had been made publicly available a few months earlier.

(I know some people use other dates to mark the beginning of the Web, but what's a fact?)

The Architecture of the web

The original web was designed as a distributed system.

Computers connected to each other, with no central authority.

A read/write web, where you could both read and publish.

Unfortunately, Mosaic, the first really successful browser, left out the writing part.

Which mean that another way had to be found to publish on the web, which resulting in a lot of centralisation.

Web 2.0: Centralisation

And so emerged a generation of websites that allowed you to post stuff to them, and that got their value from users adding stuff.

Sites like Facebook or Twitter, Photo websites, Family tree websites, github, bandcamp, dating sites, or any of a thousand more.

This got called Web 2.0, although at the same time another change was happening, "Dynamic HTML", or "AJAX", so that the term Web 2.0 also got applied to that.

But where we had hoped for a distributed system, we ended up with lots of centralisation.

Some problems of centralisation

You have to decide which one you are going to use, or if you are going to use several.

You put a lot of work in.

You lose control of your data.

The sites don't interoperate.

You get locked in.

The site gets to spy on you.

If the site dies, or changes in a way you don't like, all the work you have put in is lost.

Distributed content

One of the things not often talked about Web (1) is that there are various ways of getting content.

HTTP is the one most used, that uses a URL (starting http://) to identify the single place the content must come from.

Another method uses magnet links. These identify a document with a hash.

A hash is, simplified speaking, the sum of all the characters in a document. This gives a number that can be used to identify the document you want.

Copies of that document are searched for, and if several are found, then they can all deliver parts of the document, for speed.

The link also contains the HTTP source of the document if all else fails.

In this way, not only the computers are distributed, but also the content.

Which means that content is harder to block by state players.

How do you prove you own a thing

In real life you might be able to prove you own something by showing a receipt.

I think I could prove I own my computer. I'm not sure I could prove I own any of my bikes.

For larger and/or more expensive objects we have papers, or documents, or the ownership is registered centrally.

To buy and sell such things we use third parties who are usually legally registered to do such things.

Often we end up with some sort of certificate that declares your ownership, with a reference back to the third party, so that the claim can be verified if needed.

The legal system is geared to deciding claims.

Identifying a thing online

As more and more things become digitised, new ways are needed to identify things.

As we have seen, we can identify things digitally in different way. A person via an email address, or some other unique identifier. A document via a URL, or a hash.

This was one of the central issues at the heart of the original Web 3.0 (a different thing from what is now being called Web3): how do we identify things, and how do we specify what properties they have, so that conclusions can be drawn automatically without anyone having to read or interpret text.

This would allow you to describe all human knowledge in a machine-readable way.

But it would also allow you to create, for instance, digital contracts.

Blockchain

Blockchain is 'just' a distributed database.

Each block can contain anything really, and has a hash, and contains the hash of the previous block, thus chaining them together. So it works a bit like magnet links in that respect.

Some of the problems of blockchain are:

Latency - it works much slower than regular databases. A new block can only be added to the blockchain every ten minutes

Size: The more popular it gets, and the longer it is used, the bigger it gets. It is now about 450 GB, 16% larger than this time last year.

Energy - because of the methods it uses to ensure consistency, it uses, globally seen, huge amounts of energy:

As of August 2022, published estimates of the total global electricity usage for crypto-assets are between 120 and 240 billion kilowatt-hours per year, a range that exceeds the total annual electricity usage of many individual countries, such as Argentina or Australia. This is equivalent to 0.4% to 0.9% of annual global electricity usage, and is comparable to the annual electricity usage of all conventional data centers in the world. Whitehouse

Using nearly 1% of the world's energy to maintain a database that would fit on half of a typical SSD is inefficient to say the least.

Bitcoin

World shocked that man running business based on imaginary money might be fraudThe originator of the Blockchain, and its biggest user is Bitcoin.

Following the success of Bitcoin, several other digital currencies have followed.

Breaking news: All money is 'imaginary'

There are two different but related things: value and money.

The phrase "That's good value for money", demonstrates this.

In the beginning there was only value: people grew their own food, built their own houses, and if they needed a service, or some good from someone else, they would swap something for it, either another service, like giving their time to help with something, or another good: if you help me with this, I'll give you a dozen eggs.

Value was largely based on how much time and effort it took you to do something.

Compact, valuable things became a way of easily exchanging value: salt, which used to be very valuable (and why we use the word "salary", since Roman soldiers were paid in salt), and metals for instance.

The state took on the task of printing a mark on disks of valuable metal, to guarantee that it was real, and of a certain agreed weight. But these coins still had inherent value. (The last coins of 'real' value disappeared about 100 years ago. Now coins are just tokens)

Banks emerged for storing money. "Bank notes" were just a promise that you could collect your money from the bank.

Now even physical money is disappearing. All money is imaginary.

Anonymous money

One of the advantages (or disadvantages, depending on who you are) of cash money is that it is anonymous.

You may remember that we used to have another form of pinpas: the Chipknip. That was an anonymous system, where you loaded 'digital' cash, and spent it anonymously.

That died about 10 years ago to be replaced by the non-anonymous PIN system.

Forms of money

All money just represents the value that is in the system.

This is why printing money causes inflation: there is no extra value in the system, just more money, so the money gets spread thinner.

Money can come in many forms: even stocks and shares are a form of money.

If someone says "I've put my money in stocks and shares", what they really mean is that they have converted their State money into another form, tokens backed up by value in the companies involved.

This is why stock market collapses are so problematic.

So cybercurrencies are no more imaginary than other forms of money (though less secure than State money).

NFTs

Most of the things stored on blockchains are contracts.

Even bitcoin is really just a series of contracts of the form Identity A gave M amount of money to Identity B.

NFTs are similarly a contract: Identity A has sold object C to Identity B.

There is no inherent reason to put NFTs on the Blockchain, but that is where they most often are.

The future of NFTs will stand or fall on how well the system identifies A, B, and C, and how these contracts will be eventually handled in courts (so far untested).

Some NFTs use URLs to identify objects. And then the website disappears.

It also mystifies me why some artists sell using an NFT and then destroy the artwork.

See also Moxie Marlinspike who argues that because the blockchain is becoming so big, access to it is becoming centralised.

Web3

Not the same as the original Web 3.0

Decentralised, based on blockchain, i.e decentralised data

Current design is somewhat vague, and is therefore difficult to comment on.

Some Conclusions

Without a doubt, decentralisation is the only safe option for any future version of the web.

We are still in the "looks like a manuscript" phase of the internet. We are still imitating the old ways.

We will need to resolve the identity:anonymity conflict

Machine-readable semantics will be at the root of future improvements

The legal system will have to adapt to new ways of specifying contracts.