The Open Web

The Author Steven Pemberton, W3C/CWI, Amsterdam, The Netherlands

About me

Researcher at CWI in Amsterdam (first non-military internet site in Europe - 1988, whole of Europe connected to USA with 64kb link!). This makes me one of the first 25 civilians to use the internet in Europe.

Co-designed the programming language ABC, that was later used as the basis for Python

Wrote some of the Gnu C Compiler gcc in the 80's

Organised 2 workshops at the first Web conference in 1994. Chaired the first style and internationalization workshops at W3C.

Co-designer of HTML4, CSS, XHTML, XML Events, XForms, RDFa, etc

Degrees of openness

Data

APIs

Markup

Semantics/microformats

Open standards

Meta standards

Extensible standards

Open data

Accessibility

Device independence

Easy to author

Searchable

Not Walled gardens

The long view

Most of my work has been with a ten year time frame: what will conditions be in 10 years time, and what will we need then.

However, this still gives problems with acceptance of the work, since many people fail to see the relevance.

Moore's Law

In 1965 Gordon Moore predicted that integrated circuits would double in power each year at constant price.

In 1975 he adjusted that to a doubling every 18 months.

That's an order of magnitude increase every 5 years.

"An order of magnitude quantitive change is a qualitative change"

We live in an exponential world

To demonstrate Moore's Law: take a piece of paper, divide it in two, and write this year's date in one half:

Paper

2009

Now divide the other half in two vertically, and write the date 18 months ago in one half:

Paper

2009
2008

Now divide the remaining space in half, and write the date 18 months earlier (or in other words 3 years ago) in one half:

Paper

2009
2008
2006

Repeat until your pen is thicker than the space you have to divide in two:

Paper

2009
2008
2006
2005
2003
2002
2000
1999
97
96
95
93
92
90

This demonstrates that your current computer is more powerful than all other computers you have had put together (and the original Macintosh for instance (1984) had tiny amounts of computing power available.)

Python

I was part of the team that ended up producing the programming language Python. In the shadow of Moore's Law we were looking down the road and thinking about how we might solve today's problems in the future.

Project meeting

Interpreted languages

We believed there was a future for interpreted languages.

It was only the beginning of the personal computer age, PC's only had floppy disks. My current mobile phone is 1000 times more powerful than the mainframe computer we were using then to develop on (which we shared with 30 others).

Interpreted languages

Our plan wasn't to get rid of C, or anything like that (in fact in the period I wrote part of the Gnu C compiler gcc).

But we got a lot of pushback from people for the work we were doing, because it didn't solve the problems they had at that moment, and what we were doing ran really slowly.

But it may have run really slowly then, but we knew that in 10 years time it would run 100 times faster, and then people wouldn't complain.

Interpreted languages

And another interesting feature was that empirically programmers wrote programs 10 times faster than normal.

This was a worthwhile tradeoff, even if the programs ran really slowly: if you could write a program in an afternoon instead of a week, you were willing to try things you might not have otherwise tried.

("An order of magnitude quantitive change is a qualitative change")

Interpreted languages

In fact our project got shut down by management, and it is only because Guido van Rossum was willing to flee elsewhere and carry on the good work that we have Python today.

Well that and a bunch of people in Amsterdam in the 1980's who thought it was an interesting thing to investigate.

Views

In the late 80's we decided to build what we called an application environment.

Designed with a 10 year range.

The system had an extensible markup language, vector graphics, style sheets, a DOM, client-side scripting...

Today you would call it a browser (it didn't use TCP/IP though).

It ran on many machines, including an Atari ST.

Clocks

Four clocks in the Views system

Programming Clocks

The shortest code I could find of an analogue clock was something over 1000 lines of C (the longest was over 4000 lines):

1000 lines of C code

Clock

Here is the essence of the code used for the Views clock example.

type clock = (h, m, s)
displayed as 
   circled(combined(hhand; mhand; shand; decor))
   shand = line(slength) rotated (s × 6)
   mhand = line(mlength) rotated (m × 6)
   hhand = line(hlength) rotated (h × 30 + m ÷ 2)
   decor = ...
   slength = ...
   ...
clock c
c.s = system:seconds mod 60
c.m = (system:seconds div 60) mod 60
c.h = (system:seconds div 3600) mod 24

This is declarative programming: you say what you want to achieve, but not how to achieve it.

Map reduce (a histogram)

We were doing map reduce then as well:

type histogram = list(number)
displayed as
   join/(box(?, width) * self)
   width = ...

box(h, w) is a function that returns a graphical box of height h and width w.

box(?, w) returns a function of one parameter, where w has been already filled in (i.e. currying)

f * list maps the (single parameter) function f onto the list, returning a new list, so box(?, width) * self will return a list of boxes all of the same width, and height depending on the original values in the list.

join/ is a reduce, that sticks a bunch of graphical objects together horizontally to create a single graphic.

About the length of code

The US DoD discovered that 90% of the cost of software production is debugging

Fred Brooks of IBM discovered that the number of bugs in a program doesn't grow linearly with the size of the program, but quadratically: S1.5

In other words a program that is 10 times longer takes more than 30 times the effort/cost

Or put another way: a program that is one tenth the size costs 3%

CSS

When HTML was first introduced, many people mistook it for a presentation language.

Unfortunately, so did the browser makers, and they introduced tags like <font> and <blink>, not understanding that <h1> didn't mean big and bold but meant This is the top-level heading.

CSS

CSS was an emergency attempt to get HTML back to how it was intended, a structure description language.

But still it was a lot of work to get the message over, and even Netscape opposed CSS for a long time, saying you could use script to achieve the same results.

CSS

But still, it took the web community a long time to get it, and understand why separating content and presentation is a better solution than presentation-oriented markup.

Examples:

HTML: a great success, and a great failure

Zillions of pages of HTML are zipping across the wires as we speak.

And yet the vast majority of those are not authored in HTML.

That's why we have XML, CMSs, PHP, ASP, JSP, etc.

That's why we have YUI, Scriptaculous, Dojo, etc.

HTML solves only part of the problem. HTML is the assembly language of the Web.

XHTML2 vs HTML5

Pitting HTML5 against XHTML2 is the wrong way to look at things.

The suggestion isn't that HTML should go away. If the browser manufacturers can get together and make producing interoperable Web pages easier, that can only be a good thing for all of us.

But XHTML2 tries to step back and take a longer, broader view. What are the problems that we are trying to solve, and how can we work towards easier ways of solving them?

What do we need?

For instance:

Modularization

The WG that has been producing XHTML2 works by producing modules that then get adopted, implemented, tried out. XHTML2 is then the final packaging of all those modules. Such as:

Let's look at just one of these, XForms

HTML Forms

People in general are quite concrete, and it takes a while to understand new abstractions. Look at CSS.

HTML Forms are an example of this too: they are very presentation-oriented, and mix up presentation, function, and data values, all in one markup. Think how hard it is to work out what someone else's form actually returns.

XForms

XForms has been designed based on an analysis of HTML Forms, what they can do, and what they can't, and what we actually need.

It is based on an MVC design (Model - Viewer - Controller).

This is not a new idea. MVC dates from the 1980s. But it adds value in the same way as CSS, by separating concerns. It is about the separation of data and content.

XForms

Although XForms comes from designing a replacement for HTML Forms, it is really an application language: it has input, output and computations.

The data is abstracted away into several 'instances' which can be loaded and saved asynchronously over the net.

The 'controls' are really abstract: they only say what the control is supposed to do, not how they look.

The computational model is constraint-based, i.e. like spreadsheets.

Using techniques similar to stylesheets (e.g. XBL) you can define the presentation separately, and even have different presentations for different circumstances (e.g. for different devices).

Styling with SVG

world clock in XForms

The page in this case has text like

San Francisco: 14:30:00
New York: 16:30:00
 ...

The styling is done with SVG and XBL.

Datapoint: Google Maps

As a pilot, someone implemented (a version of) Google Maps using XForms.

Google Maps done with XForms

Result: 25k bytes of code (compared with 200+k for Google maps)

(Actually had satellite option before Google Maps did)

Datapoint: Machine interface

A company that builds huge walk-in machines with complicated user interfaces needed 30 people working for 5 years to build the user interface.

With XForms 10 people needed a year.

Do the maths: 10 person years instead of 150. How much does a person year cost? Let's be conservative and say $100,000. So it cost them 1 million dollars instead of 15 million dollars. That alone covered the cost of their W3C membership for the next thousand years or so.

Datapoint: Applications

A company replacing Javascript with XForms

"About 25% of the size" [= about an order of magnitude less work]

"the [programmers] are really happy to not have to use javascript: they like that if things don't work its not their fault"

Conclusions

In 40 years computers have become some 25 orders of magnitude faster.

That's: 10,000,000,000,000,000,000,000,000 times faster.

Programmers have managed 2 or maybe 3 orders of magnitude in that time.

In the 1960s if you bought a computer from IBM, you got free programmers in the deal.

Nowadays it is the hardware that is free (comparatively).

Conclusion

The advantages of the XForms approach are:

More Information

Steven Pemberton: www.cwi.nl/~steven

These slides: from my homepage.