Steven Pemberton, CWI, Amsterdam
The ixml language [ixml] was originally designed with the principal aim of allowing un-marked-up textual documents to be treated as if they were XML documents with markup.
This can be seen as part of a progression of abstractions being made on documents: originally we had individual documents, with markup to detail the structure, and with embedded presentation details for the styling. Style sheets allowed us to abstract out the presentation into a separate file, and consequently use the style sheet for a whole class of similar documents. In the same way, ixml allows us to abstract the markup out of the documents into a separate file, similarly to be used for a whole class of related documents.
Although ixml was not initially designed to convert textual documents to particular XML document types, but just to get a textual document into an initial XML form that could later be refined as necessary using existing XML tools, it is possible to work in the other direction: if you have an XML document type, you can use ixml to design a textual representation for it. People often seem to prefer authoring flat textual documents because they can see and understand the structure unaided, and find the need to add markup to make it readable for computers a distraction. An example is Markdown [md], or indeed almost any programming language. However ixml allows both approaches to be supported.
Markdown is an example of the approach, where the target is an HTML Document produced from a textual file, and indeed there is an an example of ixml being used to process Markdown [adv] in exactly this way.
In an earlier ixml paper on Modularisation [m12n], there was a hint of a similar approach for XForms [xf], which exists as an XML language with no equivalent textual form, but in that paper to demonstrate the use of modularisation on a larger example. In this paper we take this further, and examine the processes you have to go through to design a flat textual notation, and the options you have, using XForms as an example target language.
The most important, and distinguishing factor of designing a notation for an
existing XML document type is that the structure has already been specified:
there are no decisions to be made on that front. As pointed out in the earlier
example of defining ixml for markdown, the top level ixml rules for Markdown
have to be html, head, and
body, since they have to match the final target structure.
Similarly in the case of XForms, the overall structure of the rules has already been decided for us, which we can determine directly from the XForms schema. At the top level we have:
model: (instance; bind; action; submission)*.
-Content: Controls.
-Controls: Core-Controls; group; switch; repeat.
-Core-Controls: input; secret; textarea; output; upload; range; trigger; submit; select; select1.
(in all the rules, as in the XForms specification, names with an initial lower-case letter are used for actual elements that will occur in the output, and names with an initial capital for other rules).
Of course, XForms wasn't designed to be a standalone language, but one embedded in other languages, so we have to devise a top-level structure in a host language, in this case XHTML:
html: head, body, s?.
where head contains the models, and body contains
the content. For instance:
head: title, Style*, model+. body: Content.
There are two approaches to recognising input: either by position, or by adding extra characters to identify what sort of input we are dealing with.
For instance, the title is the first item in our input, so we can just require that the first line be the title of our XForm:
title: ~[#a]+, nl.
The rule for nl requires a newline, and allows extra optional
trailing space:
-nl: -#a, s?.
The rule for s is to allow trailing space:
-s: -[" "; #9; #a]+.
but we will also use it where spacing is required.
For styling, although it would also be possible to allow embedded CSS, to
keep it simple we will just use html link elements:
Style: "style:", s?, link, nl.
link: href, Css-type, Style-rel.
@href: url.
@Css-type>type: +"text/css".
@Style-rel>rel: +"stylesheet".
url: [L;"0"-"9"; ":/@."]+. {A simple version for now}
This requires a URL, and adds two other attributes to the output. Note how ixml renaming has been used; although this is not yet officially part of the language, it is in the future specification [ixml2] and in all implementations. So if a flat XForms begins
XForms Example style:app.css
we will get an output that starts
<html>
<head>
<title>XForms Example</title>
<link href='app.css' type='text/css' rel='stylesheet'/>
This brings us to the sticky question of namespaces; sticky, because at the time of writing the issue is not yet resolved in the working group.
The XML design group did a clever thing when designing a notation for
namespaces: they designed the namespace declarations to look like attributes,
so that XML documents would be syntactically compatible with earlier software.
Thus although namespace declarations look like attributes, they have a
different semantic interpretation because they begin with the character
xmlns.
It is this author's opinion that ixml can use the same approach, by
specifying that things that look like attributes should be interpreted as
namespace declarations if the serialisation of the node starts with the letters
xmlns.
Accepting this, we can redefine the html rule to include a
namespace in this way:
html: xhtml-ns, head, body, s?. @xhtml-ns>xmlns: +"http://www.w3.org/1999/xhtml".
which will give
<html xmlns='http://www.w3.org/1999/xhtml'>
We can use a similar technique to enclose the XForms controls in an element that declares the namespace:
body: Content. Content>group: xf-ns, Controls. @xf-ns>xmlns: +"http://www.w3.org/2002/xforms".
which will give
<body> <group xmlns='http://www.w3.org/2002/xforms'>
Most controls have a number of required parameters, and a number of optional
ones. For instance, consider input:
<input ref="person/@age"> <label>Age</label> </input>
We can define this using positioning after a leading keyword:
input person/@age "Age"
like this:
input: -"input", s, ref, s, label.
@ref: XPath.
label: -'"', ~['"'; #a]*, -'"', s?.
XPath: ... more ... . {assume this to be defined}
There's one other useful attribute for many controls, and that is
incremental="true" that specifies that the control activates for
every character typed. Since incremental="false" is the default,
we don't have to specify it, so you can write:
input person/@age "Age" incremental
by changing the rule for input to:
input: -"input", s, ref, s, label, s?, incremental. @incremental: -"incremental", +"true".
so that we get
<input ref='person/@age' incremental='true'> <label>Age</label> </input>
Nearly all elements in XForms can have certain common attributes, in
particular class for presentation purposes, and id
for identification.
<output class="error" id="out1" ref="message"> <label>Error</label> </input>
One option would be to give these a keyword to identify them:
output class:error id:out1 message "Error"
but another would be to use the same notation as used in CSS:
output.error #out1 message "Error"
like this:
output: class?, id?, ref, label, s?. class: -".", name. id: -"#", name. -name: [L], [L; "0"-"9"]+.
Going back to the definition of the head
head: title, style*, model+.
we have to define the model, for instance:
model: "model", s, id?, Model-content. -Model-content: (instance; bind; Action; submission)*. instance: -"data", s, id?, src. bind: -"bind", s, ref, s, Property+. -Property: type; constraint; relevant; required; readonly. @type: -"type:", s?, name. @constraint: -"constraint:", s?, expression.
(we'll come back to Action and submission later),
looking like this:
model data people.xml bind person/@age type:integer constraint: .>0
As you can see, we are not obliged to use the same keywords in the input as the elements in the output, so in this case we have replaced the somewhat technical "instance" with the more general "data".
To distinguish the various types of property in a bind, we have to use keywords like this, however another approach would be to give them each a separate definition:
-Model-content: (instance; Bind; Action; submission)*. Bind: Type; Constraint; Relevant; Required; Readonly. Type>bind: -"type", s, ref, s, name. Constraint>bind: -"constraint", s, ref, expr.
giving
model data people.xml type person/@age integer constraint person/@age .>0
It is worth noting that nearly all XForms only have a single model, so an alternative approach to defining models is this:
head: title, Style*, Models. Models: Single-model; model+. Single-model>model: Model-content. model: -"model", s, id?, Model-content.
so that in the usual case, you don't have to declare a model at all, only when there is more than one:
XForms Example style:app.css data people.xml type person/@age integer constraint person/@age .>0
switch case closed /case case open /case /switch switch( case #closed( )case case #open( )case )switch switch: case: #closed :case case: #open :case :switch
But some elements are sometimes enclosing
input a "Age" ( hint "An integer" )input input: person/@age "Age" hint "An integer" :input
id: always with #
submission #send
extend XPath to allow for instances
output @admin/a => instance('admin')/a
Actions
trigger "OK" toggle @closed DOMActivate /trigger
input age "Age" incremental
can use:
@incremental: +"true".
to give
<input ref="age" incremental="true"> <label>Age</label> </input>
Since non-incremental is the default, we don't need to treat it specially: you just leave it out.
Problem with embedded XML
Common
Evaluation
Single Item Binding
value (expression)
nonrelevant ("keep" | "remove" | "empty")
relevant (xs:boolean) [deprecated]
validate (xs:boolean)
resource (xs:anyURI)
action (xs:anyURI) [deprecated]
mode ("asynchronous"|"synchronous")
method ("post" | "get" | "put" | "delete" | "multipart-post" | "form-data-post" | "urlencoded-post" | Any other NCName | PrefixedName)
serialization (xs:string)
mediatype (xs:string)
encoding (xs:string)
separator (";" | "&")
version (xs:NMTOKEN)
indent (xs:boolean)
omit-xml-declaration (xs:boolean)
standalone (xs:boolean)
cdata-section-elements (QNameList)
includenamespaceprefixes (xs:NMTOKENS)
response-mediatype (xs:string)
replace ("all" | "instance" | "text" | "none" | PrefixedName)
instance (xs:IDREF)
targetref ( Single Item Binding)
target ("_self" | "_blank" | xs:string)