Kawa: Working with XML and HTML

Working with XML and HTML

Kawa has a number of features for working with XML, HTML, and generated web pages.

In Kawa you don't write XML or HTML directly. Instead you write expressions that evaluate to “node objects” corresponding to elements, attributes, and text. You then write these node objects using either an XML or HTML format.

Many web-page-generating tools require you to work directly with raw HTML, as for example:

(display "<p>Don't use the <code>&lt;blink&gt;</code> tag.</p>")

In Kawa you would instead do:

(display (html:p "Don't use the " (html:code "<blink>") " tag."))

The conversion from node objects to XML or HTML is handled by the formatter (or serializer). Some advantages of doing it this way are:

You don't have to worry about quoting special characters. Missing or incorrect quoting is a common source of bugs and security problems on systems that work directly with text, such as PHP.
Some errors such as mismatched element tags are automatically avoided.
The generated generated XML can be validated as it is generated, or even using compile-time type-checking. (Kawa doesn't yet do either.)
In application that also reads XML, you can treat XML that is read in and XML that is generated using the same functions.

Formatting XML

The easiest way to generate HTML or XML output is to run Kawa with the appropriate --output-format option.

The intentation is that these output modes should be compatible with XSLT 2.0 and XQuery 1.0 Serialization. (However, that specifies many options, most of which have not yet been implemented.

xml: Values are printed in XML format. "Groups" or "elements" are written as using xml element syntax. Plain characters (such as ‘<’) are escaped (such as ‘<’).
xhtml: Same as xml, but follows the xhtml compatibility guidelines.
html: Values are printed in HTML format. Mostly same as xml format, but certain elements without body, are written without a closing tag. For example <img> is written without </img>, which would be illegal for html, but required for xml. Plain characters (such as ‘<’) are not escaped inside <script> or <style> elements.

To illustrate:

$ kawa --output-format html
#|kawa:1|# (html:img src:"img.jpg")
<img src="img.jpg">

$ kawa --output-format xhtml
#|kawa:1|# (html:img src:"img.jpg")
<img xmlns="http://www.w3.org/1999/xhtml" src="img.jpg" />

$ kawa --output-format xml
#|kawa:1|# (html:img src:"img.jpg")
<img xmlns="http://www.w3.org/1999/xhtml" src="img.jpg"></img>

And here is the default scheme formatting:

$ kawa
#|kawa:1|# (html:img src:"img.jpg")
({http://www.w3.org/1999/xhtml}img src: img.jpg )

as-xml value

Return a value (or multiple values) that when printed will print value in XML syntax.
(require 'xml)
(as-xml (make-element 'p "Some " (make-element 'em "text") "."))
prints <p>Some <em>text</em>.</p>.

unescaped-data data

Creates a special value which causes data to be printed, as is, without normal escaping. For example, when the output format is XML, then printing "<?xml?>" prints as ‘<?xml?>’, but (unescaped-data "<?xml?>") prints as ‘<?xml?>’.

Creating HTML nodes

The html prefix names a special namespace (see the section called “Namespaces and compound symbols”) of functions to create HTML element nodes. For example, html:em is a constructor that when called creates a element node whose tag is em. If this element node is formatted as HTML, the result has an <em> tag.

html:tag attributes ... content ...

Creates an element node whose tag is tag. The parameters are first zero or more attributes, followed by zero of more child values. An attribute is either an attribute value (possibly created using make-attribute), or a pair of arguments: A keyword followed by the attribute value. Child values are usually either strings (text content), or nested element nodes, but can also be comment or processing-instruction nodes.
(html:a href: "http://gnu.org/" "the "(html:i "GNU")" homepage")

The compound identifier html:tag is actually a type: When you call it as a function you're using Kawa's standard coercion of a type to its constructor function. This means you can do type tests:

(define some-node ...)
(if (instance? some-node html:blink)
  (error "blinking not allowed!"))

Object identity is currently not fully specified. Specifically, it is undefined if a nested (child) element node is copied “by value” or “by reference”. This is related to whether nodes have a parent reference. In the XPath/XQuery data model nodes do have a parent reference, and child nodes are conceptually copied. (In the actual implemention copying is commonly avoided.) Kawa/Scheme currently followed the XQuery copying semantics, which may not be the most appropriate for Scheme.

Creating XML nodes

The XML data model is similar to HTML, with one important addition: XML tags may be qualified names, which are similar to compound symbols.

You must do this to use the following types and functions:

(require 'xml)

The following types and functions assume:

(require 'xml)

make-element tag [attribute ...] child ...

Create a representation of a XML element, corresponding to
<tag attribute...>child...</tag>
The result is a TreeList, though if the result context is a consumer the result is instead "written" to the consumer. Thus nested calls to make-element only result in a single TreeList. More generally, whether an attribute or child is includded by copying or by reference is (for now) undefined. The tag should currently be a symbol, though in the future it should be a qualified name. An attribute is typically a call to make-attribute, but it can be any attribute-valued expression.
(make-element 'p
	      "The time is now: "
	      (make-element 'code (make <java.util.Date>)))

element-name element

Returns the name (tag) of the element node, as a symbol (QName).

make-attribute name value...

Create an "attribute", which is a name-value pair. For now, name should be a symbol.

attribute-name element

Returns the name of the attribute node, as a symbol (QName).

comment

Instances of this type represent comment values, specifically including comments in XML files. Comment nodes are currently ignored when printing using Scheme formatting, though that may change.

comment comment-text

Create a comment object with the specified comment-text.

processing-instruction

Instances of this type represent “processing instructions”, such as may appear in XML files. Processing-instruction nodes are currently ignored when printing using Scheme formatting, though that may change.

processing-instruction target contents

Crreate a processing-instruction object with the specified target (a simple symbol) and contents (a string).

XML literals

You can write XML literals directly in Scheme code, following a #. Notice that the outermost element needs to be prefixed by #, but nested elements do not (and must not).

#<p>The result is <b>final</b>!</p>

Actually, these are not really literals since they can contain enclosed expressions:

#<em>The result is &{result}.</em>

The value of result is substituted into the output, in a similar way to quasi-quotation. (If you try to quote one of these “XML literals”, what you get is unspecified and is subject to change.)

An xml-literal is usually an element constructor, but there some rarely used forms (processing-instructions, comments, and CDATA section) we'll cover later.

xml-literal ::= #xml-constructor
xml-constructor ::= xml-element-constructor
  | xml-PI-constructor
  | xml-comment-constructor
  | xml-CDATA-constructor

Element constructors

xml-element-constructor ::=
    <QName xml-attribute*>xml-element-datum...</QName >
  | <xml-name-form xml-attribute*>xml-element-datum...</>
  | <xml-name-form xml-attribute*/>
xml-name-form ::= QName
  | xml-enclosed-expression
xml-enclosed-expression ::=
    @lbracechar{}expression@rbracechar{}
  | (expression...)

The first xml-element-constructor variant uses a literal QName, and looks like standard non-empty XML element, where the starting QName and the ending QName must match exactly:

#<a href="next.html">Next</a>

As a convenience, you can leave out the ending tag(s):

<para>This is a paragraph in <emphasis>DocBook</> syntax.</>

You can use an expression to compute the element tag at runtime - in that case you must leave out the ending tag:

#<p>This is <(if be-bold 'strong 'em)>important</>!</p>

You can use arbitrary expression inside curly braces, as long as it evaluates to a symbol. You can leave out the curly braces if the expression is a simple parenthesised compound expression. The previous example is equivalent to:

#<p>This is <{(if be-bold 'strong 'em)}>important</>!</p>

The third xml-element-constructor variant above is an XML “empty element”; it is equivalent to the second variant when there are no xml-element-datum items.

(Note that every well-formed XML element, as defined in the XML specifications, is a valid xml-element-constructor, but not vice versa.)

Elements contents (children)

The “contents” (children) of an element are a sequence of character (text) data, and nested nodes. The characters &, <, and > are special, and need to be escaped.

xml-element-datum ::=
    any character except &, or <.
  | xml-constructor
  | xml-escaped
xml-escaped ::=
    &xml-enclosed-expression
  | &xml-entity-name;
  | xml-character-reference
xml-character-reference ::=
    &#digit+;
  | &#xhex-digit+;

Here is an example shows both hex and decimal character references:

#<p>A&#66;C&#x44;E</p>  ⇒  <p>ABCDE</p>

xml-entity-name ::= identifier

Currently, the only supported values for xml-entity-name are the builtin XML names lt, gt, amp, quot, and apos, which stand for the characters <, >, &, ", and ', respectively. The following two expressions are equivalent:

#<p>&lt; &gt; &amp; &quot; &apos;</p>
#<p>&{"< > & \" '"}</p>

Attributes

xml-attribute ::=
    xml-name-form=xml-attribute-value
xml-attribute-value ::=
    "quot-attribute-datum*"
  | 'apos-attribute-datum*'
quot-attribute-datum ::=
    any character except ", &, or <.
  | xml-escaped
apos-attribute-datum ::=
    any character except ', &, or <.
  | xml-escaped

If the xml-name-form is either xmlns or a compound named with the prefix xmlns, then technically we have a namespace declaration, rather than an attribute.

QNames and namespaces

The names of elements and attributes are qualified names (QNames), which are represented using compound symbols (see the section called “Namespaces and compound symbols”). The lexical syntax for a QName is either a simple identifier, or a (prefix,local-name) pair:

QName ::= xml-local-part
| xml-prefix:xml-local-part
xml-local-part ::= identifier
xml-prefix ::= identifier

An xml-prefix is an alias for a namespace-uri, and the mapping between them is defined by a namespace-declaration. You can either use a define-namespace form, or you can use a namespace declaration attribute:

xml-namespace-declaration-attribute ::=
xmlns:xml-prefix=xml-attribute-value
| xmlns=xml-attribute-value

The former declares xml-prefix as a namespace alias for the namespace-uri specified by xml-attribute-value (which must be a compile-time constant). The second declares that xml-attribute-value is the default namespace for simple (unprefixed) element tags. (A default namespace declaration is ignored for attribute names.)

(let ((qn (element-name #<gnu:b xmlns:gnu="http://gnu.org/"/>)))
  (list (symbol-local-name qn)
        (symbol-prefix qn)
        (symbol-namespace-uri qn)))
⇒ ("b" "gnu" "http://gnu.org/")

Other XML types

Processing instructions

An xml-PI-constructor can be used to create an XML processing instruction, which can be used to pass instructions or annotations to an XML processor (or tool). (Alternatively, you can use the processing-instruction type constructor.)

xml-PI-constructor ::= <?xml-PI-target xml-PI-content?>
xml-PI-target ::= NCname (i.e. a simple (non-compound) identifier)
xml-PI-content ::= any characters, not containing ?>.

For example, the DocBook XSLT stylesheets can use the dbhtml instructions to specify that a specific chapter should be written to a named HTML file:

#<chapter><?dbhtml filename="intro.html" ?>
<title>Introduction</title>
...
</chapter>

XML comments

You can cause XML comments to be emitted in the XML output document. Such comments can be useful for humans reading the XML document, but are usually ignored by programs. (Alternatively, you can use the comment type constructor.)

xml-comment-constructor ::= 
xml-comment-content ::= any characters, not containing --.

CDATA sections

A CDATA section can be used to avoid excessive use of xml-entity-ref such as & in element content.

xml-CDATA-constructor ::= <![CDATA[xml-CDATA-content]]>
xml-CDATA-content ::= any characters, not containing ]]>.

The following are equivalent:

#<p>Specal characters <![CDATA[< > & ' "]]> here.</p>
#<p>Specal characters &lt; &gt; &amp; &quot; &apos; here.</p>

Kawa remembers that you used a CDATA section in the xml-element-constructor and will write it out using a CDATA constructor.

Web page scripts

A Kawa web page script is a Kawa program that is invoked by a web server because the server received an HTTP request. The result of evaluating the top-level expressions becomes the HTTP response that the servlet sends back to the client, usually a browser.

A web page script may be as simple as:

(format "The time is <~s>." (java.util.Date))

This returns a response of consisting of a formatted string giving the current time. The string would interpreted as text/plain content: The angle brackets are regular characters, and not HTML tag markers.

The script can alternatively evaluate to XML/HTML node values, for example those created by the section called “XML literals”:

#<p>Hello, <b>&(request-remote-host)</b>!</p>

In this case the response would be text/html or similar content: The angle brackets should be interpreted by the browser as HTML tag markers. The function request-remote-host is available (automatically) to web page scripts; it returns the host that made the HTTP request, which is then interpolated into the response.

Following sections will go into more details about how to write web page scripts. You can do so in any supported Kawa language, including Scheme, BRL, KRL, or XQuery.

A web server will use a URL mapping to map a request URL to a specific web page script. This can be done in a number of different ways:

The easiest to manage is to use Kawa's mechanism for the section called “Self-configuring web page scripts”. Ths is especially easy if you the web server built in to JDK 6, since no configuration files are needed. You can also use a “servlet engine” like Tomcat or Glassfish.
You can explicitly compile the web page script to a servlet, in the same way Java servlets are compiled. This can then be installed ("deployed") in a servlet-supporting web server, such a Tomcat or Glassfish. See the section called “Installing web page scripts as Servlets”.
You can run the servlet as a CGI script.

For details on how to extract information from the request see the section called “Functions for accessing HTTP requests”. For details on how the response is created see Generating HTTP responses. If the response is HTML or XML, you may want to read the section called “Creating HTML nodes”, or the section called “Creating XML nodes”, or the section called “XML literals”.

Here are some examples, starting with a simple hello.scm:

(response-content-type 'text/html) ; Optional
(html:p
  "The request URL was: " (request-url))
(make-element 'p
  (let ((query (request-query-string)))
    (if query
      (values-append "The query string was: " query)
      "There was no query string.")))

This returns two <p> (paragraph) elements: One using make-element and one using the html:p constructor. Or you may prefer to use the section called “XML literals”.

The same program using KRL:

<p>The request URL was: [(request-url)]</p>,
<p>[(let ((query (request-query-string)))
    (if query
      (begin ]The query string was: [query)

      ]There was no query string.[))]</p>

You can also use XQuery:

<p>The request URL was: {request-url()}</p>
<p>{let $query := request-query-string() return
    if ($query)
    then ("The query string was: ",$query)
    else "There was no query string."}</p>

Self-configuring web page scripts

Kawa makes it easy to set up a web site without configuration files. Instead, the mapping from request URL to web page script matches the layout of files in the application directory.

Many web servers make it easy to execute a script using a script processor which is selected depending on the extension of the requested URL. That is why you see lots of URLs that end in .cgi, .php, or .jsp. This is bad, because it exposes the server-side implementation to the user: Not only are such URLs ugly, but they make it difficult to change the server without breaking people's bookmarks and search engines. A server will usually provide a mechanism to use prettier URLs, but doing so requires extra effort, so many web-masters don't.

If you want a script to be executed in response to a URL http://host/app/foo/bar you give the script the name app/foo/bar, in the appropriate server “application” directory (as explained below). You get to pick the name bar. Or you can use the name bar.html, even though the file named bar.html isn't actually an html file - rather it produces html when evaluated. Or better: just use a name without an extension at all. Kawa figures out what kind of script it is based on the content of the file, rather than the file name. Once Kawa has found a script, it looks at the first line to see if it can recognize the kind (language) of the script. Normally this would be a comment that contains the name of a programming language that Kawa knows about. For example:

;; Hello world page script written in -*- scheme -*-
#<p>Hello, <b>&(request-remote-host)</b>!</p>

(Using the funny-looking string -*- scheme -*- has the bonus is that it recognized by the Emacs text editor.)

A script named +default+ is run if there isn't a matching script. For example assume the following is a file named +default.

;; This is -*- scheme -*-
(make-element 'p "servlet-path: " (request-servlet-path))

This becomes the default script for HTTP requests that aren't handled by a more specific script. The request-servlet-path function returns the "servlet path", which is the part of the requested URL that is relative to the current web application. Thus a request for http://host:port/app/this/is/a/test will return:

servlet-path: /this/is/a/test

Using the OpenJDK built-in web server

The easiest way to run a Kawa web server is to use the web server built in to JDK 6 or later.

kawa --http-auto-handler context-path appdir --http-start port

This starts a web server that listens on the given port, using the files in directory appdir to handle requests that start with the given context-path. The context-path must start with a "/" (one is added if needed), and it is recommended that it also end with a "/" (otherwise you might get some surprising behavior).

You can specify multiple --http-auto-handler options.

For example use the files in the current directory to handle all requests on the standard port 80 do:

kawa --http-auto-handler / . --http-start 80

There are some examples in the testsuite/webtest directory the Kawa source distribution. You can start the server thus:

bin/kawa --http-auto-handler / testsuite/webtest/ --http-start 8888

and then for example browse to http://localhost:8888/adder.scm.

For lots of information about the HTTP request, browse to http://localhost:8888/info/anything.

Using a servlet container

You can also can use a “servlet container” such as Tomcat or Glassfish with self-configuring script. See the section called “Installing web page scripts as Servlets” for information on how to install these servers, and the concept of web applications. Once you have these server installed, you create a web application with the following in the appdir/WEB-INF/web.xml configuration file:

<web-app>
  <display-name>Kawa auto-servlet</display-name>
  <servlet>
    <servlet-name>KawaPageServlet</servlet-name>
    <servlet-class>gnu.kawa.servlet.KawaPageServlet</servlet-class>
  </servlet>
  <servlet-mapping>
    <servlet-name>KawaPageServlet</servlet-name>
    <url-pattern>/*</url-pattern>
  </servlet-mapping>
</web-app>

This creates a web application where all URLs are handled by the gnu.kawa.servlet.KawaPageServlet servlet class, which is included in the Kawa jar file. The KawaPageServlet class handles the searching and compiling described in this page.

Finding a matching script

When Kawa receives a request for:

http://host:port/appname/a/b/anything

it will look for a file:

appdir/a/b/anything

If such a file exists, the script will be executed, as described below. If not, it will look for a file name +default+ in the same directory. If that desn't exist either, it will look for +default+ in the parent directory, then the grand-parent directory, and so on until it gets to the appname web application root directory. So the default script is this: appdir/+default.

If that doesn't exist then Kawa returns a 404 "page not found" error.

Determining script language

Once Kawa has found a script file corresponding to a request URL, it needs to determine if this is a data file or a web page script, and in the latter case, what language it is written in.

Kawa recognizes the following "magic strings" in the first line of a script:

kawa:scheme: The Scheme language.
kawa:xquery: The XQuery language.
kawa:language: Some other language known to Kawa.

Kawa also recognizes Emacs-style "mode specifiers":

-*- scheme -*-: The Scheme language.
-*- xquery -*-: The XQuery language (though Emacs doesn't know about XQuery).
-*- emacs-lisp -*-
-*- elisp -*-: The Emacs Lisp extension language.
-*- common-lisp -*-
-*- lisp -*-: The Common Lisp language.

Also, it also recognizes comments in the first two columns of the line:

;;: A Scheme or Lisp comment - assumed to be in the Scheme language.
(:: Start of an XQuery comment, so assumed to be in the XQuery language.

If Kawa doesn't recognize the language of a script (and it isn't named +default+) then it assumes the file is a data file. It asks the servlet engine to figure out the content type (using the getMimeType method of ServletContext), and just copies the file into the response.

Compilation and caching

Kawa automatically compiles a script into a class. The class is internal to the server, and is not written out to disk. (There is an unsupported option to write the compiled file to a class file, but there is no support to use previously-compiled classes.) The server then creates a module instance to handle the actual request, and runs the body (the run method) of the script class. On subsequence requests for the same script, the same class and instance are reused; only the run is re-executed.

If the script is changed, then it is re-compiled and a new module instance created. This makes it very easy to develop and modify a script. (Kawa for performance reasons doesn't check more than once a second whether a script has been modified.)

Installing web page scripts as Servlets

You can compile a Kawa program to a Servlet, and run it in a servlet engine (a Servlet-aware web server). One or more servlets are installed together as a web application. This section includes specific information for the Tomcat and Glassfish web servers.

Creating a web application

A web application is a group of data, servlets, and configuration files to handle a related set of URLs. The servlet specification specifies the directory structure of a web application.

Assume the web application is called myapp, and lives in a directory with the same name. The application normally handles requests for URLs that start with http://example.com/myapp. Most files in the application directory are used to handle requests with corresponding URL. For example, a file myapp/list/help.html would be the response to the request http://example.com/myapp/list/help.html.

The directory WEB-INF is special. It contains configuration files, library code, and other server data.

So to create the myapp application, start with:

mkdir myapp
cd myapp
mkdir WEB-INF WEB-INF/lib WEB-INF/classes

Copy the Kawa jar from the lib direcory. (You can also use a “hard” link, but symbolic links may not work, for security systems.)

cp kawa-home/kawa-1.14.1.jar WEB-INF/lib/kawa.jar

You should also create the file WEB-INF/web.xml. For now, this is is just a place-holder:

<web-app>
  <display-name>My Application</display-name>
</web-app>

Compiling a web page script to a servlet

Assume for simplicity that the source files are in the WEB-INF/classes directory, and make that the current directory:

cd .../myapp/WEB-INF/classes

Depending on the source language, you compile your script sing the --servlet switch:

kawa --servlet -C hello.scm

or:

kawa --servlet --krl -C hello.krl

or:

kawa --servlet --xquery -C hello.xql

This lets the web-application find the compiled servlets. Finally, you just need to add the new servlet to the WEB-INF/web.xml file:

<web-app>
  <display-name>My Application</display-name>

  <servlet>
    <servlet-name>MyHello</servlet-name>
    <servlet-class>hello</servlet-class>
  </servlet>

  <servlet-mapping>
    <servlet-name>MyHello</servlet-name>
    <url-pattern>/hello</url-pattern>
  </servlet-mapping>
</web-app>

The <servlet> clause says that the servlet named MyHello is implemented by the Java class hello. The <servlet-mapping> clause says that a request URL /hello should be handled by the servlet named MyHello. The URL is relative to the application context path, so the actual URL would be http://example.com/myapp/hello.

Installing a servlet under Tomcat

Apache's Tomcat is an open-source implementation of the servlet specifications. After you download it, uncompress it in some convenient location, which is commonly referred to as $CATALINA_HOME.

To install your web application, copy/move its directory to be in the $CATALINA_HOME/webapps directory. Thus for the example above you would have a $CATALINA_HOME/webapps/myapp directory.

To start or stop Tomcat use the scripts in $CATALINA_HOME/bin. For example to start Tomcat on a GNU/Linux system run $CATALINA_HOME/bin/startup.sh. This will start a web server that listens on the default port of 8080, so you can browse the above example at http://localhost:8080/myapp/hello.

If you're running Fedora GNU/Linux, you can use the tomcat6 package:

# yum install tomcat6
# export CATALINA_HOME=/usr/share/tomcat6

You can the manage Tomcat like other system services. You can install webapps under $CATALINA_HOME/webapps.

Installing a servlet under Glassfish

Glassfish from Oracle/Sun is a open-source “application server” that implements Java EE 6, including the 3.0 servlet specification. After you download it, uncompress it in some convenient location. This location is called as-install-parent in the Quick Start Guide. The commands you will use is most in as-install/bin, where as-install is as-install/glassfish.

To start the server, do:

as-install/bin/startserv

or under under Windows:

as-install\bin\startserv.bat

The default post to listen to is 8080; you can the port (and lots of other properties) using the adminstration console at port 4848.

A web application does not need to be any particular location, instead you just install it with this command:

as-install/bin/adadmin deploy appdir

where appdir is the application directory - myapp in the example. (Use asadmin.bat under Windows.)

Servlet-specific script functions

The following functions only work within a servlet container. To use these functions, first do:

(require 'servlets)

You can conditionalize your code to check for servlets, like this:

(cond-expand
 (in-servlet
   (require 'servlets)
   (format "[servlet-context: ~s]" (current-servlet-context)))
 (else
   "[Not in a servlet]"))

current-servlet

When called from a Kawa servlet handler, returns the actual javax.servlet.http.HttpServlet instance.

current-servlet-context

Returns the context of the currently executing servlet, as an instance of javax.servlet.ServletContext.

current-servlet-config

Returns the ServletConfig of the currently executing servlet.

get-request

Return the current servlet request, as an instance of javax.servlet.http.HttpServletRequest.

get-response

Return the current servlet response, as an instance of javax.servlet.http.HttpServletResponse.

request-servlet-path

Get the servlet path of the current request. Similar to request-script-path, but not always the same, depending on configuration, and does not end with a "/".

request-path-info

Get the path info of the current request. Corresponds to the CGI variable PATH_INFO.

servlet-context-realpath [path]

Returns the file path of the current servlet's "Web application".

Installing Kawa programs as CGI scripts

The recommended way to have a web-server run a Kawa program as a CGI script is to compile the Kawa program to a servlet (as explained in the section called “Web page scripts”, and then use Kawa's supplied CGI-to-servlet bridge.

First, compile your program to one or more class files as explained in the section called “Web page scripts”. For example:

kawa --servlet --xquery -C hello.xql

Then copy the resulting .class files to your server's CGI directory. On Red Hat GNU/Linux, you can do the following (as root):

cp hello*.class /var/www/cgi-bin/

Next find the cgi-servlet program that Kawa builds and installs. If you installed Kawa in the default place, it will be in /usr/local/bin/cgi-servlet. (You'll have this if you installed Kawa from source, but not if you're just using Kawa .jar file.) Copy this program into the same CGI directory:

cp /usr/local/bin/cgi-servlet /var/www/cgi-bin/

You can link instead of copying:

ln -s /usr/local/bin/cgi-servlet /var/www/cgi-bin/

However, because of security issues this may not work, so it is safer to copy the file. However, if you already have a copy of cgi-servlet in the CGI-directory, it is safe to make a hard link instead of making an extra copy.

Make sure the files have the correct permissions:

chmod a+r /var/www/cgi-bin/hello*.class /var/www/cgi-bin/hello
chmod a+x /var/www/cgi-bin/hello

Now you should be able to run the Kawa program, using the URL http://localhost/cgi-bin/hello. It may take a few seconds to get the reply, mainly because of the start-up time of the Java VM. That is why servlets are preferred. Using the CGI interface can still be useful for testing or when you can't run servlets.

Functions for accessing HTTP requests

The following functions are useful for accessing properties of a HTTP request, in a Kawa program that is run either as a servlet or a CGI script. These functions can be used from plain Scheme, from KRL (whether in BRL-compatible mode or not), and from XQuery.

The examples below assume the request http://example.com:8080/myapp/foo/bar?val1=xyz&val2=abc, where myapp is the application context. We also assume that this is handled by a script foo/+default+.

The file testsuite/webtest/info/+default+ in the Kawa source distribution calls most of these functions. You can try it as described in the section called “Self-configuring web page scripts”.

Request URL components

request-URI

Returns the URI of the request, as a value of type URI. This excludes the server specification, but includes the query string. (It is the combination of CGI variables SCRIPT_NAME, PATH_INFO, and QUERY_STRING. Using servlets terminology, it is the combination of Context Path, Servlet Path, PathInfo, and Query String.)
(request-URI) ⇒ "/myapp/foo/bar?val1=xyz&val2=abc"

request-path

Returns the URI of the request, as a value of type URI. This excludes the server specification and the query string. Equivalent to (path-file (request-URI)). (It is the combination of CGI variables SCRIPT_NAME, and PATH_INFO. Same as the concatenation of (request-context-path), (request-script-path), and (request-local-path). Using servlets terminology, it is the combination of Context Path, Servlet Path, and PathInfo.)
(request-path) ⇒ "/myapp/foo/bar"

request-uri

This function is deprecated, because of possible confusion with request-URI. Use request-path instead.

request-url

Returns the complete URL of the request, except the query string. The result is a java.lang.StringBuffer.
(request-url) ⇒ "http://example.com:8080/myapp/foo/bar"

request-context-path

Returns the context path, relative to the server root. This is an initial substring of the (request-path). Similar to the Context Path of a servlet request, except that it ends with a "/".
(request-context-path) ⇒ "/myapp/"

request-script-path

Returns the path of the script, relative to the context. This is either an empty string, or a string that ends with "/", but does not start with one. (The reason for this is to produce URIs that work better with operations like resolve-uri.) This is conceptually similar to request-servlet-path, though not always the same, and the "/" conventions differ.
(request-script-path) ⇒ "foo/"

request-local-path

Returns the remainder of the request-path, relative to the request-script-path.
(request-local-path) ⇒ "bar"

request-query-string

Returns the query string from an HTTP request. The query string is the part of the request URL after a question mark. Returns false if there was no query string. Corresponds to the CGI variable QUERY_STRING.
(request-query-string) ⇒ "val1=xyz&val2=abc"

Request parameters

Request parameters are used for data returned from forms, and for other uses. They may be encoded in the query string or in the request body.

request-parameter name [default]

If there is a parameter with the given name (a string), return the (first) corresponding value, as a string. Otherwise, return the default value, or #!null if there is no default.
(request-parameter "val1") ⇒ "xyz"
(request-parameter "val9" "(missing)") ⇒ "(missing)"

request-parameters name

If there is are one or more parameter with the given name (a string), return them all (as multiple values). Otherwise, return no values (i.e. (values)).
(request-parameters "val1") ⇒ "xyz"
(request-parameters "val9") ⇒ #!void

request-parameter-map

Request a map of all the parameters. This is a map from strings to a sequence of strings. (Specifically, a java.util.Map<String,java.util.List<String>>.)

Request headers

The request headers are a set of (keyword, string)-pairs transmitted as part of the HTTP request, before the request body.

request-header name

If there is a header with the given name (a string), return the corresponding value string. Otherwise, return #!null.
(request-header "accept-language") ⇒ "en-us,en;q=0.5"

request-header-map

Request a map of all the headers. This is a map from strings to a sequence of strings. (Specifically, a java.util.Map<String,java.util.List<String>>.)

Request body

request-input-port

Return a textual input port for reading the request body, as a sequence of characters.

request-input-stream

Return a binary input stream for reading the request body, as a sequence of bytes.

request-body-string

Return the entire request body as a string

Request IP addresses and ports

Information about the interface and port on which the request was received.

request-local-socket-address

The local address on which the request was received. This is the combination of (request-local-host) and (request-local-port), as an instance of java.net.InetSocketAddress.

request-local-host

Get the IP address of the interface on which request was received, as an java.net.InetAddress.

request-local-IP-address

Get the IP address of the interface on which request was received, a string in numeric form:
(request-local-host) ⇒ "127.0.0.1"

request-local-port

Get the port this request was received on.
(request-local-port) ⇒ 8080

Information about the interface and port of the remote client that invoked the request.

request-remote-socket-address

The address of the remote client (usually a web browser) which invoked the request. This is the combination of (request-remove-host) and (request-remote-port), as an instance of java.net.InetSocketAddress.

request-remote-host

Get the IP address of the remote client which invoked the request, as an java.net.InetAddress.

request-remote-IP-address

Get the IP address of the remote client which invoked the request, as a string in numeric form.
(request-remote-host) ⇒ "123.45.6.7"

request-remote-port

The port used by the remote client.

Miscellaneous request properties

request-path-translated

Map the request-path to a file name (a string) in the server application directory. Corresponds to the CGI variable PATH_TRANSLATED.

request-method

Returns the method of the HTTP request, usually "GET" or "POST". Corresponds to the CGI variable REQUEST_METHOD.

request-scheme

Returns the scheme (protocol) of the request. Usually "http", or "https".

Generating HTTP responses

The result of evaluating the top-level expressions of a web page script becomes the HTTP response that the servlet sends back to the browser. The result is typically an HTML/XML element code object Kawa will automatically format the result as appropriate for the type. Before the main part of the response there may be special "response header values", as created by the response-header function. Kawa will use the response header values to set various required and optional fields of the HTTP response. Note that response-header does not actually do anything until it is "printed" to the standard output. Note also that a "Content-Type" response value is special since it controls the formatting of the following non-response-header values.

response-header key value

Create the response header ‘key: value’ in the HTTP response. The result is a "response header value" (of some unspecified type). It does not directly set or print a response header, but only does so when you actually "print" its value to the response output stream.

response-content-type type

Species the content-type of the result - for example "text/plain". Convenience function for (response-header "Content-Type" type).

error-response code [message]

Creates a response-header with an error code of code and a response message of message. (For now this is the same as response-status.)

Note this also returns a response-header value, which does not actually do anything unless it is returned as the result of executing a servlet body.

response-status code [message]

Creates a response-header with an status code of code and a response message of message. (For now this is the same as error-response.)

Using non-Scheme languages for XML/HTML

XQuery language

Bundled with Kawa is a fairly complete implementation of W3C's new XML Query language. If you start Kawa with the --xquery it selects the "XQuery" source language; this also prints output using XML syntax. See the Qexo (Kawa-XQuery) home page for examples and more information.

XSL transformations

There is an experimental implementation of the XSLT (XML Stylesheet Language Transformations) language. Selecting --xslt at the Kawa command line will parse a source file according to the syntax on an XSLT stylesheet. See the Kawa-XSLT page for more information.

KRL - The Kawa Report Language for generating XML/HTML

KRL (the "Kawa Report Language") is powerful Kawa dialect for embedding Scheme code in text files such as HTML or XML templates. You select the KRL language by specifying --krl on the Kawa command line.

KRL is based on on BRL, Bruce Lewis's "Beautiful Report Language", and uses some of BRL's code, but there are some experimental differences, and the implementation core is different. You can run KRL in BRL-compatility-mode by specifying --brl instead of --krl.

Differences between KRL and BRL

This section summarizes the known differences between KRL and BRL. Unless otherwise specified, KRL in BRL-compatibility mode will act as BRL.

In BRL a normal Scheme string "mystring" is the same as the inverted quote string ]mystring[, and both are instances of the type <string>. In KRL "mystring" is a normal Scheme string of type <string>, but ]mystring[ is special type that suppresses output escaping. (It is equivalent to (unescaped-data "mystring").)
When BRL writes out a string, it does not do any processing to escape special characters like <. However, KRL in its default mode does normally escape characters and strings. Thus "<a>" is written as <a&gr;. You can stop it from doing this by overriding the output format, for example by specifying --output-format scheme on the Kawa command line, or by using the unescaped-data function.
Various Scheme syntax forms, including lambda, take a body, which is a list of one or more declarations and expressions. In normal Scheme and in BRL the value of a body is the value of the last expression. In KRL the value of a body is the concatenation of all the values of the expressions, as if using values-append.
In BRL a word starting with a colon is a keyword. In KRL a word starting with a colon is an identifier, which by default is bound to the make-element function specialized to take the rest of the word as the tag name (first argument).
BRL has an extensive utility library. Most of this has not yet been ported to KRL, even in BRL-compatibility mode.

Formatting XML
Creating HTML nodes
Creating XML nodes
XML literals
Web page scripts
Self-configuring web page scripts
Installing web page scripts as Servlets
Installing Kawa programs as CGI scripts
Functions for accessing HTTP requests
Generating HTTP responses
Using non-Scheme languages for XML/HTML

Up: Documentation

Previous: Object, Classes and Modules

Next: Miscellaneous topics

The Kawa Scheme Language