Kawa: Input, output, and file handling

Input, output, and file handling

Kawa has a number of useful tools for controlling input and output:

A programmable reader.

A powerful pretty-printer.

Named output formats

The --output-format (or --format) command-line switch can be used to override the default format for how values are printed on the standard output. This format is used for values printed by the read-eval-print interactive interface. It is also used to control how values are printed when Kawa evaluates a file named on the command line (using the -f flag or a just a script name). (It also effects applications compiled with the --main flag.) It currently effects how values are printed by a load, though that may change.

The default format depends on the current programming language. For Scheme, the default is scheme for read-eval-print interaction, and ignore for files that are loaded.

The formats currently supported include the following:

scheme: Values are printed in a format matching the Scheme programming language, as if using display. "Groups" or "elements" are written as lists.
readable-scheme: Like scheme, as if using write: Values are generally printed in a way that they can be read back by a Scheme reader. For example, strings have quotation marks, and character values are written like ‘#\A’.
elisp: Values are printed in a format matching the Emacs Lisp programming language. Mostly the same as scheme.
readable-elisp: Like elisp, but values are generally printed in a way that they can be read back by an Emacs Lisp reader. For example, strings have quotation marks, and character values are written like ‘?A’.
clisp
commonlisp: Values are printed in a format matching the Common Lisp programming language, as if written by princ. Mostly the same as scheme.
readable-clisp
readable-commonlisp: Like clisp, but as if written by prin1: values are generally printed in a way that they can be read back by a Common Lisp reader. For example, strings have quotation marks, and character values are written like ‘#\A’.
xml
xhtml
html: Values are printed in XML, XHTML, or HTML format. This is discussed in more detail in the section called “Formatting XML”.
cgi: The output should be a follow the CGI standards. I.e. assume that this script is invoked by a web server as a CGI script/program, and that the output should start with some response header, followed by the actual response data. To generate the response headers, use the response-header function. If the Content-type response header has not been specified, and it is required by the CGI standard, Kawa will attempt to infer an appropriate Content-type depending on the following value.
ignore: Top-level values are ignored, instead of printed.

Paths - file name, URLs, and URIs

A Path is the name of a file or some other resource. The path mechanism provides a layer of abstraction, so you can use the same functions on either a filename or a URL/URI. Functions that in standard Scheme take a filename have been generalized to take a path or a path string, as if using the path function below. For example:

(open-input-file "http://www.gnu.org/index.html")
(open-input-file (URI "ftp://ftp.gnu.org/README"))

path

A general path, which can be a filename or a URI. It can be either a filename or a URI. Represented using the abstract Java class gnu.kawa.io.Path.

Coercing a value to a Path is equivalent to calling the path constructor documented below.

path arg

Coerces the arg to a path. If arg is already a path, it is returned unchanged. If arg is a java.net.URI, or a java.net.URL then a URI value is returned. If arg is a java.io.File, a filepath value is returned. Otherwise, arg can be a string. A URI value is returned if the string starts with a URI scheme (such as "http:"), and a filepath value is returned otherwise.

path? arg

True if arg is a path - i.e. an instance of a gnu.kawa.io.Path.

filepath

The name of a local file. Represented using the Java class gnu.kawa.io.FilePath, which is a wrapper around java.io.File.

filepath? arg

True if arg is a filepath - i.e. an instance of a gnu.kawa.io.FilePath.

URI

A Uniform Resource Indicator, which is a generalization of the more familiar URL. The general format is specified by RFC 2396: Uniform Resource Identifiers (URI): Generic Syntax. Represented using the Java class gnu.kawa.io.URIPath, which is a wrapper around java.net.URI. A URI can be a URL, or it be a relative URI.

URI? arg

True if arg is a URI - i.e. an instance of a gnu.kawa.io.URIPath.

URL

A Uniform Resource Locator - a subtype of URI. Represented using the Java class gnu.kawa.io.URLPath, which is a wrapper around a java.net.URL, in addition to extending gnu.kawa.io.URIPath.

Extracting Path components

path-scheme arg

Returns the “URI scheme” of arg (coerced to a path) if it is defined, or #f otherwise. The URI scheme of a filepath is "file" if the filepath is absolute, and #f otherwise.
(path-scheme "http://gnu.org/") ⇒ "http"

path-authority arg

Returns the authority part of arg (coerced to a path) if it is defined, or #f otherwise. The “authority” is usually the hostname, but may also include user-info or a port-number.
(path-authority "http://me@localhost:8000/home") ⇒ "me@localhost:8000"

path-host arg

Returns the name name part of arg (coerced to a path) if it is defined, or #f otherwise.
(path-host "http://me@localhost:8000/home") ⇒ "localhost"

path-user-info arg

Returns the “user info” of arg (coerced to a path) if it is specified, or #f otherwise.
(path-host "http://me@localhost:8000/home") ⇒ "me"

path-port arg

Returns the port number of arg (coerced to a path) if it is specified, or -1 otherwise. Even if there is a default port associated with a URI scheme (such as 80 for http), the value -1 is returned unless the port number is explictly specified.
(path-host "http://me@localhost:8000/home") ⇒ 8000
(path-host "http://me@localhost/home") ⇒ -1

path-file arg

Returns the “path component” of the arg (coerced to a path). (The name path-path might be more logical, but it is obviously a bit awkward.) The path component of a file name is the file name itself. For a URI, it is the main hierarchical part of the URI, without schema, authority, query, or fragment.
(path-file "http://gnu.org/home/me.html?add-bug#body") ⇒ "/home/me.html"

path-directory arg

If arg (coerced to a path) is directory, return arg; otherwise return the “parent” path, without the final component.
(path-directory "http://gnu.org/home/me/index.html#body")
  ⇒ (path "http://gnu.org/home/me/")
(path-directory "http://gnu.org/home/me/")
  ⇒ (path "http://gnu.org/home/me/")
(path-directory "./dir") ⇒ (path "./dir") if dir is a directory, and (path ".") otherwise.

path-parent arg

Returns the “parent directory” of arg (coerced to a path). If arg is not a directory, same as path-directory arg.
(path-parent "a/b/c") ⇒ (path "a/b")
(path-parent "file:/a/b/c") ⇒ (path "file:/a/b/c")
(path-parent "file:/a/b/c/") ⇒ (path "file:/a/b/")

path-last arg

The last component of path component of arg (coerced to a path). Returns a substring of (path-file arg). If that string ends with ‘/’ or the path separator, that last character is ignored. Returns the tail of the path-string, following the last (non-final) ‘/’ or path separator.
(path-last "http:/a/b/c") ⇒ "c"
(path-last "http:/a/b/c/") ⇒ "c"
(path-last "a/b/c") ⇒ "c"

path-extension arg

Returns the “extension” of the arg (coerced to a path).
(path-extension "http://gnu.org/home/me.html?add-bug#body") ⇒ "html"
(path-extension "/home/.init") ⇒ #f

path-query arg

Returns the query part of arg (coerced to a path) if it is defined, or #f otherwise. The query part of a URI is the part after ‘?’.
(path-query "http://gnu.org/home?add-bug") ⇒ "add-bug"

path-fragment arg

Returns the fragment part of arg (coerced to a path) if it is defined, or #f otherwise. The fragment of a URI is the part of after ‘#’.
(path-query "http://gnu.org/home#top") ⇒ "top"

resolve-uri uri base

Returns a uri unchanged if it is an absolute URI. Otherwise resolves it against a base URI base, which is normally (though not always) absolute.

Resources

A resource is a file or other fixed data that an application may access. Resources are part of the application and are shipped with it, but are stored in external files. Examples are images, sounds, and translation (localization) of messages. In the Java world a resource is commonly bundled in the same jar file as the application itself.

resource-url resource-name

Returns a URLPath you can use as a URL, or you can pass to it open-input-file to read the resource data. The resource-name is a string which is passed to the ClassLoader of the containing module. If the module class is in a jar file, things will magically work if the resource is in the same jar file, and resource-name is a filename relative to the module class in the jar. If the module is immediately evaluated, the resource-name is resolved against the location of the module source file.

module-uri

Evaluates to a special URI that can be used to access resources relative to the class of the containing module. The URI has the form "class-resource://CurrentClass/" in compiled code, to allow moving the classes/jars. The current ClassLoader is associated with the URI, so accessing resources using the URI will use that ClassLoader. Therefore you should not create a "class-resource:" URI except by using this function or resolve-uri, since that might try to use the wrong ClassLoader.

The macro resource-url works by using module-uri and resolving that to a normal URL.

File System Interface

file-exists? filename

Returns true iff the file named filename actually exists. This function is defined on arbitrary path values: for URI values we open a URLConnection and invoke getLastModified().

file-directory? filename

Returns true iff the file named filename actually exists and is a directory. This function is defined on arbitrary path values; the default implementation for non-file objects is to return #t iff the path string ends with the character ‘/’.

file-readable? filename

Returns true iff the file named filename actually exists and can be read from.

file-writable? filename

Returns true iff the file named filename actually exists and can be writen to. (Undefined if the filename does not exist, but the file can be created in the directory.)

delete-file filename

Delete the file named filename. On failure, throws an exception.

rename-file oldname newname

Renames the file named oldname to newname.

copy-file oldname newname-from path-to

Copy the file named oldname to newname. The return value is unspecified.

create-directory dirname

Create a new directory named dirname. Unspecified what happens on error (such as exiting file with the same name). (Currently returns #f on error, but may change to be more compatible with scsh.)

system-tmpdir

Return the name of the default directory for temporary files.

make-temporary-file [format]

Return a file with a name that does not match any existing file. Use format (which defaults to "kawa~d.tmp") to generate a unique filename in (system-tmpdir). The current implementation is not safe from race conditions; this will be fixed in a future release (using Java2 features).

Ports

Ports represent input and output devices. An input port is a Scheme object that can deliver data upon command, while an output port is a Scheme object that can accept data.

Different port types operate on different data:

A textual port supports reading or writing of individual characters from or to a backing store containing characters using read-char and write-char below, and it supports operations defined in terms of characters, such as read and write.
A binary port supports reading or writing of individual bytes from or to a backing store containing bytes using read-u8 and write-u8 below, as well as operations defined in terms of bytes (integers in the range 0 to 255).

All Kawa binary ports created by procedures documented here are also textual ports. Thus you can either read/write bytes as described above, or read/write characters whose scalar value is in the range 0 to 255 (i.e. the Latin-1 character set), using read-char and write-char.

A native binary port is a java.io.InputStream or java.io.OutputStream instance. These are not textual ports. You can use methods read-u8 and write-u8, but not read-char and write-char on native binary ports. (The functions input-port?, output-port?, binary-port?, and port? all currently return false on native binary ports, but that may change.)

call-with-port port proc

The call-with-port procedure calls proc with port as an argument. If proc returns, then the port is closed automatically and the values yielded by the proc are returned.

If proc does not return, then the port must not be closed automatically unless it is possible to prove that the port will never again be used for a read or write operation.

As a Kawa extension, port may be any object that implements java.io.Closeable. It is an error if proc does not accept one argument.

call-with-input-file path proc

call-with-output-file path proc

These procedures obtain a textual port obtained by opening the named file for input or output as if by open-input-file or open-output-file. The port and proc are then passed to a procedure equivalent to call-with-port.

It is an error if proc does not accept one argument.

input-port? obj

output-port? obj

textual-port? obj

binary-port? obj

port? obj

These procedures return #t if obj is an input port, output port, textual port, binary port, or any kind of port, respectively. Otherwise they return #f.

These procedures currently return #f on a native Java streams (java.io.InputStream or java.io.OutputStream), a native reader (a java.io.Reader that is not an gnu.mapping.Inport), or a native writer (a java.io.Writer that is not an gnu.mapping.Outport). This may change if conversions between native ports and Scheme ports becomes more seamless.

input-port-open? port

output-port-open? port

Returns #t if port is still open and capable of performing input or output, respectively, and #f otherwise. (Not supported for native binary ports - i.e. java.io.InputStteam or java.io.OutputStream.)

current-input-port

current-output-port

current-error-port

Returns the current default input port, output port, or error port (an output port), respectively. (The error port is the the port to which errors and warnings should be sent - the standard error in Unix and C terminology.) These procedures are parameter objects, which can be overridden with parameterize. The initial bindings for these are implementation-defined textual ports.

with-input-from-file path thunk

with-output-to-file path thunk

The file is opened for input or output as if by open-input-file or open-output-file, and the new port is made to be the value returned by current-input-port or current-output-port (as used by (read), (write obj), and so forth). The thunk is then called with no arguments. When the thunk returns, the port is closed and the previous default is restored. It is an error if thunk does not accept zero arguments. Both procedures return the values yielded by thunk. If an escape procedure is used to escape from the continuation of these procedures, they behave exactly as if the current input or output port had been bound dynamically with parameterize.

open-input-file path

open-binary-input-file path

Takes a path naming an existing file and returns a textual input port or binary input port that is capable of delivering data from the file.

The procedure open-input-file checks the fluid variable port-char-encoding to determine how bytes are decoded into characters. The procedure open-binary-input-file is equivalent to calling open-input-file with port-char-encoding set to #f.

open-output-file path

open-binary-output-file path

Takes a path naming an output file to be created and returns respectively a textual output port or binary output port that is capable of writing data to a new file by that name. If a file with the given name already exists, the effect is unspecified.

The procedure open-output-file checks the fluid variable port-char-encoding to determine how characters are encoded as bytes. The procedure open-binary-output-file is equivalent to calling open-output-file with port-char-encoding set to #f.

close-port port

close-input-port port

close-output-port port

Closes the resource associated with port, rendering the port incapable of delivering or accepting data. It is an error to apply the last two procedures to a port which is not an input or output port, respectively. (Specifically, close-input-port requires a java.io.Reader, while close-output-port requires a java.io.Writer. In contrast close-port accepts any object whose class implements java.io.Closeable.)

These routines have no effect if the port has already been closed.

String and bytevector ports

open-input-string string

Takes a string and returns a text input port that delivers characters from the string. The port can be closed by close-input-port, though its storage will be reclaimed by the garbage collector if it becomes inaccessible.
(define p
  (open-input-string "(a . (b c . ())) 34"))

(input-port? p)                 ⇒  #t
(read p)                        ⇒  (a b c)
(read p)                        ⇒  34
(eof-object? (peek-char p))     ⇒  #t

open-output-string

Returns an textual output port that will accumulate characters for retrieval by get-output-string. The port can be closed by the procedure close-output-port, though its storage will be reclaimed by the garbage collector if it becomes inaccessible.
(let ((q (open-output-string))
  (x '(a b c)))
    (write (car x) q)
    (write (cdr x) q)
    (get-output-string q))        ⇒  "a(b c)"

get-output-string output-port

Given an output port created by open-output-string, returns a string consisting of the characters that have been output to the port so far in the order they were output. If the result string is modified, the effect is unspecified.
(parameterize
    ((current-output-port (open-output-string)))
    (display "piece")
    (display " by piece ")
    (display "by piece.")
    (newline)
    (get-output-string (current-output-port)))
        ⇒ "piece by piece by piece.\n"

call-with-input-string string proc

Create an input port that gets its data from string, call proc with that port as its one argument, and return the result from the call of proc

call-with-output-string proc

Create an output port that writes its data to a string, and call proc with that port as its one argument. Return a string consisting of the data written to the port.

open-input-bytevector bytevector

Takes a bytevector and returns a binary input port that delivers bytes from the bytevector.

open-output-bytevector

Returns a binary output port that will accumulate bytes for retrieval by get-output-bytevector.

get-output-bytevector port

Returns a bytevector consisting of the bytes that have been output to the port so far in the order they were output. It is an error if port was not created with open-output-bytevector.

Input

If port is omitted from any input procedure, it defaults to the value returned by (current-input-port). It is an error to attempt an input operation on a closed port.

read [port]

The read procedure converts external representations of Scheme objects into the objects themselves. That is, it is a parser for the non-terminal datum. It returns the next object parsable from the given textual input port, updating port to point to the first character past the end of the external representation of the object.

If an end of file is encountered in the input before any characters are found that can begin an object, then an end-of-file object is returned. The port remains open, and further attempts to read will also return an end-of-file object. If an end of file is encountered after the beginning of an object’s external representation, but the external repre- sentation is incomplete and therefore not parsable, an error is signaled.

read-char [port]

Returns the next character available from the textual input port, updating the port to point to the following character. If no more characters are available, an end-of-file value is returned.

The result type is character-or-eof.

peek-char [port]

Returns the next character available from the textual input port, but without updating the port to point to the following character. If no more characters are available, an end-of-file value is returned.

The result type is character-or-eof.

Note: The value returned by a call to peek-char is the same as the value that would have been returned by a call to read-char with the same port. The only difference is that the very next call to read-char or peek-char on that port will return the value returned by the preceding call to peek-char. In particular, a call to peek-char on an interactive port will hang waiting for input whenever a call to read-char would have hung.

read-line [port [handle-newline]]

Reads a line of input from the textual input port. The handle-newline parameter determines what is done with terminating end-of-line delimiter. The default, 'trim, ignores the delimiter; 'peek leaves the delimiter in the input stream; 'concat appends the delimiter to the returned value; and 'split returns the delimiter as a second value. You can use the last three options to tell if the string was terminated by end-or-line or by end-of-file. If an end of file is encountered before any end of line is read, but some characters have been read, a string containing those characters is returned. (In this case, 'trim, 'peek, and 'concat have the same result and effect. The 'split case returns two values: The characters read, and the delimiter is an empty string.) If an end of file is encountered before any characters are read, an end-of-file object is returned. For the purpose of this procedure, an end of line consists of either a linefeed character, a carriage return character, or a sequence of a carriage return character followed by a linefeed character.

eof-object? obj

Returns #t if obj is an end-of-file object, otherwise returns #f.

Performance note: If obj has type character-or-eof, this is compiled as an int comparison with -1.

eof-object

Returns an end-of-file object.

char-ready? [port]

Returns #t if a character is ready on the textual input port and returns #f otherwise. If char-ready returns #t then the next read-char operation on the given port is guaranteed not to hang. If the port is at end of file then char-ready? returns #t.

Rationale: The char-ready? procedure exists to make it possible for a program to accept characters from interactive ports without getting stuck waiting for input. Any input editors as- sociated with such ports must ensure that characters whose existence has been asserted by char-ready? cannot be removed from the input. If char-ready? were to return #f at end of file, a port at end-of-file would be indistinguishable from an interactive port that has no ready characters.

read-string k [port]

Reads the next k characters, or as many as are available before the end of file, from the textual input port into a newly allocated string in left-to-right order and returns the string. If no characters are available before the end of file, an end-of-file object is returned.

read-u8 [port]

Returns the next byte available from the binary input port, updating the port to point to the following byte. If no more bytes are available, an end-of-file object is returned.

peek-u8 [port]

Returns the next byte available from the binary input port, but without updating the port to point to the following byte. If no more bytes are available, an end-of-file object is returned.

u8-ready? [port]

Returns #t if a byte is ready on the binary input port and returns #f otherwise. If u8-ready? returns #t then the next read-u8 operation on the given port is guaranteed not to hang. If the port is at end of file then u8-ready? returns #t.

read-bytevector k [port]

Reads the next k bytes, or as many as are available before the end of file, from the binary input port into a newly allocated bytevector in left-to-right order and returns the bytevector. If no bytes are available before the end of file, an end-of-file object is returned.

read-bytevector! bytevector [start [end [port]]]

Reads the next end − start bytes, or as many as are available before the end of file, from the binary input port into bytevector in left-to-right order beginning at the start position. If end is not supplied, reads until the end of bytevector has been reached. If start is not supplied, reads beginning at position 0. Returns the number of bytes read. If no bytes are available, an end-of-file object is returned.

Output

If port is omitted from any output procedure, it defaults to the value returned by (current-output-port). It is an error to attempt an output operation on a closed port.

The return type of these methods is void.

write obj [port]

Writes a representation of obj to the given textual output port. Strings that appear in the written representation are enclosed in quotation marks, and within those strings backslash and quotation mark characters are escaped by backslashes. Symbols that contain non-ASCII characters are escaped with vertical lines. Character objects are written using the #\ notation.

If obj contains cycles which would cause an infinite loop using the normal written representation, then at least the objects that form part of the cycle must be represented using ???. Datum labels must not be used if there are no cycles.

write-shared obj [port]

The write-shared procedure is the same as write, except that shared structure must be represented using datum labels for all pairs and vectors that appear more than once in the output.

write-simple obj [port]

The write-simple procedure is the same as write, except that shared structure is never represented using datum labels. This can cause write-simple not to terminate if obj contains circular structure.

display obj [port]

Writes a representation of obj to the given textual output port. Strings that appear in the written representation are output as if by write-string instead of by write. Symbols are not escaped. Character objects appear in the representation as if written by write-char instead of by write. The display representation of other objects is unspecified.

newline [port]

Writes an end of line to textual output port. This is done using the println method of the Java class java.io.PrintWriter.

write-char char [port]

Writes the character char (not an external representation of the character) to the given textual output port.

write-string string [port [start [end]]]

Writes the characters of string from start to end in left-to-right order to the textual output port.

write-u8 byte [port]

Writes the byte to the given binary output port.

write-bytevector bytevector [port [start [end]]]

Writes the bytes of bytevector from start to end in left-to-right order to the binary output port.

flush-output-port [port]

force-output [port]

Forces any pending output on port to be delivered to the output file or device and returns an unspecified value. If the port argument is omitted it defaults to the value returned by (current-output-port). (The name force-output is older, while R6RS added flush-output-port. They have the same effect.)

Line numbers and other input port properties

An interactive input port has a prompt procedure associated with it. The prompt procedure is called before a new line is read. It is passed the port as an argument, and returns a string, which gets printed as a prompt.

input-port-prompter port

Get the prompt procedure associated with port.

set-input-port-prompter! port prompter

Set the prompt procedure associated with port to prompter, which must be a one-argument procedure taking an input port, and returning a string.

default-prompter port

The default prompt procedure. It returns "#|kawa:L|# ", where L is the current line number of port. When reading a continuation line, the result is "#|C---:L|# ", where C is the character returned by (input-port-read-state port). The prompt has the form of a comment to make it easier to cut-and-paste.

port-column input-port

port-line input-port

Return the current column number or line number of input-port, using the current input port if none is specified. If the number is unknown, the result is #f. Otherwise, the result is a 0-origin integer - i.e. the first character of the first line is line 0, column 0. (However, when you display a file position, for example in an error message, we recommend you add 1 to get 1-origin integers. This is because lines and column numbers traditionally start with 1, and that is what non-programmers will find most natural.)

set-port-line! port line

Set (0-origin) line number of the current line of port to num.

input-port-line-number port

Get the line number of the current line of port, which must be a (non-binary) input port. The initial line is line 1. Deprecated; replaced by (+ 1 (port-line port)).

set-input-port-line-number! port num

Set line number of the current line of port to num. Deprecated; replaced by (set-port-line! port (- num 1)).

input-port-column-number port

Get the column number of the current line of port, which must be a (non-binary) input port. The initial column is column 1. Deprecated; replaced by (+ 1 (port-column port)).

input-port-read-state port

Returns a character indicating the current read state of the port. Returns #\Return if not current doing a read, #\" if reading a string; #\| if reading a comment; #\( if inside a list; and #\Space when otherwise in a read. The result is intended for use by prompt prcedures, and is not necessarily correct except when reading a new-line.

symbol-read-case

A symbol that controls how read handles letters when reading a symbol. If the first letter is ‘U’, then letters in symbols are upper-cased. If the first letter is ‘D’ or ‘L’, then letters in symbols are down-cased. If the first letter is ‘I’, then the case of letters in symbols is inverted. Otherwise (the default), the letter is not changed. (Letters following a ‘\’ are always unchanged.) The value of symbol-read-case only checked when a reader is created, not each time a symbol is read.

Miscellaeous

port-char-encoding

Controls how bytes in external files are converted to/from internal Unicode characters. Can be either a symbol or a boolean. If port-char-encoding is #f, the file is assumed to be a binary file and no conversion is done. Otherwise, the file is a text file. The default is #t, which uses a locale-dependent conversion. If port-char-encoding is a symbol, it must be the name of a character encoding known to Java. For all text files (that is if port-char-encoding is not #f), on input a #\Return character or a #\Return followed by #\Newline are converted into plain #\Newline.

This variable is checked when the file is opened; not when actually reading or writing. Here is an example of how you can safely change the encoding temporarily:
(define (open-binary-input-file name)
  (fluid-let ((port-char-encoding #f)) (open-input-file name)))

*print-base*

The number base (radix) to use by default when printing rational numbers. Must be an integer between 2 and 36, and the default is of course 10. For example setting *print-base* to 16 produces hexadecimal output.

*print-radix*

If true, prints an indicator of the radix used when printing rational numbers. If *print-base* is respectively 2, 8, or 16, then #b, #o or #x is written before the number; otherwise #Nr is written, where N is the base. An exception is when *print-base* is 10, in which case a period is written after the number, to match Common Lisp; this may be inappropriate for Scheme, so is likely to change.

*print-right-margin*

The right margin (or line width) to use when pretty-printing.

*print-miser-width*

If this an integer, and the available width is less or equal to this value, then the pretty printer switch to the more miser compact style.

*print-xml-indent*

When writing to XML, controls pretty-printing and indentation. If the value is 'always or 'yes force each element to start on a new suitably-indented line. If the value is 'pretty only force new lines for elements that won't fit completely on a line. The the value is 'no or unset, don't add extra whitespace.

Formatted Output (Common-Lisp-style)

format destination fmt . arguments

An almost complete implementation of Common LISP format description according to the CL reference book Common LISP from Guy L. Steele, Digital Press. Backward compatible to most of the available Scheme format implementations.

Returns #t, #f or a string; has side effect of printing according to fmt. If destination is #t, the output is to the current output port and #!void is returned. If destination is #f, a formatted string is returned as the result of the call. If destination is a string, destination is regarded as the format string; fmt is then the first argument and the output is returned as a string. If destination is a number, the output is to the current error port if available by the implementation. Otherwise destination must be an output port and #!void is returned.

fmt must be a string or an instance of gnu.text.MessageFormat or java.text.MessageFormat. If fmt is a string, it is parsed as if by parse-format.

parse-format format-string

Parses format-string, which is a string of the form of a Common LISP format description. Returns an instance of gnu.text.ReportFormat, which can be passed to the format function.

A format string passed to format or parse-format consists of format directives (that start with ‘~’), and regular characters (that are written directly to the destination). Most of the Common Lisp (and Slib) format directives are implemented. Neither justification, nor pretty-printing are supported yet.

Plus of course, we need documentation for format!

Implemented CL Format Control Directives

Documentation syntax: Uppercase characters represent the corresponding control directive characters. Lowercase characters represent control directive parameter descriptions.

~A

Any (print as display does).

~@A: left pad.
~mincol,colinc,minpad,padcharA: full padding.

~S

S-expression (print as write does).

~@S: left pad.
~mincol,colinc,minpad,padcharS: full padding.

~C

Character.

~@C: prints a character as the reader can understand it (i.e. #\ prefixing).
~:C: prints a character as emacs does (eg. ^C for ASCII 03).

Formatting Integers

~D

Decimal.

~@D: print number sign always.
~:D: print comma separated.
~mincol,padchar,commachar,commawidthD: padding.

~X

Hexadecimal.

~@X: print number sign always.
~:X: print comma separated.
~mincol,padchar,commachar,commawidthX: padding.

~O

Octal.

~@O: print number sign always.
~:O: print comma separated.
~mincol,padchar,commachar,commawidthO: padding.

~B

Binary.

~@B: print number sign always.
~:B: print comma separated.
~mincol,padchar,commachar,commawidthB: padding.

~nR

Radix n.

~n,mincol,padchar,commachar,commawidthR: padding.

~@R

print a number as a Roman numeral.

~:@R

print a number as an “old fashioned” Roman numeral.

~:R

print a number as an ordinal English number.

~R

print a number as a cardinal English number.

~P

Plural.

~@P: prints y and ies.
~:P: as ~P but jumps 1 argument backward.
~:@P: as ~@P but jumps 1 argument backward.

commawidth is the number of characters between two comma characters.

Formatting real numbers

~F

Fixed-format floating-point (prints a flonum like mmm.nnn).

~width,digits,scale,overflowchar,padcharF
~@F: If the number is positive a plus sign is printed.

~E

Exponential floating-point (prints a flonum like mmm.nnnEee)

~width,digits,exponentdigits,scale,overflowchar,padchar,exponentcharE
~@E: If the number is positive a plus sign is printed.

~G

General floating-point (prints a flonum either fixed or exponential).

~width,digits,exponentdigits,scale,overflowchar,padchar,exponentcharG
~@G: If the number is positive a plus sign is printed.

A slight difference from Common Lisp: If the number is printed in fixed form and the fraction is zero, then a zero digit is printed for the fraction, if allowed by the width and digits is unspecified.

~$

Dollars floating-point (prints a flonum in fixed with signs separated).

~digits,scale,width,padchar$
~@$: If the number is positive a plus sign is printed.
~:@$: A sign is always printed and appears before the padding.
~:$: The sign appears before the padding.

Miscellaneous formatting operators

~%

Newline.

~n%: print n newlines.

~&

print newline if not at the beginning of the output line.

~n&: prints ~& and then n-1 newlines.

~|

Page Separator.

~n|: print n page separators.

~~

Tilde.

~n~: print n tildes.

~<newline>

Continuation Line.

~:<newline>: newline is ignored, white space left.
~@<newline>: newline is left, white space ignored.

~T

Tabulation.

~@T: relative tabulation.
~colnum,colincT: full tabulation.

~?

Indirection (expects indirect arguments as a list).

~@?: extracts indirect arguments from format arguments.

~(str~)

Case conversion (converts by string-downcase).

~:(str~): converts by string-capitalize.
~@(str~): converts by string-capitalize-first.
~:@(str~): converts by string-upcase.

~*

Argument Jumping (jumps 1 argument forward).

~n*: jumps n arguments forward.
~:*: jumps 1 argument backward.
~n:*: jumps n arguments backward.
~@*: jumps to the 0th argument.
~n@*: jumps to the nth argument (beginning from 0)

~[str0~;str1~;...~;strn~]

Conditional Expression (numerical clause conditional).

~n[: take argument from n.
~@[: true test conditional.
~:[: if-else-then conditional.
~;: clause separator.
~:;: default clause follows.

~{str~}

Iteration (args come from the next argument (a list)).

~n{: at most n iterations.
~:{: args from next arg (a list of lists).
~@{: args from the rest of arguments.
~:@{: args from the rest args (lists).

~^

Up and out.

~n^: aborts if n = 0
~n,m^: aborts if n = m
~n,m,k^: aborts if n <= m <= k

Unimplemented CL Format Control Directives

~:A: print #f as an empty list (see below).
~:S: print #f as an empty list (see below).
~<~>: Justification.
~:^

Extended, Replaced and Additional Control Directives

These are not necesasrily implemented in Kawa!

~I

print a R4RS complex number as ~F~@Fi with passed parameters for ~F.

~Y

Pretty print formatting of an argument for scheme code lists.

~K

Same as ~?.

~!

Flushes the output if format destination is a port.

~_

Print a #\space character

~n_: print n #\space characters.

~nC

Takes n as an integer representation for a character. No arguments are consumed. n is converted to a character by integer->char. n must be a positive decimal number.

~:S

Print out readproof. Prints out internal objects represented as #<...> as strings "#<...>" so that the format output can always be processed by read.

~:A