CML Reference Guide

Chapter 4.16:  Text Filters


Turns plain text stuff (which may contain newlines) into HTML.  It turns each newline into a <BR>.  It also turns each of the special characters <, ", and > into their HTML special codes (unless escaped by a "\").  Example:
  " $t2hbr( $shell(cat mytext) )
displays the text of an ordinary file mytext as HTML.

$cleanhtml(prohibit text)
"Clean HTML" filter.  Filters HTML fragment in text, according to the rules in the string named prohibit.  Provides a way to filter out certain HTML tags that you may not wish to be displayed in a response, such as applets, javascript, or even annoying tags such as <BLINK>.

Here are the sample contents of a prohibit string:

This means that everything between <APPLET> and </APPLET> is ignored; that the <SCRIPT> tag is allowed, and the <BLINK> tag (but only the tag, not the text that follows it) is ignored.  Normally if something is allowed it does not need to be in the list, but advanced uses of this feature can support lists of tags that can be individually allowed or prohibited at run time.

$cleanhtml() includes all of the safety features of $safehtml(), such as automatic tag closing and mismatched quote correction.

$safehtml(prop stuff)
"Safe HTML" filter.  Obsolete form of $cleanhtml().  Filters HTML fragment in text of stuff, making it "safe" to include in an existing HTML page.  Specifically, it removes the tags <HTML>, </HTML>, <HEAD>, </HEAD>, <BODY>, and </BODY>.  It "closes" any open tags (such as <B>) that don't have a matching closing tag (such as </B>).  It looks for mismatched quotes inside a tag, and adds an extra quote if necessary.  (For example, <A HREF="junk> becomes <A HREF="junk">.)

Prop is a number that controls certain properties of $safehtml().  It is the sum of a set of bitmasks (powers of 2); each bit controls a particular property.  The properties are:

Obsolete form of $safehtml(), without the Prop argument.  $rhtml(stuff) is equivalent to $safehtml(0 stuff).

Attempts an "intelligent" filtering of plain text stuff into HTML.  Blank lines become <P>'s.  Parses and translates URL's into anchored links with the same names.  (see $t2url().)

Translates URLs in stuff into anchored links (that pop up a new window) with the same names.  Both this function and $t2html() translate URLs that begin with any of the schemes http:/, gopher:/, telnet:/, ftp:/, or mailto:.

A more intelligent (than $t2html) filtering of plain text into HTML.  Acts as much as possible like a typical word-processor.  Each single "hard" RETURN in the original text translates into a <BR>; multiple RETURNs become sequences of "&nbsp;<P>".  Groups of N spaces become N-1 "&nbsp;"s plus a regular space.  A tab is treated as a group of 5 spaces.  Parses and translates URL's into anchored links.

Special note: All 3 functions also recognize and translate special "caucus" URLs of the form "http:/caucus...", into a reference to a particular Caucus CML page on the current host (and with the current swebs subserver).  For example, "http:/caucus" becomes a reference to the Caucus Center page, i.e. center.cml, and "http:/caucus/conf_name" becomes a reference to confhome.cml for conference conf_name.  This is one of the very few instances in which the CML interpreter assumes knowledge of the names and arguments of the actual CML files.  (Normally this would be a bad idea, but in this case the feature is so powerful and useful as to allow the exception.)

Translates all "&"s in stuff into "&amp;".  Useful to "pre-escape" HTML code that is going to be "unescaped" when displayed by a browser.  (This pre-escaping is essential when using Caucus to edit a response containing HTML code.  Without it, any escaped HTML special sequences like "&gt;" would lose their meaning after one edit.)

Translates all instances of "&", "<", and ">" in stuff into their HTML code equivalents (&amp; &lt; and &gt;).  Useful to "pre-escape" HTML code that is going to be "unescaped" when displayed by a browser. 

Translates all double-quotes in text to the HTML special sequence "&quot;".  This is primarily useful for placing text (that contains double-quotes) inside a double-quote-delimited field inside an HTML <INPUT> tag.

Attempts to translate address into a "mailto:" URL.  (For example, if address is "[email protected]", $t2mail() generates "<a href="mailto:[email protected]">[email protected]</A>".)  If address does not appear to be an e-mail address, it is passed through unchanged.

$wraptext(width text)
Word-wraps text to width (single-width-character) columns by inserting newlines in the appropriate places.

$mac_define(name text)
Defines a CML macro name that expands to text.  See the CML macros chapter for more information.  If name is already defined, the original definition is erased and replaced by the new one. 

$mac_define() is an "protected" function, i.e. it is a no-op when called from within $protect().

Expands any macro invocations in text.  Evaluates to text, with the macro invocations replaced by the expansion of the macros.  See the CML macros chapter for more information.