Embedding Quotes and other Delimiters in Forth Strings

Revision  2016-09-25

Background

Like most computing languages, Forth supports quote-delimited string literals e.g.

    : HELLO-WORLD  ." Hello world!" ;

    S" This is a string literal."

    ABORT" file not found"

Unlike many languages, there is no way to insert the delimiter character itself into the above strings. This seemingly strange
omission may be explained as follows:

To illustrate the last point, let's suppose we wish to display the following text.

    This string contains "quote" marks.

The standard functions ." S" cannot be used as the resulting string would consist only of those characters up to the first
double-quote. The solution is to PARSE the string using as delimiter, a character not contained in the text e.g.

    CR CHAR | PARSE This string contains "quotes" marks.| TYPE CR

This produces the desired result. If many such strings need to be entered, the process can be simplified with a defining word.

    : S|  [CHAR] | PARSE  POSTPONE SLITERAL ; IMMEDIATE
    : TEST  CR S| This string contains "quotes" marks.| TYPE CR ;

A more complex variation is S\" . This function uses the backslash escape sequences popularized by the C language e.g.

    : TEST  S\" \n\This string contains \"quote\" marks.\n" TYPE ;

The Real Problem

The previous solutions may be termed "work-arounds" - that is, they offer a way around a defect or deficiency that does not
involve change to the function itself. As a temporary solution, work-arounds pose little problem. When they become part
of the language, the risk is duplication and/or displacement of the function originally intended to be augmented. S\" is
particularly problematic as it introduces a foreign and competing syntax into the Forth language.

To avoid these issues, an alternative approach to embedding quotes and other delimiters is presented - one that addresses
"the real problem". It takes the standard functions ." S" ABORT" .( etc. and extends them in a manner compatible
with standard behaviour.

Control Characters

Embedding of control characters in string literals is not supported. Control characters are inherently non-portable and Forth
practice is to output them separately e.g. CR and WRITE-LINE. Applications requiring strings with embedded control
characters or C-style escapes (e.g. Windows) are properly supported through library functions rather than within the Forth
language.

Embedding Delimiters

The scheme presented here is the same as that used in Fortran, Pascal and most assembly language. When it is desired to
embed a delimiter character in a string, simply enter the character twice e.g.

	S" This string contains ""quote"" marks."
	.( It works for any delimiter e.g. '))' )

Implementation

A sample implementation is provided. A full implementation would include the functions ABORT" C" which require
system level support to implement.

The following code is public domain with acknowledgement to Wil Baden on whose code it was based.

\ Embedded delimiters for Forth strings
255 CONSTANT bufmax  \ may be greater than 255 characters

CREATE buf  bufmax CHARS ALLOT

: +buf ( addr1 len1 len2 -- len3 )
  >R bufmax R@ - MIN ( clip) R>
  2DUP + >R CHARS buf + SWAP CMOVE R> ;

: /PARSE ( char "ccc<char>" -- addr len )
  0 BEGIN
    >R  DUP PARSE  2DUP R> +buf >R
    1+ CHARS +  DUP SOURCE CHARS + U<
  WHILE
    2DUP C@ =
  WHILE
    1  DUP >IN +!  R> +buf
  REPEAT THEN  2DROP buf R> ;

: S" ( "ccc<">" -- | addr len )
  [CHAR] " /PARSE STATE @ IF POSTPONE SLITERAL THEN ; IMMEDIATE

: ." ( -- )  POSTPONE S" POSTPONE TYPE ; IMMEDIATE

: .( ( "ccc<)>" -- )  [CHAR] ) /PARSE TYPE ; IMMEDIATE

\ counted string support

: STRING, ( addr u -- )
  255 MIN HERE OVER 1+ CHARS ALLOT PLACE ;

: ," ( "ccc<">" -- )  [CHAR] " /PARSE STRING, ;

\ : ."  POSTPONE (.") ," ; IMMEDIATE
\ : C"  POSTPONE (C") ," ; IMMEDIATE
\ : ABORT"  POSTPONE (ABORT") ," ; IMMEDIATE

CR .( Testing ... ) CR
CR S" This string includes ""quote"" marks" TYPE
: test1  CR S" This string includes ""quote"" marks" TYPE ; test1
: test2  CR ." This string includes ""quote"" marks" ; test2
HERE ," This string includes ""quote"" marks"  COUNT CR TYPE
CR .( It works for any delimiter e.g. '))' )

History
2013-01-30 Generalized for delimiters other than quotes. Minor text
           changes.
2011-05-03 First release

Top    Home    Forth

em.gif (457 bytes)


Page updated: 25 Sep 2016