[tex-k] A041: \uppercase and \lowercase not respecting the invariant of space tokens having character code 32

tttex at mailbox.org tttex at mailbox.org
Thu Oct 12 20:37:31 CEST 2023


In "The TeXbook" in chapter 8 on page 47, it states that "the character code in a space token is always 32". But the definition of \uppercase and \lowercase as given in chapter 7 on page 41 of "The TeXbook" does not respect this invariant. These commands allow changing the character code of any character token, including space tokens. The implementation of \lowercase and \uppercase in modules 1288 and 1289 of "TeX: The Program" follows the definition given in "The TeXbook".

In "TeX: The Program", the fact that space tokens are expected to have character code 32 is used in a few places. The routine 'get_r_token', defined in module 1215, discards leading space tokens by comparing tokens to 'space_token' which is defined as a space token with character code 32. Similarly, when macro parameters are matched in module 393, tokens are only compared to 'space_token' to determine whether they are space tokens. These routines would not work correctly for space tokens with a deviating character code.

This seems to me to be an inconsistency in both "The TeXbook" and "TeX: The Program".

The following code demonstrates that space tokens with character code other than 32 can be created:

  % Create a token corresponding the letter A.
  % Create a normal space token.
  \toks0={ }
  % Create a space token with the same character code as A (i.e. 65).
  \uccode`\ =`A
  \uppercase{\toks0={ }}
    \message{The weird space token now has the same character code as A.}
    \message{It still has the same category code as a space token though.}



More information about the tex-k mailing list.