RFC 373 (rfc373) - Page 1 of 4


Arbitrary Character Sets



Alternative Format: Original Text Document



NWG/RFC #373                                       14 July 1972
NIC 11058                                          SU-AI


                        ARBITRARY CHARACTER SETS

                            by John McCarthy

It would be nice to be able to have documents stored in computers that
could include arbitrary characters and to be able to display them on
any CRT screen, edit them using any keyboard, and print them on any
printer.  The object of this memorandum is to suggest how to get there
from here with special reference to the ARPA network.

Where are we now?

   (1) At present, there is 96 character ASCII, and everyone agrees that
   it should be included in any larger set.

   (2) Many installations are dependent on 64 character sets which do not
   even include the lower case latin alphabet.

   (3) At the Stanford Artificial Intelligence Laboratory, we have a 114
   character set that includes 96 character ASCII and which is
   implemented in our keyboards, displays, and line printer

   (4) Printers are becoming available that get their character designs
   out of memory, for example, the Xerox XGP printer, one of which we are
   getting.

   (5) The IMLAC type display has the character designs in main memory so
   that changing the displayed set is just a matter of reloading the
   memory.

   (6) Many display systems share the character generator among many
   display units.  In some of these, e.g. the Datadisc, arbitrary sets
   are probably feasible (using kludgery to be described later), but in
   other systems, e.g. our III's arbitrary sets are not feasible.

One possible approach to communication in expanded character sets is
to produce an expanded standard set of characters, perhaps using 8 or
9 bits and expect new equipment to implement this set.  This approach
has the disadvantage that it will be very hard to get agreement on
what the next step should be, and even if formal agreement is
realized, many groups will find it in their interest to ignore the
standard.