Old vs New, EBCDIC and ASCII

ASCII and EBCDIC are two character encoding schemes which have played a historical hole in mainframe environments. Character encoding is simply a method of representing characters in data, morse code, for example is an early type of character encoding. This article takes a brief look at character encoding across modern and legacy systems.

ASCII and EBCDIC (pronounced ebb-si-dick) are two incompatible character encoding schemes which emerged in the early 1960s. The acronym ASCII standards for American Standard Code for Information Interchange and the standard was developed by a committee which included government and the major commercial vendors of the day. One vendor involved in the development of this standard was IBM, notable as one of the biggest manufacturers of business computer systems at the time.

Simultaneously IBM had independently developed their own proprietary encoding scheme called EBCDIC. We've noted before that it was not uncommon in that era for computer manufactures to ensure they had a customers 'loyalty' through 'vendor-lock-in'. That is, by ensuring that one vendor's technology was incompatible with others, the customer had little choice but to stick with the original vendors products or undertake a very expensive operation to re-purchase their entire infrastructure to change to anothers. However in this case, history would indicate that it was design decisions made partly due to production time lines (wayback-machine link)  that favored EBCDIC in IBM's System 360 range of mainframes. That range of systems and it's descendants went on to become extremely popular, which has in turn lead EBCDIC to be an enduring character encoding scheme.

As a developer of host connectivity software one of our most popular terminal emulations is the 3270 terminal emulator. The 3270 family of  terminals are the terminals of choice to connect to the system 360 and its descendants. By that measure EBCDIC plays a very important role in terminal emulation.

ASCII was originally a 7 bit encoding system, with a maximum of 128 characters (in fact it's around 100 characters plus some additional control characters). EBCIDIC, on the other hand, is an 8 bit character encoding system which has it's historical roots in the computer punch cards used to feed data into older computer systems - the order of characters in the encoding scheme is actually based on the layout of those punch cards. This means that the order of equivalent characters in ASCII and EBCDIC does not match and there are in fact glyphs in each set which do not have equivalents in the other.

A code page describes a subset of characters to be made available at any one time and can include language and localization requirements, alphanumeric and punctuation characters or drawing and graphical sets. As there are code pages unique to terminals and host systems a quality terminal emulator will need to support a variety of code pages to ensure the correct characters are displayed on screen.

EBCDIC is found on several IBM mainframe and midrange systems. The most popular terminals utilizing EBCDIC are the IBM 3270 family and the IBM 5250 family of terminals, although there are other, non IBM terminals such as the Unisys T27 which also make use of EBCDIC. Some later terminals have taken on many of the conceptual themes of the 3270 – terminals like the Tandem (now HP NonStop) 653x and IBM's terminals for it's AIX hosts such as the IBM 3151 and IBM 3101. However it's worth noting that whilst these terminals are similar in many ways to the 3270 they utilize ASCII rather than EBCDIC.

Today, despite ongoing use, ASCII and EBCDIC are legacy encoding systems. ASCII has been superseded by Unicode, a double byte encoding system capable of encompassing the number of glyphs required for common languages. EBCDIC remains due to the descendants of legacy systems and software which natively use EBCDIC, in particular those IBM host systems still playing a prominent role in the enterprise.

|