Electronic encoding of transliteration

Transliteration of Egyptian text consists of alphabetic symbols, punctuation symbols, brackets, and layout symbols to separate words. Punctuation symbols are mostly used within words to delimit morphological boundaries, but some types of transliteration also use e.g. commas and periods to separate phrases and sentences much as they would in, say, English texts. Various types of brackets are used to indicate uncertain readings, reconstructed readings, damaged text, etc. There are many different conventions of using punctuation symbols and brackets in transliteration.

The only problem for electronic encoding comes from the alphabetic symbols. An extensive overview of transliteration alphabets can be found in:

R. Hannig. Grosses Handwörterbuch Ägyptisch-Deutsch: die Sprache der Pharaonen (2800-950 v.Chr.). Verlag Philipp von Zabern, 1995.
Most alphabetic symbols are plain Latin letters, some are Latin letters combined with diacritical marks, and a few are not from the Latin alphabet. We concentrate on the latter two groups below.

We will consider encodings in ASCII and in Unicode. The former is adequate for encoding texts according to one fixed transliteration alphabet, but leads to difficulties when we want to accurately and unambiguously describe mixed samples of transliteration from different periods and schools of Egyptology. The latter has two disadvantages: first, one important alphabetic symbol is not included, and second, much software has not yet been adapted to handle two-byte character codes.

ASCII

At the site of the Georg-August University you can find some conventions for electronic encoding of transliteration, and some transliteration fonts. Here we restrict ourselves to common conventions for ASCII encoding of modern styles of transliteration, where each ASCII character either represents itself, or a letter not in the Latin alphabet, or some letter with a diacritical mark that is absent from the ASCII code, according to the following table:

Gardiner
sign
computer
(ASCII)
printed form
in books
G1Aaleph
M17i (or j)i, sometimes with dot replaced by ''crescent moon'' (or simply: j)
D36aayin
V28H''dotted h''
Aa1x''third h''
F32X''fourth h'' (underlined h)
N37Sshin
N29qq or ''dotted k''
"K''dotted k'' (not yet in general use)
V13T''second t'' (underlined t)
I10D''second d'' (underlined d)

In older literature, q is printed as ''dotted k'' whereas in most modern publications it is printed as itself. I strongly support a recent proposal to use K instead of q for representing ''dotted k'', thereby removing the ambiguity.

Apart from the capital letters in the above table (which all represent lower-case letters from the printed transliteration alphabet) no other capital letters should be used in transliterations. A letter is made to be a real capital letter by writing ''^'' in front of it, as for example in ^ra, for the god ''Re'', as opposed to ra, for ''sun''. A recent proposal to specify capital letters by a special XML tag I reject as unduly cumbersome.

Unicode

The Unicode standard is described in:
The Unicode Standard Version 3.0. The Unicode Consortium. Addison-Wesley, 2000.
and on the Unicode home page.

Most symbols from transliteration alphabets can be expressed in Unicode (capital letters in boldface):

Gardiner sign encoding
M17 LATIN SMALL LETTER I + COMBINING COMMA ABOVE
= 0069 + 0313
LATIN CAPITAL LETTER I + COMBINING COMMA ABOVE
= 0049 + 0313
LATIN SMALL LETTER I + COMBINING INVERTED BREVE BELOW
= 0069 + 032F
LATIN SMALL LETTER A WITH DOT ABOVE
= 0227
LATIN CAPITAL LETTER A WITH DOT ABOVE
= 0226
M17*M17 LATIN SMALL LETTER I WITH DIAERESIS
= 00EF
D36 MODIFIER LETTER LEFT HALF RING
= 02BF
LATIN SMALL LETTER A WITH MACRON
= 0101
LATIN CAPITAL LETTER A WITH MACRON
= 0100
G43 LATIN SMALL LETTER U + COMBINING INVERTED BREVE BELOW
= 0075 + 032F
V28 LATIN SMALL LETTER H WITH DOT BELOW
= 1E25
LATIN CAPITAL LETTER H WITH DOT BELOW
= 1E24
Aa1 LATIN SMALL LETTER H WITH BREVE BELOW
= 1E2B
LATIN CAPITAL LETTER H WITH BREVE BELOW
= 1E2A
GREEK SMALL LETTER CHI
= 03C7
F32 LATIN SMALL LETTER H WITH LINE BELOW
= 1E96
LATIN CAPITAL LETTER H + COMBINING MACRON BELOW
= 0048 + 0331
S29 LATIN SMALL LETTER S WITH ACUTE
= 015B
LATIN CAPITAL LETTER S WITH ACUTE
= 015A
N37 LATIN SMALL LETTER S WITH CARON
= 0161
LATIN CAPITAL LETTER S WITH CARON
= 0160
N29 LATIN SMALL LETTER K WITH DOT BELOW
= 1E33
LATIN CAPITAL LETTER K WITH DOT BELOW
= 1E32
V13 LATIN SMALL LETTER T WITH LINE BELOW
= 1E6F
LATIN CAPITAL LETTER T WITH LINE BELOW
= 1E6E
LATIN SMALL LETTER C WITH CARON
= 010D
LATIN CAPITAL LETTER C WITH CARON
= 010C
GREEK SMALL LETTER THETA
= 03B8
LATIN SMALL LETTER T + COMBINING GRAVE ACCENT = 0074 + 0300
D46 LATIN SMALL LETTER T WITH DOT BELOW
= 1E6D
LATIN CAPITAL LETTER T WITH DOT BELOW
= 1E6C
I10 LATIN SMALL LETTER D WITH LINE BELOW
= 1E0F
LATIN CAPITAL LETTER D WITH LINE BELOW
= 1E0E
LATIN SMALL LETTER C WITH CARON + COMBINING DOT BELOW
= 010D + 0323
LATIN CAPITAL LETTER C WITH CARON + COMBINING DOT BELOW
= 010C + 0323
LATIN SMALL LETTER T + COMBINING ACUTE ACCENT
= 0074 + 0301

Note that each letter with a diacritical mark can also be written as a combination of Unicode characters. For example,

LATIN SMALL LETTER A WITH DOT ABOVE = 0227
can also be written as
LATIN SMALL LETTER A + COMBINING DOT ABOVE = 0061 + 0307

An original proposal to add four Egyptological characters to Unicode was N2241 (from 2000-08-27 by Michael Everson). What was eventually accepted, on 2005-11-04, is summarized below:
Gardiner sign encoding
G1 LATIN CAPITAL LETTER EGYPTOLOGICAL ALEF
= A722
LATIN SMALL LETTER EGYPTOLOGICAL ALEF
= A723
D36 LATIN CAPITAL LETTER EGYPTOLOGICAL AIN
= A724
LATIN SMALL LETTER EGYPTOLOGICAL AIN
= A725
On 2006-4-27, this group of characters reached ISO stage 4.

Proposal N3382R (from 2008-08-27 by Michael Everson and Bob Richmond) concerns (among other things) the Egyptological Yod. The future of this proposal is still uncertain at this point.

Some special characters:

MODIFIER LETTER RIGHT HALF RING
= 02BE
''nicht genau bestimmbar, wird nicht ausgesprochen''

which occur e.g. in:

E. Graefe. Mittelägyptische Grammatik für Anfänger. Harrassowitz Verlag, Wiesbaden, 1994.
In the transcription of Egyptian proper names one also encounters the following characters:

LATIN SMALL LETTER A WITH MACRON
= 0101
LATIN SMALL LETTER O WITH MACRON
= 014D
LATIN SMALL LETTER U WITH MACRON
= 016B
LATIN SMALL LETTER E WITH MACRON
= 0113
LATIN SMALL LETTER E WITH BREVE
= 0115

which occur e.g. in:

A. Gardiner. Egyptian Grammar. Griffith Institute, Ashmolean Museum, Oxford, 1957.