All Greek to me! An automatic Greeklish to Greek transliteration system

Presented at the Fifth International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy, 24–28 May 2006

Aimilios Chalamandaris, Athanassios Protopapas, Pirros Tsiakoulis, & Spyros Raptis
Institute for Language & Speech Processing / Athena

This paper presents research on “Greeklish,” that is, a transliteration of Greek using the Latin alphabet, which is used frequently in Greek e-mail communication. Greeklish is not standardized and there are a number of competing conventions co-existing in communication, based on personal preferences regarding similarities between Greek and Latin letters in shape, sound, or keyboard position. Our research has led to the development of “All Greek to me!,” the first automatic transliteration system that can cope with any type of Greeklish. In this paper we first present previous research on Greeklish, describing other approaches that have attempted to deal with the same problems. We then provide a brief description of our approach, illustrating the functional flowchart of our system and the main ideas that underlie it. We present measures of system performance, based on about a year’s worth of usage as a public web service, and preliminary research, based on the same corpus, on the use of Greeklish and the trends in preferred Latin-Greek letter mapping. We evaluate the consistency of different transliteration patterns among users as well as the within-user consistency based on coherent principles. Finally we outline planned future research to further understand the use of Greeklish and improve “All Greek to me!&rdquo to function reliably embedded in integrated communication platforms bridging e-mail to mobile telephony and ubiquitous connectivity.