An Application of Software Engineering for Reading Linear-B Script

Linear-B script has been studied for sixty years since its decipherment. The laborious efforts of the scholars have revealed many linguistic aspects of the oldest known form of Greek (i


Introduction
The Linear-B script conveys Mycenaean/Danaic Greek, which is the oldest known form of written Greek (Babiniotis, 2002, p. 83).It was devised from the previous Aegean pre-alphabetic scripts (Cretan Hieroglyphics and Linear-A), although not in a simple manner of evolution (Davis, 2010), since it can be regarded as a relative of the previous Linear-A and not as a direct successor (Hooker, 2011, p. 55).
The linguistic importance of Linear-B is significant for Comparative and Historical Linguistics, because it allows the study of the dynamics of an Indo-European language (Greek) for thirty-five centuries (since the 15 th century BC).The texts reveal, as well, many social, cultural and commercial practices/features of the prehistoric Aegean civilizations (Duhoux & Morpurgo Davies, 2008).The corpora are available from Sacconi (1974), Chadwick et al. (1986Chadwick et al. ( -1999)), Bennett & Olivier (1973-76) and Melena & Olivier (1991).

Linguistic Features
The texts of Linear-B are of administrative nature, almost in their entirety, since the script was seemingly devised due to economic necessity (Hooker, 2011, p. 35).The main structure of the texts includes toponyms, anthroponyms, words and pictograms for registered items and signs for numbers (quantities), all of them arranged in catalogues.Because of the structure of these catalogues, the syntactic information concerning Mycenaean/Danaic Greek are very limited, although evidently Greek (Hooker, 2011, p. 132).Published by IDEAS SPREAD The morphological features include nominal and verbal stems of Greek etymology, adverbial prepositions, prefixes, inflectional and derivational suffixes.A few well-known Greek inflectional classes are recognized, while the derivational suffixes are these morphemes that indicate a Greek language wherever the stem is not construed as such , like for example in the word ko-re-te (= kore-τηρ) (Hooker, 2011, pp. 120-121).The lexical features include also the presence of names (anthroponyms and toponyms) that constitute at least 65% of the total number of words (Ventris & Chadwick, 1973, p. 92).Very few of them can be undoubtedly construed as Greek.
The Linear-B script is a syllabary, augmented with pictograms for items and signs for numbers.The syllabic signs (syllabograms) have the phonetic values of the patterns V (for a single vowel) and CV (for a pair of consonantvowel), representing an open syllable, which are very improper for the Greek phonotactics, where clusters of two to four consonants can be frequently encountered, as well as closed syllables with final consonants.This particular feature increases the difficulty of deciphering and interpreting the encountered words, despite the set of spelling rules that had been devised by the scribes, introducing various kinds of misconceptions.Few examples are presented in the next section.

Literature Review
The scale of difficulty for studying Linear-B texts is evident by the suggested spelling errors of the scribes that are found since the prehistoric times on the tablets.In Hooker (2011) there are at least eleven occasions of spelling errors or of wrong etymology observed (Hooker, 2011, pp. 101-102, 106, 109, 111, 113, 119, 124, 125, 132, 135, 251), made by the professional scribes of the Achaean courts.The interpretation is additionally obstructed by the script itself (namely the spelling rules), which doesn't differentiate the phonetic values of most dental (Babiniotis, 2002, p. 86).For instance, what is written as "ra" could also be a [la].In this way, the number of syllabograms is reduced to ninety, considering two of them (No. 34 and No. 35) as mirror images of syllable "ni" (Kenanidis & Papakitsos, 2018, p. 26).A consequence of this phonetic ambiguity though is the alternative interpretations, or the absence of them, in various cases of words and phrases.Three examples are quoted below.
Thus, the suggested word is reconstructed according to the spelling rules of Linear-B, describing a sponge-strainer as a specific bath-item.
The given interpretations have various discrepancies: • The first word (ti-ri-po-de) stands for the tripod.
• The second one (ai-ke-u) probably denotes the name of the owner, the keeper or the manufacturer.
• The last phrase (ke-re-si-jo we-ke) is repeated several times in the collection of the tablets relevant to tripods.
Interpretation (i) has to be excluded: It is presented as a single word, while on the tablet it is clear that there are two words, separated by the standard delimiter (a small vertical line).The rest of the interpretations refer to "Cretan" in a form ("κρήσιος") never found anywhere else but there; it is attempted to be justified because of the prestige of the Cretan products (Hooker, 2011, p. 212).Yet, it looks like the wishful-thinking of the interpreter(s) to discover the name of Crete in Linear-B tablets.The name "Crete" is firstly encountered in Homer (e.g., see: Odyssey 19, 172-180).Some previous Greek names of the island where: "Αερία", "Δολιχή", "Μάκαρις", "Tελχινία" (Byzantii, 1839, p. 274).Other tripods that are described in this set of tablets include those with burned legs ("αποκεκαυμένος σκέλεα") and useless ("νωφελής").So we have those tripods that were useless, with or without burned legs, and those made in Crete that are presumably imperishable!The herein suggested interpretation is: and [e] is omitted since the accompanying vowel of the next consonant is the same one: re), so instead of "κρήσιος" there is "χρήσεως" = "for usage" (in Attic Greek dialect); • we-ke > "Fει-κής" > "εικής" = "proper" (in Attic Greek dialect).Thus, ke-re-si-jo we-ke = "proper for usage, usable" (unlike the useless ones also recorded on the same tablet).
These alternative interpretations motivated the present work: a software tool that can facilitate the interpretation of Linear-B tablets (texts) and one that will be also useful in learning Linear-B, as educational software for Greekspeaking undergraduate and postgraduate students of the Classics Departments, world-wide.

Method
The idea was the creation of a software tool that would allow its user to choose the syllabograms of a word (written in Linear-B) from a virtual keyboard (on screen).After clicking an "interpretation button", the program would return to the user one or more interpretations of the given word, retrieved from a lexicon (database) of Ancient Greek.Especially the Ancient Greek of Homer is the closest form to Mycenaean/Danaic Greek of the Linear-B tablets.The original purpose of this tool is to facilitate the reading of Linear-B tablets and not to write something into Linear-B script.Consequently, the reverse search function of choosing an Ancient Greek word and returning the possible syllabogram sequences is not supported.

Project Analysis
The three main modules of the required tool are: the database (lexicon), the search-engine and the interface (on screen).The software system was developed in two stages.At the first stage, the database was created as a spreadsheet file, using a digital dictionary of Ancient Greek (Symeonidis et al., 2010).An intermediate software tool transliterated the words of Ancient Greek into their equivalent Mycenaean/Danaic (Linear-B syllabic) form (Papamichail, 2012).At the second stage, the search-engine and the interface were developed (Kontogianni, 2014).
Concerning the database, several online lexica of Linear-B are freely available, notably: 2008) is a digital database for many ancient languages.The Linear-B section includes the syllabic form, the Latin transliteration and an English translation.It does not include a Greek interpretation (neither Ancient nor Modern), making the relation of etymology difficult and timeconsuming for a Greek speaker.
• The Linear B lexicon of C. Tselentis (Tselentis, 2012) contains for every lemma: the syllabic form, the Latin transliteration, both the Greek and the English pronunciation and the English translation.Since it is found as a "pdf" file, it cannot be directly used as a database.
• The "Minoan Linear A and Mycencean Linear B" lexicon of Kim Raymoure (2013), contains a searchengine that returns the place where the tablet of the selected word was found, along with an English interpretation (mainly).Based on the work of Ventris & Chadwick (1973), it requires renewal and enrichment.The Greek interpretation is rare.
• DĀMOS is a database of Mycenaean at Oslo (Aurora et al., 2013), aiming at the creation of a digital tagged corpus of all the published texts in Linear-B.It is useful for finding the tablets that the searched word is located.The searching ability is impressive but the visual representation is rather poor, mostly lacking the syllabic form, without interpretation, while the direct transliteration is frequently obscure.
• LiBER (Linear B Electronic Resources) is probably the most comprehensive database of Linear-B (MNAMON, 2016).It provides reach metadata, transcriptions, critical apparatus, photographs, interactive maps about find-spots, chronologies, scribes, inventory numbers and places of preservation.The optimum usage of this database requires on behalf of the user a significant amount of time to get familiar with the abilities of the searching tools.
The common characteristic of these databases is that to use them for learning and reading Linear-B scripts can be more cumbersome than using conventional tools, like an encyclopaedia, at least to the perception of the authors herein.The features of the previous lexica made a new design necessary, tailored to the next presented requirements.The desired features are the good coverage of the material, the presentation of a translation into Greek with etymology or comments and an easy to learn/use interface, suitable for undergraduate or postgraduate students of Classics.

Results
Consequently, the implementation of the design concerns the two main modules of the software system, which are visible to the users: the lexicon and the interface, the latter being the final result of the development process.

The Lexicon
An intermediate tool, encoded in Visual C# 2008 computer programming language (e.g., see Foxall, 2008), transliterated automatically a digital dictionary of Ancient Greek (Symeonidis et al., 2010) into a spread-sheet file (Papamichail, 2012), containing every word in Ancient Greek along with the equivalent standard transliteration of Linear-B in Latin (e.g., "τρίποδες" > ti-ri-po-de = tripods).The lemmas are arranged firstly by size, according to the number of their Latin-form syllables (e.g., tiripode = 4) and then alphabetically.This particular arrangement facilitates a faster searching of the database by the search-engine.The computational transliteration was achieved through the reverse application of Linear-B spelling rules.This unsupervised process presented some inevitable shortcomings.For example, some phonemes of Mycenaean/Danaic Greek (e.g., [kw], [gw], [kwh]) were transformed to others in the post-Mycenaean/Danaic Greek (i.e., {p; t; k}, {b; d; g} or {ph; th; kh}, respectively), according to the context.The reverse transformation could not been encoded.The obsolescence of the Ancient Greek phoneme "F" (= [w]) could not be predicted, hence it was not re-inserted.On the contrary, the insertion of the phoneme "j" between vowels (e.g., "ιερεύς": iereus > i-je-re-u = "priest") was accomplished successfully.Notably, this particular insertion is encountered even nowadays, not only in the Cretan Greek dialect (e.g., "κριός": krios > kri-j-os = "ram") but also elsewhere (e.g., "αέρας": aeras > a-j-eras = "air, wind").
Obviously, the above shortcomings had to be corrected by hand.The original outcome (i.e., the database file) was corrected, modified and enriched in various ways by Kontogianni (2014), with the usage of the sources listed below: • the Linear-B Lexicon of Tselentis (2012); • the Palaeolexicon (2008); • the online dictionary of Greek, from Enacademic (2000Enacademic ( -2017)); • the online dictionary of Ancient Greek by Liddell & Scott (1940); • the dictionary of ancient mythological, historical and geographical proper names by Lorentis (1837); • the word-list of Hooker (2011).Eventually, from the initial 2,755 lemmas of the lexicon: • 410 were corrected; • 1,749 words in Linear-B were inserted; • 170 entries were interpreted, having at least 350 interpretations; • 315 words containing the phoneme "F" were inserted; • 50 words, initiating with [q], having an equivalent form in Ancient Greek, were inserted; • 1,045 entries were tagged/commented.The present version contains 4,504 lemmas (Figure 1), each one consisting of the Linear-B syllabic transliteration in Latin, the Ancient Greek interpretation(s) and, wherever necessary, a commentary that includes: explanatory comments; a tagging as toponym, anthroponym, name of divinity, month or nation; the reference of the original source of the comments from the previous list.The contained information of the lexicon can be utilized through the interface of the software system, in a user-friendly manner.

The Interface
The interface of the application is very simple to use (Appendix A).The entire user-guide is accessible on screen by clicking the "Help" button (Figure 2), on the right side of the window-screen.The Linear-B syllabograms of regular phonetic values are placed on a virtual keyboard in a grid on the left side of the window-screen ("Syllable Signs").The icons are arranged according to the preceding consonants of the syllables, in rows, and according to the following vowel, in columns.The relevant consonants are arranged vertically on the left of the equivalent row, while the relevant vowels are arranged horizontally on top of the equivalent column.The first row of the grid contains the stand-alone vowels.At the bottom of the window-screen, the virtual keyboard is completed with a row of syllabograms of irregular phonetic values ("Special Signs"), appearing under each icon.Although fourteen of the syllabograms are considered to be of unknown or dubious phonetic value (Babiniotis, 2002, p. 85), the present arrangement assigns a suggested phonetic value to all of them, according to the recent classification by Kenanidis (2013).
The users may choose the syllabograms of a word by clicking on the relevant icon of the virtual keyboard.The sequence of the selected syllabograms appears on screen under the label "Linear B Word".Simultaneously, the Latin transliteration of the phonetic values appears under the label "Transliteration".If an icon is selected by mistake, it can be erased by clicking on the button "Correct", on the right side of the window-screen.The searchengine is activated by clicking on the button "Search", under the "Transliteration" place.If the word is found in the Lexicon, the Ancient Greek interpretation and the related commentary appears on the space under the "Search" button, else a failure-message is presented there.Then, the users may click on the button "Clear", on the right side of the window-screen, in order to repeat the search with a new word.
Various messages informing the user for false actions may appear on pop-up windows.By clicking on the last button "Exit", on the right column of the window-screen, the user may close the application and the session, alternatively to the standard "x" button on the upper right corner of the window-screen.The source of the Linear-B tablet's icon is referenced by clicking on the icon.Finally and most importantly, the interpretation session is recorded in a text file that can be printed afterwards.In this way, the entire content of a tablet can be printed in an interpreted form, for facilitating learning or decipherment.Published by IDEAS SPREAD

Discussion
It would be probably interesting to report the various challenges that were encountered during the development of this particular software system and probably of similar ones, as well, under similar conditions.Most of the challenges were initially expected.Consequently, the development process was designed according to a systemic manner, following the relevant model of OMAS-III (Papakitsos, 2013).This particular model combines concepts of software engineering and communication theory, forming a unified working framework, for both software development and project management, with obvious organizational advantages.The respective studied factors had been: the resources and conditions, the required time and mode of implementation, as well as the required persons/experts for a successful implementation.
The first major challenge had been the complete luck of research funds (resources), mainly but not exclusively due to the economic crisis that harasses Greece, since 2010.Thus, the development project had to be carefully planned, in order to be implemented in a sequence of successive postgraduate dissertations (Papamichail, 2012;Kontogianni, 2014), under the same supervisor.The stages had to be of limited interaction, while the whole process spanned a time longer (more than double) than technically necessary.Similar projects cannot be realized without an interdisciplinary approach (persons/experts).Four experts had been involved: a linguist, specialized in ancient scripts, and three linguistic computing engineers.The contribution of the linguist, as presented in subsection 1.3 Interpretation Examples, was crucial for defining the scope of the project.Yet, some comments about the notion of interdisciplinarity could be useful.Because of the prolonged development time and the relative isolation of the team members, since only two out of four were working simultaneously at any given time, interdisciplinarity was better applied by the same person.Namely, one of the experts was a software engineer, specialized in linguistic engineering/computing, and the other was a linguist Published by IDEAS SPREAD specialized in computational linguistics.Needless to say that the supervisor had to be experienced in both disciplines, in order to connect the different parts together, having the overall picture of the project.
A final note regarding the results of linguistic software applications: as it is exhibited by the examples in subsection 3.1 The Lexicon, the more complicated a linguistic application is, the less unsupervised the software can be.Systems that transliterate text from one language (or form of language) to another (herein from Ancient to Mycenaean/Danaic Greek and vice-versa) should be better viewed as computer-assisted transliteration ones instead of a stand-alone/automated application.The involvement of humans is still indispensable, in various parts and tasks of the project, because language rules or phenomena cannot always be captured in an encodable manner.Another relevant example is presented in the next subsection, regarding the usage of this software tool as a decipherment model.

Decipherment Models
Some comments now about the computational efforts for deciphering ancient texts would be appropriate.There is a single work published by Snyder et al. (2010), created for a similar purpose.The opinion of the authors herein is that the work of Snyder et al. (2010) is technically not similar to this project in many (if not all) aspects.Yet, a presentation and comparison is useful, because these two models can be more complementary than competitive.
In a few words, Snyder et al. (2010) have developed a statistical model (henceforth S-Model) for the automatic decipherment of lost languages, tested on Ugaritic, a lost but known Western Semitic language of the 14 th century BC, written in cuneiform consonantal alphabet (Pardee, 2008).The unsupervised software system that is implemented according to S-Model is based on the linguistic similarities of Ugaritic to Hebrew.This system uses an encoded dictionary of Hebrew.The S-Model performs matches of Hebrew words to Ugaritic ones, in order to discover the latter.The results have a remarkable recall accuracy of 90.53%, denoting though that no matter how accurate a system is, it still requires some amount of human intervention for the rest of the text.
Let's proceed to the comparison.The S-Model is designed for unsupervised decipherment of lost-languages, while our lexical model (henceforth L-Model) for active transliteration and teaching of a lost language.S-Model requires a prior digitalization of the text, while L-Model is readily available.S-Model bases the successful decipherment on the existence and selection of another language (i.e., Hebrew) related to the lost one (i.e., Ugaritic).If such a related language is not correctly selected (or even exists) then the decipherment will fail.L-Model just interprets the text of the lost language (i.e., Mycenaean/Danaic Greek) in a more recent and also readable version (i.e., Ancient Greek).That is why the latter is also capable of being a teaching tool, which was one of the two initial goals of its development.The present software system introduces a model (L-Model) for interpreting other existing corpora and teaching the conveyed languages, as well.For example, there are hundreds of thousands of cuneiform tablets existing (in Sumerian, Akkadian, etc.), but only about 10% of them have been interpreted so far, because there are very few experts globally (Watkins & Snyder, 2003).Their interpretation could change History, as we know it.The development of similar software systems for this double purpose could assist significantly the laborious efforts of the involved scholars.

Conclusions
Although the Linear-B tablets are available for more than half a century, there are just a few academic books written in Greek about the Mycenaean/Danaic Greek (Pantelidis, 2012;Hooker, 2011;British Museum, 1987;Chadwick, 1962), a few databases and not a single relevant interactive software system for Greek speakers, up to now.The present one aims at facilitating the learning of Linear-B by Greek university students (or by other students of the Classics Departments world-wide, who can read Ancient Greek), as an auxiliary teaching tool.Since it is supported by a reach, recent and explanatory database of Linear-B in Greek, it can be also used for the study and interpretation not only of the existing Linear-B inscriptions but also of whatever might be discovered in future.Due to various limitations of the available infrastructure, this software tool is not web-based (for the time being), while because of its novelty there are no empirical results regarding the potential research and/or educational benefits from its usage.
The two most important future updates of the system include: the installation of other languages than Greek and the support of the Mycenaean/Danaic Greek inflectional morphology.The installation of other languages is technically easy, by adding to the Lexicon two new columns (for each new language), having the Ancient Greek word and the accompanying comments translated, respectively.In addition, a new button must be attached on the window-screen of the interface, for selecting the relevant language.The support of inflectional morphology is far more difficult.At the moment, the different inflected forms of a Greek word are registered as separate entries.

Figure 2 .
Figure 2. The user-guide window