Project B11:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
◄◄◄Thanks to Fabian Kliebhan our corpora are online again! BZ 15.03.2011 Visible corporaAs one can see from our two annotation examples, one for Old Tibetan and one for Ladakhi, the mere xml-structure is hardly informative and of little use for a person not acquainted with computer linguistics. Even when the tags and their content can be differentiated (e.g. by colours), the annotated texts gets completely lost in the tag forest. As the available linguistic representation tools could not handle our complex text structures, we experimented with some alternative representations. One of the main problems we faced is that most solutions, especially those based on java tools, are far too slow for the (little) amount of text that we have so far produced.The main representation, developed by Frank Müller-Witte with the help of Fabian Kliebhan, runs under a cocoon application on an auxiliary server without exposing the xml-data to the viewer. It shows a field for the Tibetan text (unstructured or with brackets for four consecutive embedded structures), a field for the translation, additional information on clause structures, or a representation of the tree structure. The translation becomes visible when clicking on the red verb number. All verbs in the translation are linked to their Tibetan counterparts in the original text, which makes it easy to navigate. (In the case of higher numbers, you may need to click two times to get back to a verb in the original text.) Green brackets indicate ntNodes (argument NPs or adverbial phrases, AvPs), clauses are indicated by blue brackets. Information concerning the clause structure becomes visible when clicking on the blue opening bracket. The clause type is indicated directly after the closing blue bracket, and clicking on this category will open up the tree display for this clause. The page can be searched through in the normal search mode, an x-path-search is, for the time being, not possible. For the Tibetan text you may need the following special signs: ·, ŋ, ñ, ž, š, ḥ, in Names: Ŋ, Ñ, Ž, Š, Ḥ, for words of Indian origin, you may need aditionally the following signs: ṭ, ḍ, ṇ, ś, ṣ, Ś. Two additional minor fields on the right side should ideally provide lexical information from the accompanying text- specific dictionaries, when clicking on a word in the Tibetan text. Notes to the annotation or the context are indicated by an asterisk, which can be clicked upon to access the information in a sepearate small window (see also the third screen shot below). Time constraints did not allow optimising the representation technically and aesthetically, and we thus apologise for any instance, where these fields do not function properly, due to either a minor fault in programming (FMW) or in the annotation (BZ). For details of the annotation you may contact Bettina Zeisler (email: zeis[ ]uni-tuebingen.de). Note that the representation has been adapted only to Firefox and may not function properly in other browsers. Four small corpora are presently available: Old Tibetan: OTC: The Old Tibetan Chronicle Chapter I RAMA: Fragments of the Tibetan Rāmāyaṇa (preliminary annotation, only graphical trees) Classical Tibetan: TVP: Die tibetische Version des Papageienbuchs Contemporary Ladakhi: LLV: A Lower Ladakhi version of the Kesar epic see also the metadata below. Note: This page is going to be archived and can no longer be updated. The three corpora, however, might have to migrate to another host in the future and the above links might become disfunctional. In this case, please visit the website of Bettina Zeisler at Indologie Tübingen, where the new links will be provided. BZ 08.01.2013
A note on the translations All corpora are supplied with translations for those readers who are not well acquainted with Tibetan. It was not our aim to provide new translations, nor did we have enough time for this task. For this reason we provide the original translations, which, in two cases, happen to be in German. While the translation of the LLV by Anna Theodora Francke needed only small changes to fit into the annotation scheme, the translation of the TVP by Silke Herrmann turned out to be quite problematic and we had to interfere more often than we wished. We have nevertheless kept as much of her translation as possible, respecting the freedom of the author, without, however, underwriting all her solutions. Our changes are marked by square brackets. Similarly, we had planned to use Bacot et al.'s French translation of the OTC. Nathan W. Hill, however, whose main task was the annotation of OTC, was eager to provide a new translation, and since the OTC constitutes a particularly difficult text, this was accepted on the condition that the translation reflects the annotation (or vice versa) so that the translation could be a useful tool in the process of annotating. Unfortunately his translation (published 2006 in the Revue d'Etudes Tibétaines 10: 89-101) does not reflect the annotation. According to the intentions of the author (cf. p. 89, note 2), it also does not, with only one exception, reflect any of the discussions in the project. Since the earlier, pioneering translations, were, at crucial points, also not much better, we eventually decided to provide yet another translation, a translation, however, which does not strive for literary elegance and originality, but is as faithful to the structure of the original as possible. Thus no attempt was made to smoothen out the long chains of intertwining non-finite clauses. We think, however, that this representation has at least the benefit to immediately show the different strategies of representation, such as the mere enumeration of (possibly historical) facts in short simple sentences in § 6, which stands in sharp contrast to the more condensed and complex mythological narration in § 5, which consists of only few sentences, but a lot of embedded structures. Like in literary German, complex sentences may be helpful to represent complex situations, but they may also be used to veil facts and reasons (or the absence of these). And they may be prone to linguistic accidents. Despite, or perhaps because of, sticking slavishly to the text and the grammatical rules, we came in several cases to quite different interpretations than the previous translators. Since our linguistic as well as historical insights might be of some interest to the students of Tibetan history, we shall provide here in advance a version of the annotated translation as pdf.
Metadata (If diacritics are not properly displayed, select the UTF-8 charset on your browser.)
References
Tree views We include some tree graphics (png) generated with the help of the CLaRK tool, which Bettina Zeisler (BZ) used for the annotation. Here again, we faced the problem that CLaRK is not able to open up larger trees, thus we had to divide the text according to the division structure, and sometimes even into smaller parts. Links that go beyond these sections cannot be represented. Further more, the graphics are no longer searchable or dynamic; the dynamic representations in CLaRK itself (very useful for smaller structures) are usually minimised to absolute illegibility. On the other hand, CKaRK allows to redefine the trees for special purposes and to highlight individual properties. BZ has thus designed two colourful sets of trees, one showing the basic information (full structure of sentence, clause with clause categories, ntNode with ntNode categories, case, token, text plus interlinear version, part of speech), the other showing a somewhat reduced structure (sentence, clause, ntNode, token, text and interlinear version) plus the argument structure and the reference-relation of empty or ordinary anaphoric elements to their antecedent. The following colours are currently used for the reference links: red: empty (obligatory) arguments orange: omitted (non-obligatory) arguments blue: demonstrative pronouns dark green: personal pronouns purple: emphatic pronouns yellowish green: pronominal use of adjectives dark purple: empty argument referring to an implied antecedens dark green: omitted argument referring to an implied antecedens black: invalid reference empty argument (the antecedens cannot be decided upon) grey: NP-internal reference
CLaRK tree representations (overview)
Layout: Christoph Singer. Responsible for the content: B. Zeisler. Last modified: 15.03.2011 |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||