| 
 Abstract
 
     he types-tokens ratio (TTR), which is calculated by dividing the number of different word forms (types) in a text by the total number of the words
    (tokens), roughly characterizes the lexical variety of the text. This makes it intriguing to compare this parameter in the original texts and Sin translations from the theoretical and practical points of view. After analyzing our proper empiric material, four Spanish-Ukrainian translations and
    four Ukrainian-Spanish translations compared with their respective originals, along with the results of other researchers in different language
    combinations, it turned out that TTR modifications show common tendencies depending on the typological characteristics of the source language and target
    language, and the direction of translation, rather than the lexical variety of the text. 
    1. Introduction
 
    There have been a number of attempts to describe the quantitative characteristics of vocabulary, both of a language or sublanguage system and of the
    language of a particular author, from the point of view of the number and frequency of types, tokens, and and lexemes. We will not even attempt to
    offer a more or less exhaustive list of these. Gradually, with the development of corpus linguistics, theorists of translation studies have picked up
    several quantitative ideas from linguistics and, trying to make their criteria for evaluating the differences between original and target texts and the
    lexical-stylistic adequacy of the translation objective, started to calculate those parameters, which was easily accomplished using computing techniques.
    Within the last decade there has been a boom in the amoount of research into quantitative parameters in translation studies due to the use of electronic
    corpora. Although attempts to create corpora started before, corpus-based research did not emerge until the late 90s (Kruger, 70), being theoretically
    generalized in translation studies by M. Baker (Baker, 175-186) and other theoreticians. The availability of copora, their relatively easy compilation
    and/compatibility with personal computers meant that investigations carried out by individual researchers, even with manually compiled corpora, became
    possible and popular.
 
      With the use of electronic text analysis tools, it became possible to calculate the number of words and also word-forms (‘types,’ also called
    ‘orthographic words’) of a text automatically, literally with one click of a mouse, which was previously possible only by the method of total
    continuous extraction of samples or the like. Logically, the number of types compared to the total number of the words in a text will give a coefficient
    indirectly indicating the lexical richness of a text. “The higher the ratio, the more varied the vocabulary, i.e. the implication is that
    there is little repetition” (A.Kruger, 74). This coefficient, obtained by dividing the number of types by the number of tokens (also called
    ‘running words’), was first named TTR (types/tokens ratio), presumably by M.Templin in 1957 in the area of language didactics (cit. after Rhea,
    2007, 476), highlighting a wide field for investigations in particular and general translation studies with the purpose of unveiling one more universal
    parameter of translation.| TTR can be a useful parameter for comparing translations with the respective originals from the practical and theoretical points of view. | 
 
    For example, the sentence “I have to buy some bread, because I have no bread” is stylistically awkward, and its TTR is low. (three word forms
    are repeated, TTR = 8/11 = 0,73), whereas “I’ve run out of bread, so I need to buy some” is much better stylistically and richer lexically,
    and its TTR is higher (only one word form is repeated, TTR = 10/11 = 0,91).
 
    However, this rule is only applicable with reservations expressed by words such as ‘likely’ and ‘indirectly.’ And we should add another implication to that of A.Kruger that there are few repetitions of different types of the same lexeme. As the number of types may be quite extensive
    due to the large number of grammatical forms for a lexeme in inflexional or incorporating languages, TTR should be very sensitive to the variety of grammatical
    forms in the text. For instance, in the Present tense of Indicative Mode in English a verbal lexeme presents two flexed forms; in Ukrainian, as well as in
    Spanish, six forms. That’s why a high TTR may indirectly indicate not only lexical richness, but also grammatical (morphological) richness. A natural
question is: which of the factors, lexical or grammatical richness, is more significant in a TTR? In spite of this doubt, it would be hard to deny that    for the same language a text characterized by a higher TTR is certainly richer from the lexical point of view. However, the same statement is
    questionable when comparing texts in different languages, as usually happens in translation.
 
    
        2. Related works and discussion
 
    It has been stated that translated texts in a language differ from their original by a lower TTR (V.Pápai, 157), which can suggest that they are less
    rich lexically.
 
    For instance, V.Pápai, having researched explicitation strategies in translation using four English-Hungarian fiction translations in her work
    “Explicitation: a universal of translated text?” argues that TTR is lower in translated texts than in non-translated text in Hungarian
    (V.Pápai, 159). But this does not necessarily mean that TTR should be lower in a translated text compared with the original. For instance, A.Kutuzov,
    from Tyumen State University, shows that in English-Russian translation the TTR becomes higher (A.Kutuzov, 10). Meanwhile, A.Kruger demonstrates that in
    English-German translations the TTR is lower than in the original (his empiric base was four Shakespeare texts) (A.Kruger, 74). So, a preliminary
    theoretical analysis suggests that TTR changes show a noticeable dependence on the language combination. As these changes may also depend on the
    translation direction, in the present research we are attempting to examine this hypothesis using both Spanish-Ukrainian and Ukrainian-Spanish
    translations, as well as trying to reveal the regularities of these dependences.
 
    Before introducing our results, we should first stop and think about the strengths and weaknesses of the TTR comparison method to describe the lexical
    richness of a text. It must be accepted that this method is too simple and approximate. It is undeniable that this ratio is sensitive to text or corpus
    length. The longer a text, the more likely it is that words will be repeated, thus lowering the ratio; thus, in short texts this ratio is not
    representative. This ratio is widely used, since it can be easily calculated by any text analysis tool and the functioning of these tools does not depend
    on the language system.
 
    However, a high number of types does not necessarily mean a high number of lexemes. To be more exact, if we want to calculate the lexical variety of a
    text, we should divide the number of lexemes (i.e. their respective lemmas used in a text) by the number of tokens. Since it is quite time-consuming to calculate
    the lexemes, their number is usually not taken into account. Let us incidentally note that also the number of lemmas can be calculated by specific
    software known as lemmatizer, which is designed for every language separately and usually requires a time-consuming work of processing a great number of
    morphological rules, exceptions, and vocabulary. It is usually not freeware. Thus, the TTR seems to show easily, indirectly, and roughly the variety of
    words, rather than the lexical richness of a text; it is “a simple indication of the superficial lexical complexity of a text” (Munday 1998:4)
    along with its grammatical complexity--we might add. In spite of the above, we do not deny by any means its theoretical usefulness. A. Kutuzov, for
    instance, after researching the variation of TTR from the original to the translated text, concludes that their graphs are
    extremely similar from chapter to chapter (A.Kutuov, 8-9). A. Kutuzov’s method by itself can be another useful tool to ‘measure’ the adequacy of translation.
    Unfortunately, we cannot afford to concentrate here on other important and interesting uses of TTR, although they do exist.
 
    
        3. Hypothesis
 
    As shown above, the number of types in a text may depend on two basic factors: the number of lexemes and the number of different grammatical forms. Hypothetically, in
    non-flextional and incorporating languages the TTR should be higher, as the same lexeme will present a wide number of types, while in ‘more
    analytic’ languages the TTR should be lower and tending to approach the ‘lemmas/tokens’ ratio, since most lemmas would present only one type
    (type number ≈ lemmas number). This hypothesis (hypothesis #1), both plausible and logical, we suppose, will not present serious contradictions,
    although it is still to be proven in the area of contrastive linguistics. It needs to be tested by comparing original (untranslated) texts in different
    languages with the same or similar content, such as international agreements, constitutions, laws, similar literary genres etc. Nevertheless, as indicated
    above, we have seen a clear dependence of the changes in TTR on the language combination and translation direction. Ch.Ho-Jeong has
    observed in English-Korean and Korean-English translations that several changes, such as contraction/expansion of the text, depend on the
    direction of translation (Ch.Ho-Jeong, 362). On the other hand, E. Kelih, investigating translation of a Russian novel into 11 Slavic languages (E.Kelih,
    179) implicitly proves that the TTR changes depend on the source and target languages. Let us incidentally note that we deduced that by attentively reading 
    his article, because the researcher miscalculated the TTR by confusing the divisor and the dividend.
 
Our actual hypothesis (hypothesis #2) will refer to translation studies, not to contrastive linguistics: when the degree of synthetism of the language increases from the original to the translation, the TTR rises, and, vice versa,    when the degree of synthetism decreases from the original to the translation, the TTR will decrease. If hypothesis #2 is correct, it may also
    indirectly confirm hypothesis #1.
 
    
        4. Empirical test
 
    Assuming that hypotheses #1 and #2 are correct, i.e., when translating from an analytic language into a flexional one, the TTR rises, and, vice versa, when
    translating from an inflexional or incorporating language into a more analytic one, the TTR decreases, our hypothesis is true (naturally, there should be
    room for exceptions for the influence of extralinguistic factors). If we deal with Spanish and Ukrainian texts, Spanish is a more analytic language
    compared to Ukrainian. After analyzing four Spanish-Ukrainian and four Ukrainian-Spanish fiction translations, we obtained the following results:
 
    Table 1. TTR changes in Spanish-Ukrainian and Ukrainian translation.
 
    
        
            | 
                    Work
                 | 
                    Total types in the original
                 | 
                    Total tokens in the original
                 | 
                    Types / tokens ratio in the original
                 | 
                    Total types in the translation
                 | 
                    Total tokens in the translation
                 | 
                    Types / Tokens Ratio in the translation
                 | 
                    TTR change
                 | 
                    Translator
                 | 
                    Confir-rmation
                 
                    of the hypothesis
                 |  
            | 
                    Spanish – Ukrainian translation
                 |  
            | 
                    G.García Márquez “El amor en los tiempos del cólera”
                 | 
                    15 352
                 | 
                    145 108
                 | 
                    0,1058
                 | 
                    28 357
                 | 
                    126 394
                 | 
                    0,2244
                 | 
                    0,47
                 
                    (rises)
                 | 
                    V.Shovkun
                 | 
                    +
                 |  
            | 
                    B. Pérez Gadós “Doña Perfecta”
                 | 
                    11 117
                 | 
                    65 177
                 | 
                    0,1705
                 | 
                    15 827
                 | 
                    54 474
                 | 
                    0,2905
                 | 
                    0,59
                 
                    (rises)
                 | 
                    Zh.Konye-va
                 | 
                    +
                 |  
            | 
                    P.A. de Alarcón “El sombrero de tres picos”
                 | 
                    5 572
                 | 
                    25 768
                 | 
                    0,2162
                 | 
                    7 303
                 | 
                    20 622
                 | 
                    0,3541
                 | 
                    0,61
                 
                    (rises)
                 | 
                    Zh.Konye-va
                 | 
                    +
                 |  
            | 
                    P.A. de Alarcón “El sombrero de tres picos”
                 | 
                    5 572
                 | 
                    25 768
                 | 
                    0,2162
                    
                 | 
                    6 881
                 | 
                    20 117
                 | 
                    0,3420
                 | 
                    0,63
                 
                    (rises)
                 | 
                    L.Dobryan-s’ka, L.Kolesnyk
                 | 
                    +
                 |  
            | 
                    Ukrainian-Spanish translation
                 |  
            | 
                    І.Франко “Захар
                    Беркут”
                 | 
                    13 352
                 | 
                    50 372
                 | 
                    0,2651
                 | 
                    10 049
                 | 
                    61 472
                 | 
                    0,1635
                 | 
                    -0,62
                 
                    (decreases)
                 | 
                    S.Ryzva-niuk
                 | 
                    +
                 |  
            | 
                    М. Коцюбинський “Тіні
                    забутих предків”
                 | 
                    6 197
                 | 
                    15 766
                 | 
                    0,3811
                 | 
                    5 639
                 | 
                    26 027
                 | 
                    0,2167
                 | 
                    -0,57
                 
                    (decreases)
                 | 
                    J.Bory-syuk
                 | 
                    +
                 |  
            | 
                    О.Довженко
                    “Зачарована Десна”
                 | 
                    6 081
                 | 
                    15 956
                 | 
                    0,3811
                 | 
                    5 523
                 | 
                    19 828
                 | 
                    0,2785
                 | 
                    -0,73
                 
                    (decreses)
                 | 
                    R.Hupalo
                 | 
                    +
                 |  
            | 
                    Ю. Яновський
                    “Вершники”
                 | 
                    10 122
                 | 
                    27 123
                 | 
                    0,3732
                 | 
                    8 647
                 | 
                    38 325
                 | 
                    0,2263
                 | 
                    -0,6
                 
                    (decreases)
                 | 
                    S.Ryzva-niuk
                 | 
                    +
                 |  
    As we see from the Table 1, the TTR decreases in all instances of the Ukrainian-Spanish translation direction and it rises in all instances of the
    Ukrainian-Spanish translations of our corpus. This tendency does not seem to depend on the translator.
 
    
        5.
    
     
    Data interpretation and generalization
 
    As we can see from Table 1, the results of the randomly chosen eight texts and their respective translations prove that TTR rises in Spanish-Ukrainian
    translation and it decreases in the opposite direction. This seems to be, if not a universally valid, but quite a clear tendency for this pair of
    languages. As this conclusion is valid solely for Spanish-Ukrainian and Ukrainian-Spanish translations, in order to extrapolate the results from different
    particular theories into the general one, we propose a table which will clearly indicate the general tendency. We’ve gathered several
    researchers’ results in Table 2.
 
    Table 2. TTR changes in translation within different language combinations.
 
    
        
            |  | 
                    Direction of translation
                 | 
                    Degree of synthetism of the target language
                 | 
                    TTR
                 | 
                    Researcher
                 | 
                    Confirmation of the hypothesis
                 |  
            | 
                    1
                 | 
                    English-Russian
                 | 
                    Rises
                 | 
                    rises
                 | 
                    A. Kutuzov
                 
                    (Kutuzov,10)
                 | 
                    +
                 |  
            | 
                    2
                 | 
                    English-German
                 | 
                    rises
                 | 
                    decreases
                 | 
                    A. Kruger
                 
                    (Kruger, 74)
                 | 
                    -
                 |  
            | 
                    3
                 | 
                    Spanish - English
                 | 
                    decreases
                 | 
                    decreases
                 | 
                    J. Munday 1998, (Munday 4)
                 | 
                    +
                 |  
            | 
                    4
                 | 
                    English-Chinese
                 | 
                    decreases
                 | 
                    decreases
                 | 
                    Y. Tsai (Tsai, 75)
                 | 
                    +
                 |  
            | 
                    5
                 | 
                    English-Polish
                 | 
                    decreases
                 | 
                    decreases
                 | 
                    R. Uzar
                 
                    (R. Uzar, 259)
                 | 
                    +
                 |  
            | 
                    6
                 | 
                    Russian- Macedonian
                 | 
                    decreases
                 | 
                    decreases
                 | 
                    E. Kellih
                 
                    (Kelih, 179)
                 | 
                    +
                 |  
            | 
                    7
                 | 
                    Russian-Serbian
                 | 
                    decreases
                 | 
                    decreases
                 | 
                    E. Kellih
                 
                    (Kelih, 179)
                 | 
                    +
                 |  
            | 
                    8
                 | 
                    Russian-Bulgarian
                 | 
                    decreases
                 | 
                    decreases
                 | 
                    E. Kellih
                 
                    (Kelih, 179)
                 | 
                    +
                 |  
            | 
                    9
                 | 
                    Russian-Slovene
                 | 
                    decreases
                 | 
                    decreases
                 | 
                    E. Kellih
                 
                    (Kelih, 179)
                 | 
                    +
                 |  
            | 
                    10
                 | 
                    Russian-Croation
                 | 
                    decreases
                 | 
                    decreases
                 | 
                    E. Kellih
                 
                    (Kelih, 179)
                 | 
                    +
                 |  
            | 
                    11
                 | 
                    Spanish-Ukranian
                 | 
                    Rises
                 | 
                    rises
                 | 
                    S. Fokin
                 
                    (the present study)
                 | 
                    +
                 |  
            | 
                    12
                 | 
                    Ukranian-Spanish
                 | 
                    decreases
                 | 
                    decreases
                 | 
                    S. Fokin
                 
                    (the present study)
                 | 
                    +
                 |  
            | 
                    13
                 | 
                    Finnish-Russian
                 | 
                    decreases
                 | 
                    decreases
                 | 
                    M. Kopotev
                 
                    (Копотев, 379)
                 | 
                    +
                 |  
    Therefore, the general picture mostly confirms the hypothesis. Exception number 2 (English-German translation, A. Kruger’s data) may have an
    explanation in extralingustic factors. However, we consider, try as we might, this will remain a tendency and not a general rule, because translation
    can be a conscious process, so that sometimes translators could consciously influence the TTR index for their own reasons, for example, trying
    to show the richness of their vocabulary or that of their native language, while it is quite absurd to imagine that a translator would try to artificially increase the number of grammatical forms in the translated text.
 
    We cannot deny that translated texts show a lower TTR in comparison with the original texts, but ‘the third code,’ evidently, is not the only
    factor that influences the changes in TTR in translation; the typological differences between the source language and the target language turn out to be
    a much more powerful factor.
 
    
        6. Conclusion
 
    TTR can be a useful parameter for comparing translations with the respective originals from the practical and theoretical points of view. Changes in the TTR in
    translation can indirectly indicate modifications in the lexical variety; thus, it can be important for roughly evaluating this aspect of the adequacy of
    translation, as well as the translator’s and the author’s idiostyle. Much more significant, from our point of view, is its theoretical significance. Apart from
    being a universal fact that translated text is characterized by a lower TTR than the original (consequently is less varied lexically), the change in the TTR in
    translation follows a common tendency. When translating from an analytic language into a more synthetic one, the TTR rises; translating in the opposite
    direction, it decreases. While this is a strong tendency, it is not a universal law because of the strong influence of extralinguistic factors. In order
    to make this kind of research more precise, the ratio of the number of lemmas used to the number of tokens (lemmas-token ratio) should be applied when
    evaluating the lexical richness of a text in the original and the translation, although this method is further complicated by the lack of lemmatizers for
    several languages and their high cost.
 
    
        References
 
    Baker, M. (1996). Corpus-based translation studies: The challenges that lie ahead. In Terminology, LSP, and translation: studies in language
    engineering in honour of Juan C. Sager. – Amsterdam: John Benjamins. – P. 175-186.
 
    Dovzhenko, A. (1972). El Desná encantado / Traducido por R. Hupalo. – Kiev: Dnipro.– 86 p.
 
    Franko, I. (1983). Zakhar Bérkut / Traducido por S. Ryzvaniuk. – Kiev: Dnipro. – 199 p.
 
    García Márquez G. (1986). El amor en los tiempos del cólera. – La Habana: Arte y literatura sólo para Cuba. – 460 p.
 
    Ho-Jeong, Ch. (2006). Target Text Contraction in English-into-Korean Translations: A Contradiction of Presumed Translation Universals? In Meta:
    journal des traducteurs, vol. 51 – n° 2. – P. 343-367.
 
    Janovskyj, J. (1982). Los jinetes. / Traducido por S. Ryzvaniuk. – Kiev: Dnipro. – 127 p.
 
    Kelih, E. (2009). Preliminary Analysis of a Slavic Parallel Corpus. – NLP, Corpus Linguistics, Corpus Based Grammar Research. Fifth
    International Conference Smolenice, Slovakia, 25-27 November 2009. – Bratislava: Tribun. – P. 175-183. (Accessed online on 21 November
    2012 at 
    
        http://www.uni-graz.at/emmerich.kelih/Publikationen/2009_slovko_slavic_parallel_corpora_kak_zakaljalas_stal_kelih.pdf
    
     )
 
    Kotsiubinskiy, M. (1972). La sombra de los antepasados olvidados y otros relatos / Traducido del por J. Borysiuk. – K.: Dnipro. – 330 p.
 
    Kruger, A. (2002). Corpus-based translation research: its development and implications for general, literary and Bible translation. In. Acta
    Theologica, Supplementum 2. – P. 70-106.
 
    Kutuzov, A. (2010). Change of word types to word tokens ratio in the course of translation (based on Russian translations of k. Vonnegut's novels).
In International Computational Linguistic Conference “Dialog-21” (Accessed online on 21 November 2012 at    http://arxiv.org/ftp/arxiv/papers/1003/1003.0337.pdf)
 
    Munday, J. (1998). A computer-assisted approach to the analysis of translation shifts. In Meta: journal des traducteurs, vol. 43, n° 4. –
    P. 542-556.
 
    Pápai, V. (2004). Explicitation: a universal of translated text?
    In Translation universals: Do they exist? / Edited by Anna Mauranen, Pekka Kujamäki. – Amsterdam. – John Benjamins B.V., 2004. – P.
    145-164. 
 
    Pérez Galdós, B. (1964). Doña Perfecta. – Москва:
    Радуга. – 276 с.
 
    Rhea, P. (2007). Language disorders from infancy through adolescence: assessment & intervention, 3rd edn. – St. Louis:
    Mosby/Elsevier. – 784 p. 
 
    Tsai, Y. (2010). Text Analysis of Patent Abstracts. In The Journal of Specialised Translation. – Issue 13. – National Taiwan University.
– P. 61-80. (Accessed online on 21 November 2012 at    http://www.jostrans.org/issue13/art_tsai.pdf)
 
    Uzar, R. (2002). A Corpus Methodology for Analysing Translation. In Cadernos de Tradução. – Universidade Federal de Santa Catarina.
    – P. 237-265.
 
Аларкон, П.-А. (1958).    Трикутний капелюх / пер. Л.
    Добрянської і Л. Колесник.
    – К.: Державне видавництво
    художньої літератури. – 80 с.
 
Аларкон, П.-А. (1983).    Трикутний капелюx / пер.
    Ж.Конєвої. – К.: Дніпро. – 176 с.
 
Ґарсія Маркес, Ґ. (1999).    Кохання в час холери. –
    Львів: Класика. – 346 с.
 
    Довженко, О. (1957).
    
        Зачарована Десна.
        Кіноповісті
    
    . – Київ.: Радянський
    письменник. – С.459-507.
 
    Копотев, М. (2010).
    
        Я никогда не буду так
        говорить. Языковая
        компетенция и языковая
        рефлексия американской
        финки из СССР
    
    . In Slavica Helsingiensia 40 Instrumentarium of Linguistics. Sociolinguistic Approach to Non-Standard Russian. – Helsinki. - (Accessed online on 21
    November 2012
    
        http://www.helsinki.fi/slavicahelsingiensia/preview/sh40/pdf/26-sh40.pdf
    
    ).
 
Коцюбинський, М. (1989).    Тіні забутих предків. In
    Подарунок на іменини.
    Оповідання, новели,
    повісті. – К.
 
Перес Гальдос, Б. (1978).Донья Перфекта    ; Сарагоса / пер. Ж. Конєвої
    – Київ: Дніпро, 1978. – 350 с
 
Франко, І. (1994).    Захар Беркут: Роман /
    Микола Костомаров.
    Чернигівка: Повість –
    Київ: Укр. Центр духовної
    культури,. – 312 с.
 
Яновський, Ю. (1984).    Оповідання, романи, п'єси.
    – Київ: Наукова думка. – 578 с.
 |