Corpus and pilot grammar of the Slovene Sign Language

COORDINATOR: dr. Š. Vintar
DURATION: 2011 - 2014

The Slovene Sign Language Corpus and Grammar project will build a balanced and representative corpus of annotated video recordings of the Slovene Sign Language (SSL) and then exploit this corpus to produce, firstly, a standardised inventory of SSL signs and their glosses, and secondly, a pilot grammar of SSL. The collection of corpus data will take existing materials as a starting point and will then collect new recordings from a large balanced sample of native signers in Slovenia. The corpus will collect both spontaneous and prepared signing data, from signers representing all age, gender and regional groups. The data will be hand annotated with glosses using the ELAN toolbox, then the corpus of glosses will be semi-automatically part-of-speech tagged. In the second step we shall construct a frequency list of signs and then determine a core standardised inventory of SSL signs. Furthermore the data will be linguistically analysed in order to establish common syntactic patters and to explore selected areas of SSL semantics and pragmatics. These findings will then be used for the creation of a pilot grammar of SSL. The project will collaborate with the main Slovenian actors in this field: the Research institute of the Slovenian Academy of Sciences as project partner (ZRC SAZU) and the Association of the Deaf and Hard of Hearing of Slovenia, the Association of Interpreters of SSL of Slovenian and the Slovenian Institute for the Deaf and Hard of Hearing as users.

Health Care Interpreting in Slovenia

COORDINATOR: dr. V. Gorjanc
DURATION: 2010 - 2013

In view of the ever greater migration of representatives of different ethnic and linguistic groups not only within the enlarged EU, but also from the communities outside European borders, the issues regarding establishing communication in social services is becoming one of the most pertinent problems of contemporary societies. The challenge of establishing successful communication in medical settings is seen as the most burning issue in the majority of the EU member states, especially after the last enlargement. Since Slovenia became the member of the EU, it has turned into a country of increasing immigration (economic and political). The migrants now come from the linguistic environments that are not familiar to Slovene general public. Many of those migrants come into contact with Slovene health service providers but cannot establish a successful communication, which leads into longer, sometimes even inappropriate treatment and higher costs. The virtual inexistence of healthcare interpreting in Slovenia reflects insufficient legal basis that would organize the field in an integrated way and thus enable its further development. Currently, the establishment of communication in healthcare settings with speakers of languages that are not traditionally understood by the medical personnel is only managed through improvisation and goodwill of all parties involved. The need for the interpretation in health care is consequently only known to those who experience the lack of it. The proposed research aims to respond to this growing need. It will, on one hand, analyze the state of the art of public service interpreting (PSI) in Slovenia and attempt to raise awareness for the need of providing PSI among Slovene health-care stakeholders, and, on the other hand, it shall take up a proactive approach by fulfilling all the necessary conditions for the implementation of a training programme for healthcare interpreters which would correspond to the specific needs of Slovenia. Thus, it shall respond to a clear demand, since more and more Slovene healthcare providers have to address the needs of patients who do not speak the Slovene language or languages in which healthcare providers can communicate. The main research objectives of the proposed project therefore are: 1. a review and analysis of the state of the art of public service interpreting in Slovenia; 2. compilation of a list of literature dealing with healthcare interpreting and related issues in Slovenia, critical discussion of the legislation dealing with public service interpreting; 3. the possible use of IT tools facilitating interpreting in Slovene healthcare settings will be explored; 4. awareness raising activities to inform healthcare stakeholders, healthcare providers and users of the healthcare services of the need for healthcare interpreting and establish the ground for that activity in Slovenia; 5. a design of the curriculum for health care interpreting for Slovenia and preparation of all the documentation; 6. implementation and evaluation of a curriculum for health care interpreters; preparation of teaching material; selection and training of trainers; 7. a design of a proposal how to organize the network for healthcare interpreting provision service in Slovenia; 8. dissemination of the results of the project. The proposed project group shall include researchers employed by the Department of Translation Studies, University of Ljubljana, who were also actively involved in the European LLL Project MedInt - Development of a curriculum for medical interpreters (134007-LLP-2007-AT-GRUNDTVIG-GMP), and by the Department of Slavonic Studies, since the most numerous potential users of interpreting in healthcare system are speakers of the South-Eastern Europe. The partners of the consortium are also researchers from the most important medical institution in Slovenia, the University Medical Centre in Ljubljana.


Slovene Translation Studies: Sources and Research

DURATION: 2009 - 2012
The project has a significant impact on the future development of translation studies and linguistics in Slovenia and elsewhere. The Slovene translation scholars have gained the first translational corpus, which is an invaluable resource to explore the properties of translations as compared to originals. Our corpus is a particularly interesting resource because of the fact that it contains contemporary literary works not available in other similar collections nor on the web. The initial research results yielded by the corpus are presented in the collected volume Slovenski prevodi skozi korpusno prizmo (Slovene translations from the corpus-based perspective). They demostrate some surprising and novel properties of translations, and epecially they show a semantic variety and linguistic creativity that is in no way inferior to that of original texts. Extensive research studies and comparisons will continue in future and may show differences in translation strategies between different author/translator generations, genres and publisher policies.
The project results are directly relevant for academic institutionsinvolved in training future translators. Translation strategies are often subconscious, and if we explore them with the help of quantitative data derived from representative text collections such subconscious beliefs about translation equivalence often need to be revised. Studies performed so far show a large discrepancy between the translation equivalents listed in bilingual dictionaries and those found in the corpus. In translator training a corpus can help develop selectivity and criticism for using classical reference works, self-reliance and originality in seeking translation solutions for cases where equivalence cannot be ensured on the same linguistic level and calls for compensation strategies. The development of translation studies in Slovenia will reach a new level and will become more directly comparable with some other languages, for which similar projects and studies have already been performed. In the long run we expect for the project to have an impact for related disciplines, because the SPOOK corpus presents an interesting resource for contrastive linguistics, literary studies, cultural studies and language technologies.


Slovene Termonology Portal

COORDINATOR: dr. V. Gorjanc
DURATION: 2007 - 2009
Web page: STP
Web page: STP
Particularly in the European environment, information society became a major challenge which instigated a lasting interest in the formulation of principles and methods which tackle the problem of transferring various information related to language. The recognition of the need for free communication which can be realized only within one's own mother tongue lead to the widely accepted principle of assuring creative freedom of each individual in his/her own language, together with the possibility of exchanging information with other languages. This is true for all domains of human activity and creativity, in a knowledge-based society, however, scientific and professional communication – where one is confronted with the so-called specialized knowledge – is particularly emphasized. Terminology management is very important for this type of communication, with a specialized dictionary for each specific professional or scientific field lying at its core.


Language-independent methods for automatic construction of semantic lexicons from comparable corpora

DURATION: 2010 - 2012

As the amount and importance of electronic documents are increasing, efficient handling of them without computer support is becoming unfeasible. That is why numerous computer applications have been developed that classify documents into groups according to their content, retrieve information from large text collections, generate abstracts of long texts, translate documents from one language to another, etc. However, such technological solutions require a certain degree of language understanding. This can be achieved with collections in which human knowledge is organized in a way that enables computers to access the meaning of words and phrases and understand the relations between them. One of the most popular concept-based lexicons that organizes concepts into a network with lexical and semantic relations is wordnet (Fellbaum 1998). It was first developed for English, after which wordnets for more than 50 other languages were constructed. Among them is a wordnet for Slovene which I developed durimg my PhD. I wish to further work on the Slovene semantic lexicon with the proposed post-doctoral research, the aim of which is the development of a methodology for the construction of wordnets from comparable corpora. They are becoming an increasingly popular resource in computational, corpus and contrastive linguistics, as well as translation studies. While parallel corpora are a preferable resource, only a few are available and are typically very limited in size, language pairs and domains (McEnery and Xiao 2006). In the post-doctoral project I propose to focus on Wikipedia as a comparable corpus. Under the premise that articles which describe the same concept use very similar words to do so, it is possible to take advantage of the standardized article structure, keywordness, cognates and other statistical measures in order to estimate translation equivalence between words in different languages and extract a multilingual lexicon from the comparable corpus (Sharoff 2008). Such an approach can handle monosemous as well as polysemous vocabulary and is suitable for multi-word terms and named entities as well, which change the too rapidly to be suitably represented by traditional dictionaries and glossaries. The proposed post-doctoral research project consists of several phases. In the first phase, I will transform Wikipedia into a multilingual comparable corpus. The second phase of the project is the extraction of the multilingual lexicon from the corpus from the Slovene part of Wikipedia. The extracted lexicon entries will then be assigned a wordnet id and Slovene synsets will be generated. In the third part of the project, the method and the constructed resource will be evaluated in an application for automatic word-sense disambiguation. The results of the evaluation will give insight into the suitability of the constructed Slovene semantic lexicon for practical tasks. To my knowledge, there has been no research into comparable corpora in the field of Slovene lexical semantics. This is why the proposed project presents an important milestone in Slovene corpus linguistics as well as human language technologies. Not only will the result of the project be an established, tested and language-independent methodology for the extraction of translation equivalents from comparable corpora; the project will also bring a highly palpable result in the form of a Slovene semantic lexicon that is aligned to wordnets in many other languages and therefore useful for mono- as well as multilingual applications. In this way, the developed wordnet will bridge the gap in the field of Slovene language resources and provide the foundations for a broader, semantically-enriched exploitation of Slovene corpus resources.

The task of the translator as the common hermeneutical task

DURATION: 2001 - 2004

The research project will attempt to show that every translation is the result of a particular understanding of the meaning of the original text. Translation will thus denote any comprehension or appropriation of the text, where readers or translators translate the contents of the texts and its textual world from the realm of the otherness into the domestic realm. The task of the translator, as understood in this project, is thus not only the remodelling of the text while transferring it from one linguistic code to another but also the transformation of the otherness. As such, the task of the translator does not differ from any hermeneutical act. On one hand, the research will focuss on the rewordings or intralinguistic translations of English Medieval texts which like translations from foreign languages reveal the translator's specific reading of the text. On the other hand, the research will investigate various interpretations of translators from one language to another, and attempt to show how individual abilities of a particular translator are much more important and relevant for the quality of the translation than other socio-cultural factors such as nationality or mother tongue of the translator. Since every critical approach to translations and the awareness of the fact that every translation is an interpretation demands theoretical discourse, one of the main aims of the project is also to create the work which would present to the Slovene readership a selection of the principal theoretical writings of the theory of translation. Researcher: Nike Kocijančič Pokorn

Contrastive analysis of discourse strategies in Slovene and French

COORDINATOR: dr. M. Š. Brezar
DURATION: 2001 - 2004

he analysis of spoken and written discourse, namely text in context of uttering or writing, engenders new knowledge in conception of communication. That approach, widespread in Anglo-saxon and French linguistics, has only a little impact on teaching and translating in Slovenia.The present research deals with analysis of spoken and written discourse. It will go on in three modules, phonological, morphosyntactic and discourse module. The research is based on the hypothesis that in discourse itself there are mechanisms that engender strategies, used by speakers to achieve linguistic and extralinguistic goals in communication. Strategies are a part of all modules and are realised mostly interclausally and intertextually. That is why the pragmatic approach will be used.Analysis of texts in context will be executed on comparable typical recorded and transcribed texts in French and Slovene. That will lead to a general sociocultural comparative study. Analysis results will be compared with results in already existing corpora of written texts Tresor de la langue francaise and Fida (corpus of Slovene).Phonological module will include research in segmental (phonetics) and prosodic comparative phonetics. That research enables formal description of separate levels in different language systems. The most remarkable difference between Slovene and French is seen at the prosodic level, including accentuation and intonation. Besides the intonation, the accent in French plays an important role in forming of discourse strategies.Morphosyntactical module includes the analysis of syntactic and morphologic facts that in their realisation depend on the context, namely definiteness, word order, use of connectives, particles and adverbs in Slovene and French. The results of contrastive analysis will help to correct mistakes of Slovenes while learning or translating. A study of Slovene mistakes in learning of French will be made.In discourse module, argumentative strategies, that are a part of linguistic consciousness of a speaker, will be stressed. The importance of modality that is expressed in French mostly with adverbs, sometimes combined with connectives, and in Slovene with modal particles, will be shown. This module also provides application of results in teaching at the university level.