Open Access BASE2007

From the corpus to the dictionary. Automatic production of a multilingual information management tool ; Du corpus au dictionnaire. Réalisation automatique d'un outil de gestion de l'information multilingue

Abstract

In this article, we propose an automatic method of constructing multilingual lexico-semantic resources to navigate by meaning through information contained in textual bases of different languages. This method is based on a mathematical model of representation of the meaning called semantic Atlas, which consists of exploiting linguistic relationships between lexical units to construct graphics, projected in a semantic space which is a map denoting the tendencies of a given word. Based on the morpho-syntactical analysis of a corpus, and using the syntactical relationships between the items in the corpus, it is possible to constitute a lexico-semantic resource that describes all the meanings documented in the corpus for the whole of the globe depicted there, thanks to the syntax contexts typical of the entries described. It is also possible to maintain a systematic link between the trends in meaning depicted and the statements that have been used to construct them, and thus to connect all the instances of a word in a given sense to navigate between them. It is also possible, using different language corpus, to build matching resources between languages and navigate between texts through even partial translation of syntax contexts. ; International audience In this article, we propose an automatic process to build multi-lingual lexico-semantic resources. The goal of these resources is to browse semantically textual information contained in texts of different languages. This method uses a mathematical model called Atlas sémantiques in order to represent the different senses of each word. It uses the linguistic relations between words to create graphs that are projected into a semantic space. These projections constitute semantic maps that denote the sense trends of each given word. This model is fed with syntactic relations between words extracted from a corpus. Therefore, the lexico-semantic resource produced describes all the words and all their meanings observed in the corpus. The sense trends are expressed by ...

Report Issue

If you have problems with the access to a found title, you can use this form to contact us. You can also use this form to write to us if you have noticed any errors in the title display.