Summary: | A novel approach to Machine Translation (MT) called <i>Shake-and-Bake</i>, is presented, which exploits recent advances in Computational Linguistics in terms of the rise of lexicalist unification-based grammar theories. It is argued that it overcomes many deficiencies of current methods, such as those based on transfer rules, interlingual representations, and isomorphic grammars. The key advantages are a greater modularity of the monolingual components, which can be written with great independence of each other, using purely monolingual considerations. They can be used for parsing and generation, and may be used for multi-lingual translation systems. The two monolingual components involved in translation are put into correspondence by means of a bilingual lexicon which contains information similar to what one might expect to find in an ordinary bilingual dictionary. The approach is demonstrated by presenting very different Unification Categorial Grammars from small fragments of English and Spanish. Although their coverage is small, they have been chosen to contain linguistically interesting phenomena known to be difficult in MT, such as word order variation and clitic placement. These monolingual grammars are put into correspondence by means of a bilingual lexicon. The <i>Shake-and-Bake</i> approach to MT consists of parsing the Source Language in any usual way, then looking up the words in the bilingual lexicon, and finally generating from the set of translations of these words, but allowing the Target Language grammar to instantiate the relative word ordering, taking advantage of the fact that the parse produces lexical and phrasal signs which are highly constrained (specifically in the semantics). The main algorithm presented for generation is a variation on the well-known CKY one used for parsing.
|