Publications |
| Language Identification |
Applying Monte Carlo Techniques to
Language Identification |
| A paper presented on CLIN 2001. In this paper,
we introduce a new language identification technique that is
based on Monte Carlo sampling. We show that, by determining
the language of a large enough number of random features, we
can determine the document language to be the language which
result most often from these features. Whether the amount of
samples is sufficiently large can be determined by calculating
the standard error of the samples. Finally, we discuss some
pilot experiments where we compare this new technique with
others. |
| Data-Oriented Translation |
Data-Oriented Translation |
| A paper presented on COLING 2000. |
| Data-Oriented Translation : Using the Data-Oriented Parsing framework for Machine Translation |
| The final version of my master thesis. |
| Data-Oriented Translation |
|
A first paper that studies the Data-Oriented Translation method.
This paper was the topic of a presentation held at the CLIN 1998 in
Leuven.
|
| DOT Implementation notes |
|
The implementation written for the for the paper mentioned above was programmed in C++.
It was documented with the fine utility doc++.
|
| JavaDOT Implementation notes |
|
Recently, a new implementation was made in Java. Though this language is not quite as fast as C++, there are numerous other advantages. Javadoc is one of them.
|
| Examining the Cognitive Aspects of Human and Machine Translation |
|
In this paper, we will examine human translation: what type of
knowledge is required, and what stages of competence in translation can be
found. Then, we will give a short introduction into MT, and we will we
examine if and how MT systems correspond to the cognitive aspects of human
translation. Finally, we take a certain MT system (The Data-Oriented
Translation system), and explain that this system does adhere to some of the
human cognitive aspects of translation. |