Loading...

Logofag-Multiterm

Project report

Customer:pwn.pl - PWN AI - LSP
Objective:Automatically compiled dictionaries
Year deployed:Logofag - 2009, Multiterm - 2012

Logofag and Multiterm are electronic dictionaries that cover a much wider range of language than traditional paper dictionaries. They include colloquial vocabulary and borrowings from other languages, which appear in great numbers in contemporary texts. Many words popularly used in online texts and dialogues are nowhere to be found in “standard” Polish dictionaries.

The problem

From the beginning of the 21st century, paper dictionaries started to fall out of favour, both among professional translators and writers of texts in foreign languages, and among ordinary readers. There were several reasons for this: computer dictionaries are generally cheaper (and often free), they are more frequently updated, and they are more convenient to use, as they offer functionalities such as copying and pasting text.

The greatest drawback of free dictionaries is their dubious accuracy – the Internet, unlike well-renowned sources, takes no responsibility for the quality of translations. The dictionary user is thus faced with a dilemma: convenience and low cost, or reliability?

The solution

Our goal was to create dictionaries that would combine wide coverage of the contemporary language with high standards of accuracy. We developed mechanisms that automatically searched for words and phrases and their translations in multilingual text corpora. To ensure accuracy, every translation we found had to be confirmed in at least two independent lexicographic sources (we observed that some dictionaries had been constructed by copying and pasting text from other publications; we did not regard these as independent sources).

By this procedure we generated the following dictionaries, going under the common name Logofag:

  • Polish–English: general
  • Polish–English: science and technology
  • Polish–English: IT
  • Polish–English: legal and business
  • Polish–German: general
  • Polish–German: science and technology
  • Polish–French: general

These dictionaries were several times larger than traditional publications; for example, the general Polish–English dictionary contained around 540,000 translation pairs, compared with only 200,000 for its traditional counterpart (including 74,000 words and about 125,000 phrases).

The Logofag dictionaries were published on CDs by PWN Publishers (pwn.pl) and sold in multimedia stores. They also served as a lexical database for the Translatica automatic translation system (translatica.pl), which was also created by our company (under its former name Poleng).

In 2012, in conjunction with SDL PLC, the distributor of SDL Trados software, we converted our dictionaries to Multiterm format. Integrated with SDL Trados, they became an essential aid for translators and translation agencies, while at the same time supporting the largest automated translation platforms using neural network-based machine learning (including Google Translate).

Benefits

  • individual user access to reliable, easy-to-use electronic dictionaries with very large numbers of headwords
  • expansion of translation agency software to include general and specialist dictionaries
  • improved quality of translation by automatic systems making use of the dictionaries
  • updating of dictionaries with contemporary terms and their meanings

We were the pioneers of a new era of bilingual dictionaries in the Polish market, offering reliable electronic lexicons of unprecedented size. Perhaps we can also help you to build a pioneering solution? Let’s talk about it.

Logofag-Multiterm

Technology used

  • HunAlign
  • Arena