Our Blog

How does AI learn to classify texts?

Krzysztof Jassem December 15, 2020

In automatic text classification, a computer system has the job of assigning texts to defined categories. In some applications the system itself decides how to define the categories (classes) of texts. For example, if someone wants to classify the latest news reports, but is not certain what topics they will relate to, they may tell the artificial intelligence (AI) system to divide the set of reports into a specified number of classes (10, say). The system will then group the reports in such a way that each category contains texts that use similar vocabulary.

Continue Reading...

How to tell whether artificial intelligence is effective

Krzysztof Jassem December 14, 2020

Artificial intelligence (AI) is a type of computer system whose task is to imitate actions performed by a human.

How can we evaluate whether an AI system performs its task effectively? I will try to answer that question in this blog.

Continue Reading...

How do computers analyse words?

Krzysztof Jassem November 30, 2020

Morphological analysis is used to determine how a word is built up. The result of such analysis might be the statement that, for example, the word classes is made up of the stem class and the ending -es, which is used in English to make the plural form of certain nouns and the third person singular form of certain verbs. From this information we may deduce that classes is likely the plural of a noun (or the third person singular of a verb) whose base form, or lemma1, is class.

Continue Reading...

How computers read text

Krzysztof Jassem November 20, 2020

How does a child normally learn to read in its native language? First it gets to know all the graphical symbols (letters, punctuation marks, accent marks, and so on) that are used to write that language. Next it learns the relationships between symbols and sounds, and learns to connect the sounds into words and phrases while reading. After some time the child is able to interpret even whole sentences at a single glance.

Continue Reading...

Can a word be a number?

Krzysztof Jassem November 12, 2020

The idea of using numbers to represent words, or texts made up of words, is “as old as time”. Texts are converted into number sequences in a process of encryption; then in the process of decryption the reverse operation is performed, in a manner known only to the intended recipient. In this way, the encrypted message cannot fall into the wrong hands. Encrypting tools are reported to have been used even in ancient Greece: “A narrow strip of parchment or leather was wound onto a cane, and text was written along it on the touching edges. The addressee, having a cane of the same thickness, could quickly read the text of the message. When unfurled, to show meaningless scattered letters, it would be of no use to a third party; it was understandable only to the intended recipient, who would match it to his template” (https://pl.wikipedia.org/wiki/Skytale).

Continue Reading...

Regular expressions, or how to find many needles in a haystack

Krzysztof Jassem October 19, 2020

One of the main advantages of storing documents in digital form is the ease of searching them for particular words and phrases. If you had the paper version of the book A Game of Thrones, and you wanted to find the first time the character name “Daenerys” appeared, it would be like looking for the proverbial needle in a haystack. But in the digital version? Simply use the Find function. Replacing text is just as easy: using a global Replace command, you could, for example, change all instances of “Daenerys” to the alternative spelling “Denerys”.

Continue Reading...

How to teach a computer to talk to a human

Krzysztof Jassem October 13, 2020

For a computer to be able to conduct a conversation like a human, it needs to master four elements of dialogue:

  • speech recognition;
  • language comprehension;
  • text generation;
  • speech synthesis.

Continue Reading...

NLP in Linux

Krzysztof Jassem October 6, 2020

It is commonly believed that the processing of text documents requires knowledge of a high-level programming language. A decade or so ago it was considered proper to have good knowledge of the Perl language, while today a specialist in the field “absolutely must” have mastery of Python. But is this knowledge really indispensable?

In this post I will show that even sophisticated tasks in text processing can be completed quickly and easily without knowledge of a single command in any programming language.

Continue Reading...

Using regression in NLP

Krzysztof Jassem September 10, 2020

In a previous post on this blog I discussed the task of classification, which involves determining automatically which class a given object belongs to. A classification system assigns to each object a class label, where the number of possible labels is defined in advance.

Continue Reading...

Language models

Krzysztof Jassem August 26, 2020

A model is a representation of some entity, serving to make it easier to work with. A model is often a miniature version of an object. When playing with a miniature model of an aeroplane, a child can add or subtract various components – mount wings, for example, or remove an engine. In an atlas or on a globe, which represent the Earth, we can cover hundreds or thousands of miles with just one sweep of a finger. On the other hand, if we want to observe the motion of elementary particles, the model of the atom must be a great deal larger than the original. A model as a representation of something makes it easier for us to understand and get to know that thing, from the general concept to detailed properties.

Continue Reading...

NLP – what is it and what can it be used for?

Krzysztof Jassem August 5, 2020

A natural language is a language used by humans to communicate with each other, like English, Polish, etc. It contrasts, for example, with a programming language (Java, Python, etc.), which is used by humans to give instructions to a computer.

Continue Reading...

The history of machine translation in a nutshell

Krzysztof Jassem July 27, 2020

Poprawne tłumaczenie z jednego języka na drugi wymaga wiedzy i kompetencji wykształconego człowieka, dlatego tłumaczenie automatyczne traktowane jest jako zadanie wchodzące w obszar działania sztucznej inteligencji.

Continue Reading...

How to evaluate the quality of automatic translation

Krzysztof Jassem July 14, 2020

How can we determine whether an automatic translation system is doing its job – that is, translating texts correctly and preserving the original meaning? How should we compare the quality of two translation systems so as to choose the one that best meets our needs? I will be trying to answer these questions in this blog.

Continue Reading...

Online education solutions – the key elements

Maciej Olejniczak June 17, 2020

Web-based training (WBT), which consists of remote learning via the Internet or an intranet, is constantly growing in popularity. In the coronavirus era, it has become indispensable.

Continue Reading...

Neural Machine Translation System Our Translator