Loading...

How to teach a computer to talk to a human

Krzysztof Jassem October 13, 2020

Four elements of computer conversation

For a computer to be able to conduct a conversation like a human, it needs to master four elements of dialogue:

  • speech recognition;
  • language comprehension;
  • text generation;
  • speech synthesis.

Speech recognition

Speech recognition is a process in which the computer listens to a human utterance and converts it to written form. The process can be seen on certain news channels, for example, where the words being spoken by a journalist appear on the screen as subtitles almost immediately.

In recent years, systems of this type have made huge progress in recognising speech signals in practically any of the world’s natural languages. Not long ago it was close to a miracle if a car navigation system understood a spoken destination address correctly; today we are surprised if even the smallest error occurs in understanding that kind of spoken text.

Speech synthesis

Just a few years ago, a computer-generated voice would jar with its unvarying rhythm and unnatural intonation. Today, you need an especially sensitive ear to distinguish a synthetic voice from a human one.

Text generation

Text generation is a process of creating a text in a natural language based on knowledge encoded in the computer. For example, based on information of the type: “<departure station: Kings Cross; operator: LNER; destination station: Newcastle; delay: 30>” the computer system can generate the text: “The LNER train from Kings Cross to Newcastle is delayed by thirty minutes”.

Text generation can be done using a human-designed set of rules or templates; it is a much easier task than language comprehension, which we discuss next.

Language comprehension

We can say that a computer system understands a natural language if, in response to a human utterance, it takes action in accordance with the speaker’s intention. For example, in response to a question, the system should provide a relevant piece of information; in response to a command, it should perform the requested action; and in response to an expression of opinion, it should continue the dialogue in the way a human would.

Understanding natural language is the hardest problem in computerised conversation, and is still far from being solved. This is because it is very hard to develop a finite set of rules or principles that will take care of the infinite number of possible human utterances.

However, if we accept that the system is to operate exclusively within a certain simplified reality, in which the user’s utterances are limited in vocabulary and structure, it then becomes possible – and not even especially complicated – to write a computer program that will understand natural language.

In this blog post I will describe, step by step, how such a system can be created. The example given will be written in Python, but even without knowledge of that or any other programming language, the reader will be able to learn how to build effective systems of this type.

A virtual trader

The system in this example will be a virtual trader, doing business in accordance with the mission statement: “I buy in order to sell.” It will be an uncomplicated system, responding to short sentences with simple syntax. To broaden its potential market, our trader will communicate in English, and its partner in conversation will give it commands of the type:

Buy 5 large blue metal boxes.
Sell 3 tiny red plastic rings.

The trader will do everything to make a suitable deal in every case. It will accept any purchase instruction, recording the relevant information in its inventory. It will carry out a sale instruction only if it has previously bought the required quantity of goods.

We will accept that the system understands what is said to it if its inventory is always in agreement with the purchases made and the sale transactions concluded.

How does the virtual trader work?

Our program is expected to work in the following way:

The user gives a command in a natural language, e.g.:

Buy 5 large blue metal boxes.

  1. The system identifies the good in question, namely in this case:
    large blue metal boxes
  2. Every good that the trader sells has its own code, or ‘index’, in the inventory. The system determines the index of the item in question in a defined manner (this is an element of language comprehension); in our system each index will be a four-digit number, e.g. 1215.
  3. The system analyses the first word of the command to see whether it concerns a purchase or a sale (from its point of view, of course).
    3.1. If it is a purchase, the system unconditionally accepts the instruction, and in its inventory it increases the stocked quantity of the good with the given index.
    3.2. If it is a sale, the system first checks in the inventory whether it has the required quantity of the good in stock. If it does not, it refuses to perform the instruction; but if it does, it carries out the sale transaction, and records the decrease in the inventory.

Writing the program

Dictionary

Types of tokens – word meanings

We will have to create a dictionary containing all of the words that the system is expected to understand. These will be called tokens. First, however, we need to define the permissible meanings of words, called token types. The corresponding Python source code takes the following form:

    tokens = (
        'OPERATE',
        'NUMBER',
        'SIZE',
        'COLOR',
        'MATERIAL'
        'KIND',
    )

We have defined six token types, corresponding to different meanings of tokens. In a moment we will determine, for example, that the dictionary can contain a token “Buy” belonging to the OPERATE type.

Exercise for the reader

Guess to which of the above types the tokens in this sentence belong:

“Buy 5 large blue metal boxes.”

Answers in a moment…

The dictionary content

At the next step, we define for each type what tokens belong to it and what the system has to do when a specified token appears in the user’s command:

Operations
    def t_OPERATE(t):
    r'Buy | Sell'
    return t

The above code assigns to the OPERATE type the two words Buy and Sell (second line), and instructs the system to remember the given operation when such a token appears in the user’s command (third line).

Numbers
    def t_NUMBER(t):
    r'\d+'
    t.value = int(t.value)
    return t

The above code assigns to the NUMBER type any text consisting purely of digits (second line), transforms that text into the corresponding numerical value (third line), and instructs the system to remember that value whenever such a token appears in the user’s command (fourth line).

Attributes
    def t_MATERIAL(t):
    r'metal | plastic'
    if t.value =='metal':
    t.value = 1
    elif t.value == 'plastic':
    t.value = 2
    return t

The above code assigns to the MATERIAL type the words metal and plastic. Next (and this is something new!) it instructs the system to store either 1 (if the material is metal) or 2 (if the material is plastic).

We can construct the other token types analogously to MATERIAL:

    def t_COLOR(t):
    r'(black | white | red | green | blue)'
    if t.value =='black':
    t.value = 1
    elif t.value == 'white':
    t.value = 2
    elif t.value == 'red':
    t.value = 3
    elif t.value == 'green':
    t.value = 4
    elif t.value == 'blue':
    t.value = 5
    return t

    def t_SIZE(t):
    r'tiny | small | big | large'
    if t.value =='tiny':
    t.value = 1
    elif t.value =='small':
    t.value = 2
    elif t.value =='big':
    t.value = 3
    elif t.value =='large':
    t.value = 4
    return t
Kinds of goods
    def t_KIND(t):
    r'box(es)? | ring(s)?'
    if t.value[0] =='b':
    t.value = 1
    else:
    t.value = 2
    return t

In the definition of the KIND type we ensure that the dictionary contains the words box, boxes, ring and rings. When a token referring to boxes appears, the value 1 is stored; and when the token refers to rings, the value 2 is stored.

The goods index

How does our computer trader identify the index of the goods mentioned in a command? This is made clear by the following table, showing the values defined for particular tokens.

Value KIND SIZE MATERIAL COLOR
1 box(es) tiny metal black
2 ring(s) small plastic white
3 big red
4 large green
5 blue

When the system has to determine the index of an item being bought or sold, it does so in the way indicated by the above table – where the ordering of the columns is significant.

For example, to determine the index of the good referred to as:

large blue metal boxes

the system must first arrange the given attributes in an order corresponding to the columns of the table: KIND, SIZE, MATERIAL, COLOR. This gives in this instance:

boxes large metal blue

It then assigns digits to the tokens as given in the table:

1415

(boxes → 1; large → 4; metal → 1; blue → 5)

The index of the good large blue metal boxes is therefore 1415. Notice that even if the user had stated the attributes in a different order, the resulting index would still be the same.

Grammar rules

The virtual trader will only understand commands given in accordance with the rules of grammar defined by the programmer. Every rule indicates how smaller parts of a sentence can be put together to make a larger whole. For example, the following rule says that a command must contain, in order, a token denoting an operation, a token denoting a number, and a component denoting the article:

command: OPERATE NUMBER article

A grammar rule can also contain instructions which the program is to perform when it finds that the analysed text conforms to that rule.

Rules for attributes

Let us begin with the simplest rules, concerning the attributes of goods:

    def p_attribute_color(p):
    'attribute : COLOR'
    p[0] = p[1]

The rule denoted p_attribute_color (the name of a rule must begin with “p_”, but the rest of the name is freely chosen by the programmer) says that a part of the sentence with the name attribute may consist exclusively of a token of COLOR type (second line). The third line says that in this situation the value of the attribute component (denoted p[0]) is to be identical to the value of the COLOR token (p[1] denotes the first, and in this case only, component).

Thus, if the attribute black appears in the command, it will have the value 1, the attribute white will have the value 2, and so on.

    def p_attribute_material(p):
    'attribute : MATERIAL'
    p[0] = 10 * p[1]

The rule with the name p_attribute_material says that a part of the sentence with the name attribute may alternatively consist exclusively of a token of MATERIAL type (second line). The third line says that in this situation the value of the attribute component is to be the value of the MATERIAL token multiplied by 10.

Thus the attribute metal has the value 10, and the attribute plastic has the value 20.

Analogously:

    def p_attribute_size(p):
    'attribute : SIZE'
    p[0] = 100 * p[1]

Rules for articles

The description of an article may consist only of the kind of good:

    def p_article_kind(p):
    'article : KIND'
    p[0] = 1000 * p[1]

The value will be 1000 for boxes and 2000 for rings.

The description of an article may be preceded by an attribute of any type:

    def p_article_attribute(p):
    'article : attribute article'
    p[0] = p[1] + p[2]

Adding an attribute to the description of an article increases its value. For example, the description “boxes” has the value 1000, “metal boxes” has the value 1010, “blue metal boxes” has the value 1015, and “large blue metal boxes” has the value 1415.

Rules for commands

The most important rule concerns the entire command:

    # Main parser rule (command)
    def p_command(p):
    'command : OPERATE NUMBER article'
    index = p[3]

    #Buy article
    if p[1] == 'Buy':
    tab[index] += p[2]
    print('OK. I am buying ' + str(p[2])+ ' new articles indexed as ' + str(index) +'.')
    print('No of articles in shop: '+ str(tab[index]))

    #Sell article
    elif p[1] == 'Sell':
    if p[2] > tab[index]:
    print('I do not have as many articles as you want.')
    else:
    tab[index] -= p[2]
    print('OK. I am selling ' + str(p[2])+ ' articles indexed as ' + str(index) + '.')
    print('No of articles in shop: ' +  str(tab[index]))

The first line states that the analysed command must consist of, in order, the operation (an OPERATE token), the number of articles (a NUMBER token), and a description of those articles (article).

The index of the good referred to in the command is taken to be the value of the article (p[3]). In the case of a purchase, the inventory record for the good is increased by the value of the NUMBER token, and an appropriate message is returned. In the case of a sale, the system additionally checks whether the required quantity of the good is available.

An example conversation session with the virtual trader

What can I do for you?
Buy 10 large blue metal boxes
OK. I am buying 10 new articles indexed as 1415.
No of articles in shop: 10

What can I do for you?
Buy 5 tiny white plastic rings
OK. I am buying 5 new articles indexed as 2122.
No of articles in shop: 5

What can I do for you?
Sell 5 large blue metal boxes
OK. I am selling 5 articles indexed as 1415.
No of articles in shop: 5

What can I do for you?
Sell 3 tiny white plastic rings
OK. I am selling 3 articles indexed as 2122.
No of articles in shop: 2

What can I do for you?
Sell 3 tiny white plastic rings
I do not have as many articles as you want.

What can I do for you?
Bye
>>>

Postscript

The full code of the program is given below (with essential “decorations”).

    # Import lexer and parser from ply module
    import ply.lex as lex
    import ply.yacc as yacc

    # List of token types.
    tokens = (
    'NUMBER',
    'OPERATE',
    'SIZE',
    'KIND',
    'COLOR',
    'MATERIAL'
    )

    # Token types may be defined as regular expressions, e.g. r'Buy | Sell'
    def t_OPERATE(t):
    r'Buy | Sell'
    return t

    def t_MATERIAL(t):
    r'metal | plastic'
    if t.value =='metal':
    t.value = 1
    elif t.value == 'plastic':
    t.value = 2
    return t

    def t_COLOR(t):
    r'(black | white | red | green | blue)'
    if t.value =='black':
    t.value = 1
    elif t.value == 'white':
    t.value = 2
    elif t.value == 'red':
    t.value = 3
    elif t.value == 'green':
    t.value = 4
    elif t.value == 'blue':
    t.value = 5
    return t

    def t_SIZE(t):
    r'tiny | small | big | large'
    if t.value =='tiny':
    t.value = 1
    elif t.value =='small':
    t.value = 2
    elif t.value =='big':
    t.value = 3
    elif t.value =='large':
    t.value = 4
    return t

    def t_KIND(t):
    r'box(es)? | ring(s)?'
    if t.value[0] =='b':
    t.value = 1
    else:
    t.value = 2
    return t

    def t_NUMBER(t):
    r'\d+'
    t.value = int(t.value)
    return t

    # Lexer error handling rule (Handle words out of vocabulary)
    def t_error(t):
    print("Illegal character '%s'" % t.value[0])
    t.lexer.skip(1)

    # Ignore white spaces
    t_ignore  = ' \t'

    # Main parser rule (command)
    def p_command(p):
    'command : OPERATE NUMBER article'
    index = p[3]
    #Buy article
    if p[1] == 'Buy':
    tab[index] += p[2]
    print('OK. I am buying ' + str(p[2])+ ' new articles indexed as ' + str(index) +'.')
    print('No of articles in shop: '+ str(tab[index]))
    #Sell article
    elif p[1] == 'Sell':
    if p[2] > tab[index]:
    print('I do not have as many articles as you want.')
    else:
    tab[index] -= p[2]
    print('OK. I am selling ' + str(p[2])+ ' articles indexed as ' + str(index) + '.')
    print('No of articles in shop: ' +  str(tab[index]))

    # Parser rule (attribute)
    def p_attribute_color(p):
    'attribute : COLOR'
    p[0] = p[1]

    # Parser rule (attribute)
    def p_attribute_material(p):
    'attribute : MATERIAL'
    p[0] = 10 * p[1]

    # Parser rule (attribute)
    def p_attribute_size(p):
    'attribute : SIZE'
    p[0] = 100 * p[1]

    # Parser rule (article - stop)
    def p_article_kind(p):
    'article : KIND'
    p[0] = 1000 * p[1]

    # Parser rule (article - recursion)
    def p_article_attribute(p):
    'article : attribute article'
    p[0] = p[1] + p[2]

    # Syntax error handling rule
    def p_error(p):
    print("Syntax error in input!")

    #######################################
    #Main program

    #Initialize table of articles (zero articles at the beginning)
    tab = []
    for index in range(3000):
    tab.append(0)

    #Build the lexer
    lexer = lex.lex()

    #Tokenize (short version)
    # for tok in lexer:
    #    print(tok)

    #Build the parser
    parser = yacc.yacc()

    #Main loop
    while True:
    s = input('What can I do for you? \n')
    if s == 'Bye':
    break
    parser.parse(s)
How to teach a computer to talk to a human

Contents

Neural Machine Translation System Our Translator