A "natural language" is a human language such as English, Spanish,
French, and so on. A very important task in artificial intelligence
is getting intelligent systems to be more comfortable with human language.
Natural language is both one of the oldest areas of artificial intelligence,
and one of the most important and current areas of AI. Indeed, natural
language work really predates the development of computers. As far
back as the 1800's, plans for an Analytical
Engine were designed by Lady
Ada Lovelace and Charles Babbage.
The Engine was never built (a previous version called the Difference
Engine capable only of mathematical computation was partially built),
but it was envisioned that natural language processing would be part
of its capabilities. In the 1940's, Alan Turing developed something
called the "Turing Test" to determine whether a computer has attained
intelligence. At the time, a popular game involved asking typed questions
of a man and a woman, both hidden from view, who would type their
responses. The objective for the man was to try to pretend, through
his answers, that he was really a woman; the objective for the woman
was to try, through her answers, to expose the man as a fraud as quickly
as possible; and the goal for the questioners was to try to guess
who was who as quickly as possible. The sensibilities of the time
generally prevented overly specific questions about, say, the responder's
anatomy that a questioner, especially a female questioner, could use
to discern the gender of the responder.
Around this time,
mathematician Alan Turing began thinking about what capabilities a
machine would need to have to be capable of thought. He designed an
abstract machine called the "Turing machine" which could perform basic
computation. And-perhaps in part because he was gay and had thought
more about gender issues than many people of his time-he revised the
game of a man pretending to be a woman so that instead a computer
would pretend to be a human. This became known as the "Turing test"-if
a machine, through conversation in human language over some type of
teletype device, could successfully pass itself off as human, it had
attained intelligence. Some versions of the "Turing test" combined
the two tests, so that a computer would be deemed to be intelligent
if it could imitate a woman better than a man could. Other versions
placed restrictions on the questions similar in spirit to restricting
overtly personal questions that a woman would be able to answer better
than a man. And, ironically, the Internet has created many new opportunities
to play the original parlor game. If someone shows up on an Internet
bboard calling themselves "Lisa" or "Angela" or "Nadia", how can you
really be sure they are a woman?
We still do not have a computer capable of passing the Turing Test,
although an annual competition known as the Loebner
Prize is devoted to finding programs which are able to pass the
Turing Test in some very restricted area of human conversation. An
annual prize is awarded to the most "human" computer program, although
no program has succeeded in fooling a human. A significantly larger
prize would be awarded for a machine which actually passed the Turing
Test. As
can be seen, the programs' performance does not really come close
to matching that of a human being. However, we have made considerable
progress in the years since Turing proposed his test. Grammar checkers,
while not capable of determining the meaning underlying natural language,
routinely correct grammatical errors in online documents. Voice recognition
software is sufficiently reliable to be useful.
Natural language is difficult largely because the meaning of words
is often not precise. Whereas in computer languages everything always
has a very precise meaning, in natural language it depends heavily
upon the context in which it is used. Consider, for example, the last
sentence in the previous paragraph: "voice recognition software is
sufficiently reliable to be useful." What does the word "useful" mean
in this sentence? The meaning of this word isn't really defined. There
is an assumption being made here that you have had the experience
of using voice recognition software package, or speaking on the telephone
to an automated customer service line, and have therefore had some
experience in verbal communication with a computer. This gives you
a clear basis to agree-or disagree-with my claim that this software
is "useful". Without that background, it would be very difficult to
even have a clue what the word "useful" means in this context, much
less agree or disagree with me.
However, despite its difficulty, we are far better poised today than
in the past to make serious inroads into natural language understanding.
I recall late in 1993 writing a proposal for Rama about designing
a natural language system based upon large amounts of human-computer
interactions. The idea was that with copious human-computer conversations,
together with advanced machine learning using neural
networks, one could train a computer to successfully interact
with humans using human language. The proposal, which I submitted
inwardly to Rama but never sent in printed form, suffered from the
weakness that it would be very costly to pay for humans to "talk"
with computers enough to make it work. Nowadays, however, the Internet
provides so many opportunities for computers to interact with humans
and with human language, essentially free of charge, that this difficulty
no longer exists.
In addition to capabilities such as grammar checkers and voice recognition
software, another area in which natural language processing is being
used is in automated email response software. This software, which
is being produced by companies such as Brightware
and Kana, is capable of providing
automated response to email queries so that companies don't delay
a long time in responding. The automated response generally takes
the form of categorizing the message, sending a suitably customized
response to the sender, and notifying the appropriate person at the
recipient company. While it is not capable of general-purpose interpretation
of natural language, it is capable of doing some intelligent work
in parsing human language.
In the next couple of editions, we will explore in more detail some
of these natural language technologies. In the meantime, for further
information you might check out the following books: Speech
and Language Processing: An Introduction to Natural Language Processing,
Computational Linguistics and Speech Recognition and Natural
Language Understanding.
Next edition:
Natural Language Part
2 - Automated Email Response