5 Jul 2010

Computers deciphering languages


On June 30, there was a news release about how a computer program has managed to mostly decipher Ugaritic "on its own" (that is to say, re-decipher for the pursuit of computer science advancement). This was already discussed on Memiyawanzi and Abnormal Interests but I feel I should add my two cents considering that I involve myself in both linguistics and computer programming on this site.

When I was still in French-immersed gradeschool, I learned quickly that English words don't map nicely to French words in phrases all the time. One clever teacher encouraged me to think of an English joke and to try to translate it into French. As she anticipated my reaction, she delighted in watching the personal discovery she had abetted unfold all over my face. Naturally, the puns on which my joke relied had magically lost their meaning and I realized then very early that translation is quite complex. Unfortunately, many others in adulthood have never taken the time to think about just how complex translation really is. Many people who are monolingual especially have little experience to go on to understand just how difficult this task really is and just how amazingly effortlessly the human brain solves many problems.

And after programming things like the online Etruscan dictionary, I can say with some degree of experience that trying to machine-translate something using simple one-to-one mapping algorithms between languages produces ludicrous Babelfish-like results (eg. 'What is on television?' becomes *'Qu'est à la télévision?' on that site for some reason). Even when AI algorithms are added to improve results, the translations are still nowhere near as competent as produced by human beings because they are unequipped to overcome gaps in knowledge or ambiguities in speech like plastic brains can. Historical translation becomes all the more complex because there are even more unknowns involved.

The article is interesting but I feel it's a lot of pompous hype from an institution that is no doubt pressured to come up with innovation after innovation, or at least to look like it is. The realistic limitations of this program can easily get misconstrued. What I think will be the truely noteworthy innovation to this complex goal of machine translation is a very general pattern-recognition algorithm, one that doesn't require the guidance of researchers or the programming-in of added assumptions to find and discover new patterns. Yet, if programmers could accomplish that, we'd not only have a fancy translator algorithm but a fully fledged digital human complete with the beginnings of artificial intuition. Until then, these programs are in no way a replacement for human beings and any talk of that sort is more relevant to sci-fi than current reality.

5 comments:

  1. I'm glad to see this article combined with a sample of the Harappan/Indus Valley Script. Deciphering that was the first application I thought of when I first heard about this!

    ReplyDelete
  2. "What I think will be the truely noteworthy innovation to this complex goal of machine translation is a very general pattern-recognition algorithm, one that doesn't require the guidance of researchers or the programming-in of added assumptions to find and discover new patterns. Yet, if programmers could accomplish that, we'd not only have a fancy translator algorithm but a fully fledged digital human complete with the beginnings of artificial intuition. "

    Someone (I forget who) once said that humans are pattern-seeking animals. We spend a lot of time obsessing over language and intelligence while so many animals display all sorts of intelligent behavior and problem solving with no language that these CompSci geeks would recognize.

    (For the record: I have an advanced degree in CompSci)

    ReplyDelete
  3. Sppt on.

    Which Google translates into various languages as phrases usually equivalent to "location in/on/at" or in Welsh "look on". I did find the Italian equivalent "loco" amusing though, even if it wasn't what I originally meant.

    ReplyDelete
  4. I dont think pattern recognition applies to inscriptions that are not in some way standard.

    I worry about the interpretations of the variations (some subtle - some not) of some linear A inscription 'symbols' as example; if the human eye can't decide if they are new symbols or variants of some already known symbol...how in the world will some smarmy software know?

    ReplyDelete
  5. kt_06226: "I dont think pattern recognition applies to inscriptions that are not in some way standard."

    Self-contradictory. One needs the capacity to recognize patterns to ascertain with any certainty that something is "non-standard". This involves recognizing the pattern of history, the pattern of inscriptions, the pattern of civilization, the pattern of culture, the pattern of language, the pattern of human psychology, etc., etc., etc. It's an insanely complex multi-dimensional riddle replete with patterns of various kinds, physical and abstract.

    The capacity to solve problems, any problem, is squarely about recognizing patterns (which is probably why IQ tests are replete with pattern recognition questions).

    "[...] if the human eye can't decide if they are new symbols or variants of some already known symbol...how in the world will some smarmy software know?"

    Yes, and this is exactly my point concerning current technology. Without matching the human ability to generally recognize patterns and intuitively extrapolate further information, computers will remain inadequate substitutes for the amazing human mind... for now. ;o)

    I have a personal feeling that "intuition" and "consciousness" are merely emergent traits of a more fundamental pattern recognition program (perhaps even a ridiculously simple one) encoded within our own brains, duplicated several million times like a recursive fractal to create the awe-inspiring higher-level day-to-day functionality that we so often take for granted.

    ReplyDelete