In part 1, I presented a high-level description of the Semantic Web project. In this part, I delve a little deeper into one of the core challenges: how to you develop a common vocabulary?
The Semantic Web project is all about allowing machines to easily converse with each other, so ideally they should share a common vocabulary. For example, it's a lot easier for two machines to exchange information about employee salaries if they use the same words for "employee" and "salary". Without a common vocabulary, some form of translation is needed, which is both time-consuming and expensive. In addition, as the number of machines grows, the number of possible translations grows exponentially which is very hard to support.
The tough question is: how do you develop a shared vocabulary? Before discussing some approaches that use modern technologies, it's instructive to look at how the English Language evolved. Maybe we can learn something from history.
Because of the invasion by France in 1066, legal, noble, and diplomatic communications inside England were actually conducted in French. Only the lower classes tended to use the English language, which at that time was fairly primitive and spread via word of mouth. But as England became stronger and more unified, the use of the English language grew, and by 1349, it was being taught at schools again. It was generally adopted for writing around 1425, and by 1489, it replaced French as the official language for use in Parliament. But incredibly enough, there still was no formal English dictionary!
In 1582, Richard Mulcaster decided to bring order to the chaos and wrote Elementarie, the first English dictionary, which listed around 8000 common words. During the creation of the book, he also simplified the spelling of some words to introduce more consistency to the language. Elementarie was not a true dictionary because it only spelt words and did not include their meanings.
In 1604, Robert Cawdrey released a Table Alphabeticall of Hard Words, which listed around 2,500 less common English words together with a short description of their meaning. Although it covered less words than Elementarie, it improved the overall structure and content of what a dictionary should be.
In 1721, Nathaniel Bailey introduced the Universal Etymological English Dictionary, which was the first attempt to list all the words in the English language. It was a fairly formal book, and focused on providing precise literal meanings for words rather than practical examples of usage. Although this was a good stride forward, England was still behind the French and the Italians, who already had more advanced offerings for their own languages.
In 1746, Samuel Johnson signed a contract to create the first comprehensive, easy-to-use Dictionary of the English Language, and delivered the finished product in 1755. Initially he thought it would take just 3 years; oops! One funny thing about the Johnson dictionary is that he occasionally provided word definitions that were colored by his own personal opinions. For example, "excise" was defined as "a hateful tax levied upon commodities, and adjudged not by the common judges of property, but wretches hired by those to whom excise is paid". I bet the IRS would not be too thrilled by that definition!
In summary, a primitive form of the English language was spread informally by mouth until around 1582, when the first dictionary was introduced. Several versions of English dictionary followed, which gradually included more words together with their semantic meaning.
In the next part, I'll discuss how a vocabulary for machines might be developed.
I find the development of languages truly fascinating, in fact, I have always loved to observe how babies learn to speak a language. It appears that the brain of a baby is 'hard wired' to absorb languages. I wonder if during the development of this semantic language concept one could apply some of the baby brain functions. http://www.fcs.uga.edu/pubs/current/FACS03-6.html
It seems that the 'brain' has some sort of pathways that facilitate language. The pathways disolve after the age 12. From personal experience I know that I learned to speak all the languages I know by 12.. In anycase,these pathways are not necessarily for the memorization of 'words' but rather through these pathways one can develop contextual understanding. For example, the first time I heard the idiom 'bling, bling' I really didn't know what the actual meaning was, but because P.Diddy was doing gestures with his hands, and was showing all his jewels, it was evident, 'bling bling' meant gems, and such.:-)
Posted by: SBG"P" | Mar 14, 2005 at 06:06 PM