[Top] [All Lists]

RE: [ontac-forum] Problems of ontology -- and terminology

To: "Obrst, Leo J." <lobrst@xxxxxxxxx>, "ONTAC-WG General Discussion" <ontac-forum@xxxxxxxxxxxxxx>
From: "Obrst, Leo J." <lobrst@xxxxxxxxx>
Date: Sat, 20 May 2006 19:54:45 -0400
Message-id: <9F771CF826DE9A42B548A08D90EDEA8001021751@xxxxxxxxxxxxxxxxx>
Repost.     (01)

John,    (02)

I think that many if not most of us who have been in this ontology
business long enough know the difference between ontologies and
terminologies.    (03)

We explicitly choose to represent ontologies because we want a formal
representation, a logical theory, about the things in the world (or
even possible or fictional or impossible things). Why, because logic is
our best tool for such. And a machine that can interpret content
represented in that logic can infer in well-defined ways conclusions
that a human would make. These formal ontologies constitute a computer
usuable AND interpretable semantic model.      (04)

I agree that natural languages are extremely useful, that their
ambiguity and context-adjustment is absolutely necessary for the
communicative and other speech tasks that we do, possibly including
internal conversations within our selves, but I sould state that in
principle the specific salient interpretations of a given natural
language utterance can be arbitrarily approximated, given a rendition
in a logic or logics. That is the purview of formal semantics,
including formal lexical semantics, both of which are used by
computational semantics.     (05)

Ontology engineering does not focus on terminology, nor on the formal
semantics of natural language, though of course those assist in the
effort. It does not focus on epistemology, though that too is
important. It focuses on ontology.     (06)

WordNet is NOT an ontology, but primarily a thesaurus organized by
psycholinguistic principles (at least according to George Miller,
Christiane Fellbaum). People link their ontologies to WordNet in order
to have a term to ontology "concept" mapping, and get the benefit of
synonmy and weak term semantics (hypernymy and hyponymy). This is good.
In any complicated application, you will need a lexicon, a thesaurus,
and an ontology (or a number of each). Why, because typically you have
to go from text (unstructured or structured, e.g., schema labels,
metadata terms, etc.) to meaning. Lexicons (especially linked with a
morphologizer that can generate or map to canonical lexical roots)
enable you to go from terms to word senses, which can then be linked to
thesauri, and the latter mapped to ontologies.     (07)

Typically, in introducing audiences to ontologies, I try to show
(define, give examples of) folks a series of increasingly expressive
semantic models ranging from flat terminologies to taxonomies to
thesauri to conceptual models to logical theories. Each are important
and where you are or need to be on that spectrum depends on what you
need to accomplish, i.e., your use cases and application requirements.
If you need to just agree as a standards group on terms and their
definitions, you need a terminology/data dictionary; this is a human
head-nodding activity: yes, we agree that this is what we mean. If you
need to put your documents into topic buckets organized by the weak
semantics of broader_than/narrower_than terms and synonyms in thesauri,
for gross topic search and navigation purposes, that is fine. If you
need precise semantics for your applications or services, then you need
logical theories, i.e., high-end or strong ontologies.    (08)

More about the issue of vagueness later.    (09)

Leo    (010)

Dr. Leo Obrst       The MITRE Corporation, Information Semantics 
lobrst@xxxxxxxxx    Center for Innovative Computing & Informatics 
Voice: 703-983-6770 7515 Colshire Drive, M/S H305 
Fax: 703-983-1379   McLean, VA 22102-7508, USA     (011)

-----Original Message-----
From: ontac-forum-bounces@xxxxxxxxxxxxxx
[mailto:ontac-forum-bounces@xxxxxxxxxxxxxx] On Behalf Of John F. Sowa
Sent: Saturday, May 20, 2006 1:25 PM
To: ONTAC-WG General Discussion
Subject: Re: [ontac-forum] Problems of ontology -- and terminology    (012)

Pat, Matthew, Azamat, Chris, Eric, et. al.,    (013)

What I've been trying to say in many different ways is just
one fundamental point:    (014)


People may say they agree on this point, but then they keep
talking about WordNet as if it were an ontology, but it's not.
WordNet is a very useful resource, but it is first and foremost
exactly what its name implies:  a network of English words.    (016)

All the efforts that people put into aligning their so called
ontologies with WordNet have one very unfortunate result:
they cause the ontologies to become just as vague as WordNet.
The next implication that follows from this observation is    (017)


This sounds like heresy because people have been publishing
research article after research article showing how they have
been doing such alignments.  But the end result is nothing more
nor less than an alignment of *words* not of *categories*.    (019)

We certainly have to relate the words of natural languages to
the categories of various ontologies, but it is essential to
recognize that the mappings *cannot* be one-to-one because of
the following facts:    (020)

  1. The words of any natural language have an open-ended number
     of senses with vague boundaries, but the logical relations
     that express those categories have sharp boundaries.    (021)

  2. Words are flexible, and they can be adapted dynamically to
     the changing conditions of a situation or discourse, but
     the relations in logic are fixed and frozen by the axioms
     by which they are defined.    (022)

  3. Words in natural language are robust and adaptable, but
     logic is fragile.  Logic does not adapt to changing
     circumstances -- it breaks.    (023)

  4. Computer programs have the same properties as logic.
     They're not flexible, they don't adapt, and they break.    (024)

  5. Any change to a program or a theory expressed in logic
     causes relations to be redefined.  They thereby become
     totally different relations that just happen to have
     the same names.  Natural languages can adapt to such
     changes, but logic and computer programs *break* .    (025)

These points have many implications on how we do our work.
The first is that we should recognize that many, if not
most of the things that people have been calling ontologies
should really be called *terminologies* .  They are not
sufficiently precise and formal to be used for deduction.    (026)

Words must be related to ontologies, but that mapping is
a complex many-to-many relationship between the words of
any natural language and the categories of an ontology.
Consider the word "file" as used in operating systems:    (027)

  ? IBM mainframe:  A bit string separated into records
    by the operating system.    (028)

  ? Unix:  A character string separated by new-line characters.    (029)

  ? Macintosh:  A character string separated by carriage-return
    characters.    (030)

  ? Windows:  A character string separated by new-line plus
    carriage-return.    (031)

For each of these four senses of the word "file", there are
multiple "microsenses" that distinguish the various kinds of
attributes a file may have in different versions (e.g., the
FAT16, FAT32, and NTFS file systems of Windows or the many
different file systems for the versions of Unix -- any or
all of which change with each release or patch to the OS).    (032)

For many purposes, we can say in English "Please send me
the file."  But many things can happen to that file in
transit so that the result when shipped from A to B and
back to A is not identical to the original.  For some
applications, the difference may not be important, but
for others it may be vitally important.    (033)

These issues, which can be illustrated with just the word
"file", are multiplied over and over again with every term
in the terminology (forget the word "ontology" because most
of them are terminologies).    (034)

With this preamble, I'll comment on some of the comments:    (035)

PC> ... let us find out how close we can get by trying to build
 > the common UO and seeing where the residual problems are...    (036)

Matthew has already answered that point:    (037)

MW> I challenge you to pick one and use it. If you are not
 > prepared to do so, why do you think anyone else would be
 > prepared to use one that you approve of?    (038)

Furthermore, the Cyc project has already invested 750 person-
years in determining "where the residual problems are".
I agree with Lenat:  the upper level is much less important
than the mid and lower levels.  Don't waste more time and
money on things that don't matter.    (039)

JS>> Categories that are not shared need not be aligned.    (040)

MW> It depends on what you are trying to do. Full integration
 > would probably allow you to do things that could not be
 > done in any of them individually. Partial integration allows
 > them to interoperate which allows you to do just what each
 > of them can do.    (041)

The term "full integration" is meaningless.  Every computer
system of any size is constantly being updated and revised.
It is impossible to have "full integration" of version 1.2.3
with version 1.2.4, much less with any version of any system
that was independently developed.    (042)

MW> ... the more ontologies you have to map to each other
 > the cheaper it becomes to do one mapping from each to an
 > integrating ontology (which might of course be one of them).    (043)

If you really mean what you are saying, I believe that you are
using the word "ontology" for what I am calling a "terminology."    (044)

A lot of people have aligned their terminologies, but nobody
has ever aligned any large ontology with any other in the
sense that all inferences with one are *identical* to all
the inferences with the other.    (045)

AA> if only as 'vagueness' you don't mean 'ambiguity' (a plurality
 > of meanings for the same word), but rather 'equivocation'...    (046)

No.  By 'vagueness', I don't mean 'ambiguity' or 'equivocation'.
By vagueness, I mean being vague -- having no precise boundaries,
having an open-ended number of possible definitions, or allowing
a very wide variety of interpretations in different circumstances.    (047)

Nearly all the words of every natural language are vague, and the
most general words, such as Science, Technology, Art, Business,
Education, Law, Professional, Amateur, Truth, Beauty, Justice,
Love, Hate, Good, Bad, Evil, Happy, Sad, etc., etc., are so vague
that it is impossible to give any kind of formal definition.    (048)

Just look at what happened when Socrates tried to define the word
"justice".  The result was an entire book, Plato's _Republic_,
which laid out a specification for revising all of society in
a way that I can't imagine anyone ever wanting to live.    (049)

CP> My experience with integration (particularly legacy system
 > integration) within an enterprise (a common major problem for
 > most large enterprises) is that a top ontology is not only useful,
 > but almost essential.    (050)

I strongly suspect that you are talking about a terminology, and I
would certainly agree.  Having a complete glossary of all the words
that people use in an enterprise is *extremely* important.  I would
advocate much more attention to the very practical (and solvable)
problem of gathering and defining terminology in ordinary language
than to the hopeless dream of a universal formal ontology in logic.    (051)

CP> I also believe that your use of the Peircean position on
 > precision is not borne out by (the facts of) the history of
 > engineering....    (052)

On the contrary, your story supports what I (and Peirce) were
trying to say:  precision is possible for a very narrow subject.
Whitworth's micrometer was *not* designed for all measurements.    (053)

And Peirce certainly understood precision.  He was the first person
to recommend the use of a wavelength of light as a standard for
measurement -- and he also designed and built the instruments
for using a wavelength of light to measure the pendulums he used
in measuring gravity.  But he didn't use a wavelength of light
for measuring his house.    (054)

EP> I'm not sure that I would use the term "vague" however.  That
 > sort of implies that human language in glosses and doc strings
 > cannot be written so as to be precise.  Does not a good gloss
 > entail the necessary and sufficient conditions of the concept
 > that it describes?    (055)

A good gloss (or a good use of language in any circumstance) is
one that is as precise as necessary for the purpose.  I also make
the point that any natural language can be used as precisely as
any formal language.  However, that level of precision *always*
depends on the context.  The words that are used in language
are inherently vague when taken out of context, and the level
of granularity depends on the purpose in the given circumstances.    (056)

For further discussion of vagueness in natural language and the
issues in mapping language to logic, I suggest the slides I
used for a talk last month:    (057)

    http://www.jfsowa.com/talks/cmapping.pdf    (058)

The most important problems for us to address are the mappings
from language to logic (and other computable languages).  For
that purpose, an important first step is to get our terminologies
in order -- and recognize that they are *not* ontologies.    (059)

John Sowa    (060)

Message Archives: http://colab.cim3.net/forum/ontac-forum/
To Post: mailto:ontac-forum@xxxxxxxxxxxxxx
Shared Files: http://colab.cim3.net/file/work/SICoP/ontac/
Community Wiki:
gWG    (061)

Message Archives: http://colab.cim3.net/forum/ontac-forum/
To Post: mailto:ontac-forum@xxxxxxxxxxxxxx
Shared Files: http://colab.cim3.net/file/work/SICoP/ontac/
Community Wiki: 
http://colab.cim3.net/cgi-bin/wiki.pl?SICoP/OntologyTaxonomyCoordinatingWG    (062)
<Prev in Thread] Current Thread [Next in Thread>