cuo-wg
[Top] [All Lists]

Re: [cuo-wg] An Evolved Approach to CDSI?

To: "common upper ontology working group" <cuo-wg@xxxxxxxxxxxxxx>
From: "Haugh, Brian" <bhaugh@xxxxxxx>
Date: Wed, 22 Nov 2006 18:38:34 -0500
Message-id: <DC92506CBE953D44A0EF52095965E835010DAF20@xxxxxxxxxxxxxxx>
Perhaps some replies to issues raised by Brad Cox (in quotes) may help
clarify some of the real problems faced by CDSI, which his proposal for
an "evolved approach" does not adequately address.     (01)

But, those who already feel that they understand the "N^2 problem" and
its ramifications for Brad's proposal and CDSI may want to skip this. 
__________
"I don't feel I understand what people mean by the term "N^2 problem....
for N machines in a linear pipeline, the number of interfaces is N-1."    (02)

"Pipelines," also known as "stovepipes," are not known to promote
effective interoperability across systems, much less across domains.
Translation pipelines, however, are even worse than traditional
stovepipe systems, which may share some common internal representations.
A translation "pipeline" would face the results you get from the
children's game of "telephone", only worse. Each translation is likely
to lose or distort some aspect of information obtained from a source
with different semantic foundations, so by the end the results may be
unrecognizable.     (03)

While we may all agree that you won't have the worst N^2 case in every
context, you really don't want a lot of linear pipelines translating
between systems/domains. In addition, capability needs are driving
requirements for direct interoperability between more and more
systems/domains. Hence, the N^2 upper bound is a real concern.
____________
"The approach doesn't much depend on what standard (language) is used."    (04)

While the definition of the ("designed") approach is indedpendent of the
language used; I expect all would agree that any success achieved by an
application of this approach will be strongly dependent on the
expressive power of the language, as well as its semantic foundations,
breadth of adoption, adequacy of support tools, etc. 
______________
[In] "the evolved approach ...groups...address the problem in much the
same way we solve inteoperability with natural languages; by using
dictionaries and related tools, using interpreters, etc."    (05)

While unclear, it sounds like the proposed "evolved approach" involves
point-to-point translations between systems/domains and a competitive
marketplace for achieving the best translations. If so, it seems to me
that any faith in such an approach underestimates:    (06)

1) the number of point-to-point translation components required    (07)

2) the difficulty of doing translation between systems/domains with no
common semantic foundation (you still need the "domain experts" as in
the "designed approach")    (08)

3) the exhorbitant costs of supporting a competitive market place in
point-to-point translation components. The market for customized
components to translate between indivdual systems or even domains does
not seem adequate to support an effective free market. Providing
incentives to government employees (or contractors) to produce multiple
competing translations doesn't sound real cost-effective either. Someone
has to pay the labor for all those translations (Order K*N^2, where K is
the number of competing solutions for each pairwise translation).
General-purpose tools for dictionaries and translation cannot do the
translations themselves, even if there were a viable market for them. 
__________________
"But mainly because people just don't solve ontology differences that
way in
 the real (non-IT) world. They just buy a dictionary, or hire a
translator.  Problem solved."    (09)

This analogy with natural language translation by humans doesn't hold up
too well for a number of obvious reasons. We don't want to use humans to
do all real-time translations manually due to costs in labor, time, and
money. Even existing translations of (& information extraction from)
human readable natural language text is moving towards automation,
(e.g., TIDES, ACE, AFE) due to volume and cost issues.     (010)

So, I think we all recognize the needs for machines to do the
cross-domain translations and to use the results for tasks like
discovery and analysis. But, as acknowledged, machines are not as smart
a people, so we can't just hand them a dictionary :-). To effectively
translate and analyze information from multiple disparate sources,
computers need software translation components that are grounded in
formal semantics supporting automated inference. But, an "evolved
approach" dependent on funding a competitive market in machine-readable
translation components ("dictionaries") for all pariwise unrelated
languages/systems/domains (which need to interoperate) doesn't sound
real cost-effective (Order (K*N^2) is a real cost issue). 
__________________    (011)

While the "evolved approach" that's been described has long been
considered a non-starter by many, other concepts of "evolution" and
"bottom-up" development in semantic interoperability might well have
something to offer to the CDSI problem. There has been work on
automating the extraction of ontology elements from text, as well as the
evolution of ontologies over time to reflect changing usage. The work by
John Sowa in this area, already cited in this forum, is one example of
interest. Such approaches have evolved well beyond the simplistic "hire
a translator" proposal. Still, they seem to need some more evolution
before they are ready for prime time applications.     (012)

And, while the "designed approach" has been dismissed, it has a track
record of success in promoting data interoperability amongst disparate
systems. The C2IEDM/GH5/JC3IEDM, which others have mentioned, is one
good example wherein a multi-national group of domain experts
collaborated to produce a common information model that serves as the
"lingua franca" for operational interoperability in Command and Control
across very different systems maintained by different governments. While
such efforts have been unfairly disparaged as too "top-down", when
successful they actually rely heavily on bottom-up inputs from Subject
Matter Experts working closely with IT folks to get the information
models to support user needs.     (013)

That is not to say that the specific approach taken by the C2IEDM/GH5 is
adequate for CDSI, as it falls short in its semantic content (being
inadequate for machine "understanding") and it covers only one domain
(Command and Control). Still, some variant of the generic "designed
approach" might be adapted for CDSI by the use of a common upper
ontology and/or a small set of upper/middle ontologies. But, I.     (014)


Brian
__________________________________________
Brian A. Haugh, Ph.D.
Institute for Defense Analyses 
Information Technology and Systems Division  Phone: (703) 845-6678
4850 Mark Center Drive                       Fax: (703) 845-6848
Alexandria, VA 22311-1882                    Email: bhaugh@xxxxxxx    (015)


> -----Original Message-----
> From: cuo-wg-bounces@xxxxxxxxxxxxxx
[mailto:cuo-wg-bounces@xxxxxxxxxxxxxx]
> On Behalf Of Brad Cox, Ph.D.
> Sent: Monday, November 20, 2006 3:38 PM
> To: rick@xxxxxxxxxxxxxx; common upper ontology working group
> Subject: Re: [cuo-wg] White Paper
> 
> Thanks for the encouraging note, Richard. I'd backed off, convinced
I'd
> wasn't
> being heard. But buoyed by your note, I'll take one more shot at
> explaining
> what I've been trying to get across.
> 
> One of the things that's confusing me is I don't feel I understand
what
> people
> mean by the term "N^2 problem". I'm guessing that's shorthand for
costs
> increaases as limitOf(N*(N-1)) as N -> infinity = N^2. Fair enough;
its
> shorter.
> 
> But that applies if all N machines are to be connected to all N-1
others.
> Actually cost increases as the number of *interfaces*. N^2 is just an
> upper
> bound on that. But why concentrate on the upper bound when interfaces
> could be
> counted as easily, without the concern over whether upper bounds are
> realistic? For example, for N machines in a linear pipeline, the
number of
> interfaces is N-1, hardly N^2 or even N*(N-1).
> 
> So rephrasing the problem as one of semantic interoperability between
M
> interfaces where M is larger than we might like but still far less
than
> N*(N-1). I been trying to point out that there are two ways of
approaching
> that problem. I've called them the designed approach and the evolved
> approach.
> 
> In the designed approach, a (small) community of experts uses high
> technology
> ontology tools to build a generalized solution (upper ontology) that
can
> generate the mappings needed to make any given interface
interoperable.
> The
> approach doesn't much depend on what standard (language) is used. I
used
> OWL
> as my example because that's what I'm most familiar with. Structured
> English,
> structured french, or plain ol' Java/Cobol/Haskel would do about as
well,
> albeit with varying readibility. What's important here is that the
> approach is
> centrally planned, largely confined to an expert community, although
> hopefully
> with at least some support by domain experts with conflicting demands
on
> their
> time.
> 
> The evolved approach is entirely different and more bottom-up. M
> interfaces
> imply there are M  groups of individuals that care about making each
> specific
> interface (call it M(i)) interoperate. Those M groups are empowered
> (governance?) to address the problem in much the same way we solve
> inteoperability with natural languages; by using dictionaries and
related
> tools, using interpreters, etc. Dictionaries and interpreters are
evolved
> systems. Externally these are commercial products that compete with
each
> other
> in a competive system (free markets). But I could well imagine that
domain
> experts within govt might produce translation dictionaries that might
> compete
> in a similar way, if govt could find a way to incentive them to focus
on
> the
> problem over other pressing uses of their time.
> 
> Point is, I could well see how the second (evolved) approach could
"solve"
> the interoperability problem" as I've stated it. I've much less
confidence
> (approaching zero) that the designed approach (as I defined it) ever
> could.
> This is partially because AI technology just isn't very smart, and
> partially
> because you still need domain experts and don't have a way to
incentive
> them
> to contribute, since you've counted too heavily on high technology as
the
> sole
> solution.
> 
> But mainly because people just don't solve ontology differences that
way
> in
> the real (non-IT) world. They just buy a dictionary, or hire a
translator.
> Problem solved.
> 
> --
> Work: Brad Cox, Ph.D; Binary Group; Mail bcox@xxxxxxxxxxxxxxx
> Home: 703 361 4751; Chat brdjcx@aim; Web http://virtualschool.edu
> 
>     (016)

 _________________________________________________________________
Message Archives: http://colab.cim3.net/forum/cuo-wg/
Subscribe/Unsubscribe/Config: http://colab.cim3.net/mailman/listinfo/cuo-wg/
To Post: mailto:cuo-wg@xxxxxxxxxxxxxx
Community Portal: http://colab.cim3.net/
Shared Files: http://colab.cim3.net/file/work/SICoP/cuo-wg/
Community Wiki: 
http://colab.cim3.net/cgi-bin/wiki.pl?SICoP/CommonUpperOntologyWG    (017)
<Prev in Thread] Current Thread [Next in Thread>