cuo-wg
[Top] [All Lists]

Re: [cuo-wg] An Evolved Approach to CDSI?

To: bcox@xxxxxxxxxxxxxxx, common upper ontology working group <cuo-wg@xxxxxxxxxxxxxx>, "Haugh, Brian" <bhaugh@xxxxxxx>
From: "Brad Cox, Ph.D." <bcox@xxxxxxxxxxxxxxx>
Date: Thu, 23 Nov 2006 11:47:44 -0500
Message-id: <20061123164505.M82570@xxxxxxxxxxxxxxx>
Jim; I've applied the enclosed suggestion to the wiki page at
http://www.visualknowledge.com/wikikey/A72465S3496911 in the hope that others
will refine the use case into one we could all accept. I hope this is OK; it
matches my understanding of what wikis are for.    (01)

--
Work: Brad Cox, Ph.D; Binary Group; Mail bcox@xxxxxxxxxxxxxxx
Home: 703 361 4751; Chat brdjcx@aim; Web http://virtualschool.edu    (02)


---------- Original Message -----------
From: "Brad Cox, Ph.D." <bcox@xxxxxxxxxxxxxxx>
To: "Haugh, Brian" <bhaugh@xxxxxxx>, "common upper ontology working group"
<cuo-wg@xxxxxxxxxxxxxx>
Sent: Thu, 23 Nov 2006 07:38:22 -0500
Subject: Re: [cuo-wg] An Evolved Approach to CDSI?    (03)

> > with no real semantic interoperability,
> > which I thought was the focus of this group.
> 
> That brings us squarely to the question I've been struggling to expose;
> exactly what is the goal of this group, minus code phrases ("semantic
> interoperability" or "N^2 problem") whose meaning won't be shared by folks
> like me... or those who'll ultimately pass on whatever solutions we propose.
> 
> I'll propose a solution to our internal group semantic interop problem
> (leaving the designed vs evolved solutions to external problems stand for
> now).  That is to define the use case for the problem to be solved. The
> group's wiki page will do for now, front and center where it can't be missed.
> This should be done collaboratively, and with more care than I've given it,
> since the use case chosen will have radical impact on solution approaches
> (designed vs evolved, etc). Here's a rough draft, no doubt leaning toward my
> biases, based on another example posted earlier to this list:
> 
> Use Case:
> A officer on the battlefield learns of an entirely new source for weather
> data. He is tasked to integrate that data with existing data sources to
> prepare tomorrow's battle plan.
> 
> Key phrases:
> The words "new" and "tomorrow" challenge both of the proposed solutions. For
> the designed approach, this impliels 24 hours to communicate an entirely new
> source to the upper ontology experts, encorporate the new data source into
> their upper ontology database (including whatever support from domain experts
> may be required) and return working translators to the officer. For the
> evolved approach, there is 24 hours assemble a domain expert team from
> upstream/downstream domains, build a working translation, and put it to 
>effect.
> 
> 24 hours may not be possible with either approach, barring war-winning
> benefits. But it does suggest specific tooling differences. For example, the
> evolved approach would demand very elaborate collaboration tools for pulling
> domain experts together on a moment's notice, plus very specialized tools for
> designating which messages must be translated *for that specific interface*,
> and how.
> 
> Similar tools could be proposed for the designed approach. The key difference
> is they'd be concerned with the new interface in relation to all existing
> nodes, whereas the evolved approach focuses on only the two sides of the
> single new interface.
> 
> --
> Work: Brad Cox, Ph.D; Binary Group; Mail bcox@xxxxxxxxxxxxxxx
> Home: 703 361 4751; Chat brdjcx@aim; Web http://virtualschool.edu
> 
> ---------- Original Message -----------
> From: "Haugh, Brian" <bhaugh@xxxxxxx>
> To: <bcox@xxxxxxxxxxxxxxx>, "common upper ontology working group"
> <cuo-wg@xxxxxxxxxxxxxx>
> Sent: Wed, 22 Nov 2006 23:00:21 -0500
> Subject: RE: [cuo-wg] An Evolved Approach to CDSI?
> 
> > Stovepipes are a type of pipeline in which every node is connected in a
> > linear chain, not communicating with nodes outside the chain. Pipelining
> > fails to be an apt example of the (undisputed) point that semantic
> > interoperability between N nodes need not require N^2 connections. One
> > reason cited was that communications between nodes that require
> > translations between systems with very different information models ARE
> > LIKELY to lose or distort information because such translations are
> > bound to be lossy unless they at least share some common semantic
> > foundations. It is not an issue of translations not working the same way
> > each time, but of semantic incompatibilities between languages. Hence, a
> > well-designed architecture for translating between different languages
> > would not do well to do many sequential translations to get from one
> > language to another. This is the sort of thing human translators of
> > human languages try to avoid.
> > 
> > Effective translation between languages/domains seeks to minimize the
> > number of intermediaries. Hence, pipelined architectures are an unlikely
> > model for promoting CDSI. Hence, if you want accurate translation, the
> > number of translators between N nodes using fundamentally different
> > languages with no common lingua franca is likely to be closer to N^2
> > than to N. Using a lingua franca would nominally require only N
> > interfaces (one for each node) translating to and from the common
> > language, with one intermediary language for all cross-domain
> > communications. But, while this may be an ideal for effective
> > translation, CDSI faces problems in constructing any single lingua
> > franca (or universal ontology). Hence, we look to alternatives, such as
> > common upper/middle ontologies to promote effective translation.
> > 
> > Even if we assume perfect translations and a pipeline with N-1
> > interfaces, the evolutionary part of the "evolved approach" would
> > require multiple translations for each interface, hence bloating the
> > required work. But, it seems difficult to reconcile the recent assertion
> > (below) that translations don't distort information with the original
> > proposal for a competitive market in translations ("dictionaries"),
> > which would seem to suggest some expected imperfections.
> > 
> > The contrast drawn below between the approaches of either using upper
> > ontologies or just doing direct (syntactic?) translations really
> > misrepresents the former. Both of these approaches depend primarily on
> > human intelligence (domain experts) developing translation rules. So, we
> > are all betting on the humans :-). But, the ontology approach will also
> > capture much of the semantics of the concepts, enabling machine
> > inference that a purely syntactic translation would not support. And,
> > the discipline of formalizing the semantics in a common ontology (or set
> > thereof) also promises to clarify the meanings of information elements,
> > so that common understandings of exchanged information are enhanced for
> > both humans and machines. But, if XSLT is your model for translation, it
> > seems all you have is syntax, with no real semantic interoperability,
> > which I thought was the focus of this group.
> > 
> > Brian
> > __________________________________________
> > Brian A. Haugh, Ph.D.
> > Institute for Defense Analyses 
> > Information Technology and Systems Division  Phone: (703) 845-6678
> > 4850 Mark Center Drive                       Fax: (703) 845-6848
> > Alexandria, VA 22311-1882                    Email: bhaugh@xxxxxxx
> > 
> > > -----Original Message-----
> > > From: cuo-wg-bounces@xxxxxxxxxxxxxx
> > [mailto:cuo-wg-bounces@xxxxxxxxxxxxxx]
> > > On Behalf Of Brad Cox, Ph.D.
> > > Sent: Wednesday, November 22, 2006 8:23 PM
> > > To: common upper ontology working group
> > > Subject: Re: [cuo-wg] An Evolved Approach to CDSI?
> > > 
> > > > "Pipelines," also known as "stovepipes," are not known to promote
> > > > effective interoperability across systems, much less across domains.
> > > 
> > > We seem to have miscommunicated. Pipeline doesn't mean stovepipe,
> > > certainly
> > > not in my domain. It means a linear chain of nodes, linearly connected
> > by
> > > interfaces. I never proposed pipelining as a solution, but as a
> > concrete
> > > example to show a lower bound (N-1) that is far smaller than the upper
> > > bound
> > > (N*(N-1)). Real systems fall somewhere between.
> > > 
> > > The rest of your argument seems to hinge on that misunderstanding. Any
> > > system,
> > > however connected, will have M interfaces between its nodes. The
> > > interfaces do
> > > the mappings. Each translation is *NOT* likely to lose or distort
> > > information;
> > > each translation works exactly the same way, day in, day out. The
> > internet
> > > would never work if bits degrade along the way.
> > > 
> > > The question is whether those mappings are derived from some explicit
> > > upper
> > > ontology that understands all domains of interest, versus
> > independently
> > > for
> > > each interface by pairs of domain experts on each side of each
> > interface.
> > > 
> > > The first approach relies on high-tech ontology specification and
> > > translator
> > > derivation tools. The second one relies on human intelligence (domain
> > > experts)
> > > feeding extremely low-tech tools (XSLT or slightly better).
> > > 
> > > Take your choice. My bet is on humans over AI every time.
> > > 
> > > --
> > > Work: Brad Cox, Ph.D; Binary Group; Mail bcox@xxxxxxxxxxxxxxx
> > > Home: 703 361 4751; Chat brdjcx@aim; Web http://virtualschool.edu
> > > 
> > > 
> > > ---------- Original Message -----------
> > > From: "Haugh, Brian" <bhaugh@xxxxxxx>
> > > To: "common upper ontology working group" <cuo-wg@xxxxxxxxxxxxxx>
> > > Sent: Wed, 22 Nov 2006 18:38:34 -0500
> > > Subject: Re: [cuo-wg] An Evolved Approach to CDSI?
> > > 
> > > > Perhaps some replies to issues raised by Brad Cox (in quotes) may
> > help
> > > > clarify some of the real problems faced by CDSI, which his proposal
> > for
> > > > an "evolved approach" does not adequately address.
> > > >
> > > > But, those who already feel that they understand the "N^2 problem"
> > and
> > > > its ramifications for Brad's proposal and CDSI may want to skip
> > this.
> > > > __________
> > > > "I don't feel I understand what people mean by the term "N^2
> > problem....
> > > > for N machines in a linear pipeline, the number of interfaces is
> > N-1."
> > > >
> > > > "Pipelines," also known as "stovepipes," are not known to promote
> > > > effective interoperability across systems, much less across domains.
> > > > Translation pipelines, however, are even worse than traditional
> > > > stovepipe systems, which may share some common internal
> > representations.
> > > > A translation "pipeline" would face the results you get from the
> > > > children's game of "telephone", only worse. Each translation is
> > likely
> > > > to lose or distort some aspect of information obtained from a source
> > > > with different semantic foundations, so by the end the results may
> > be
> > > > unrecognizable.
> > > >
> > > > While we may all agree that you won't have the worst N^2 case in
> > every
> > > > context, you really don't want a lot of linear pipelines translating
> > > > between systems/domains. In addition, capability needs are driving
> > > > requirements for direct interoperability between more and more
> > > > systems/domains. Hence, the N^2 upper bound is a real concern.
> > > > ____________
> > > > "The approach doesn't much depend on what standard (language) is
> > used."
> > > >
> > > > While the definition of the ("designed") approach is indedpendent of
> > the
> > > > language used; I expect all would agree that any success achieved by
> > an
> > > > application of this approach will be strongly dependent on the
> > > > expressive power of the language, as well as its semantic
> > foundations,
> > > > breadth of adoption, adequacy of support tools, etc.
> > > > ______________
> > > > [In] "the evolved approach ...groups...address the problem in much
> > the
> > > > same way we solve inteoperability with natural languages; by using
> > > > dictionaries and related tools, using interpreters, etc."
> > > >
> > > > While unclear, it sounds like the proposed "evolved approach"
> > involves
> > > > point-to-point translations between systems/domains and a
> > competitive
> > > > marketplace for achieving the best translations. If so, it seems to
> > me
> > > > that any faith in such an approach underestimates:
> > > >
> > > > 1) the number of point-to-point translation components required
> > > >
> > > > 2) the difficulty of doing translation between systems/domains with
> > no
> > > > common semantic foundation (you still need the "domain experts" as
> > in
> > > > the "designed approach")
> > > >
> > > > 3) the exhorbitant costs of supporting a competitive market place in
> > > > point-to-point translation components. The market for customized
> > > > components to translate between indivdual systems or even domains
> > does
> > > > not seem adequate to support an effective free market. Providing
> > > > incentives to government employees (or contractors) to produce
> > multiple
> > > > competing translations doesn't sound real cost-effective either.
> > Someone
> > > > has to pay the labor for all those translations (Order K*N^2, where
> > K is
> > > > the number of competing solutions for each pairwise translation).
> > > > General-purpose tools for dictionaries and translation cannot do the
> > > > translations themselves, even if there were a viable market for
> > them.
> > > > __________________
> > > > "But mainly because people just don't solve ontology differences
> > that
> > > > way in
> > > >  the real (non-IT) world. They just buy a dictionary, or hire a
> > > > translator.  Problem solved."
> > > >
> > > > This analogy with natural language translation by humans doesn't
> > hold up
> > > > too well for a number of obvious reasons. We don't want to use
> > humans to
> > > > do all real-time translations manually due to costs in labor, time,
> > and
> > > > money. Even existing translations of (& information extraction from)
> > > > human readable natural language text is moving towards automation,
> > > > (e.g., TIDES, ACE, AFE) due to volume and cost issues.
> > > >
> > > > So, I think we all recognize the needs for machines to do the
> > > > cross-domain translations and to use the results for tasks like
> > > > discovery and analysis. But, as acknowledged, machines are not as
> > smart
> > > > a people, so we can't just hand them a dictionary :-). To
> > effectively
> > > > translate and analyze information from multiple disparate sources,
> > > > computers need software translation components that are grounded in
> > > > formal semantics supporting automated inference. But, an "evolved
> > > > approach" dependent on funding a competitive market in
> > machine-readable
> > > > translation components ("dictionaries") for all pariwise unrelated
> > > > languages/systems/domains (which need to interoperate) doesn't sound
> > > > real cost-effective (Order (K*N^2) is a real cost issue).
> > > > __________________
> > > >
> > > > While the "evolved approach" that's been described has long been
> > > > considered a non-starter by many, other concepts of "evolution" and
> > > > "bottom-up" development in semantic interoperability might well have
> > > > something to offer to the CDSI problem. There has been work on
> > > > automating the extraction of ontology elements from text, as well as
> > the
> > > > evolution of ontologies over time to reflect changing usage. The
> > work by
> > > > John Sowa in this area, already cited in this forum, is one example
> > of
> > > > interest. Such approaches have evolved well beyond the simplistic
> > "hire
> > > > a translator" proposal. Still, they seem to need some more evolution
> > > > before they are ready for prime time applications.
> > > >
> > > > And, while the "designed approach" has been dismissed, it has a
> > track
> > > > record of success in promoting data interoperability amongst
> > disparate
> > > > systems. The C2IEDM/GH5/JC3IEDM, which others have mentioned, is one
> > > > good example wherein a multi-national group of domain experts
> > > > collaborated to produce a common information model that serves as
> > the
> > > > "lingua franca" for operational interoperability in Command and
> > Control
> > > > across very different systems maintained by different governments.
> > While
> > > > such efforts have been unfairly disparaged as too "top-down", when
> > > > successful they actually rely heavily on bottom-up inputs from
> > Subject
> > > > Matter Experts working closely with IT folks to get the information
> > > > models to support user needs.
> > > >
> > > > That is not to say that the specific approach taken by the
> > C2IEDM/GH5 is
> > > > adequate for CDSI, as it falls short in its semantic content (being
> > > > inadequate for machine "understanding") and it covers only one
> > domain
> > > > (Command and Control). Still, some variant of the generic "designed
> > > > approach" might be adapted for CDSI by the use of a common upper
> > > > ontology and/or a small set of upper/middle ontologies. But, I.
> > > >
> > > > Brian
> > > > __________________________________________
> > > > Brian A. Haugh, Ph.D.
> > > > Institute for Defense Analyses
> > > > Information Technology and Systems Division  Phone: (703) 845-6678
> > > > 4850 Mark Center Drive                       Fax: (703) 845-6848
> > > > Alexandria, VA 22311-1882                    Email: bhaugh@xxxxxxx
> > > >
> > >  _________________________________________________________________
> > > Message Archives: http://colab.cim3.net/forum/cuo-wg/
> > > Subscribe/Unsubscribe/Config:
> > http://colab.cim3.net/mailman/listinfo/cuo-
> > > wg/
> > > To Post: mailto:cuo-wg@xxxxxxxxxxxxxx
> > > Community Portal: http://colab.cim3.net/
> > > Shared Files: http://colab.cim3.net/file/work/SICoP/cuo-wg/
> > > Community Wiki: http://colab.cim3.net/cgi-
> > > bin/wiki.pl?SICoP/CommonUpperOntologyWG
> > >
> ------- End of Original Message -------
> 
>  _________________________________________________________________
> Message Archives: http://colab.cim3.net/forum/cuo-wg/
> Subscribe/Unsubscribe/Config: http://colab.cim3.net/mailman/listinfo/cuo-wg/
> To Post: mailto:cuo-wg@xxxxxxxxxxxxxx
> Community Portal: http://colab.cim3.net/
> Shared Files: http://colab.cim3.net/file/work/SICoP/cuo-wg/
> Community Wiki:
http://colab.cim3.net/cgi-bin/wiki.pl?SICoP/CommonUpperOntologyWG
------- End of Original Message -------    (04)

 _________________________________________________________________
Message Archives: http://colab.cim3.net/forum/cuo-wg/
Subscribe/Unsubscribe/Config: http://colab.cim3.net/mailman/listinfo/cuo-wg/
To Post: mailto:cuo-wg@xxxxxxxxxxxxxx
Community Portal: http://colab.cim3.net/
Shared Files: http://colab.cim3.net/file/work/SICoP/cuo-wg/
Community Wiki: 
http://colab.cim3.net/cgi-bin/wiki.pl?SICoP/CommonUpperOntologyWG    (05)
<Prev in Thread] Current Thread [Next in Thread>