- Ref: DataReferenceModel_09_2004 (2KI3)
- Comment: Volume I, Version 1.0 should be strengthened to enable the decentralized network of trusted information sharing envisioned in “Creating a Trusted Network for Homeland Security. ” To achieve this decentralized network of trusted information sharing, Volume I requires high-level abstractions of Security and Privacy and high-level abstractions of Semantic Interoperability. (2KI4)
- Security and Privacy (2KI5)
- Information exchange implies the following key elements of Security and Privacy: collaboration among parties in a circle of trust; identity and encryption; and an information exchange protocol. These elements of Security and Privacy should be represented as high-level abstractions in Volume I, details can be deferred to Volume III as specified in the DRM Roadmap. (2KI6)
- For example, Exhibit G: DRM Collaboration Process could be enhanced to represent information sharing within a circle of trust. A circle of trust, as specified in the Liberty Project implies a set of security agreements among parties to the information exchange according to a specified protocol. (2KI7)
- This figure illustrates trusted information sharing in a circle of trust as a collaboration process based on the OMG Enterprise Distributed Object Computing (EDOC) standard. (2KI8)
- Identity and encryption should be added to the Information Exchange Package, the payload of the information exchange. For Example, Exhibit D: Information Exchange Package should be enhanced to include role based identity as well as encryption of the payload in support of its implied model of message-level security. (2KI9)
- An Information Exchange Protocol uncouples the payload’s information structure from its persistent structure and provides a trust agreement for each specified information exchange. By uncoupling the payload’s information structure from its persistent structure, we don’t all have to structure information the same, we just have to know how to represent the information for the specified payload. For example, the Data Requestor in Figure 1 could present its information structure to the Data Provider and receive the payload in its expected format. The information exchange protocol also specifies the trust agreement, or policy, for the specified information exchange. Security Assertion Markup Language (SAML) and XML Digital Signature (XML-DSIG) specifications support these key architectural principles of trusted information sharing information on a decentralized network. (2KIA)
- Semantic Interoperability (2KIB)
- Volume I requires a high-level abstraction of semantic interoperability. Semantics turn data into information at various levels. Levels of interoperability allow Volume I to level set the DRM’s expected outcomes against capabilities. Without specifying the level of semantic interoperability, mappings between diverse representations will have no meaning. For example, a Unified Modeling Language (UML) representation and a Web Ontology Language (OWL) representation can only have meaning when the semantics of the target representation are specified in the source representation. (2KIC)
- This figure, adapted from work done by Leo Obrst of the MITRE Corporation for the Semantic Interoperability Community of Practice (SICoP), represents levels of semantic interoperability across the ontology spectrum based on W3C standards. (2KID)
- The natural language semantics throughout Volume I are informal and formal semantics are required for semantic interoperability and information exchange. For example, the word “transaction” has well defined semantics among database administrators. To a database administrator, “transaction” means an insert, update, or delete operation with two-phase rollback and commit capabilities. The semantics of “transaction” in the DRM-I paragraph on Exchange of Data implies nothing of the sort and is a good example of how loosely defined natural language semantics lead to confusion. (2KIE)
- Natural Language Processing (NLP) is an evolving field and 90% or more of the EA artifacts are defined in natural language. Version 1.0 should contain a high-level abstraction and usage model of NLP of unstructured data. Technologies are now available to infer noun phrases and their predicates in NLP meaning we can derive structure information from unstructured information based on this capability. (2KIF)
- Business Areas and Lines of Business (LoB) may be useful to set context, but there are other and more useful context setting approaches. For example, technologies are now available to infer context from data. By limiting context to Business Areas and LoBs, we severely limit context setting especially where 90% of our data does not contain a Business Area or LoB context. And if the context we set has no relationship to the data we provide, then we lose meaning and information once again becomes data. (2KIG)
- Data becomes information when we can derive meaning from its representation. Modern approaches to knowledge representation specify a triangle of meaning through a concept, its object, and its representation. ISO 11179 separates concept from term, but doesn’t adequately clarify the significance of this key principle of semantic interoperability. Thesaurus standards separate concept from term and Volume I must contain this distinction to achieve its stated objectives. (2KIH)
- Version I interprets ISO 11179 loosely and terms like Super Type, Data Property, and Data Representation does not map strictly to the concepts ISO 11179 is communicating. This approach causes another level of indirection and loss of meaning in a document where clear communication is critical to its successful implementation. This approach implies yet another set of mapping, traceability, and training requirements with no clear evidence of a return on investment. (2KII)