4.2.1. What is Data Context and Why is it Important: (3X2W)
Data Context is any information that provides additional meaning to data to relate it to the purposes for which it was created and used. It is the information that makes it possible to provision a Context Awareness Service to support a COI or collaboration among COIs. Within a Context Awareness Service, one identifies the existence of a Data Asset and enables a user to discover whether it is potentially relevant to a given information need. The service makes Data Context artifacts, developed in accordance with the Data Context section of the DRM abstract model, available for use. These artifacts are chosen by the COI to reflect government related business needs and contain adequate information to support government related decision making. Typical examples of Data Context for a given Data Asset may include a Topic identifying a subject area, a data stewardship assignment, sources of record, etc. At a minimum, the Data Context for a given Data Asset should answer the following questions: (3X2X)
- What are the data (subject areas/ Topics and entities of interest) contained within the Data Asset? (3X2Y)
- What organization is responsible for maintaining the Data Asset? (3X2Z)
- What is the linkage to the FEA BRM? (3X30)
- What services are available to access the Data Asset? (See Data Sharing) (3X31)
There may be more than one context for a Data Asset. Context can be considered a “lens” and one may view something through a number of different “lenses”, one for each of the different contexts in which a Data Asset may be of interest. Data Context artifacts should be developed to reflect the understanding of the relevant Data Assets from the perspective of a COI. (3X32)
To satisfy a broad, general audience, as in the case of citizen access to public information, modern search engines are an effective means of discovering and retrieving unstructured and semi-structured information. Search technology, like the popular Google™ search engine, indexes unstructured and semi-structured documents (like Web pages) and returns a result set in response to a keyword-based query. The speed of the returned results often offsets the large quantity of hits (or matches) in the result set. In summary, search effectively serves information sharing to citizens and the techniques expressed in this section effectively serve information sharing within communities of interest. (3X33)
Agencies and organizations, participating in COIs, are called upon to categorize their data using taxonomies that may be defined and/or exchanged using the DRM’s Data Context standardization area. Once shared in data registries, these taxonomies become vehicles for discovering data that offers value for data sharing. Additionally, data consumers can subscribe to topics published within data registries, further enhancing data discovery. Lastly, for citizen-access to semi-structured and unstructured information, enterprise search technologies should be used. (3X34)