Abstract
We describe a strategy for integration of data that is based on the idea of semantic enhancement. The strategy promises a number of benefits: it can be applied incrementally; it creates minimal barriers to the incorporation of new data into the semantically enhanced system; it preserves the existing data (including any existing data-semantics) in their original form (thus all provenance information is retained, and no heavy preprocessing is required); and it embraces the full spectrum of data sources, types, models, and modalities (including text, images, audio, and signals). The result of applying this strategy to a given body of data is an evolving Dataspace that allows the application of a variety of integration and analytic processes to diverse data contents. We conceive semantic enhancement (SE) as a lightweight and flexible process that leverages the richness of the structured contents of the Dataspace without adding storage and processing burdens to what, in the intelligence domain, will be an already storage- and processing-heavy starting point. SE works not by changing the data to which it is applied, but rather by adding an extra semantic layer to this data. We sketch how the semantic enhancement approach can be applied consistently and in cumulative fashion to new data and data-models that enter the Dataspace.