Chapter Six     |     Context and Rationale     |     Organisational Approach     |     Implementation Approach

Implementation Approach

Definitions and Overview

Data Sets

Data sets are described by metadata and maintained within a data store. Foundation and Framework data sets represent fundamental data core that may be present within a spatial data infrastructure (See Chapter 2). Data sets are composed of collections of features (e.g. roads, rivers, political boundaries, etc.) and/or coverages (e.g. satellite/airborne imagery, digital elevation models, etc.).

Data Stores

Data stores are used to manage data sets. Data stores may be offline or online repositories. Traditional online data stores are file-based repositories, setup for the delivery of pre-defined data sets. Data stores also contain text and attribute data related to a data set. Data warehouses are datastores that provide seamless access and management of data sets.

Spatial Data Warehouse

A spatial data warehouse provides storage, management and direct access mechanisms. Typically, data warehouses ingest data from legacy file-based or data production systems.

Key characteristic of a spatial data warehouse include:

Commercial data warehouses include: Cubestore from Cubewerx (http://www.cubewerx.com/), the Oracle Spatial solution, (http://www.oracle.com/database/options/spatial/) and ESRI Spatial Data Engine (http://www.esri.com/).

Data Access Service

Implementations of data access services include the following:

In OpenGIS (http://www.opengis.org/) Project Document 98-060: "User Interaction with Geospatial Data" the Portrayal model is described. Figure 1 describes this model, which illustrates an simple features-based access and portrayal services pipeline.

Figure 1- OGC portrayal model

Figure 1- OGC portrayal model

Data Access Client

Online implementations of data access clients include:

Data Formats

Common spatial data formats include the following:

  1. GIS proprietary (e.g. ESRI, MapInfo, Intergraph, etc.)

    A good overview of GIS formats can be found at http://www.gisdatadepot.com/helpdesk/formats.html

  2. International and community

    Efforts have recently been made to minimise the number of geodata formats and to converge towards a reduced set.

    The Spatial Data Transfer System (SDTS), ISO TC/211 (http://www.statkart.no/isotc211/welcome.html) and the DIgital Geographic Exchange STandard (DIGEST) are examples of this trend.

  3. Exchange formats that allow the use of data outside of closed environments (e.g. Geography Markup Language - http://www.opengis.org/)

Typical data formats for most GIS applications contain only enough information for the originating GIS application to be able to use it properly. The data formats usually carry the features and maybe some basic projection information.

Data Exchange formats are usually more robust. They usually carry information that would allow the use of the data in a variety of systems. Exchange formats usually also carry some minimum metadata to describe the data set as well as data quality statements. Data exchange formats are typically used by producers of data.

Due to lack of current standards, spatial data infrastructures must support for today's multitude of spatial data formats, and emerging data access services.

In the past, a multitude of GIS data formats were very problematic. Currently, most GIS and related access systems support format translation.

Examples of commercial support for format translation include: the Feature Manipulation Engine from Safe Software (http://www.safe.com/) and Geogateway from PCI (http://www.pci.com/)

An online data access service that combines data access with format translation is the Open Geospatial Datastore Interface (http://132.156.30.81/iii/).

Unfortunately format translation systems do little to support translation of semantics. The real problem for interoperable data access services, and formats is the lack of common semantics. Semantic translation and multi use feature coding catalogs (e.g. Digest) attempt to address the cross domain semantic support issue.

Web Implementation formats

A vector file has many advantages that will prove useful for WWW spatial interfaces:

A vector file can be delivered to the client where it can be zoomed and panned without the need to expensively conduct every operation on a WWW server.

A vector file is composed of layers that might represent roads, rivers, boundaries.
The layers can be switched on or off.

A vector file allows a mechanism to limit the level of zoom so that spatial data is not pushed beyond its level of reliability.

The size and efficiency of a simple vector file will help with network services and response times.

Most GIS software can directly produce vector files.
A vector file is really an interactive map.

There are a number of candidate file formats for an inline vector file on the WWW:

Simple Vector Format (http://www.w3.org/Graphics/SVG/)
Web Computer Graphics Metafile (http://www.cgmopen.org/webcgmintro/paper.htm)

Recent on XML-based encoding formats (e.g. Geography Markup Language) allows for Web-based transfer of feature information, for subsequent styling and rendering via Web client, or client plug-ins.

Web/internet delivery of GIS raster formats such as ADRG, BIL and DEM (http://www.gisdatadepot.com/helpdesk/formats.html) is often problematic due to the large size of such files, combined with general lack of Internet bandwidth.

Typically raster files predominate Web-based portrayals for both Vector and Raster data. Common Web formats include GIF, JPEG and PNG (http://www.cdrom.com/pub/png/).

Relationship to other spatial data infrastructure services

Figure 2 illustrates the relationship role of data access in an end-to-end resource discovery, evaluation and access paradigm. Successive iterations of resource discovery via a metadata catalog, followed by resource evaluation (such as Web mapping) lead to data access either: direct as a data set, or indirect via a data access service.

Figure 2 - Geospatial Resource Access Paradigm

Figure 2 - Geospatial Resource Access Paradigm

Mature spatial data infrastructure will allow both application and human exploitation of the resource access paradigm. A key element of future spatial data infrastructures is the ability to broker requests for services, based on discovery and real-time access to online geoprocessing and related services. Future capability for chaining of distributed geoprocessing services is also expected.

A system context for data access is given in Figure 3. A data access service provides network access to a data set stored within a data store. Data sets are discovered (and later accessed) via metadata queries from a catalog client to a data catalog service [See Chapter 4].

Data sets can be visualised (and later accessed) via Web Mapping services [See Chapter 5], which are complementary to the data catalog service.

Figure 3 - System context for Geospatial data access services

Figure 3 - System context for Geospatial data access services

Standards In general, standards related to geospatial data access are still in their infancy. The standards of most relevance to access components of spatial data infrastructures include those from ISO/TC211, Open GIS Consortium (OGC) and Internet-related bodies including the World Wide Web consortium (W3C) and the Internet Engineering Task Force (IETF).

ISO/TC211

The primary mandate of ISO/TC211 (http://www.statkart.no/isotc211) is international standardisation in the field of digital geographic information.

"This work aims to establish a structured set of standards for information concerning objects or phenomena that are directly or indirectly associated with a location relative to the Earth.

These standards may specify, for geographic information, methods, tools and services for data management (including definition and description), acquiring, processing, analyzing, accessing, presenting and transferring such data in digital/electronic form between different users, systems and locations.

The work shall link to appropriate standards for information technology and data where possible, and provide a framework for the development of sector-specific applications using geographic data."

Emerging work on services is currently underway in both ISO/TC211 and the OGC. The definition of services interfaces will allow a wide range of applications access and use of geospatial resources. The OGC Simple Features Access model for SQL has been submitted to ISO for standardisation.

ISO SQL/MM

The purpose of the Draft Spatial Database Standard SQL/MultiMedia (SQL/MM) Part Three Spatial is to define multimedia and application specific objects and their associated methods (object packages) using the object-oriented features in SQL3 (ISO/IEC Project 1.21.3.4).

SQL/MM is structured as a multi-part standard. It consists of the following parts:

SQL/MM Part 3: Spatial is aimed at providing database capabilities to facilitate increased interoperability and more robust management of spatial data.

Open GIS Consortium (OGC)

Phase 1 of the recent OGC (http://www.opengis.org/) sponsored Web Mapping Test (WMT) bed initiative [ref: Chapter 5] has been successful in "Webmapping" portrayal of spatial data. An XML-based encoding scheme (Geography Markup Language or GML) for OGC Simple features was also an important output of WMT phase 1. Further evolution of the GML specification and direct data access is expected in subsequent OGC testbed initiatives, including a WMT phase 2.

Other activities of the OGC include the following:

The Open GIS Consortium has achieved consensus on several families of interfaces, and some of these have now been implemented in Off-The-Shelf software. All OGC consensus interface specifications carry a pledge of commercial or community implementation by their submitting teams.

Three Open GIS Simple Feature Access (SFA) interface specifications have been released: one each for SQL, COM-based, and CORBA distributed computing platforms. Companies belonging to the teams submitting one or more of these include Bentley Systems, ESRI, Oracle, Sun Microsystems, UCLA, Camber, Intergraph, Laser-Scan, MapInfo, Smallworldwide, IBM, and Informix.

The interfaces provide several layers of access to and control over GIS features. At the primitive level, the interfaces provide for the establishment of linear and angular units, spheroids, datums, prime meridians, and map projections that give semantics to coordinates. At the intermediate level, they enable the construction and manipulation of geometric elements such as points, lines, curves, strings, rings, polygons, and surfaces, as well as the topological and geometric and other relationships between them. Included are support for common geometric and topological constructs, such as convex hull, symmetric difference, closure, intersection, buffer, length, distance, and dozens of others.

At the GIS feature level, the interfaces provide for the creation and management of feature collections, and the ability to access features from such collections using geometric, topological, or attributional modifiers. Features and feature collections may be invoked in Well-Known-Binary (WKB) or Well-Known-Text (WKT) codes. Work is underway to specify Simple Features Access encoding using the Extended Markup Language (XML) as a well-known packaging of geometric and attribute information.

Open Geospatial Datastore Interface (OGDI)

OGDI offers a data access approach that leverages and accelerates standardisation efforts. OGDI is an application programming interface (API) that resides between an application and various geodata products, to provide standardised geospatial access method. The publicly available OGDI specification, and reference implementations are maintained by the Internet Interoperability Institute (http://132.156.30.81/iii/).

OGDI uses a client/server architecture to facilitate the dissemination of geodata products over the Internet/Intranet and a driver-oriented approach to facilitate access to a variety of geodata products and formats.

OGDI features include the following:

  1. the distribution of geodata products via Internet/Intranet. This reduces the space needed to store geographic data and insures access to "closest to the source", up-to-date data.
  2. access to data in native format. There is no need to keep multiple versions of geographic data in order to accommodate different GIS software packages.
  3. the adjustment of coordinate systems and cartographic projections; done on-the-fly so that original data is unaltered.
  4. the retrieval of geometric and attribute data.
  5. access to a large number of geodata products and formats.

Web and Internet related

The Internet Engineering task force (http://www.ietf.org/) develops and maintains specification for many Internet related application, transport, routing and security standards (Request for Comments - RFCs) many of which are related to data access (e.g. http, ftp, smtp)

The World Wide Web consortium, or W3C (http://www.w3.org/) is responsible for the development of common protocols and specifications to further the evolution of the World Wide Web. Activities of the W3C that related to spatial data access include work on Web graphic file formats, XML and metadata.

Related Services

Many services are related to data access. A brief listing follows:

  1. Discovery and catalog services [ref Chapter 4]
  2. Webmapping [ref Chapter 5]
  3. Electronic commerce related (e.g. http://www.commerce.net/)
  4. Public Key Infrastructure
  5. Delivery and Packaging
  6. Data subscription services
  7. Data and file transport
  8. Geoprocessing services (e.g. as defined by OGC)
  9. Distributed Computing Platforms
Best Practice Application

GeoGratis (http://geogratis.cgdi.gc.ca/)

One common problem with online access to data through a single infrastructure is the variety of policies and practice in place by the different data custodians. In order to support these different access policies one approach is to develop services to support different basic paradigms. These cases include:

  1. Custodians who restrict access to particular users would benefit from common user authentication/authorisation services;
  2. Custodians who charge for data or services would benefit from electronic commerce services;
  3. Custodians who distribute data free of charge would benefit from an inexpensive mechanism (both time and money) to distribute data.

One example of services to support the third paradigm is GeoGratis that provides common services to support the distribution of freely available geospatial data. GeoGratis provides a single ftp/web access point where consumers can discover and download freely available data sets. As a common online service GeoGratis can be viewed from different perspectives:
  1. The types of data it makes available;
  2. The services it provides;
  3. The distribution model it offers.

GeoGratis makes many types of geospatial data available to the consumer. These data may be national or local in scope, raster or vector, or current or legacy data.

Small-scale national data sets are commonly made publicly available. In the case of GeoGratis, base map data from the National Atlas of Canada is available for download. Additionally many national scale framework data sets are available through GeoGratis. At the other end of the spectrum are data from local test studies/sites that are nominally available free of charge. By offering basic download capabilities GeoGratis supports a wide variety of data types, including raster, vector and tabular. The only restriction is on any value-added service above the basic download capability. A final characteristic of the data available through GeoGratis is the availability of many legacy data sets such as the Canada Land Inventory. These data are typically data sets that suffered through some measure of cost cutting or program termination and as a result are no longer supported. GeoGratis provides a facility to make these data available albeit without background support.

In addition to freely available data GeoGratis provides value-added services.

As a basic service GeoGratis provides the download of freely available data. Other basic services that GeoGratis provides is the discovery of available data through a search interface, the evaluation of data sets through detailed metadata and visualisation. Additionally, extra services are provided in support of data download - these include data subsetting, reprojection and reformating for all types of data available through GeoGratis. More advance services include the provision of data warehousing capabilities that support seamless access to large area data sets available through GeoGratis.

Finally, GeoGratis offers a cost avoidance data distribution model. Since GeoGratis is provided as one of many common services supporting data access, this distribution model does not preclude other models, i.e., private access or fee based access. Similarly, GeoGratis does assert that all data should be freely available, but provides an effective service for data that is freely available.

One example of this is the National Atlas of Canada digital data. Originally these data were sold for a nominal fee. However it did not prove cost effective to continue this strategy due to the costs of selling and supporting the data compared to the limited return. Therefore a strategy of cost avoidance was adopted where the data was placed on GeoGratis for free download and support was removed. Access by any other means (such as distribution of the data on CD) was left to the value added private sector community. The result was a dramatic increase in the access and use of these data.

From an implementation and standards perspective, Geogratis provides an excellent "data rich" environment in which to implement emerging spatial data infrastructure standards, in an operational environment. Geogratis currently supports Catalog-based discovery services via the Z39.50 Geo profile, and is expected to provide future online OGC Web mapping and direct-access spatial datawarehouse access services.

The new reprojection and reformatting services provided by Geogratis will also be used to exercise the emerging OGC service specifications within an Intranet environment.

Summary and Readiness Analysis

Organisational readiness

Key organisational issues, related to data access in development of a spatial data infrastructure include:

Implementation readiness

Table 1 illustrates the evolution of data access and related spatial data services. Migration from "classic" towards "infrastructure enabled; standards based; and full functioned" is required to bootstrap a national spatial data infrastructure.

Both "top-down" and "bottom-up" implementation strategies are suggested. Early adoption, and "best practices" should be followed by key government data providers.

Table 1 - evolution of access-related services

Table 1 - evolution of access-related services

References and Linkages

GeoGratis (http://geogratis.cgdi.gc.ca/)

International Organisation for Standards, ISO/TC211 (http://www.statkart.no/isotc211)

Internet Engineering task force (http://www.ietf.org/)

Internet Interoperability Institute (http://132.156.30.81/iii/)

World Wide Web consortium, or W3C (http://www.w3.org/)

Table of Contents


Chapter Six     |     Context and Rationale     |     Organisational Approach     |     Implementation Approach