Classifying Schema Transformation Approaches and Tools

In the HUMBOLDT project, considerable effort was spent in mapping the landscape of tools and methods that are used to harmonize spatial data or that might be applied to this process. One area of focus was the process of schema transformation. We thus conducted studies on tools in 2007 and in 2010/2011, and continued this work afterwards. In these studies, we used a framework to classify these approaches, since we felt we were comparing apples and oranges all the time! This post defines the core classification categories for schema transformation approaches, as I also presented them at the INSPIRE KEN Schema Transformation Workshop.

There are multiple aspects or dimensions we can use to classify different approaches for schema transformation. Note my use of the term “approach” is meant to abstract from language, method or its implementation in a tool.

Activity/Phase

In Schema Transformation projects, several phases are characteristic, very much like in software development or engineering projects. These phases include the following ones:

  • Design: Define the correspondances and functions to use independent of implementation details, e.g. in matching tables or using UML
  • Development: Coding in a programming language such as Xquery, or building a Pipes-and-Filters Graph visually as in FME or Talend
  • Debugging: Analysing the Schema Transformation’s behaviour
  • Validation: Testing with the full range of data, quality assurance
  • Documentation: Documenting parameters, the process, limitations and assumptions of the schema transformation, provide lineage on the transformed dataset
  • Maintenance: Keep track of changes, iterate through the other activities for updated/new datasets, new schemas…

Different Approaches put their focus on different phases. As an example, a matching table is a good design and documentation tool, but has very limited use in the transformation development. We furthermore differentiate between explicit support and implicit support. Explicit support means that the approach has facilities designed to support the phase, while implicit means the approach has facilities that can be (mis)used to support the phase. As an example for implicit support, consider XSLT: Since it’s text, programmer’s maintenance and documentation features can be used.

Paradigm

Originally used to classify computer programming languages, paradigms can help us understand what kind of patterns to use in the development phase. We differentiated two major paradigms:

  • Declarative: Describe the logic of a computation without describing its control flow. Leave optimization and actual execution order to the runtime engine.
    • Examples: XSLT, EDOAL/gOML
  • Procedural: Describe a computation by giving its control flow through a series of functions.
    • Examples: Python GeoProcessing Tool, FME

Of course, there are other approaches that can’t be fit into these two, such as Aspect-Oriented Programming or Agend-based Programming. Furthermore, there are approaches that contain elements of both. If, for example, has a procedural and a declarative sublanguage. XQuery as a rule-based approach also has a declarative and a procedural part.

Model Level

A classic property of schema transformation approaches is the abstraction level they work on – the meta-model, the conceptual model, the logical model or the physical model.

Schema Transformation - Abstraction levels from the Model Driven Architecture Approach

Schema Transformation – Abstraction levels from the Model Driven Architecture Approach

In practical terms, each level focuses on different aspects of the tranformation – conceptual/semantic integrity at the top level, adherence to structural rules on the logical level and value transformation at the physical level. Consequently, higher-level transformation definitions do not focus on minutiae such as the format of a date string. In a true model-driven architecture, the availability of vertical mappings means that you only have to define the schema transformation on the conceptual level, and the necessary transformations for the logical and physical levels are derived automatically. In most cases, the number of decisions or statements that a user needs to make increases significantly from the conceptual level to the physical level.

Instance- or Schema-Driven Execution

In this classification, the are two categories:

  • Instance-driven, where the execution of a schema transformation is driven by properties of a (set of) features
  • Schema-driven, where execution of the schema transformation is driven by properties of the schema elements

Furthermore, especially in semantic web research, more and more combined approaches are developed that use a combined approach. As an example, consider EDOAL/OML and its implementation in HALE: HALE sets up a transformation graph based on the schema, but then modifies it during execution when it encounters individual features with specific properties that make this necessary, e.g. because of varying cardinalities or formats of string to date conversions. From a practitioner’s perspective, the main difference between schema-driven and instance driven is that only schema-driven approaches can be “complete”, i.e. cover all possible kinds of data valid according to a particular schema. However, with instance-driven methods you often save development time, since focus is put on the part of the schema that actually contains the data.

Representation

Another means of classifying a schema transformation approach is to look at its primary representation form – textual, graphical or a combined approach. Textual forms have several advantages, such as versioning (and merging) and the fact that they tend to be less tool-bound. You can open an XSLT file in any old text editor, after all. Graphical forms such as the transformations graphs we have come to be accustomed to from Talend, FME or GeoKettle emphasize data flow and often represent a more intuitive syntax than textual forms.

Graphical (in FME) and Textual (XQuery) Representations of Schema Transformation Languages

Graphical (in FME) and Textual (XQuery) Representations of Schema Transformation Languages

Expressivity

The final criterion that is typically used is the actual expressivity of the approach – can I do everything that I need to with the language or tool? Is it powerful enough, in other words? Some approaches such as XSLT are effectively general-purpose programming languages and have been shown to be turing-complete. For assessing the suitability for spatial data schema transformation, I use Matt Beare’s classification from the 2010 INSPIRE Schema Transformation Network Service Pilot project. This classification has six levels of functions:

  • 1 – Renaming classes and attributes
  • 2 – Simple attribute derivation
  • 3 – Aggregating input records
  • 4 – Complex derivation and dynamic type selection
  • 5 – Deriving values based on multiple features
  • 6 – Conflation and model generalisation

In total, the classification lists 25 functions that approaches would need to be considered complete for the given spatial data schema transformation use cases. As an example, under level 2 the following functions are listed:

  • Transforming data types (e.g. numbers into text or strings into timestamps)
  • Transformation based on basic geometric functions (e.g. bounding box, convex hull, area)
  • Transformation based on non-spatial functions (e.g. uppercase, truncate, substring, round, regular expression)
  • Transforming units of measure
  • Setting default values where data is not supplied
  • Replacing values based on lookup tables (e.g. code lists)
  • Entering identifiers for referenced objects (e.g. based on GML xlink or relational database foreign key in the source data).

Conclusion

I have listed six criteria that can be used to assess a specific approach to schema transformation. These are not necessarily all that are important – there are others such as maturity and verbosity . Furthermore, the actual classification within each criterion is often a subject of discussion. As an example, RIF and 1spatial use a rule-based paradigm that has elements of declarative and procedural paradigms, but it could be argued that it mandates a category of its own.

dhp-blogpost-standardimage

Workflow: Loading the INSPIRE Annex II and III schemas

Working with large GML schemas and their dozens of dependencies to ISO, OGC and INSPIRE application schemas can be quite a pain – it’s one of these cases where I can’t help but agree to James Fee’s statement about GIS being complicated. Issues related to just loading them were, for quite some time, the most commonly asked support queries we got. This workflow post shows you which options you have in HALE to load a complex schema easily.

1. Import from preset: The simple option:

  1. Go to File -> Import source (or target) schema
  2. Select the “From preset” tab
  3. Pick one of the “bundled” schemas (all dependencies are included and don’t need to be fetched online)
  4. Click “Finish”. You are done.

hale-import-preset

2. Load a schema from an online repository: If the schema you want to work with is not bundled with HALE, you can access one of the online repositories, which usually have all imported schemas and other resources.

  1. In HALE, go to File -> Import source (or target) schema
  2. Select the “From URL” tab
  3. Copy/Write the URL of your root schema into the text box, e.g. http://inspire.ec.europa.eu/…/PlannedLandUse.xsd. Example repositories include:
    1. INSPIRE draft schemas: http://inspire.ec.europa.eu/draft-schemas/
    2. INSPIRE final schemas: http://inspire.ec.europa.eu/schemas/
    3. OGC schemas: http://schemas.opengis.net/
  4. Click “Finish”. All dependencies are retrieved automatically.

INSPIRE KEN & EuroSDR Workshop in Paris

If you’d like to get an excellent overview of available software and approaches for transforming data to INSPIRE formats, a good opportunity is coming up: THE INSPIRE KEN (Knowledge Exchange Network) and EuroSDR are organizing a workshop about schema transformation tools and methods in the premises of the ENSG – Marne-La-Vallée (near Paris) – France.

Citing from the workshop’s information and registration page, “NMCAs, as other data producers, will have to make their data compliant with INSPIRE interoperability Implementing Rules; during [the] next years, this compliance will mainly be achieved through schema and data transformation. The objectives of the workshop is to make a state-of-play about (existing or projected) schema transformation tools, to help NMCAs to assess these tools and to help them to choose the most appropriate and possibly, to provide background to disseminate knowledge about schema transformation at national level.

The workshop is scheduled from Tuesday 8th October 2013 09:00 – Wednesday 9th October 2013 16:00. This is the draft lineup of presentations of the workshop:

Tuesday Morning

Time Topic
09:00 – 09:10 Welcome and introduction
09:10 – 09:35 A Study about schema transformation services
09:35 – 10:00 Approaches & Languages for Schema Transformation: Findings of HUMBOLDT & follow-up Activities
10:00 – 10:25 From production data base to INSPIRE data: potential methods
10:25 – 10:45 Pause
10:45 – 11:30 The ESDIN experience : use of DBMS and WFS, The GeoServer APP schema, Catalogue and mapping generators by Politecnico di Milano
11:30 – 12:10 Deegree and its specific developments for INSPIRE

Tuesday 08th October afternoon : non-commercial tools

Time Topic
13:20 – 13:50 XSLT and its use by Kadaster for ESDIN
13:50 – 14:20 Stetl for INSPIRE transformation
14:20 – 14:50 Talend for INSPIRE Theme Land Use
14:50 – 15:20 Comparison Talend GeoKettle
15:20 – 15:40 Pause
15:40 – 16:10 Humboldt Alignment Editor (HALE) and Conceptual Schema Transformer (CST)
16:10 – 16:40 GeoConverter
16:40 – 17:10 ExoMS for INSPIRE themes Species Distribution – Habitat and Biotopes
17:10 – 17:40 model driven Web Feature Service (mdWFS)
17:40 – 18:30 Discussion : main drivers to choose transformation tool(s) and method(s)

Wednesday 9 October morning : Commercial tools

Time Topic
09:00 – 9:40 Feature Manipulation Engine (FME) and its use for ESDIN
09:40 – 10:20 ArcGIS for INSPIRE – Example of use
10:20 – 10:50 Snowflake GO Publisher
10:50 – 11:10 Pause
11:10 – 11:50 INTERGRAPH GeoMedia Fusion and its use by Gugik
11:50 – 12:20 Use of Altova MapForce by Lantmateriät
12:20 – 12:50 Schema transformation by 1Spatial

Wednesday 9 October afternoon : research and discussions

Time Topic
14:00 – 14:30 Tools to restructure geographic data on the Web
14:30 – 15:00 The ARE3NA project
15:00 – 16:00 Discussion

INSPIRE 2013: Implementation Experience Reports

With the work of the Data Specification teams and initial implementation efforts being completed, there were a lot of experience reports at this year’s INSPIRE conference. This post highlights two of those, which were especially interesting in the context of data transformation and harmonisation.

The first experience-centered talk carried the title “A successful experience of implementation of INSPIRE Specifications Data with different tools” and was given by Paloma Abad and her team (A. F. Rodríguez E. López, A. Sánchez, A. Villena, L. Hernández, I. Serra M. Juanatey, C. Soteres, C.Ruiz) at IGN, Spain. The presentation contained a very nice list of requirements for any data transfpormation tool to be useful in the production of INSPIRE data sets – the tool should…:

  • Make inspire easy
  • smart and powerful
  • with automatic transformation capabilities
  • cheap or free
  • mature
  • open source
  • Quality of transformation
  • Multiplatform,….

Their experience was that current transformation tools are not easy to use and require enough experience in its use, aren’t open source and or free of costs, or have limited functionality. In the end, the group thus selected a combination of tools – Esri’s ArcGis for INSPIRE, Safe Software FME and GeoConverter by Geobide to evaluate and to achieve results. Their conclusion was that with each tool, the stated goals could be achieved – i.e. the tools were used to transform Geographical Names and Administrative Units form multiple input data sets to INSPIRE GML that would be validated by the validation tools available through the INSPIRE portal. They stated that, as usual, the applications tested have advantages and disadvantages depending on:

  • the local data model,
  • the INSPIRE data specifications,
  • the complexity of the transformation process and
  • the software used to publish the INSPIRE Network Services

An interesting suggestion was that the INSPIRE Geoportal should have a list of software applications where we can see the degree of conformity of tools. Of course this would mean that somebody would need to do the conformance testing or even certification – and that would require tools that go beyond the capabilities of those currently available, such as the Test Suite of GDI-DE, the dataset validation tool developed in Bavaria together with Interactive Instruments or the service quality tools developed by Spatineo.

Dominique Laurent from IGN France then gave her presentation titled “INSPIRE services in NMCAs“. She summarized experiences gathered by the INSPIRE Knowledge Exchange Network (KEN) on the implementation of discovery, view, download and transformation services for Annex I data. She started with the issues encountered with discovery services and state that “It is difficult to get good quality, INSPIRE compliant metadata from other data providers. Situation may be even worse for annexes II and III themes”. Furthermore, INSPIRE has a thematic approach (one theme, multiple scales) whereas NMCAs have a product approach (several themes, one scale). This induces a mismatch in the creation of the metadata. Technically, a remaining challenge on Discovery Services is the synchronisation of data and metadata.

On View services, there are also multiple challenges, but in general, the situation is quite good: “There are open-source tools (Deegree, Geoserver, Mapserver) that are compliant with INSPIRE requirements (at least, with IR). Commercial solutions (ESRI, Intergraph) also claim INSPIRE compliance.” One item I noticed as well is that “the display of INSPIRE layers gives bad rendering (poor INSPIRE legends); it is almost meaningless for some themes (AD, GN). Users prefer viewing cartographic products, traditional maps, at least as background”. Especially for View services, but also for data analysis, it would be important to have good default SLDs for the INSPIRE themes.

Of special interest was her discussion of INSPIRE transformation services that followed. The KEN group had come to the conclusion that “Transformation services are useless: Coordinate transformation already done by WFS, Schema transformation requires too much knowledge from users“. This might be surprising at first, but actually there are a lot of good reasons why externally accessible transformation services are not (yet) needed by the NMCAs:

  • Annex I Data sets are often not subject to high-frequency updates.
  • The target requirements are fixed – in the form of the INSPIRE Data Specifications.
  • The data sets to be published are known in advance and do not consist of a small portion of a big data set.

Of course, data providers need to have an internal process in place to synchronize between their primary systems and the INSPIRE services, which might include different types of transformation services or applications. The choices how this is currently implemented are pretty wide.

Transformation services have potential especially for data users, not data providers – users need to transform (integrate, harmonize) data so that they can use it in their processes. Each user might have different requirements, but collectively, there will be a lot of overlap between transformation service requirements of users. The main question is whether there will be specific service providers (be it private or public) who can build a business model to satisfy (INSPIRE) data user’s common requirements.

dhp-blogpost-standardimage

HALE 2.6.0 brings integration with FME

Last week, we published the new HALE 2.6.0 release. There is one particular feature that we have added in collaboration with Safe Software that I’d like to highlight: The integration of HALE and the CST engine as a GML Writer in the upcoming FME 2014 release. This integration reflects what I and other had been doing for a while now:

  1. Start FME and use the readers to import from a wide range of formats such as an Esri File Geodatabase,
  2. Perform operations that are only possible in FME, such as Geometry calculations,
  3. then write out a simple feature style output schema using a GML or SHP writer,
  4. then start HALE to map the data to the actual target schema, e.g. an INSPIRE Application Schema,
  5. and lastly, use HALE/CST to create the final GML product.

In other words, you can use HALE to either perform the mapping to the complex schema after all other data preparation in FME, or you create a subset of a complex data set before you go to FME. To make the new integrated workflow work, you’ll need FME 2014 and HALE 2.6.0 installed. These are the steps after installation of both tools:

1. Start FME. Add readers and transformers as needed.

2. Add the HALE GML/XML Writer.

fmehale01

3. Open the Parameters for the HALE GML/XML Writer and set at least a *.halex project file location and the path to the HALE executable:

fmehale02

4. Add a Feature Type to the Writer, e.g. by right-clicking on the canvas and selecting “Insert Feature Type…”. Either import a Feature type from any data set or, after creating the Feature type, manually define its schemas using the “User Attributes” tab in the Feature Type’s properties.

5. Connect the last transformer in the workspace to the writer as appropriate to have a complete workspace such as this one:

fmehale03

6. Set any necessary additional writer attributes, such as the number of features you want to use in HALE for interactive transformation. Also pay specific attention to the “Execution Mode” setting. “Schema Transformation” will just execute the project given in Step 3, while “Update Mapping” will launch the HALE UI to enable you to create/update a halex project. “Auto” switches between these two modes depending on whether a file is already present in the indicated halex location:

fmehale04

7. Execute the workbench, and see either HALE or CST get fired up in the final writing phase!

A press release giving more information is also available.

INSPIRE 2013: First day presentations – Code Lists, XQuery, GeoUML and other stuff

While I had a lot of meetings and booth duty, I managed to attend some presentations that turned out to be quite insightful.

First, I visited Michael Lutz’ talk entitled “INSPIRE Data Specifications – What’s New? What’s Next?”. A core change has been made in the way that code lists are handled. In the preparation of our HALE Workshop, we had already found that the way that Code Lists are used in Annex II and II has been changed from the way it was done before – instead of having a CodeList type with a namespace indicating from which list the value was taken, there is now a ReferenceType whose href attribute point to a value in a published code list. These code lists will be made accessible through registries, of which the first one is the official INSPIRE codelist registry. Code Lists can now also be extended in several ways by INSPIRE implementers, e.g, to add narrower values in a classification. Micheal then proceeded with explaining the work that has been started on the Registry provision. The INSPIRe Registry already provides themes, code lists and application schemas in various formats such as HTML, XML, Atom and JSON. We’ll adapt HALE so that it will be able to work with the new resources as soon as possible.

Michael Lutz INSPIRE Presentation

The next presentation that I saw was Christine Giger’s “Tips & Tricks for Spatial Data Harmonization”. She focused on exploring options for data harmonisation to satisfy INSPIRE and other standards, and suggested that XQuery could be a useful tool for schema transformation and other aspects of data harmonisation. XQuery builds on top of Xpath as a selection/filter language and on several elements of XSLT for the generation part. XQuery, as a mainstream IT tool, has quite good tool support, e.g. in Zorba. To cover spatial operations, an extension called ExPath Geo can be used. Christine showed examples how to use XQuery and summarized her assessment that typical problems in data transformation can be covered. Of course, authoring the XQuery (a functional, turing-complete language) statements is not trivial, and I am not yet aware of an interactive authoring environment. As such, it might be interesting to extend HALE’s XSLT/XPath support to also enable XQuery authoring.

Christine Giger INSPIRE Presentation

Giuseppe Pelagatti presented his group’s work in his talk titled “Application of The Geouml Tools for The Production and Validation of INSPIRE Datasets“. They built a catalogue viewer to be able to explore the INSPIRE schemas, in a very similar fashion to what HALE’s schema explorer provides. However, Alberto, Giuseppe and their colleagues based their tool directly on the UML model, which provides more information than the GML Application Schemas. It would be really nice to collaborate with them to maybe bring actual UML support back into HALE, after our earlier experiments with bringing in XMI were not very successful. The talk also explained work that the group did in the field of connecting INSPIRE with spatial database backends, essentially performing vertical mappings. Here I though it might also be worthwhile to collaborate, to bring DB support to HALE.

INSPIRE Presentation

The final talk I was able to attend was Astrid Feichtner’s “Testing of INSPIRE Datasets“. She explained the current state of work in a project at the Bavarian Surveying Authority to perform AAA and INSPIRE dataset conformance testing, which goes well beyond Service Testing and validation performed on an XML Schema basis. The project, where the implementation partner is Interactive Instruments, was quite impressive. OCL constraints from the original UML model are evaluated, e.g. testing allowed spatial relationships and verifying references. The current limitations of the project were mainly scalability and completeness of spatial tests. I thought that providing this tool as part of the GDI-DE testsuite or otherwise making it accessible to the community might be a very valuable prospect. One major caveat I have however is that all tools I know of that create INSPIRE Geodata are based on compliance with the current GML application schemata. As the information in the schemas is only a subset compared to what is in the UML model, these data sets will not comply to all aspects of the regulation. This wouldn’t need to be the case, even without moving on to XSD 1.1 there would be ways to include more information in INSPIRE XSDs.

INSPIRE 2013: Full Agenda for the HALE Workshop

We (Silvia Franceschi, Simon Templer and me) look forward to welcome you at the HALE workshop at INSPIRE conference. This post brings you the full agenda of the workshop and provides links to associated materials and software.

Resources:

Agenda:

  1. Introduction to the workshop and background presentation (Thorsten, 15 minutes)
  2. Basic HALE Mapping: Converting the INSPIRE Planned Land Use data for the Trento Region of Italy (Silvia, 40 minutes)
    1. Loading resources
    2. Analysing schemas and data
    3. The 3 R’s of basic mapping: Retype, rename, reclassify
    4. INSPIRE Mapping Functions
    5. Dealing with references
    6. Data export and usage
  3. Advanced HALE Mapping: Provide INSPIRE Hydrophysical Waters Data from MERIDIAN2 UK data (Simon, 20 minutes)
    1. Mapping inheritance
    2. Functions and their documentation
    3. Additional Export functions: HTML, XSLT, Excel (CSV/XLS)
    4. One more thing…
  4. Q&A, Discussion

You will get the resources at the workshop on a nice USB stick as well, but of course it is always a good idea to prepare to get the optimum from an event. See you in Florence on Monday, 24th of June!

Feature Wish of the Month Polls

It’s important to us to know what you really need in tools to get to grips with data integration and harmonisation. Therefore we’ll be asking you once a month for your feature wishes. We’ll be offering three Features each month. The one to get the most votes will be prioritized for the next release. Of course you can always send us additional ideas via the commenting function to the poll!

Is there a Data Harmonisation Market?

Determining what kind of market exists for the toolset developed in the original HUMBOLDT project (2006 to 2011) was a major part of work. Among other activities, we conducted two market studies (2009 and 2011) to determine whether a market for data harmonisation products and services exists, and what the properties of that market are. These market studies used questionnaires and expert interviews to characterize the market, its actors and their needs.

Here are some of our findings of the 2011 study with 29 participants.

1. Importance and Use of Data Harmonisation

  • The participants in the study assess the importance of data harmonisation as very high, but the expected benefit varies by industry. The following diagram shows the expectations voiced:
Assessment of the importance of spatial data harmonisation for different industries (n= 29)

Assessment of the importance of spatial data harmonisation for different industries (n= 29)

However, many actors perform almost no data harmonisation, and if they do, it is focused on two fields – geographic names and geometric harmonisation. The effort involved is seen as prohibitive to perform data harmonisation for all but the most important data sets. If the costs of data harmonisation and reuse could be reduced, the full benefits of SDIs and INSPIRE could be unlocked.

  • In general, participants in the studies mentioned the following main benefits of data harmonisation activities:
    • Reducing duplication of data collection costs
    • Enabling easier discovery of datasets using standardized metadata and publishing such metadata electronically
    • Improved cross-departmental co-ordination of spatial data collection and publishing regimes due to harmonized datasets
    • Faster access to spatial data, especially using web-based delivery
    • Huge efficiency gains derived from a wider access to data of better quality within organizations / disciplines and across them
    • Benefits to society (better foundation for political decisions and monitoring)
    • Development of standardized fundamental core spatial databases from which new products and services can be developed more cheaply and quickly

2. Market structure and actors

For the data harmonisation services and products developed during the HUMBOLDT project, the following primary and secondary markets were identified:

Primary market:

  • National INSPIRE-responsible bodies (LMOs)
  • INSPIRE SDICs
  • GIS developers implementing applications with data harmonisation issues.
  • Parties that have to use/offer heterogeneous data from various sources (Data Custodians/Data Integrators)

Secondary market:

  • People/institutions faced with spatial data interoperability difficulties in a cross-border situation or other application fields.
  • Other parties interested in data harmonisation.
  • Thematically related European or Industry Projects.

These are in total several thousand potential customers in Europe alone who are currently for the largest part due to a lack of tools and processes not investing into data harmonisation, but rather perform re-collection of data or just use heterogeneous data sets.

To summarize, the market for data harmonisation services and products is in a competitive situation. It is evolving and changing quickly with both customers and market actors using very different approaches to data harmonisation. There are almost no actors that actively promote data harmonisation activities, but rather perform them under different titles such as data transformation, integration or spatial/business analytics.

What is your take on this? Is there a market for specific data harmonisation products and services or are these just special fields of Business Intelligence, Business Integration, or Spatial Data Value Added Services? What’s the right label?

 

The inspire-foss project

The open source community around data harmonisation and transformation is growing steadily. An interesting project that was started in 2010 by Just van den Broecke and others is called inspire-foss and offers “Transformation, storage and web-based delivery for geodata based on European INSPIRE standards using Free and Open Source Software“. The main focus of the inspire-foss project itself at this time is ETL – the transformation of data in an arbitrary spatial data source file to an INSPIRE-compliant schema. Technology-wise, a mix of XSLT, GDAL/OGR, Python and Java is used to create the ETL service. A description of the typical implementation process can be found here. As this process involves activites such as XSLT development and Unix Shell scripting, which might not be for everyone, the group also offers professional support.

For solving other steps of a data harmonisation process, several components are integrated. The deegree inspire node is used for data delivery, while an OpenLayers-based web map application called Heron Mapping Clientis used as a client. In HUMBOLDT, we have also worked closely with the deegree team and recommend using their service as well. This collaboration was tested in projects like GS SOIL.

An interesting topic that came up during INTERGEO, where we met part of the team behind the deegree project, was the format they use internally to configure the mapping between the database and GML. The definition of such mapping models and their usage are still complicated tasks and there was a strong interest in combining efforts to advance a standardised conceptual mapping format. We will definitely follow up on this!