Workflow: 3 Steps to using SQL Databases in HALE

The headline feature of the HALE 2.8.0 release is the support for reading schemas and data from SQL Databases. Out of the box, PostGreSQL is supported. How you can access a PostGreSQL/PostGIS database in HALE and work with its content is detailed in this workflow. It will take you through the basic steps for mapping a database in HALE. In this concrete case the target is the current version of the INSPIRE Planned Land Use GML schema. Thus the starting point is the “Map to INSPIRE Planned Land Use” template which already includes the target schema and the respective code lists. You can access the template projects in HALE by clicking File -> New project from template....

1. Importing the Schema

After the template project has been loaded, we have to load the database schema as the source schema. To import the database schema (File -> Import -> Source Schema -> From Database (JDBC)) you need the host your database is running on, the name of the database and a user and password with at least read privileges:

Import a Schema from a SQL Database

Import a Schema from a SQL Database

HALE then reads the schema from the database, and each table or view is represented as a type with its columns as properties. More information is available in the online help. At this stage, foreign key relations are not yet taken into account:

Source Schema imported from the SQL Database

Source Schema imported from the SQL Database

Depending on your database the list of tables might include much more than you need for the mapping. Remove any tables that are not relevant – this removes the clutter from the Schema Explorer and ensures that if you are working with source data, only the relevant data is loaded. Note that the tables are organized per schema in the dialog, you can easily select or de-select a whole schema. In our example we only keep tables from the u_plan_v2 schema.

2. Importing sample data

An important feature of HALE is the continous testing of a defined mapping using real data. Sometimes using an entire dataset will slow down the workflow, though. To take only a subset of the data, enable instance sampling, e.g. to only load the first 1000 rows for each table. You can configure Instance Sampling by going to Window -> Settings -> Project -> Source Data:

Using Instance Sampling if the Source Dataset is too big for real-time work

Using Instance Sampling if the Source Dataset is too big for real-time work

Now import the data from your database, as you did before with the schema. To connect to the same database as before you can use the Recent resources in the From URL tab of the Load Source Data Dialog:

Importing Source Data from the SQL DB

Importing Source Data from the SQL DB

Once the data is loaded you can use the usual methods in HALE to inspect and analyse it.

3. Perform the Mapping

If you have only one relevant source type – i.e. all information you need for mapping to a target type is in a single table, you can simply use the Retype function. In most cases however, as we are dealing with a relational database, we need to Join multiple tables to combine the information.
In our example we want to map objects from the ft_landuse table to the ZoningElement type, but only objects that have the value planned in the discr column. So first we add a condition context on the ft_landuse type:

Adding a Type condition to match only a part of the Features of this type that satisfies the condition

Adding a Type condition to match only a part of the Features of this type that satisfies the condition

In the example database, several tables are linked to ft_landuse that hold information we need to populate ZoningElement properly. So we use the Join function to combine these types for the mapping:

Adding a 5-fold Join Retype Function

Adding a 5-fold Join Retype Function

After selecting and confirming which types from the source need to be joined, the next step in the configuration of the Join is specifying the Join order. The first type must be the main source type that corresponds to the target, in this case ft_landuse. In other words, for every source Feature with type ft_landuse, a target Feature of type ZoningElement will be created, with attributes of the other source features merged in. The main source type is followed by the other feature types it is directly related to. Next, for each feature type, the Join condition that specifies how it is linked to other types is specified. This is determined automatically if there are corresponding foreign key constraints defined in the database, so you can save yourselves this step if your database is set up completely. In many cases, a primary key and a foreign key need to be equivalent to have the features be joined:

Adding the Join conditions

Adding the Join conditions

After creating the type relation the transformation will produce empty ZoningElements. So, for each type relation you should then proceed with mapping the properties. With a Join just map the properties of the individual source types to the target type, as you would do it with a simple retype. The complete mapping for the ZoningElement example looks like this:

Complete Mapping with Join and Attribute Mappings

Complete Mapping with Join and Attribute Mappings

As we configured HALE before to only load a sub-set of the data, the transformation may not find all relevant objects to Join and thus information may be missing in the transformed data. To transform the whole data either disable the sampling to load all data in HALE or use Transform project data in the Transformation menu to directly transform and encode the complete data. Having done that, we now have a nice, INSPIRE-compliant dataset:

Completed INSPIRE PLU Data

Completed INSPIRE PLU Data

This post was prepared by Simon Templer. Thanks a lot for his contribution!

KEN Interviews: Just van den Broecke on Stetl

For the third interview of the INSPIRE KEN Series, Just van den Broecke is my guest. Just is an independent Open Source Geospatial Professional, self-employed at Just Objects B.V.. He provides consultancy, support and development exclusively in the domain of Free and Open Source for Geospatial and is the core developer of the Stetl open source framework.

Thorsten: What was the big idea that led to the development your software?

Just: Not a single big idea. This goes back to 2009 when I consulted for the Dutch Kadaster. Within the EURADIN and ESDIN projects we prototyped various solutions dealing with INSPIRE Data Transformation and harmonized Data Download Services (via WFS). For the transformation we used a combination of shell-scripting, GDAL/OGR, XSLT and PostGIS. For the WFS we (and many others in ESDIN) used the deegree WFS server. It struck us that this combination was really powerful, above we were one of the first members in ESDIN that delivered harmonized and valid (via the ETF test tool) INSPIRE GML via WFS. We then realized that any working INSPIRE solution should be an integrated toolset, i.e. ETL plus datastore (PostGIS) + WFS (deegree). Later in 2011-2012, the ETL tooling became more integrated into what became Stetl, completely rewritten in Python and more of a generic ETL tooling, not just for INSPIRE harmonization but also for local Dutch GML-based datasets.

Thorsten: What is the main strength of your software? What are you particularly proud of?

Just: I think the ability to produce valid GML from any source, dealing with terabytes, the streaming architecture, the speed and versatility to integrate with other geo-tools, in particular in Open Source (Linux) server-environments. For example the integration with the deegree WMS/WFS, allows for a turnkey INSPIRE solution. In effect Stetl is standing on shoulders of the giants that developed XSLT, GDA/OGR and PostGIS.

Thorsten: For which problems or use cases is it particularly well suited? What differentiates it from other products or solutions?

Just: Stetl is particularly well suited in Open Source server-contexts. The two downsides may be that Stetl does not have a GUI and that XSLT has some learning curve. The first is a matter of time. As for XSLT: this is still a mainstream tool/standard. When well-developed XSLT scripts can be highly structured, as we did for various INSPIRE Annex I transformations. For example: we have high reuse for common structures like INSPIRE Identifiers and Geographical Names. A new data theme transformation can often be quickly (matter of days) derived working by example from previous transforms.

Thorsten: Please provide a number of your choice and its meaning.

Just: 2 hours: complete transformation of Dutch Addresses to the INSPIRE Annex I AD theme.

Thorsten: On which phases of a schema transformation project does you software focus?

Just: Mainly on the execution phase.

Thorsten: Describe (ideally use a concrete example) how users of your software should design the schema transformation in a project.

Just: Start with a semi-formal description (like in Excel sheet) of the source and destination features/attributes. Use a small subset of the data.

Thorsten: Describe how users of your software develop the actual schema transformation. Which tools do they need, which knowledge should they have?

Just: At a minimum just a text editor is required. But to be effective with GML and XSLT it is best to use an Integrated Development Environment (IDE) like Eclipse or IntelliJ. Also use a version control system (SVN, GitHub).

Thorsten: Describe how users of your software should debug and validate the schema transformation process they have developed. Again, which additional tools and knowledge do they require?

Just: One of the filter modules in Stetl is an XML Validator. During development this filter can be applied to check the output. In addition to test the WFS output the Open Source ESDIN Test Framework can be used.

Thorsten: Describe how users of your software should document a schema transformation project.

Just: In practice there is not so much code, a lot of it should be self-documenting. XSLT design can be documented as described here.

Thorsten: Describe how users of your software should maintain a schema transformation project.

Just: Best is to have an integrated architecture (ETL+ Web Services) and maintain all sources/configs in a version control system like SVN or Git. This is one of the weaknesses of some desktop tools I’ve seen: multiple versions lying around, deployment issues. With Stetl + deegree everything can be maintained and built from a single repository.

Thorsten: Please classify your software according to the criteria given in this article:

Just:

  • Paradigm: Stetl uses a combined declarative and procedural approach. Stetl uses a text file (.ini format) to specify the ETL as chained processing modules: inputs, filters, outputs.
  • Execution: Via the “stetl” command with a given Stetl config file (see Paradigm) the processing modules are executed. Typically input modules use OGR for reading from a source input and Coordinate Transformation and “schema flattening” while an XSLT-Filter module will do Schema Transformation.
  • Representation: Stetl uses a text file (.ini format) that is a specification the entire ETL chain. Specific processing and transformation steps, for example XSLT scripts, are parameterized and referenced from that specification.
  • Expressiveness: Out of the box offers the ETL-framework via a Pipes-and-filters Pattern. Many modules are already available like streaming GML parsing, XSLT-processing and OGR and deegree-integration. XSLT scripts need to be specified by the ETL-developer. Additional processing elements can be added via Python scripting by inheriting from existing processing modules.

Thorsten: Give an example of a schema transformation project in the INSPIRE context using your software.

Just: The Dutch National INSPIRE SDI PDOK for Addresses Theme (together with degree WFS/WMS) and various EURADIN and ESDIN projects.

Thorsten: What do you think of creating a schema transformation standard language, e.g. in the OGC?

Just: No strong opinion: the combination of GDAL/OGR + XSLT + PostGIS plus some custom Python coding, thus Stetl, should be able to handle any INSPIRE transform. The advantage is that these are or solid and widely supported tools.

Thorsten: Anything else that you would like to explain? Future plans for the software? The next big thing?

Just: Well, Stetl is more and more used for National Data transformations (Dutch Topography and Ordnance Survey Mastermap). Integration with WPS is is foreseen in combination with a (web-based) GUI mainly for execution and parameterization. No plans for developing transformations graphically as in HALE or FME – Stetl will remain firstly a commandline approach, like ogr2ogr.

Thorsten: Thank you very much, Just!

HALE 2.8.0 brings PostGreSQL access

Just about a month after the release of HALE 2.7.0, the next intermediate version is ready for you. Despite the short cycle, it brings a lot of improvements, since it includes a few features that were almost but not entirely ready for release with the prior version. So these are the updates you can expect with the 2.8.0 (downloads, documentation):

Database access

Out of the “box”, you can now connect to PostgreSQL/PostGIS databases as a schema and data source for a transformation with the new JDBC based database support in HALE. Each table in the database is loaded as a type; to combine information from different database tables in a mapping, use the Join function, which has also been improved. To support additional database types, HALE’s plug-in mechanisms can be used. For more information, see the Database Import documentation.

Accessing a PostGIS database in HALE

Sample Source Data

Source data is used in the live transformation to directly see the impact of changes to the mapping in the transformed data. Live transformation performance will become sluggish is the source dataset is very big, however. So, you can now configure that only a sub-set of the data is loaded into HALE, which in most cases also speeds up the import.

Groovy Transformation Scripts

While the transformations functions delivered with HALE cover a lot of ground, an occasional request by advanced users was to provide their own functions or customize existing ones. You can now can combine the regular HALE transformation functions with powerful Groovy scripts. HALE provides easy-to-use APIs for accessing and creating complex instances. To author the scripts, a new script editor is included that supports syntax highlighting and script validation. Detailed help and example code is available as well.

Improved classification mapping function

The classification mapping function now uses a tabular representation and has some new features:

  • Fill the lookup table with values encountered in the source data (Occurring values) or enumeration values defined in the schema
  • Load the lookup table from a CSV or Excel file
  • Save the lookup table as a CSV or Excel file

Improved Classification Mapping Function in HALE 2.8.0

Other new functions

As usual, there has been a set of other improvements (about 15). You can go here to learn more about these.

Simon will post separate workflow descriptions for the usage of PostGreSQL databases and on the Groovy Scripting in the next days. Enjoy your work with HALE 2.8.0!

dhp-blogpost-standardimage

INSPIRE KEN: All Presentations and Videos now publicly accessible

Thanks to the work of Dominique Laurent and the folks at Eurogeographics, all presentations as well as video recordings of the presentations are now publicly available through the Eurogeographics website. You can also check out the videos directly on Eurogeographics’ Youtube channel.

My two presentations are embedded below:

Approaches and Languages for schema transformation

Create and Use Harmonised Spatial Data With HALE

KEN Interviews: Simon Templer of Fraunhofer IGD

In the second interview of the INSPIRE KEN series, my guest is Simon Templer. He is a researcher at the Fraunhofer Institute for Computer Graphics Research IGD and is the lead developer for HALE. His Spatial Information Management group at Fraunhofer IGD was the coordinator of the HUMBOLDT research project and is still the driving force behind the continuous development of HALE.

Thorsten: What was the big idea that led to the development your software?

Simon: Enabling domain experts to easily create and maintain Schema Mappings – even for complex schemata.

Thorsten: What is the main strength of your software? What are you particularly proud of?

Simon: The main strength is the instant feedback you get on the transformation during the whole process of creating the Schema Mapping. Starting with the first relation you create you can observe how the transformation takes form and visually verify the results.

As I’m also a developer working on HALE I see a different strength – HALE is designed to be extendable. It offers a large set of extension points, e.g. to add support for additional schema and data formats, transformation functions or User Interface components.

Thorsten: For which problems or use cases is it particularly well suited? What differentiates it from other products or solutions?

Simon: The INSPIRE GML Application Schemata and many other XML schemata are really complex. Other tools tend to work well with simple feature models, but get exponentially harder to use and understand with increased schema complexity. Scaling an intuitive, simple user experience to schemata of any complexity and data sets of any size is one thing we focused on when developing HALE – but without hard-coding everything, so that power users still have as much choice as they need.

Thorsten: On which phases of a schema transformation project does your software focus?

Simon: The main focus lies on the development and maintenance of the Schema Mapping. The design, development, debug and validation phases are not strictly separated in HALE, but are rather tied into a single, fast feedback loop. However, for each phase you can create specific resources.

Thorsten:Describe (ideally use a concrete example) how users of your software should design the schema transformation in a project.

Simon: To design a schema transformation, people use HALE’s schema explorer as well as the source data view to analyze the schemata and data. An especially helpful function in this phase are the statistics on the schema, such as which elements are actually filled with data, and with what kinds of data. The schemata can also be exported as a matching table to get feedback from persons used to working with matching tables. The next step is identifying correspondences and deciding how to express them as relations in HALE, first on type, then on property level.

Thorsten: Describe (ideally use a concrete example) how users of your software develop the actual schema transformation. Which tools do they need, which knowledge should they have?

Simon: The development tool is HALE, and it can be used without programming skills – just bring your data model knowledge. Developing the Schema Mapping is a short cycle of defining or adapting individual relations and immediately getting feedback on how the changes influence the transformation. Rapid development is supported through the accessible schema documentation, instant transformation feedback, validation of transformed instances, as well as generated and user defined mapping documentation.

In addition to the regular relations that are used to define the Schema Mapping, advanced users have the possibility to combine them with custom Groovy scripts.

Thorsten: Describe (ideally use a concrete example) how users of your software should debug and validate the schema transformation process they have developed. Again, which additional tools and knowledge do they require?

Simon: During the creation of the Schema Mapping sample data is transformed and validated based on the constraints defined by the associated schema. Transformed objects can be inspected individually or as a whole, and compared with the objects they originated from. Errors in individual relations won’t stop the transformation – HALE always provides as complete results as possible.

Thorsten: Describe (ideally use a concrete example) how users of your software should document a schema transformation project.

Simon: Documentation can be generated automatically for a mapping project in a variety of formats, such as HTML or Excel. It includes detailed information on all defined relations. In addition, notes and comments can be attached to each individual relation as well as the whole project.

Thorsten: Describe (ideally use a concrete example) how users of your software should maintain a schema transformation project.

Simon: Due to the declarative nature of our mapping language, each relation can be independently added, removed, edited or disabled. Mappings can be versioned, forked and merged. Furthermore it is possible to import existing mappings in your own project. As an example, you can import a CityGML to INSPIRE Buildings base mapping and then create your own mapping to deal with an ADE.

Thorsten: Classify your software according to the criteria given in this article:

Simon:

  • Paradigm: HALE uses a declarative approach.

  • Execution: HALE combines Schema-driven and Instance-driven elements during the transformation – schema and mapping are compiled into a transformation graph which can be modified for individual instances.

  • Representation: HALE uses a graphical representation of relations as well an RDF-Text-based one. The user creates and configures relations guided step-by-step by specific wizards.
  • Expressiveness: Out of the box, HALE offers almost complete expressiveness. It only lacks spatial filtering and loop constructs. Both are rarely used (so far they haven’t been requested by the users) and can be added through scripting or custom extensions.

Thorsten: Give an example of a schema transformation project in the INSPIRE context using your software.

Simon: In our newest user project, KU Leuven from Belgium uses HALE to produce Air Quality data compliant to INSPIRE and the EU Air Quality Directive IPR, based on their existing Web Feature Service.

Thorsten: What do you think of creating a schema transformation standard language, e.g. in the OGC?

Simon: A schema transformation standard language should be relatively easy to design and implement. It should focus on defining a framework, with some basic transformation functionality and mechanisms to extend it, coupled with a public registry to discover transformations.

Thorsten: Anything else that you would like to explain? Future plans for the software? The next big thing?

Simon: Be sure to check out the next release which will be out in the first half of November. It adds transformation based on PostgreSQL/PostGIS databases, an improved user interface for defining classifications and advanced scripting functions.

Thorsten: Thank you very much, Simon!

KEN Interviews: Ken Bragg of Safe Software

To support the work of the INSPIRE Knowledge Exchange Network, I have started to interview schema transformation software providers, specifically those who took part in the INSPIRE KEN workshop. These interviews provide a starting point for comparing strengths and weaknesses of different approaches and outline best practices for phases such as design, development or maintenance. I conducted the first interview with Ken Bragg, who is the European Services Manager for Safe Software. Safe Software are the makers of FME, perhaps the most the well-known spatial data translation and transformation tool.

TR: What was the big idea that led to the development of FME?

Ken: We believe you should have complete mastery of and access to your data where and how you need it. FME lets you transform your data to use and share.

TR: What is the main strength of your software? What are you particularly proud of?

Ken: FME supports over 300 data formats and enables users to transform data in limitless ways. We are proud of the way our users simply love working with FME and become incredibly enthusiastic about our products.

TR: For which problems or use cases is it particularly well suited? What differentiates it from other products or solutions?

Ken: FME is very well suited for virtually any kind of data transformation including: format, coordinate system, schema and content transformation. No other product supports the range of formats and transformers supported by FME.

TR: On which phases of a schema transformation project does you software focus?

Ken: FME is well suited for format transformation, coordinate system transformation and particularly attribute mapping including name, values and data type mapping.

TR: Describe (ideally use a concrete example) how users of your software should design the schema transformation in a project.

Ken: Many of our users use FME to migrate data from their own format and schema into an Inspire INSPIRE staging database for example ArcGIS Inspire Geodatabase. This transformation can be designed and edited in FME Workbench which is an easy to use and mature graphical environment. The design can be documented within FME Workbench and saved or edited as an FME Workspace file.

TR: What you are saying is that there is no explicit design step, right? The implementation of the project is the design?
Ken: Yes you’re right – there is no explicit design step.

TR: Describe (ideally use a concrete example) how users of your software develop the actual schema transformation. Which tools do they need, which knowledge should they have?

Ken: FME Workbench is the key tool we use to develop schema mapping. The basic steps for defining a transformation into an Inspire INSPIRE staging database transformation are as follows:

  1. Add a reader to the source data in whichever format it exists into FME Workbench. This will add the source feature types and their schema into your Workspace.
  2. Add a writer to the destination database and import the required destination feature types and their schema from an existing database or template.
  3. Define the schema mapping by connecting the feature types and using FME transformers such as AttributeRenamer, AttributeCopier etc. Or use the SchemaMapper transformer in FME to read a set of mapping rules from a table or spreadsheet.

A domain expert in the source data is required to perform the actual mapping and some knowledge of FME Workbench.

TR: Describe (ideally use a concrete example) how users of your software should debug and validate the schema transformation process they have developed. Again, which additional tools and knowledge do they require?

Ken: Errors in the transformation process (using the above example) can be easily trapped in FME Workbench by disabling and enabling connections and then using the transformers DataInspector and Logger to see features at various points in the transformation. Break-points can also be created by inserting Inspection Points along connections.
Output data can be validated with other FME Workspaces which can verify schema, attribute values and geometry.

TR: Describe (ideally use a concrete example) how users of your software should document a schema transformation project.

Ken: FME Workbench includes annotation tools for any objects in the canvas and for general annotation. For example in an Inspire INSPIRE schema mapping workspace we might annotate a StringConcatenator transformer to say this is where the _NationalID is defined. Workbench also includes a rich set of “Workspace Properties” for metadata such as data, history, usage, requirements, etc. These can be edited in a Workspace Properties dialog.

TR: Describe (ideally use a concrete example) how users of your software should maintain a schema transformation project.

Ken: Schema transformation workflows are maintained in workspaces which can be edited at any time. Also, if the SchemaMapper transformer is used then the mapping rules can be maintained in database tables or spreadsheets if preferable.

TR: Do you have any best practices when it comes to versioning or even merging workspaces?

Ken: No, there aren’t really best practices around versioning or merging workspaces yet. This is something on our list of things to do.

TR: Classify your software according to the criteria given in this article.
Ken:

  1. FME uses a procedural paradigm for schema mapping
  2. FME can be both schema and feature driven depending on the transformation defined in FME Workbench
  3. FME uses a graphical representation for defining workflows
  4. Arguably FME is completely Expressive when it comes to Inspire INSPIRE transformation requirements.

TR: Please give an example of a schema transformation project in the INSPIRE context using your software.

Ken: BKG (German: Federal Agency for Cartography and Geodesy) uses FME to perform schema mapping from ESRI Esri Geodatabases into Inspire INSPIRE staging Geodatabases or EuroGraphics datasets.

TR: What do you think of creating a schema transformation standard language, e.g. in the OGC?

Ken: In my personal opinion this would add an unnecessary layer of complexity and abstraction to schema transformation.

TR: Anything else that you would like to explain? Future plans for the software? The next big thing?
Ken:

  1. FME’s GML writer for FME 2014 fully supports XSD schema driven GML writing
  2. FME Server 2014 has an improved Streaming Service which allows flexible support for workspace drive WFS and other Web Services.
  3. FME already supports the 3D, AIXM and raster features which will be required in Inspire INSPIRE Annex III
  4. FME can support Application Domain Extensions (ADE’s) for CityGML which should also ease productions of 3D Building datasets for Inspire INSPIRE Annex III.

TR: Thank you very much, Ken!

INSPIRE KEN Schema Transformation Workshop in Marne-la-Vallee – Day 2 summary

On 8th and 9th of October, around 50 people gathered for the joint EuroSDR/INSPIRE Knowledge Exchange Network (KEN) Workshop on Schema Transformation. For the first part of my report from this very interesting workshop, continue here.

The second day focused on presentations of commercial, closed-source software and was opened by Ken Bragg of Safe Software, with an in-depth presentation on how to use FME to create INSPIRE-compliant geodata. He explained improvements that were added in the latest versions (2013/2014) that make it simpler to create valid, complex GML structures, dwelling on transformers such as the SchemaMapper Transformer, which can be configured by anyone with basic Excel skills.

Ken Bragg of Safe Software explains FME

Ken Bragg of Safe Software explains FME’s SchemaMapper Transformer

After the presentation of ArcGIS for INSPIRE by Paul Hardy, emphasizing the importance of having an actual GIS in which to fully edit, create and use INSPIRE data, the INTERGRAPH GeoMedia Fusion presentation followed a comparable angle. Next up was one of the presentations I had looked forward to, coming to this workshop: A presentation of how to use Altova MapForce for the transformation of complex XML data. The presentation was given by an end user – Helen Erikson from Lantmateriät. She walked the audience through the process of how to work with the software and highlighted pros such as the large number of built-in functions, the good usability and the fact that a valid GML file could be created easily. One thing she mentioned though is the mapping can become complex very quickly, and consequently hard to document, understand and maintain. This screenshot from her presentation illustrates the challenge and shows the limits of the “connect left to ride side approach” and it’s variants:

Using Altova MapForce to create INSPIRE Data

Using Altova MapForce to create INSPIRE Data

The final presentation in the morning block was given by Robert Chell from 1spatial. They have really come quite far with the radius studio/server products since I last evaluated them in 2010. Their solution helps with the full process, from source data discovery to source data assessment to data reconciliation (that’s where the schema transformation and quality assurance take place) and publishing.

A short afternoon presentation block followed, with Robin Smith of JRC explaining the purpose of the ARE3NA project (A Reusable INSPIRE Reference Platform), and with Sandrine Belley of IGN France exploring the User Experience side of schema transformation. She essentially described Mismatches and other sources of semantic heterogeneity that have been the core of my research (see this paper and this paper) that led to the development of HALE since 2008.

As on the first day, a discussion round completed the programme. This time the main question was “Missing items in Schema Transformation” with the findings again based on Dominique Laurent’s summary (with my favourites being highlighted):

  1. User Experience and Usability are not sufficiently in focus of the application development in this area; one vendor even explicitly stated that “the number of users is too small to spend significant effort to improve the user experience”. This might be true for INSPIRE-specific solutions, but many of the presented tools have value well beyond INSPIRE.
  2. Tools widely lack semantic-level functionality, so that the impact of a schema transformation is hard to assess.
  3. There is need for tools to be able to consume INSPIRE data, especially in desktop and Web GISes.
  4. Not all constraints in the INSPIRE data specifications (in the PDFs and UML Model form) are encoded in the .XSD file and so, can’t be tested during the transformation phase, though it would be useful to check them as soon as possible. Specifically we have to check the data integrity and consistency across multiple themes.
  5. The language to be selected to express and check constraints is still open to discussion; it is unsure that OCL is best option
  6. There is a need of a standardized mapping from the UML models to a relational database that includes measures for acceptable operational performance. This would allow tools to transform this relational data base to INSPIRE GML data, instead of having one-off solutions.
  7. Establish a standard for mapping language: it would enable us to provide the mapping rules to users and not only the transformed data.
  8. We should investigate other formats as optional delivery formats, such as JSON/GeoJSON, because of their importance on the user’s side.
    1. The discussion then shifted to more general issues with the adoption of INSPIRE, such as initiatives to populate missing data or to encourage the use of INSPIRE data. This ended the workshop, which was a really interesting event. Thanks to Dominique and her colleagues for organising it!