iii. Use of Corine Land Cover in OpenStreetMap in France

Interaction type

Government  → Public

Trigger Event

Change in the license policy of the Corine Land Cover dataset

Domain

Generic Mapping (update of Land cover/Environmental datasets)

Organisation

European Environment Agency  (EEA)

Actors

French OSM Community, EEA

Data sets in use

44 Land cover classes for France

Process

CLC2006 data not overlapping existing OSM data have been imported to the dataset (account for ~60% of the land). The CLC2006 typology has been adjusted to match the one of OSM.

Feedback

Goal

The goal was to update the OSM database with land cover information, mainly in rural areas, as the contribution in such areas is limited.

Side effects

i. A number of OSM communities (country-wise) have followed the example of French OSM community and have integrated CLC2006 into OSM database, ii. Inconsistencies in LoD, semantics and metadata between VGI and authoritative datasets have been revealed.

Contact Person

pieren3[at]gmail.com, Dr. Guillaume Touya (guillaume.touya[at]ign.fr)

The COoRdinate INformation on the Environment (Corine) is a European Commission supported program that aims to provide a Land Cover dataset (known as CLC; Corine Land Cover) for 39 European countries. The CLC2006 project was coordinated by European Environment Agency (EEA).  The image production for the land cover digitization was centrally co-ordinated by EEA and the actual data production took place by the EEA member states to “benefit from local knowledge” (EIONET 2012). The data integration of the national contributions was managed by EEA and the European Topic Centre on Land Use and Spatial Information (ETC LUSI).

CLC 2006 is the up-to-date version of the CLC for the year 2006. For CLC2006 the focus was on the identification and recording of all changes that took place since CLC2000. To compile the CLC2006 all land cover changes bigger than 5 ha have been mapped, regardless of location and have been integrated to the CLC2000. Changes in land cover between the 2000 and 2006 will be visually recognized using satellite data from 2006 (+/– 1 year), so called IMAGE2006, by two satellites: SPOT-4 ( pixel size of 20m multi-spectral and 10m panchromatic) and IRS P6 (pixel size of 23m panchromatic). The technical specifications of CLC2006 follow the ones set up for earlier CLC project and thus the data refer to a 1:100.000 scale, with a minimum mapping unit of 25 hectares and minimum width of linear elements of 100 metres which represents a trade-off between production costs and level of detail of land cover information (Heymann et al., 1994). As in previous CLC projects, CLC2006 includes 44 land cover classes grouped in a 3-level hierarchy. Level-1 includes five main categories: 1) artificial surfaces, 2) agricultural areas, 3) forests and semi-natural areas, 4) wetlands, and 5) water bodies (Heymann et al., 1994). The project covers in total 5.8M km2.

Allowed by the release terms of CLC2006, the French OSM community has imported the CLC2006 dataset into the OSM database. However, the original dataset was only ~60% automatically imported in those areas where there was no OSM data. The rest was not imported because it was in conflict with existing land cover polygons created manually by the OSM contributors. This was deemed more efficient as the OSM community realized that land cover polygons created by OSM contributors were more accurate compared to the CLC2006 dataset (Pieren 2013; personal communication). Moreover, as OSM contributors now have access to high resolution Bing aerial imagery their data should be even more accurate regarding land cover compared to the CLC which has been based on IMAGE2006 for data collection.

The integration of CLC206 to the OSM datasets has enriched instantly the latter with data regarding ~60% of the French territory. Moreover, it must be considered that the land cover classification based on imagery interpretation needs considerably more expertise than road classification and in general attracts less contributors than the “high-profile” urban fabrics. However, in this case, the land cover parcels imported can serve as a first class photo interpretation keys that can aid the OSM community to complete accurately the task. Finally, the French example has paved the way and builds the expertise for more countries to follow this practice successfully (nine more countries have imported their national CLC2006) (OSM, 2012).

The case of the integration of the French CLC2006 with the OSM datasets has brought into light a number of issues that are characteristic of the co-existence of VGI and authoritative data. First, it must be taken into account that the import of authoritative data into a VGI database injects into the latter both the positive and negative endogenous issues of the former such as the failure of keeping the data up-to-date. For example, the CLC2006 was imported in OSM in 2009, which means that data where already two years old (the project for France finished in 2007). Even worse, the authoritative nature of CLC2006 might give to the OSM users the false impression that such datasets are more accurate than they actually are or that they have been created recently by other OSM contributors and thus diverting their attention to other yet uncovered areas. Second, it should be expected that there is going to be a certain level of semantic inconsistencies. For example, the CLC2006 has less land cover classes than those used in OSM while some CLC2006 classes are too vague for OSM and thus they have not been converted and subsequently imported to OSM (Pieren; personal communication). Third, in a similar manner as the previous point raised, it must be expected that there are going to be inconsistencies regarding the Level of Detail (LoD). In the case of CLC2006, the features captured were in the scale of 1:100.000 (country-lever) while OSM is a street-level dataset. Touya and Brando (2013) have highlighted the nature of LoD inconsistencies between CLC2006 and OSM for France providing a number of illuminating examples.

All these reasons have probably played a crucial role in the subsequent stance of the OSM France community towards newer CLC datasets. More specifically, in a discussion thread that was initiated among the French OSM community for the update of the CLC polygons, the prevailing instructions was that the bulk import should be considered only as a starting point and now the community is advised not to use CLC datasets anymore. Thus, instead of using a new version, the contributors are asked to modify the polygons with the available OSM editors (e.g. JOSM) and to increase the granularity of the existing polygons. Moreover, the community is urged to remove the CLC identifiers and to update the “source” tag by adding the actual source of imagery that was used in order to update the polygons (e.g. Bing images).

The above discussion leads to some interesting conclusions and challenges regarding both the co-existence of VGI as it is realised that semantic and technical issues of data capture should be taken into consideration because the conflation of inconsistent datasets might produce more problems than those expected to solve. Furthermore, the synergy of authoritative datasets and the government→public→government type of data flow appears to be an open challenge for both parties (i.e. the authoritative source and the VGI community). This case showed that data curation, data management, quality assessment are not a straightforward process. Similarly the data transformation from one nomenclature to another might also prove a challenging task. All these factors might discourage the VGI community from relying to authoritative sources or directly using their datasets and instead to try to mobilize the crowd to collect the datasets needed. However, this practice might lead to the existence of two similar datasets for the same features, a crowdsourced and an authoritative one, increasing the complexity to the end user and in a sense one dataset undermining the other.

Main lessons:

  • The COoRdinate INformation on the Environment (Corine) is a European Commission program which aims to provide a Land Cover datasets to 39 European countries and it is coordinated by European Environment Agency (EEA). CLC projects include 44 land cover classes grouped in a 3-level hierarchy. The up-to-date version of the CLC is CLC 2006.
  • The main project is based on the incorporation of the CLC2006 dataset into the OSM database by the French OSM community and showed major differentiations in comparison to other VGI projects; great expertise and specialized interest among the volunteers is required for its success.
  • The result indicated automatic incorporation close to 60% in unmapped areas and great temporal accuracy of OSM comparing CLC datasets.
  • The evaluation of the two datasets in terms of quality indicated that semantic inconsistencies and temporal accuracy should be solved when OSM data is incorporated to authoritative datasets.
  • The difficulties of authoritative and crowdsourced data conflation might deter VGI communities from relying on or using authoritative data and instead try to create the data needed on their own.

References

EIONET, 2012. Corine Land Cover 2006. [online] Available at: [http://sia.eionet.europa.eu/CLC2006] [Accessed 18 December 2013].

Heymann, Y., Steenmans, Ch., Croissille, G. and Bossard. M., 1994. Corine Land Cover. Technical Guide. Office for Official Publications of the European Communities. Luxembourg.

OpenStreetMap, 2012. WikiProject Corine Land Cover. [online] Available at: [http://wiki.openstreetmap.org/wiki/WikiProject_Corine_Land_Cover/] [Accessed 18 December 2013].

Pieren, pieren3[at]gmail.com, 2013. Corine Land Cover and OSM. [email] Message to V. Antoniou (v.antoniou[at]ucl.ac.uk). Sent Saturday 7 December 2013: 11:58.

Touya, G. and Brando-Escobar, C., 2013. Detecting Level-of-Detail Inconsistencies in Volunteered Geographic Information Data Sets. Cartographica, 48(2), pp. 134–143.