1. Access harmonized LUCAS samples

1.1. Install

1.1.1. Note for Google Colab

In the case that this notebook is run in Google Colab, additional steps are needed. Google Colab comes with GDAL 2.2. We have to update this library to version 3.0. It will takes a while.

[ ]:
!add-apt-repository --yes ppa:ubuntugis/ubuntugis-unstable
!apt update
!apt install libgdal26
!pip install gdal==3.0.4
print("INSTALLATION COMPLETED")

1.1.2. Install st_lucas package

First of all, let’s install st_lucas Python package and additional required packages.

[ ]:
!pip install st_lucas ipyleaflet

Now we have to restart runtime: Runtime -> Restart runtime (on Google Colab) or Kernel -> Restart (on JupyterLab).

From st_lucas package let’s import LucasRequest and LucasIO.

[2]:
from st_lucas import LucasRequest, LucasIO

1.2. Usage

1.2.1. Define a request

Request is defined by LucasRequest. In example below the spatial filter is defined by a bounding box (bbox property). Note that only EPSG:3035 is supported.

For testing purposes, the request can be created by build() method.

[3]:
request = LucasRequest()
request.bbox = (4504276, 3020369, 4689608, 3105290)

request.build()
[3]:
{'typename': 'lucas:lucas_points',
 'filter': '<ogc:BBOX xmlns:ogc="http://www.opengis.net/ogc"><ogc:PropertyName>geom</ogc:PropertyName><gml311:Envelope xmlns:gml311="http://www.opengis.net/gml" srsName="http://www.opengis.net/gml/srs/epsg.xml#3035"><gml311:lowerCorner>4504276 3020369</gml311:lowerCorner><gml311:upperCorner>4689608 3105290</gml311:upperCorner></gml311:Envelope></ogc:BBOX>'}

1.2.2. Download data based on a request

LUCAS data retrieval is controlled by the LucasIO (input/output) class. Data is downloaded by calling the download() method using the prepared request. The number of downloaded LUCAS observations can be retrived by the count() method.

[4]:
lucasio = LucasIO()
lucasio.download(request)

print("Number of downloaded points:", lucasio.count())
2024-03-09 23:42:58,268 - root - WARNING - __init__.getSRS - Requested srsName 'EPSG:3035' is not declared as being allowed for requested typename 'lucas:lucas_points'. Options are: 'urn:x-ogc:def:crs:EPSG:3035'.
2024-03-09 23:42:59,070 - LUCAS - INFO - io.download - Download process successfuly finished. Size of downloaded data: 14258kb
Warning 1: GDAL was built against curl 8.6.0, but is running against 8.5.0. Runtime failure is likely !
Number of downloaded points: 5222

1.2.3. Get data in specified format

LucasIO allows getting data in various data structures/formats.

By calling the to_geopandas() method, the data is retrieved as a GeoDataFrame object.

We can visualize downloaded LUCAS observations using ipyleaflet package.

[5]:
from ipyleaflet import Map, GeoData, basemaps, LayersControl

points = lucasio.to_geopandas(epsg=4326)

center = points.dissolve().centroid
m = Map(center=(float(center.y), float(center.x)), zoom=8, basemap=basemaps.OpenStreetMap.Mapnik)

geo_data = GeoData(geo_dataframe=points,
                   style={'color': 'black', 'fillColor': '#3366cc', 'opacity':0.05, 'weight':1.9, 'dashArray':'2', 'fillOpacity':0.6},
                   point_style={'radius': 2, 'color': 'red', 'fillOpacity': 0.8, 'fillColor': 'blue', 'weight': 3},
                   name='LUCAS points')

m.add_layer(geo_data)
m.add_control(LayersControl())

m
2024-03-09 23:43:01,425 - numexpr.utils - INFO - utils._init_num_threads - Note: NumExpr detected 12 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2024-03-09 23:43:01,426 - numexpr.utils - INFO - utils._init_num_threads - NumExpr defaulting to 8 threads.
/tmp/ipykernel_130678/3023446211.py:5: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.

  center = points.dissolve().centroid
/tmp/ipykernel_130678/3023446211.py:6: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  m = Map(center=(float(center.y), float(center.x)), zoom=8, basemap=basemaps.OpenStreetMap.Mapnik)
[5]:

The method to_gml() returns a byte array containing data in the OGC GML format.

[6]:
gml_str = lucasio.to_gml()

# check the type of the string
print(type(gml_str), '\n'.join(gml_str.splitlines()[:5]))
<class 'str'> <?xml version="1.0" encoding="utf-8" ?>
<ogr:FeatureCollection
     gml:id="aFeatureCollection"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="http://ogr.maptools.org/ st_lucas_jtg7jf2l.xsd"

The method to_gpkg() stores the data locally in a OGC GeoPackage file.

[7]:
gpkg_file = 'sample.gpkg'
lucasio.to_gpkg(gpkg_file)

# check if the file exists
from pathlib import Path
print(Path(gpkg_file).exists())
True

1.3. Request examples

Beside filtering by a bounding box, LucasRequest also supports other two spatial filters: * by countries and * by user-defined polygon

1.3.1. Filter by countries

List of countries to be filtered is defined by countries property. Country is given by NUTS0 code. In the example below the spatial filter is limited to the Czech Republic. LUCAS subset is downloaded by user-defined download() function.

[8]:
request = LucasRequest()
request.countries = ['CZ']

def download(request):
    lucasio = LucasIO()
    lucasio.download(request)
    print("Number of downloaded points:", lucasio.count())

download(request)
2024-03-09 23:43:19,165 - root - WARNING - __init__.getSRS - Requested srsName 'EPSG:3035' is not declared as being allowed for requested typename 'lucas:lucas_points'. Options are: 'urn:x-ogc:def:crs:EPSG:3035'.
2024-03-09 23:43:22,870 - LUCAS - INFO - io.download - Download process successfuly finished. Size of downloaded data: 74649kb
Number of downloaded points: 27227

Example below demostrate retrival of LUCAS subset defined by the Czech Republic and Slovakia.

[9]:
request = LucasRequest()
request.countries = ['CZ', 'SK']

download(request)
2024-03-09 23:43:34,646 - root - WARNING - __init__.getSRS - Requested srsName 'EPSG:3035' is not declared as being allowed for requested typename 'lucas:lucas_points'. Options are: 'urn:x-ogc:def:crs:EPSG:3035'.
2024-03-09 23:43:44,259 - LUCAS - INFO - io.download - Download process successfuly finished. Size of downloaded data: 112898kb
Number of downloaded points: 41618

1.3.2. Filter by polygon

Spatial filter can be also defined by polygon vertices.

There is a limit for the request length, so the number of vertices is also limited. The AOI polygon can contain only about 190 vertices.

[10]:
request = LucasRequest()
request.aoi_polygon = '''
<Within>
    <PropertyName>lucas:geom</PropertyName>' \
    <gml:Polygon xmlns:gml="http://www.opengis.net/gml" srsName="urn:ogc:def:crs:EPSG::3035" gml:id="polygon_32.geom.0">' \
        <gml:exterior>' \
            <gml:LinearRing>' \
                 <gml:posList>3014669.3195414557 4640226.080241454 2981893.140187475 4628385.701013993 2965449.7283930806 4639492.816821902 2958781.6185918115 4658392.1858341275 2977549.274784839 4672892.4477362465 3004572.819247867 4661017.510044226 3014669.3195414557 4640226.080241454 </gml:posList>' \
             </gml:LinearRing>' \
         </gml:exterior>' \
    </gml:Polygon>' \
</Within>
'''

download(request)
2024-03-09 23:44:01,896 - root - WARNING - __init__.getSRS - Requested srsName 'EPSG:3035' is not declared as being allowed for requested typename 'lucas:lucas_points'. Options are: 'urn:x-ogc:def:crs:EPSG:3035'.
2024-03-09 23:44:02,136 - LUCAS - INFO - io.download - Download process successfuly finished. Size of downloaded data: 1518kb
Number of downloaded points: 554

1.3.3. Filter by years

By default all survey years are retrieved. By the years property, list of survey years can be limited.

[11]:
request = LucasRequest()
request.countries = ['AT']
request.years = [2006, 2009]

download(request)
2024-03-09 23:44:02,826 - root - WARNING - __init__.getSRS - Requested srsName 'EPSG:3035' is not declared as being allowed for requested typename 'lucas:lucas_points'. Options are: 'urn:x-ogc:def:crs:EPSG:3035'.
2024-03-09 23:44:03,611 - LUCAS - INFO - io.download - Download process successfuly finished. Size of downloaded data: 11222kb
Number of downloaded points: 4961

1.3.4. Thematic groups

A thematic group determines which attributes will describe the downloaded data. There are 5 groups: * LC_LU - Land cover, Land use; * LC_LU_SO - Land cover, Land use, Soil); * CO - Copernicus; * FO - Forestry; * IN - INSPIRE.

Visit list of LUCAS attributes on ST_LUCAS website.

The file always contains mandatory attributes that define mainly the location of the point.

A combination of bbox and group (Copernicus) is presented below.

[12]:
request = LucasRequest()
request.countries = ['CZ']
request.group = 'CO'

lucasio = LucasIO()
lucasio.download(request)

points = lucasio.to_geopandas()
print("Number of attributes:", len(points.columns))
2024-03-09 23:44:06,105 - root - WARNING - __init__.getSRS - Requested srsName 'EPSG:3035' is not declared as being allowed for requested typename 'lucas:lucas_points_co'. Options are: 'urn:x-ogc:def:crs:EPSG:3035'.
2024-03-09 23:44:08,340 - LUCAS - INFO - io.download - Download process successfuly finished. Size of downloaded data: 31995kb
Number of attributes: 42

1.3.5. Filter by attributes

Attribute filter allows to use any LUCAS attribute (see list of LUCAS attributes) or combination of attributes.

In example bellow only LUCAS locations visited repeatedly 5 times are retrieved.

[13]:
from owslib.fes import PropertyIsEqualTo

request = LucasRequest()
request.bbox = (4504276, 3020369, 4689608, 3105290)
request.propertyname = 'SURVEY_COUNT'
request.operator = PropertyIsEqualTo
request.literal = 5
request.st_aggregated = True

download(request)
2024-03-09 23:44:31,290 - root - WARNING - __init__.getSRS - Requested srsName 'EPSG:3035' is not declared as being allowed for requested typename 'lucas:lucas_st_points'. Options are: 'urn:x-ogc:def:crs:EPSG:3035'.
2024-03-09 23:44:31,721 - LUCAS - INFO - io.download - Download process successfuly finished. Size of downloaded data: 3011kb
Number of downloaded points: 199

1.3.6. Space-time aggregation

The st_aggregated property is used to determine whether the data should be space-time aggregated. Space-time aggregation means that one record in the attribute table represents one point with all values measured in all years. Otherwise, every single record in the attribute table represents one survey.

In example below space-time aggregated LUCAS points located in the Czech Republic are queried.

[14]:
request = LucasRequest()
request.group = 'LC_LU'
request.countries = ['CZ']
request.st_aggregated = True

lucasio = LucasIO()
lucasio.download(request)

points = lucasio.to_geopandas()
points.columns
2024-03-09 23:44:32,665 - root - WARNING - __init__.getSRS - Requested srsName 'EPSG:3035' is not declared as being allowed for requested typename 'lucas:lucas_st_points_lc_lu'. Options are: 'urn:x-ogc:def:crs:EPSG:3035'.
2024-03-09 23:44:36,715 - LUCAS - INFO - io.download - Download process successfuly finished. Size of downloaded data: 65435kb
[14]:
Index(['point_id', 'nuts0', 'nuts1_2006', 'nuts1_2009', 'nuts1_2012',
       'nuts1_2015', 'nuts1_2018', 'nuts2_2006', 'nuts2_2009', 'nuts2_2012',
       ...
       'ex_ante_2015', 'ex_ante_2018', 'survey_count', 'survey_dist_2006',
       'survey_dist_2009', 'survey_dist_2012', 'survey_dist_2015',
       'survey_dist_2018', 'survey_maxdist', 'geometry'],
      dtype='object', length=360)

In example below space-time aggregated LUCAS points only for years 2015 and 2018 are queried.

[15]:
request = LucasRequest()

request.years = [2015, 2018]
request.bbox=(4624127, 2998330, 4650393, 3013986)
request.st_aggregated = True

download(request)
2024-03-09 23:48:49,146 - root - WARNING - __init__.getSRS - Requested srsName 'EPSG:3035' is not declared as being allowed for requested typename 'lucas:lucas_st_points'. Options are: 'urn:x-ogc:def:crs:EPSG:3035'.
2024-03-09 23:48:49,391 - LUCAS - INFO - io.download - Download process successfuly finished. Size of downloaded data: 432kb
Number of downloaded points: 54

1.3.7. Show photos of selected LUCAS point

Show point_id of points by GeoPandas library.

[16]:
request = LucasRequest()

request.years = [2015, 2018]
request.bbox=(4624127, 2998330, 4650393, 3013986)

lucasio = LucasIO()
lucasio.download(request)

df = lucasio.to_geopandas()
df[["point_id"]]
2024-03-09 23:48:51,969 - root - WARNING - __init__.getSRS - Requested srsName 'EPSG:3035' is not declared as being allowed for requested typename 'lucas:lucas_points'. Options are: 'urn:x-ogc:def:crs:EPSG:3035'.
2024-03-09 23:48:52,091 - LUCAS - INFO - io.download - Download process successfuly finished. Size of downloaded data: 257kb
[16]:
point_id
0 46283000
1 46423000
2 46463000
3 46363012
4 46363000
... ...
58 46343004
59 46403006
60 46343006
61 46383004
62 46323002

63 rows × 1 columns

1.3.7.1. Display photos

Show photos of a selected LUCAS point by calling the method get_images() with year and point_id of the point specified. This will return a dictionary of URL adresses of 5 photos representing the point itself, a northern look, a southern look, an eastern look, and a western look.

[17]:
id = df[["point_id"]].values[0][0]
images = lucasio.get_images(2015, id)

Let’s display photo representing the South of the point.

[18]:
import requests
from IPython.display import Image

r = requests.get(images["P"])
Image(r.content)
[18]:
../_images/notebooks_01_st_lucas_access_40_0.jpg

1.3.7.2. Download photos

LUCAS photos may be downloaded by download_images() method. The method returns the path to the created zip archive with photos.

[19]:
filename = lucasio.download_images(images, "/tmp")
2024-03-09 23:49:04,587 - LUCAS - INFO - io.download_images - LUCAS photos downloaded to: /tmp/46283000_2015.zip

Let’s check downloaded zip archive.

[20]:
from zipfile import ZipFile

print(filename)
with ZipFile(filename) as fd:
    print(fd.namelist())
/tmp/46283000_2015.zip
['P.jpg', 'S.jpg', 'N.jpg', 'E.jpg', 'W.jpg']