Economic activities in Zürich

Zürich Statistical Office collects data on the city and its residents. This data is published as Linked Data.

In this tutorial, we will show how to work with Linked Data. Mainly, we will see how to work with data on economic activities.
We will look into how to query, process, and visualize it.

SPARQL endpoint

Data on some economic activities is published as Linked Data. It can be accessed with SPARQL queries.
You can send queries using HTTP requests. The API endpoint is https://ld.integ.stadt-zuerich.ch/query.

Let's use SparqlClient from graphly to communicate with the database. Graphly will allow us to:

  • send SPARQL queries
  • automatically add prefixes to all queries
  • format response to pandas or geopandas
In [1]:
# Uncomment to install dependencies in Colab environment
#!pip install mapclassify
#!pip install git+https://github.com/zazuko/graphly.git
In [2]:
import mapclassify
import matplotlib
import matplotlib.cm

import pandas as pd
import plotly.express as px
import plotly.graph_objects as go

from graphly.api_client import SparqlClient
In [3]:
sparql = SparqlClient("https://ld.integ.stadt-zuerich.ch/query")
wikisparql = SparqlClient("https://query.wikidata.org/sparql")

sparql.add_prefixes({
    "schema": "<http://schema.org/>",
    "cube": "<https://cube.link/>",
    "property": "<https://ld.stadt-zuerich.ch/statistics/property/>",
    "measure": "<https://ld.stadt-zuerich.ch/statistics/measure/>",
    "skos": "<http://www.w3.org/2004/02/skos/core#>",
    "ssz": "<https://ld.stadt-zuerich.ch/statistics/>"
})

SPARQL queries can become very long. To improve the readibility, we will work wih prefixes.

Using add_prefixes method, we can define persistent prefixes. Every time you send a query, graphly will now automatically add the prefixes for you.

Restaurants over time

Let's find the number of restaurants in Zurich over time. This information is available in the AST-BTA data cube. To give restaurant numbers a context, let's scale them by population size. The number of inhabitants over time can be found in the BEW data cube.

The query for number of inhabitants and restaurants over time is as follows:

In [4]:
query = """
SELECT *
FROM <https://lindas.admin.ch/stadtzuerich/stat>
WHERE {
    {
    SELECT ?time (SUM(?ast) AS ?restaurants)
    WHERE {
      ssz:AST-BTA a cube:Cube;
                    cube:observationSet/cube:observation ?obs_rest.   
      ?obs_rest property:TIME ?time ;     
           property:RAUM <https://ld.stadt-zuerich.ch/statistics/code/R30000> ;
           property:BTA <https://ld.stadt-zuerich.ch/statistics/code/BTA5000> ;
           measure:AST ?ast . 
    }
     GROUP BY ?time ?place
  }
  {
    SELECT ?time ?pop
    WHERE {
      ssz:BEW a cube:Cube;
                    cube:observationSet/cube:observation ?obs_pop.   
      ?obs_pop property:TIME ?time ;     
           property:RAUM <https://ld.stadt-zuerich.ch/statistics/code/R30000>;
           measure:BEW ?pop
    }
  }  
}
ORDER BY ?time
"""

df = sparql.send_query(query)
df.head()
Out[4]:
time restaurants pop
0 1934-12-31 1328.0 315864.0
1 1935-12-31 1327.0 317157.0
2 1936-12-31 1321.0 317712.0
3 1937-12-31 1321.0 318926.0
4 1938-12-31 1334.0 326979.0

Let's calculate number of restaurants per 10 000 inhabitants

In [5]:
df = df.fillna(method="ffill")
df["Restaurants per 10 000 inhabitants"] = df["restaurants"]/df["pop"]*10000
In [6]:
fig = px.line(df, x="time", y = "Restaurants per 10 000 inhabitants", labels={"time": "Years"})
fig.update_layout(title_text='Restaurants in Zürich over time', title_x=0.5)