Out on the streets, leveraging geographical data - Part 1
Table of Contents
Introduction
This post will focus on GIS data, visualization and querying geographical data in PostgreSQL and PostGIS.
We'll be looking at ways of finding things, using distances and leveraging SQL, visualizing geograpical data.
This is part1 of a series:
Setup
Install
You will need PostgreSQL 9.4 please follow the steps outlined here in order to get installed.
In order to install the latest Postgis compatible with PostgreSQL see the instructions here.
After you follow those, you can install the needed packages:
sudo aptitude install \ libxml2-dev libxslt1-dev \ postgis postgis-doc postgresql \ libpq-dev osm2pgsql \ postgresql-9.4 \ postgresql-client-9.4 \ postgresql-client-9.4 \ postgresql-9.4-postgis-2.1 \ postgresql-9.4-pgrouting postgresql-contrib-9.4
Now switch to the postgres user and create a database
CREATE USER "user" WITH PASSWORD 'somepass'; CREATE DATABASE geodb; GRANT ALL PRIVILEGES ON DATABASE "geodb" TO "user";
Now connect to that database and add a few extensions
\connect geodb CREATE EXTENSION postgis; CREATE EXTENSION pgrouting; CREATE EXTENSION postgis_topology; CREATE EXTENSION hstore;
Geographical data import
We'll be using data from Openstreetmap.
Openstreetmap has country-level data dumps for Romania listed here. We'll work with
the .osm.gz data dump for Bucharest (capital of Romania), but since
it's only available at country level, we'll download the data for the
entire country from here (there are other places where
one can get the same data, but we'll be using the data dumps from geofabrik.de)
Since the data for Romania is delivered at country-level and we just
want part of it, we can use the bounding box parameter -b to get
only the data within a bounding box defined by lower-left point and
upper-right points.
(The data import took around 2 minutes)
#!/bin/bash export OSM_CREATE='osm2pgsql -k -c -d geodb -j --number-processes 4 -U user -H localhost -P 5432' export OSM_APPEND='osm2pgsql -k -a -d geodb -j --number-processes 4 -U user -H localhost -P 5432' # extract Bucharest from Romania OSM file $OSM_CREATE romania-latest.osm.bz2 -b 25.8941,44.3312,26.4317,44.5408
Leveraging GeoJSON to use data from PostgreSQL in LeafletJS
At the time of writing this, the latest specification of GeoJSON is a draft from IETF. A collection of points represented with GeoJSON contains multiple "features". Each feature has a geometry and some properties. This is roughly how it looks:
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [102.0, 0.5]
},
"properties": {
"prop0": "value0"
}
},
...
]
}
Now let's write a query to generate GeoJSON data in that format. We'll use a constant table with some predefined locations and names.
SELECT json_build_object( 'type' , 'FeatureCollection', 'features', json_agg( json_build_object( 'type' , 'Feature', 'geometry' , ST_AsGeoJSON(sample.geom)::json, 'properties', json_build_object( 'name', name ) ) ) ) FROM ( VALUES ('Carol Park' , ST_GeomFromEWKT('SRID=4326;POINT(26.096894 44.4136414 )')), ('Herastrau Park' , ST_GeomFromEWKT('SRID=4326;POINT(26.0829941 44.4705056)')) ) AS sample (name, geom);
Now we can visualize the GeoJSON data on the map using LeafletJS
Queries
Finding some parks
Park point query
In this query, we're selecting all the polygons tagged with the park label. We're then computing the centroid of each of those polygons and then we're going to put them on a map.
The planet_osm_polygon table and the way column is of type
geometry and contains ST_Polygon objects.
-- SHOW SERVER_ENCODING; -- SHOW CLIENT_ENCODING; \encoding "utf-8" \o 'map-data-4.json' WITH parks AS ( SELECT name AS name, ST_TRANSFORM(ST_Centroid(way),4326) AS geom, ST_Area(way) AS area FROM planet_osm_polygon WHERE tags->'leisure'='park' AND name IS NOT NULL ORDER BY name DESC ), gis_data AS ( SELECT json_build_object( 'type' , 'FeatureCollection', 'features', json_agg( json_build_object( 'type' , 'Feature', 'geometry' , ST_AsGeoJSON(geom)::json, 'properties', json_build_object( 'name', name, 'area', area ) ) ) ) AS json_data FROM parks ) SELECT json_data::text FROM gis_data; \o
Wait, what happened ? There can't be that many parks.. something's not right.
Let's only get the parks larger than 500,000 squared meters.
Park polygon query
PostgreSQL can serialize the polygons as well and GeoJSON and LeafletJS also support this type of geometry.
A polygon in GeoJSON format looks like this, it has a list of all the vertices, and ends with the first vertex.
{
"geometry" : {
"type" : "Polygon",
"coordinates" : [
[
26.149838354955,
44.4222853535813
],
[
26.1499807379275,
44.4224712823627
],
...
...
[
26.149838354955,
44.4222853535813
]
]
...
}
}
Now we're going to look at park polygons instead of their centroids.
-- SHOW SERVER_ENCODING; -- SHOW CLIENT_ENCODING; \encoding "utf-8" \o 'map-data-6.json' WITH parks AS ( SELECT name AS name, ST_TRANSFORM(way,4326) AS geom, ST_Area(way) AS area FROM planet_osm_polygon WHERE tags->'leisure'='park' AND name IS NOT NULL AND ST_Area(way) > 500000 ORDER BY name DESC ), gis_data AS ( SELECT json_build_object( 'type' , 'FeatureCollection', 'features', json_agg( json_build_object( 'type' , 'Feature', 'geometry' , ST_AsGeoJSON(geom)::json, 'properties', json_build_object( 'name', name, 'area', area ) ) ) ) AS json_data FROM parks ) SELECT json_data::text FROM gis_data; \o
Finding cafes and pubs
You can usually find data about a location in Openstreetmap by looking at the amenity column which is just TEXT,
or you can look at the tags column which is of HSTORE type and holds key-value pairs (you may find the available
keys in this multi-page listing of the keys)
\encoding "utf-8" \o 'map-data-7.json' WITH cafes AS ( SELECT name AS name, ST_TRANSFORM(way,4326) AS geom FROM planet_osm_point WHERE amenity IN ('cafe','restaurant','fast_food','pub','coffeehouse') AND name IS NOT NULL ORDER BY name DESC ), gis_data AS ( SELECT json_build_object( 'type' , 'FeatureCollection', 'features', json_agg( json_build_object( 'type' , 'Feature', 'geometry' , ST_AsGeoJSON(geom)::json, 'properties', json_build_object( 'name', name ) ) ) ) AS json_data FROM cafes ) SELECT json_data::text FROM gis_data; \o
Other locations
Now here's some other interesting queries for common locations you can find in the Openstreetmap data.
Cinemas
SELECT name, way FROM planet_osm_point WHERE amenity='cinema';
ATMs/banks
SELECT name, way FROM planet_osm_point WHERE tags->'atm'='yes' OR amenity='bank';
Pharmacies
SELECT name, way FROM planet_osm_point WHERE amenity='pharmacy';
Places that sell internet USB sticks (we're looking for the most common mobile operators)
SELECT name, way FROM planet_osm_point WHERE LOWER(name) SIMILAR TO '%(euska|eroski|digimobil|movistar|vodafone|orange|yoigo|lebara|lycamob|tuenti)%'
Places with public wifi
SELECT name, way FROM planet_osm_point WHERE tags->'internet_access' = 'yes' OR tags->'internet_access' = 'wlan' OR tags->'internet_access' = 'public' OR tags->'wifi' = 'yes'
Hospitals
SELECT name, way FROM planet_osm_point WHERE building='hospital' OR tags->'building' = 'hospital'
Laundrettes
SELECT name, way FROM planet_osm_point WHERE shop='laundry';
Conclusion
We've seen how to use PostGIS + PostgreSQL and Leaflet to visualize data. We've mentioned the GeoJSON standard and used it to transfer data between PostGIS and LeafletJS in order to visualize it on a map.