Developer corner: cross layer filtering coming down to GeoServer

Ever needed to find the answer to a question involving the spatial relationship between two layers? Something like “get me all the coffe shops withing 100 meters of the subway M exits?” Or “show me all the industrial buildings in counties X, Y and Z”?

Both of those queries require finding features in one layer, possibly with an attribute filter, and then locate all the locate all the features in another layer that have some kind of spatial relationship with the first result set.
This is known as a spatial join and it’s used to find object that do relate with each other by proximity.

Now, GeoServer does not natively support layer joining. Up to today doing a WFS request like the above would have entailed:

  • make a first WFS GetFeature request to get the locations of subway M
  • parse the result, build a MultiPoint geometry
  • use it in a second WFS GetFeature against the commercial activities layer using a DWithin filter against the above constructed geometries

If you are playing with a web based client that is troublesome, first there are two round trips to the server, second the web based client cannot normally deal with many points meaning it may well start consuming too much memory for practically dealing with requests that pull up large geometries (state boundaries, rivers and the like).

Having to do the same with WMS comes with the aggravation that the filtering is likely done with CQL, but a large geometry won’t simply fit within the maximum length allowed for GET requests (can be worked around with dynamic SLD o POST requests, but that adds other complications).

Enter the new “querylayer” GeoServer module.
The module provides filter functions that query and summarize GeoServer layers data so that the whole process can be done in a WFS single request or a single, compact CQL filter.

Let’s make a simple example using the GeoServer sample data: we want to find all bugsites inside the restricted area whose “cat” attribute values 3.
The wfs request would be:

Whilst a WMS request doing the same would use a CQL filter as simple as:

INTERSECTS(the_geom, querySingle(‘restricted’, ‘the_geom’,’cat = 3′))

The following map shows the result with some context:

Want to find all the bug sites within 200 meters from any road. Here we go:

and the CQL equivalent would be:

DWITHIN(the_geom, collectGeometries(queryCollection(‘roads’,’the_geom’,’INCLUDE’)), 200, meters)

which results in the following map:

Pretty cool eh? Naturally all roses have their thorns and this is no exception:

  • The requests, while valid, are going to work only with GeoServer, they are not portable towards other servers
  • The machinery is not actually able to do an in database join, it just loads the data in memory first and replaces the function call with the result so that intersects/dwithin can hit the database directly. It means the intermediate data is memory bound, and we actually have some admin configurable limits to make sure the requests are not loading in memory too much data (when the threshold is crossed an exception will the thrown)

The “querylayers” module is now a GeoServer community module that can be downloaded as part of the trunk nightly builds (starting tomorrow). It is available for both GeoServer trunk and GeoServer 2.1.x, though in the latter case you’ll want to add the -Dorg.geotools.filter.function.simplify=true JVM option to get the best performance out of the module (it enables a specific querying optimization that is available by default on trunk).

We’d like to thank GeoSmart for sponsoring the development of this module, it’s a pretty useful addition to the GeoServer data access arsenal.

As said, joining support could be improved even further. Interested? Let us know!

The GeoSolutions team