Developer’s Corner: Disk Quota Clustering in GeoWebCache

Dear All,
recently one of our clients asked us to work on GeoWebCache to improve its support for clustered deployments.

As you might know GeoWebCache had historically problems in enterprise set-ups where multiple instances where used, with or without disk-quota enabled, in front of the same cache all with the possibility to not only read existing tiles but also to create new tiles (hence acting as writer). This was due to various existing mechanisms (e.g. metastore) trying to acquire exclusive locks on the same objects hence preventing multiple instances of GeoWebCache to work in parallel.

In order to tackle the various limitations, the metastore has been removed and all the relevant information is now stored on the file system; the file handling has also been modified to eliminate the obvious
points that could have lead to tile corruption in clustered set-ups (parallel writers, readers trying to get a tile before it’s fully written on disk). In addition, the disk quota has also been modified so that it can work off the existing Berkely DB (which cannot be used in a clustered set up), but also using embedded H2, PostgreSQL and Oracle, where the last two allow GeoWebCache to work in a clustered set-up with disk-quota enabled.

In order to use an external database for the disk-quota mechanism, you’ll have to specify the following in the geowebcache-diskquota.xml file:


  true
  JDBC  <-- this is the switch -->
  …

as well as creating a new geowebcache-diskquota-jdbc.xml file containing information about the chosen database as well as the data source.

In case of usage of Oracle with a JNDI data source the file will look as follows:


  Oracle
  java:comp/env/jdbc/oralocal

In case of PostgreSQL with an internally managed connection pool it will look as follows instead:


  PostgreSQL
 
    org.postgresql.Driver
    jdbc:postgresql:gttest
    cite
    cite
    1
    10
    1000
    50
    select 1
    50
 

As said before the metastore has been eliminated, but you’ll still have to setup the proper locking in geowebcache.xml to use active/active clustering:



  xmlns=”http://geowebcache.org/schema/1.3.0″
  xsi:schemaLocation=”http://geowebcache.org/schema/1.3.0 http://geowebcache.org/schema/1.3.0/geowebcache.xsd”>
  1.3.0
  120
  nioLock <-- here!

Finally a trick which is unrelated to the discussion here but which is worth to remind, in order to have GWC fetch images from an external WMS in a format which is different from the one we intend to encode (e.g. to encode PNG we usually ask for an uncompressed format and the we compress internally) e.g. requesting TIFF format to encode PNG locally, you’ll have to specify format modifiers for each of the mime types you
are caching. The following makes sure the image/png tiles are actually requested in image/tiff format (which is quicker to write than P:

     
       
          image/png
          image/tiff
       
     

The nice side effect of the configuration above is that due the differences in how readers and writers are implemented underneath this configuration speeds things up a lot. We’ve made some quick tests, the first 8 zoom levels of topp:states get seeded in 45 seconds without the setting above, and in just 28 seconds with the above configuration. The layer in question is very light to paint, the speedup you’ll get will also depend on how much time it takes to paint your layer, the longer, the smaller the advantage of using tiff as the seeding format.
Of course we have done this test on a LAN where the fact that the TIFF is uncompressed (hence larger) is secondary with respect to the speed of compression/decompression with respect to PNG. If the WMS is remote and/or bandwidth is very limited then the results may vary a lot.

We’ve also posted here the full config files from which we’ve extracted the above examples. Notice that if you are using JNDI as the provider the configuration file will still need to be named geowebcache-diskquota.xml, the one attached to the mail is called geowebcache-diskquota-jndi.xml only to avoid conflicts with the other configuration sample.

This work has been already committed to GeoWebCache master branch on github and will be part of the next 1.4 release.

If you are willing to act as a beta tester for this new feature you can download nightly build of the 1.4.x series on this website.

There is still plenty of improvements that could be made on GeoWebCache, such as experimenting with database tile storage, combining existing tile layers into virtual group layers whose tiles are built on the fly, and so on. If you are interested in these topic just let us know.

The GeoSolutions team,