out of time
-
Sunspot 1.2 released (finally)
After a firmly ridiculous amount of time in release candidate status (mainly owing to lack of time on my part), Sunspot 1.2 final is out. Here’s the inside scoop.
Upgrading
First, if you’re using Sunspot::Rails, you no longer need to explicitly load the ‘sunspot/rails’ source file (in fact, if you do, things won’t work right). So if you’re using Rails 3 (or bundler with Rails 2), your Gemfile just needs:
gem 'sunspot_rails'And if you’re using Rails 2 without Bundler, it’s just:
config.gem 'sunspot_rails'The other major change is in spatial search: Sunspot 1.2 has a complete rewrite of spatial search functionality, and both the API and the underlying implementation are quite different. I’ll go in to quite a bit of depth a little later in this post, but for now, here’s a quick before-and-after on the API.
Previously, you configured a (the) spatial field like this:
coordinates :coordinatesIn this case,
:coordinatesis just a method that’s used to return the coordinate information, which could be either a two-element array, or an object that responded to#latand#lng, or some other variants on those attribute names. Each document got exactly one set of coordinates, so there was no explicit field name associated with the information.Now, you set it up like this:
location :coordinatesSeems pretty similar, but there are a couple of crucial differences. First,
:coordinatesis an actual field name, with aLocationtype. You can think of it like any other field, and you can pass all the usual options in. You can also specify more than one location field:location :hq_coordinates location :field_office_coordinates, :multiple => trueAlso, Sunspot 1.2 is stricter about the data that’s used to populate location fields: It has to be an object (or array of objects, if it’s a multi-valued field) that responds to #lat and #lng. You can use
Sunspot::Util::Coordinatesif you’re not working with objects that already fit the bill.OK, now on to performing geo search. Before, you’d do this:
near [40.0, -70.0], :distance => 5Now you’ll do something like:
with(:hq_coordinates).near 40.0, -70.0, :precision => 8Don’t worry too much about what that
:precisionoption means right now; we’ll get into that later.What’s new in Sunspot 1.2
Spatial search with GeoHash
By far the biggest change in Sunspot 1.2 is a complete rewrite of the spatial search component. Instead of relying on solr-spatial-light, a Solr plugin I wrote, to perform spatial search, Sunspot now uses a geohash-based spatial search strategy that is implemented completely in Sunspot itself; no special functionality is needed from Solr. This has some major advantages, but it also has some disadvantages.
The good:
- Performs well at large scale, since under the hood it is executing what amounts to a relatively simple fulltext search in Solr. Contrast with solr-spatial-light, which has severe performance problems at scale.
- Allows multiple location fields in a single document, and also multi-valued location fields.
- Allows searches to incorporate both fulltext relevance and spatial proximity when calculating result score, resulting in a very “natural” default result ordering when the search contains both fulltext and spatial components.
The bad:
- Control over search “radius” is severely constrained — only provide nine precision levels, ranging from 389 miles (precision 3) to 8 feet (precision 12).
- Proximity search matches locations that inhabit the same square on a fixed grid on the globe as the search origin; if the origin is near the edge of the square, then nearby documents will be missed, whereas more distant documents will be matched.
How it works:
When locations are indexed, they are converted into GeoHash values. GeoHash is a clever algorithm that encodes coordinates on the globe into a single string which has the property that, on average, the shared prefix of two geohashes increases as the distance between the locations decreases. Thus, proximity search can be performed very efficiently by simply searching for documents which share a prefix with the origin point, and closer documents can be given higher relevance by boosting matches with longer shared prefixes.
For complete documentation on using Sunspot’s new GeoHash spatial search, see the API documentation.
Rails 3 Compatibility
Sunspot::Rails 1.2 is fully compatible with Rails 3. There’s not much more to say about that — just include it in your Gemfile, and it’ll work. Sunspot::Rails is still tied to ActiveRecord, though; I’d like to make it more ORM-agnostic in the future, much as Rails 3 is today.
Support legacy Solr schemas with :as
If you’ve got a legacy Solr schema where the field names don’t follow Sunspot’s naming conventions, you can now explicitly tell Sunspot what a field’s name in Solr is using the
:asoption:string :title, :as => :legacy_titleAs well as supporting legacy schemas, this option can be useful if you want to set up new field types in your Solr schema.
Other enhancements
- The
SilentFailSessionProxywill swallow errors that occur during write operations, useful if you’ve got an unreliable Solr instance and don’t want to throw application errors when a non-critical Solr write fails. - You can now include documents by identity, e.g.
with(some_instance). Presumably the primary use for this would be inside a disjunction. - You can call
Sunspot.optimizeto trigger a Solr optimize from inside your application.
That’s all, folks!
Well, that’s all for this release, friends. But we’ve got some big, big plans for Sunspot 1.3, due out in January 2055:
- Support for [http://wiki.apache.org/solr/FieldCollapsing](Field Collapsing) (group results by field value)
- Add NGram and EdgeNGram field types for easy prefix/substring search
- Improve edge-case spatial matching by searching proximate n+1-precision geohash boxes
- More stuff that you want!
Thank you!
Each release of Sunspot has been less of an individual effort and more of a community effort than the last, and 1.2 is by far the biggest example of that yet. From now on, it’s my official policy to give committer rights to anyone who submits a good, robust patch; I’d like for ongoing Sunspot development to be entirely community-driven, and we’re already a lot of the way there. Big thanks to everyone who has contributed to the library so far and thanks in advance to those whose patches are still to come.