Reverse Search with Elasticsearch
I am working on a Rails application that allows construction companies to manage a portfolio of projects (AscribeHQ.com). It also has the ability for other users to create a “Group Portfolio” to display projects that were uploaded by other users, based on a potentially complicated criteria. (ex. ‘City of Chicago Group Portfolio’ wants to show construction projects with budgets over $250k and are located within the city limits)
The problem
A Group Portfolio searches the existing projects to show the user which projects they can display, which we are using accomplishing with Elasticsearch. The real challenge is, we want the company user who uploads a new project to find all the Group Portfolios that matches their new project. Essentially we need a “Reverse Search”.
The solution
Elasticsearch has an amazing feature called Percolation, which allows us to save the complicated searches of the Group Portfolio into an index, then when a new project is added, we can ‘search the searches’ to return the Group Portfolio ID. Even better, there is a ruby gem called Tire that supports this percolation feature.
I’ll not go into all the set up of Elasticsearch itself as there are many great blog post on that part already, but this is how we set up the Percolation using the Tire gem:
UserProject model:
class UserProject < ActiveRecord::Base
include Tire::Model::Search
include Tire::Model::Callbacks
# Magic method that returns all the search that match the new project
def find_matching_groups
Portfolio::Group.find(UserProject.index.percolate(self))
end
# The normal mapping that is required to set up ElasticSearch
mapping do
indexes :title, :as => 'title', :boost => 2
indexes :custom_title, :as => 'custom_title', :boost => 2
indexes :owner, :as => 'owner.company_name', :boost => 2
indexes :subtitle, :as => 'subtitle'
indexes :street, :as => 'street'
indexes :city, :as => 'city'
indexes :state, :as => 'state'
indexes :zip, :as => 'zip'
end
end
Group Portfolio model:
class Portfolio::Group < Portfolio
include Tire::Model::Search
include Tire::Model::Callbacks
after_save do |group_portfolio|
if group_portfolio.project_criteria.present?
group_portfolio.save_query
end
end
# Adds and updates query in ElasticSearch database
def save_query
UserProject.index.register_percolator_query(self.id) do |q|
params = {}
# project_criteria is saved on the Group Portfolio object. ex: [{"filter_type": "state", "states": ["MI"]}, {"filter_type": "proj_type", "types": ["28", "29"]}]
self.project_criteria.from_json.each do |criteria|
params = params.merge(criteria)
end
q.filtered do
query do
boolean do
must { terms :phase_id, params['phases']} if params['phases']
must { terms :project_type_id, params['types']} if params['types']
must { terms :green_id, params['greens']} if params['greens']
must { terms :delivery_method_id, params['delivery_methods']} if params['delivery_methods']
must { terms :project_definition_id, params['project_definitions']} if params['project_definitions']
must { terms :state, params['states'].map(&:downcase) } if params['states']
must { terms :city, params['cities']} if params['cities']
must { terms :zip, params['zips']} if params['zips']
end
end
end
end
end
end
Action returning Group Portfolios:
class Manage::PublishingsController < ApplicationController
def index
@project = Project.find(params[:project_id])
@user_project = UserProject.where(:portfolio_id => current_portfolio.id, :project_id => @project.id).first
@group_portfolios = Portfolio::Group.where(:id => @user_project.find_matching_groups)
end
end
Results
We were previously trying to accomplish this same type of results with Delayed Job and some very complicated code. It often took around 5 minutes to do this ‘reverse search’. Now the user sees the results in a half of a second and with simpler code. This is a big win for us and will help us offer better service to our customers.
Applications
There seems to be endless applications for this. On dating sites, a new user can be told how many (and even who) searched for them before they signed up. Auto dealerships could easily see if the car they are could buy matches any recent searches for vehicles on their site. An advertising site could estimate how many views an ad will get based on previous queries. All of these things could be done in other ways, but the code will likely be very complicated and slow.
references: Percolation
Comments