Payloads: Boost Popular Products for Queries

New Support for Payloads in Solr

Support for payloads existed in Lucene since long, but good out of the box support for payloads in Solr has finally been introduced in version 6.6. Concept of payloads can be understood as a per-document map of terms to values (string or numeric). Payloads can be used to store metadata for documents. Payloads lend themselves very well to implement a per-document weighted attribute set that can be used for filtering/scoring of those documents.

In this example, let us create a collection using a tiny portion of the Open Product Data. For this e-commerce oriented usecase of payloads, we’ll add clickstream based payloads to help in achieving better relevance for user queries.

Start SolrCloud

First of all, lets start Solr 8.5 (or any version >= 7.0) in cloud mode.

Docker:

docker run -it -p 8983:8983 solr:8.5.2 /opt/solr/bin/solr -c -f

Without Docker:

wget http://www-us.apache.org/dist/lucene/solr/8.5.2/solr-8.5.2.tgz
tar -xvf http://www-us.apache.org/dist/lucene/solr/8.5.2/solr-8.5.2.tgz
cd solr-8.5.2
bin/solr -c

Note: If you prefer using Postman, here is the postman collection that you can import.

Create a collection called “products”

curl -X POST "http://localhost:8983/api/collections" -d \
'{"create": {"name": "products", "numShards": 1}}'

Index a few documents into the collection

curl -X POST \
    'http://localhost:8983/api/collections/products/update?commit=true' \
    -H 'content-type: application/json' \
    -d '[
    {"id":"5000147030156", "title":"bargains4you | Bargains4You Iphone 4, Iphone 4S, Iphone s, Iphone 5, Iphone Signal & Wifi Boosters X 2"},
    {"id":"0602956", "title":"Marware Membrane iPhone Case - iPhone - Smoke - Polypropylene"},
    {"id":"0602956009542", "title":"Marware Membrane iPhone Case - iPhone - Smoke - Polypropylene"},
    {"id":"087956904", "title":"Fosmon iPhone Case - iPhone - Purple - Silicone, Thermoplastic Polyurethane (TPU)"},
    {"id":"0879569046442", "title":"Belkin BodyGuard Hue iPhone Case - iPhone - Yellow, Light Graphite"},
    {"id":"0879569046459", "title":"Belkin Bodyguard Hue iPhone Case - iPhone - Garnet, Light Graphite"},
    {"id":"0879569046541", "title":"Fosmon iPhone Case - iPhone - Blue - Silicone, Thermoplastic Polyurethane (TPU)"},
    {"id":"0879569046565", "title":"Fosmon iPhone Case - iPhone - Orange - Silicone, Thermoplastic Polyurethane (TPU)"},
    {"id":"0879569046572", "title":"Fosmon iPhone Case - iPhone - Green - Silicone, Thermoplastic Polyurethane (TPU)"},
    {"id":"0879569046589", "title":"Fosmon iPhone Case - iPhone - Purple - Silicone, Thermoplastic Polyurethane (TPU)"},
    {"id":"0885909229352", "title":"Iphone Mb048ll A Smartphone"},
    {"id":"4713507011535", "title":"iMarkCase | Girrafe iPhone 4 / 4S Case - iPhone Designer Phone Case"},
    {"id":"4713507", "title":"iMarkCase | Tiger iPhone 4 / 4S Covers - iPhone 4S Custom Phone Case"},
    {"id":"8801105", "title":"iMarkCase | Spot iPhone 4 / 4S Covers - Design An iPhone Phone Case"},
    {"id":"4713507019135", "title":"iMarkCase | Tiger iPhone 4 / 4S Covers - iPhone 4S Custom Phone Case"},
    {"id":"8801105000535", "title":"iMarkCase | Spot iPhone 4 / 4S Covers - Design An iPhone Phone Case"},
    {"id":"0758302", "title":"Ipad Ipod & Iphone Wall Charger"},
    {"id":"0660543008323", "title":"Iphone 4 Impact Case Blue"},
    {"id":"0758302638604", "title":"Ipad Ipod & Iphone Wall Charger"},
    {"id":"0885909459865", "title":"Iphone 4s 32gb Black White"},
    {"id":"0885909503865", "title":"Iphone 4 32gb In Black"},
    {"id":"8801105000177", "title":"iMarkCase | Girl iPhone 4 / 4S Covers - Design Your Own iPhone 4S Phone Case"},
    {"id":"0098689", "title":"Lucky Hard Case Feather Iphone 4"},
    {"id":"0047532893908", "title":"Alarm Clock For Iphone Ipod Black"},
    {"id":"0098689392233", "title":"Lucky Hard Case Feather Iphone 4"}]'

Query for iPhone

curl "http://localhost:8983/solr/products/select?q=title:iphone"

{
  "response":{"numFound":259,"start":0,"docs":[
  {
    "id":"5000147030156",
    "title":["bargains4you | Bargains4You Iphone 4, Iphone ... Boosters X 2"]},
  {
    "id":"0602956",
    "title":["Marware Membrane iPhone Case - iPhone - Smoke - Polypropylene"]},
  {
    "id":"0602956009542",
    "title":["Marware Membrane iPhone Case - iPhone - Smoke - Polypropylene"]},
  {
    "id":"087956904",
    "title":["Fosmon iPhone Case - iPhone - Purple - ... Polyurethane (TPU)"]},
  {
    ...
}}

As you can see, the results here are all over the place in terms of relevance to user expectations. Usually, the intent of such queries are to explore the actual phone referred to as “iphone”, instead of cases and accessories for iPhone.

Adding payloads

Assume that we have gathered clickstream statistics through offline processing and we know the click count for a pair. Using payloads, we can associate the query and number of clicks (for that query) for each of the products. Here’s an example (for one of the most popular products):

curl -X POST 'http://localhost:8983/solr/products/update?commit=true' \
-d '[{"id":"0885909459865", "queries_dpf": {"set": "iphone|20 apple|15"}}]'

Here, queries_dpf is a dynamic float payload field. The above represents the scenario that the product “0885909459865” was clicked 20 times for the query “iphone” and 15 times for the query “apple”.

Sorting the results by the payload

To let the click counts participate in the ranking of the results, we can sort the results by a payload function: payload(queries_dpf,iphone,0). The first parameter is the payload field name, the second is the term that has the payload and the last is the default payload value (in case there is no payload for that term).

curl \
'http://localhost:8983/solr/products/select?userquery=iphone&q=%7B!query%20v%3D%24userquery%7D&df=title&fl=id,title,payload(queries_dpf,%24userquery,0)&sort=payload(queries_dpf,%24userquery,0)%20DESC,score%20DESC'

{
"response":{"numFound":259,"start":0,"docs":[
{
  "id":"0885909459865",
  "title":["Iphone 4s 32gb Black White"],
  "payload(queries_dpf,iphone,0)":20.0},
{
  "id":"5000147030156",
  "title":["bargains4you | Bargains4You ... Wifi Boosters X 2"],
  "payload(queries_dpf,iphone,0)":0.0},
{
  "id":"0602956",
  "title":["Marware Membrane iPhone Case ... Polypropylene"],
  "payload(queries_dpf,iphone,0)":0.0},
{ ...
  ...}]
}

As you can see, the most clicked document for the query “iphone” was shown first, leading to a much more relevant user experience.

Instead of sorting, you could also use the payload function in scoring as well. Also, you can filter by payloads. Check out the reference guide for the payload query parsers or the payload function.