In the last post, I discussed aggregating data needed for search in a custom Lucene index. In this post, I’ll review how I implemented the query logic.

Date Conversion Exception

I started accessing the Lucene index and immediately started getting exceptions. Sitecore and Lucene were not happy with how my datetime data was getting stored in the Lucene index. I added a custom IndexFieldDateTimeConverter to manage the exceptions.


public class IndexFieldDateTimeValueConverter : Sitecore.ContentSearch.Converters.IndexFieldDateTimeValueConverter
{
    public override object ConvertFrom(ITypeDescriptorContext context, System.Globalization.CultureInfo culture, object value)
    {
        try
        {
             return base.ConvertFrom(context, culture, value);
        }
        catch(Exception e)
        {
             string fieldValue = value as string;

             DateTime dReturn = new DateTime();

             if (DateTime.TryParseExact(fieldValue, "yyyyMMdd", culture, DateTimeStyles.None, out dReturn))
                 return dReturn;
             else
                 throw e;
        }
    }
}

I created an include file to apply the class and had to name it with z (zCustomIndexValueConverters.config) so that it loaded after the Sitecore Content Search Lucene include files.


< configuration xmlns :patch ="http : / /www.sitecore.net /xmlconfig /" >
    < sitecore >
        < contentSearch >
            < indexConfigurations >
                < defaultLuceneIndexConfiguration >
                    < !-- DateTimeConverter -- >
                    < indexFieldStorageValueFormatter type ="Sitecore.ContentSearch.LuceneProvider.Converters.LuceneIndexFieldStorageValueFormatter, Sitecore.ContentSearch.LuceneProvider" >
                        < converters hint ="raw :AddConverter" >
                            < converter handlesType ="System.DateTime" >
                                < patch :attribute name ="typeConverter" >Someproject.ContentSearch.Converters.IndexFieldDateTimeValueConverter, Someproject<  /patch :attribute >
                            <  /converter >
                        <  /converters >
                    <  /indexFieldStorageValueFormatter >
                <  /defaultLuceneIndexConfiguration >
            <  /indexConfigurations >
        <  /contentSearch >
    <  /sitecore >
<  /configuration  >

I then applied the attribute to my data properties.

[TypeConverter(typeof(IndexFieldDateTimeValueConverter))]
 public virtual DateTime MetadataDate { get; set; }

Am I proud of myself? Nope. Did this work. Yep.

POCO and SearchResultItem

I needed classes to store the search results and facet data. I created four classes.

Facet Classes

The facet classes are fairly straight forward. The FacetValue class and the SearchFacet class are POCO (Plain Old CLR Objects) classes to store the facet data and return it to the presentation layer.

[Serializable]
[DataContract(Name = "FacetValue")]
public class FacetValue
{
    [DataMember(Name = "Value")]
    public string Value { get; set; }

    [DataMember(Name = "FacetCount")]
    public int FacetCount { get; set; }
}

[Serializable]
[DataContract(Name = "SearchFacet")]
public class SearchFacet
{
    private List _values;

    public SearchFacet()
    {
        _values = new List();
    }

    [DataMember(Name = "FacetName")]
    public string FacetName { get; set; }

    [DataMember(Name = "Values")]
    public List Values
    {
        get { return _values; }
        set { _values = value; }
    }
}

Search Results & Search Entity Classes

The search entity class stores all of the search result data we want to return to the presentation layer, as well as the properties to where and filter. I inherited from the Sitecore SearchResultItem class and then hid the data I did not want to return to the presentation layer for security and not to bloat the JSON. The search results class is the container for everything.

[Serializable]
[DataContract(Name = "SiteSearchEntity")]
public class SiteSearchEntity : SearchResultItem 
{
    [TypeConverter(typeof(IndexFieldIDValueConverter))]
    [IndexField("_id")]
    public Guid Id { get; set; }

    [DataMember(Name = "ComputedUrl")]
    [IndexField("LinkProviderUrl")]
    public virtual string ComputedUrl { get; set; }

    [DataMember(Name = "ComputedMetaTitle")]
    [IndexField("Title")]
    public virtual string ComputedMetaTitle { get; set; }

    [DataMember(Name = "ComputedMetaDescription")]
    [IndexField("Description")]
    public virtual string ComputedMetaDescription { get; set; }

    [IgnoreDataMember]
    [IndexField("Keywords")]
    public virtual string ComputedKeywords { get; set; }

    [DataMember(Name = "ComputedDocumentDate")]
    [IndexField("DocumentDate")]
    public virtual DateTime ComputedDocumentDate { get; set; }

    [DataMember(Name = "ComputedCategory")]
    [IndexField("ComputedCategory")]
    public virtual List ComputedCategory { get; set; }

    [DataMember(Name = "ComputedImageUrl")]
    [IndexField("ImageURL")]
    public virtual string ComputedImageUrl { get; set; }

    [DataMember(Name = "ComputedSearchUrl")]
    [IndexField("SearchURL")]
    public virtual string ComputedSearchUrl { get; set; }

    #region Hide Some Data Members
    [IgnoreDataMember]
    public new string Version { get; set; }

    [IgnoreDataMember]
    [IndexField("_group")]
    [TypeConverter(typeof(IndexFieldIDValueConverter))]
    public new ID ItemId { get; set; }

    [IgnoreDataMember]
    [IndexField("_uniqueid")]
    [TypeConverter(typeof(IndexFieldItemUriValueConverter))]
    [XmlIgnore]
    public new ItemUri Uri { get; set; }

    [IgnoreDataMember]
    [IndexField("_templatename")]
    public new string TemplateName { get; set; }

    [IgnoreDataMember]
    [IndexField("_template")]
    [TypeConverter(typeof(IndexFieldIDValueConverter))]
    public new ID TemplateId { get; set; }

    [IgnoreDataMember]
    [IndexField("__semantics")]
    [TypeConverter(typeof(IndexFieldEnumerableConverter))]
    public new IEnumerable Semantics { get; set; }

    [IgnoreDataMember]
    [IndexField("_fullpath")]
    public new string Path { get; set; }

    [IgnoreDataMember]
    [IndexField("_path")]
    [TypeConverter(typeof(IndexFieldEnumerableConverter))]
    public new IEnumerable Paths { get; set; }

    [IgnoreDataMember]
    [IndexField("_name")]
    public new string Name { get; set; }

    [IgnoreDataMember]
    [IndexField("_language")]
    public new string Language { get; set; }

    [IgnoreDataMember]
    [IndexField("__smallcreateddate")]
    public new DateTime CreatedDate { get; set; }

    [IgnoreDataMember]
    [IndexField("_content")]
    public new string Content { get; set; }

    [IgnoreDataMember]
    [IndexField("parsedcreatedby")]
    public new string CreatedBy { get; set; }

    [IgnoreDataMember]
    [IndexField("__smallupdateddate")]
    public new DateTime Updated { get; set; }

    [IgnoreDataMember]
    [IndexField("parsedupdatedby")]
    public new string UpdatedBy { get; set; }

    [IgnoreDataMember]
    [IndexField("_datasource")]
    public new string Datasource { get; set; }

    [IgnoreDataMember]
    [IndexField("_database")]
    public new string DatabaseName { get; set; }

    [IgnoreDataMember]
    [IndexField("_parent")]
    public new ID Parent { get; set; }

    [IgnoreDataMember]
    [IndexField("urllink")]
    public new string Url { get; set; }

    #endregion

    #region Work Around for Facets with Spaces
    [IndexField("CategoryFacet")]
    public virtual List CategoryFacet { get; set; }
    #endregion
}

[DataContract(Name = "SearchResults")]
[Serializable]
public class SerializableSearchResults
{
    List _entities = new List();
    List _facets = new List();

    [DataMember(Name = "TotalCount")]
    public int TotalCount { get; set; }

    [DataMember(Name = "SearchTerm")]
    public string SearchTerm { get; set; }

    [DataMember(Name = "entities")]
    public List entities
    {
        get { return _entities; }
        set { _entities = value; }
    }

    [DataMember(Name = "facets")]
    public List facets
    {
        get { return _facets; }
        set { _facets = value; }
    }
}

Search Logic & the Predicate Builder

Now for the fun part. How do we build a search algorithm to return accurate results?  I modeled my work after Matt Burke’s blog post.  I converted the search term into an array of strings, delimiting the term using the space character.  If then built the filter and term predicates separately and joined them together.   The Sitecore PredicateBuilder makes working Lucene fairly easy.  I simplified the search algorithm as it appears below for simplicity.

public static SerializableSearchResults GetSearchResultsLucene(string searchTerm, int page, Dictionary facets)
{
    SerializableSearchResults oReturn = new SerializableSearchResults();
    string sDatabase = "web";
    ISitecoreService service;

    ISearchIndex searchIndex = ContentSearchManager.GetIndex("sitesearch_web");

    SearchResults results = null;
    IQueryable query = null;

    service = new SitecoreService(sDatabase);

    using (IProviderSearchContext searchContext = searchIndex.CreateSearchContext())
    {
        // Parse search term into a collection of strings
        string[] terms = searchTerm.ToLower().Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);

        Expression filterPredicate = PredicateBuilder.True();

        // Get facets
        string category = GetFacetValue("category", facets);

        // Build facets clauses
        if (!String.IsNullOrEmpty(category))
            filterPredicate = filterPredicate.And(se => se.CategoryFacet.Contains(category));

        Expression termPredicate = PredicateBuilder.False();

        foreach (string term in terms)
        {
            termPredicate = termPredicate
                .Or(p => p.ComputedMetaTitle.Contains(searchTerm)).Boost(5.0f)
                .Or(p => p.ComputedMetaTitle.Like(term, 0.75f)).Boost(2.0f)
                .Or(p => p.ComputedMetaDescription.Contains(searchTerm)).Boost(3.0f)
                .Or(p => p.ComputedMetaDescription.Like(term, 0.75f)).Boost(2.0f)
                .Or(p => p.ComputedKeywords.Contains(searchTerm).Boost(2.5f))
                .Or(p => p.ComputedKeywords.Like(term, 0.75f).Boost(1.5f));
        }

        Expression fullPredicate = filterPredicate.And(termPredicate);

        query = searchContext.GetQueryable().Where(fullPredicate);

        FacetResults searchFacets = searchContext.GetQueryable().Filter(fullPredicate).FacetOn(x => x.CategoryFacet).GetFacets();

        query = query.Page(page - 1, 20);
        results = query.GetResults();
        oReturn.entities = results.Hits.Select(hit => hit.Document).ToList();

        foreach(SiteSearchEntity s in oReturn.entities)
        {
            service.Map(s);
        }

        oReturn.SearchTerm = searchTerm.ToLower();
        oReturn.TotalCount = results.TotalSearchResults;

        oReturn.facets = GetFacetResults(searchFacets);
    }

    return oReturn;
}

private static string GetFacetValue(string FacetName, Dictionary facets)
{
    string sReturn = String.Empty;

    if (facets.ContainsKey(FacetName))
        sReturn = facets[FacetName];

    return sReturn;
}

The method below loads the facet data into the POCO facet objects.

private static List GetFacetResults(FacetResults results)
{
    List f = new List();
    foreach (FacetCategory fc in results.Categories)
    {
        SearchFacet sf = new SearchFacet();
        sf.FacetName = fc.Name;
        foreach(Sitecore.ContentSearch.Linq.FacetValue fv in fc.Values)
        {
            sf.Values.Add(new Someproject.Models.Search.FacetValue() { FacetCount = fv.AggregateCount, Value = fv.Name });
        }
        f.Add(sf);
    }
    return f;
}

Paginated search results and the faceted breakdown of the search results are neatly packaged and are ready to be serialized into JSON for the presentation layer.

Conclusion

The Sitecore PredicateBuilder and Content Search Linq interface makes building a site search solution very managable.  Computed Fields allow the ability to store anything you need into the Lucene index file.

One thought on “Implementing Search Using Sitecore & Lucene – Part II

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s