Sitecore MVP Applications Open!

The yearly Sitecore MVP application is now open, and I’m ever so excited to apply once again.

Recently I’ve had a number of people ask me about the whole process, and over the years there have been a number of times when the application was open, but the last few years it has been invitation only. Which brings up the question – how do you get invited?

A lot of times prospects will reach out to current MVPs for invitations and recommendations, but this process is unpredictable, and subjective at best.

First things first, you would have needed to do a number of things over the year to garner up all your contributions. There is a comprehensive stackexchange answer on this: https://sitecore.stackexchange.com/questions/14790/what-information-is-needed-for-the-sitecore-mvp-application

The contribution factors have not changed much over the years, but one thing new this year is the community introduction forum. This is an excellent way to introduce yourself to Sitecore and the MVP community in general. This puts everything into a much more streamlined method, and everybody that is interested in becoming an MVP can get exposure, and Sitecore and the community can see what the prospect has contributed.

So, if you have been doing things over the year to contribute your time and knowledge to the Sitecore community, and you would like to apply for the 2021 MVP, go ahead to the community forum and introduce yourself!

Advertisement

Saving Sitecore 9 Forms in Multiple Locations (other than the default)

If you’ve worked with Sitecore 9 Forms recently, you’ve probably come to know that all forms are saved under a default location, which is ‘/Sitecore/Forms’. This is fine if you have a single site on your instance, but when you get into a multi-site implementation, this gets tricky because you want to separate the folders for the forms for each site – for multiple reasons, a major concern being security, and information architecture in general.

There have been a number of issues reported and resolved by fellow MVPs Jason and Toby. Jason found an issue where the forms don’t show up because it didn’t get added to the index. The second issue Toby found is more related to this post, as it sheds some light on how Sitecore knows where the forms should get saved. It turns out that the item ID of the default forms folder that comes out of the Sitecore installation is stored in an item in the Core database – I believe this value is used to configure the location to index.

Now that we know where the search index looks to index the forms, I got the idea that it reads this value at some point in the SPEAK application that the Forms Designer uses. At this point I figured I could override where it reads this, but I still didn’t know where. I’m not very fluent in SPEAK (yet), so my next approach was to try to investigate what code saves the forms.

After some chrome debugging and dotpeek, I found the web API that gets called:

/sitecore/api/ssc/forms/formdesign/formdesign/save?sc_formmode=new&sc_formlang=en-GB

…and resulting pipeline that saves the forms:

<forms.saveForm>
<processor type="Sitecore.ExperienceForms.Client.Pipelines.SaveForm.CreateModels, Sitecore.ExperienceForms.Client" resolve="true" />
<processor type="Sitecore.ExperienceForms.Client.Pipelines.SaveForm.GenerateNames, Sitecore.ExperienceForms.Client">
<defaultItemName>Form Item</defaultItemName>
</processor>
<processor type="Sitecore.ExperienceForms.Client.Pipelines.SaveForm.UpdateItems, Sitecore.ExperienceForms.Client" resolve="true" />
</forms.saveForm>

The pipeline receives JSON post data from the call, that looks something like this:

You can see here that the JSON data being passed has a bunch of models – they aren’t in any sort of heirarchy, except for sortOrder but that’s really just to make sure the order is correct for the form elements. Each model has a template ID – for the type of item it is, and parent ID – the location where it should be saved. You can also see here that the parentId field of the first model has the magic value – the item ID of the forms folder (‘/Sitecore/Forms’) which is also in the SearchConfig item in the Core database. Now that we know that, all we need to do is override that value. You can do this in the SPEAK application if you are more adept in SPEAK, but I decided to do in the pipeline.

First things first – you can either use the default forms folder and make sub-folders for each site, or you can make a brand new folder and make sub-folder for each site under the new folder. If you do the latter, you must update the SearchConfig item (as noted in Toby’s post) with the item ID of the folder you created.

Next, you can make a sub-folder for each site under the root folder. You probably want to put this in a config setting somewhere, different for each site.

The rest is easy – write a class to set the parentId of the model that contains the information for the main form element, to the item ID of your forms folder:


using Sitecore.Diagnostics;
using Sitecore.ExperienceForms.Client.Models.Builder;
using Sitecore.ExperienceForms.Client.Pipelines.SaveForm;
using Sitecore.Mvc.Pipelines;
using System.Linq;

namespace Custom.Pipelines
{
public class UpdateParentLocation : MvcPipelineProcessor<SaveFormEventArgs>
{

public override void Process(SaveFormEventArgs args)
{
Assert.ArgumentNotNull((object)args, nameof(args));
if (args.ViewModelWrappers == null)
{
return;
}

//Only look for the model that has the form item (searching by template ID)
ViewModelWrapper vm = (ViewModelWrapper)(from v in args.ViewModelWrappers
where v.Model.TemplateId.ToLower() == "{6ABEE1F2-4AB4-47F0-AD8B-BDB36F37F64C}".ToLower()
select v).FirstOrDefault();

if (vm == null)
{
return;
}

//Use whatever logic is neccessary to set where the form should be saved
vm.ParentId = Sitecore.Configuration.Settings.GetSetting("FormsLocationRoot");

}
}
}

Put whatever logic you need to in the above set statment to determine the location where the forms should be saved. It could be based on logic that results in creating more folder, etc, but that’s upto you.

Note: There are multiple models being passed, starting with the main form, then all the form elements. Because each form element is a child of the parent, you must be careful not to set the parentID of ALL the models. Which is why we are looking only for the model that has the templateID of the form.

Make a patch config file to insert this step right before the items are saved/updated, and you should be all done.

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:role="http://www.sitecore.net/xmlconfig/role/">
<sitecore role:require="ContentManagement">
<pipelines>
<forms.saveForm>
<processor patch:before="processor[@type='Sitecore.ExperienceForms.Client.Pipelines.SaveForm.UpdateItems, Sitecore.ExperienceForms.Client']" type="Custom.Pipelines.UpdateParentLocation, Custom.Pipelines" />
</forms.saveForm>
</pipelines>
</sitecore>
</configuration>

Addendum: I *think* it would also be possible to save the forms under multiple root folders, but then the indexes would need to be reconfigured to look in multiple locations – being that there is only one item for the SearchConfig item, it would need to get customized to handle a string of item IDs , or something of that sort.

I hope future updates to Sitecore 9 Forms will take multi-site implementations into account, but until then, this pipeline should help!

#Protip: SOLR throws java.lang.OutOfMemoryError when re-indexing Sitecore Indexes

As you may know, Sitecore 9 has it’s default search engine configured with SOLR, instead of Lucene. This requires you to install SOLR, in order for Sitecore 9 to work. There are some really informative, thorough posts on how to accomplish this, so I won’t go into that.

What I wanted to focus on is when you setup the SOLR to run as a windows service. The default command of running the SOLR command:

solr.cmd -f -p 8983

…sets the default memory allocation for the SOLR heap, which is 512MB. This is because we don’t actually set a parameter for the heap size. Depending on the content you have in your Sitecoreinstance, you may get java.lang.OutOfMemoryError errors when re-indexing. To fix this, you need to explicitly specify the heap size, which you set as such:

solr.cmd -f -p 8983 -m 1024m

This sets it to 1GB. You can adjust as needed – how much you need will depend on your situation – this article has some pointers on how to decide the size. This should allow you to re-index your large Sitecore indexes.

Securing your SOLR Instance for Sitecore

If you didn’t know already, SOLR is the default search engine that comes with Sitecore 9, and presumably future versions. The choice to use SOLR instead is very obvious – SOLR is scalable, while lucene is not. SOLR is more fault-tolerant, and can be load-balanced for failover, and thus resulting in a more reliable environment. This also means that Sitecore developers are going to need to know more about managing SOLR. Fortunately, there is plenty of information out there to learn about SOLR, so we aren’t totally in dark.

Security being an important factor in most infrastructures, securing the SOLR instance when being used with Sitecore is an important topic. The default installation for SOLR is open for anonymous visits, so I’m going outline some steps to make it less easy to get into the SOLR instance.

Some simple steps to take:

  1. As mentioned, by default, the SOLR instance is open. First thing you can do is to lock it down by IP, so only your Sitecore instances can see them (CM, CD, xConnect, etc.). This is simple enough and can be done without much effort.
  2. Make sure the SOLR instance is internal (i.e. behind a firewall). SOLR instances for Sitecore does need to be accessed by public visitors, so there is no need for it to be exposed outside your internal network.
  3. Add SSL to your SOLR instance. There are various ways to do this, and a lot of the documentation refers to using a self-signed cert. If you are running SOLR on apache, you’ll need to generate the java keystore with your real SSL certificate (make sure you have the .pfx file, which has both the public and private keys). If you are running SOLR on windows, you can use the .pfx file directly.
  4. Add Basic Authentication

The last step is the most involved and requires a small change in Sitecore as well. It enables basic authentication on SOLR, so Sitecore will need to authenticate to access SOLR. To do this, you’ll need to do the following.

Enable Basic Authentication

Add a new file, security.json to your SOLR instance with the below code – save this file in [path to solr]\server\solr.


{
"authentication":{
"blockUnknown":true,
"class":"solr.BasicAuthPlugin",
"credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}
},
"authorization":{
"class":"solr.RuleBasedAuthorizationPlugin",
"user-role":{"solr":"admin"},
"permissions":[{"name":"security-edit",
"role":"admin"}]
}}

So what does this file do?

  • Enables authentication and authorization
  • A user called solr is created with a password SolrRocks – note that we have to add it so at least you have one user. You can change the password later.
  • The user solr is assigned a role admin
  • The permissions to edit security is now restricted to the role admin. All other resources are unprotected and the user should configure more rules to secure them.

You can very granular with the security rules – I wasn’t able to find all the possible permissions, but there is a short list here.

Make sure Basic Authentication works

Once you do the above, your SOLR instance should be not accessible without logging in anymore. Restart the SOLR service, and give it a try, and you should see this screen:

If you use the solr/SolrRocks credentials, you should be able to get into your SOLR instance.

Configuring Sitecore to use the credentials

At this point, Sitecore can’t access the SOLR instance, so your instance is probably not working correctly. You’ll need to add the credentials to the configs:


\App_Config\Sitecore\ContentSearch\Sitecore.ContentSearch.Solr.DefaultIndexConfiguration.config

Comment this out:


<solrHttpWebRequestFactory type="HttpWebAdapters.HttpWebRequestFactory, SolrNet" />

And add this:


<solrHttpWebRequestFactory type="HttpWebAdapters.BasicAuthHttpWebRequestFactory, SolrNet">
<param hint="username">solr</param>
<param hint="password">SolrRocks</param>
</solrHttpWebRequestFactory>

…right before:


</indexConfigurations>
</contentSearch>

Note: Don’t edit the config file directly, ya’know – best practice – make an include config patch file.

Manage Users

Now that Sitecore is able to authenticate to access SOLR, you should change the default passwords. The easiest way to do this is via the REST API that SOLR has, and the easiest way to access that is via curl. Once you get this downloaded, open up the command prompt and fire out the command to change the password for the ‘solr’ user:


curl --user solr:SolrRocks https://localhost:8983/solr/admin/authentication -H "Content-type:application/json" -d "{ \"set-user\": {\"solr\" : \"MyNewPassword\"}}

You can also add new users:


curl --user solr:Password.1 https://localhost:8983/solr/admin/authentication -H "Content-type:application/json" -d "{ \"set-user\": {\"newuser\" : \"newpassword\"}}

Note that you have to change admin password for the ‘solr’ in the authentication request, if you changed it prior to running commands

If your request is successful, you should see the security.json file in [path to solr]\server\solr change. The encrypted password in the file should have changed. You should be able to use these credentials without restarting SOLR.

#Bug: Rich Text Editor Insert Sitecore Link wipes out css classes

I found this bug in Sitecore content editor the other day – in the Rich Text Editor, if you are editing an HTML and want to edit a link to point to a Sitecore Link, it will wipe out any existing css classes on your link.

So, something like this:

<a href="http://mylink" class="test">This is the Text</a>

Will become like this:

<a href="http://mynewlink">This is the Text</a>

I figured that the custom command that edits this is not saving any other attributes of the link – the command is in \sitecore\shell\Controls\Rich Text Editor\RichText Commands.js; I opened a support ticket and Sitecore Support managed to fix this for me pretty quickly. It has been registered as a bug, and the fix is available here:

https://github.com/SitecoreSupport/Sitecore.Support.95393.165781/releases/tag/9.0.1.0

Note: This has been tested in Sitecore 9 update 1

Addendum: Sitecore 9.1 Infrastructure Roles

Sitecore 9.1 was released recently after the Sitecore Symposium 2018, and there are a lot of goodies to be found. Pieter Brinkman has a really good series on all the new features of Sitecore 9.1.

I want to add a little bit of addendum to my original post for all the Sitecore 9 Infrastructure roles. If you visited some of the technical talks at the Symposium, you’ll see a pattern towards breaking a monolithic architecture into a much more manageable micro component architecture. This means some of the functions from the main application will get split out. If you look at the recently updated documentation, you’ll find that there are now a LOT of infrastructure roles. Note – my earlier blog post only refers to application roles. You’ve already seen this initiative in the case of xConnect and the publishing service (and various other cases), which are both separate roles, and are slated to run independently of the main CM or CD instance.

Sitecore Identity Server

Sitecore 9.1 continues this trend, and introduces a new application role, called the Identity Server. The Identity Server is a separate ASP.NET Core application that manages the authentication for Sitecore. After installing 9.1, you’ll see that when you go to log into Sitecore CM, you’ll get redirected from https://{instanceName}/sitecore to https://{instanceName}-IdentityServer.

The beauty of this is that now authentication delegated into a separate role, and we can now use this role to set up SSO (Single Sign-On) across Sitecore services and applications. You can use the SI server is based on OpenID Connect-compliant security token service and can manage/update/refresh tokens. These tokens can then be used to manage authetication for Sitecore Services. This will also allow users to sign into various sites and services that are hosted separately even when you do not have a running instance of Sitecore XP.

Planning

From an infrastructure planning perspective, this is a minor change. If you’re on Azure, this is another app service. If you are on-premise, then this is another IIS Site. This can be scaled separately from the other roles if you plan to use it extensively. Otherwise, it has a pretty small footprint. This is a benefit of the split up from the monolithic architecture, otherwise, you would need to scale out the whole CM/CD role just to scale out maybe a single function.

All in all, I think this is a great direction for Sitecore to go into. Initially, it can be overwhelming, but once you understand the roles, it is much easier to manage.

Sitecore 9 Infrastructure Roles

Not to be confused with Server Roles that are required for Configuration, this post is about all the different elements (or infrastructure roles) required to run an instance of Sitecore 9 Experience Platform. Whether you are in the cloud, or on-premise, these roles are all neccessary (albeit they run differently on-premise vs on the cloud) for all the features of the Experience Platform to work.

Content Management and Content Delivery

These are obvious and unchanged – can be scaled up as necessary, and their functions continue to be the same.

Processing and Reporting

Processing servers are responsible for aggregating collected data into a format that is suitable for reporting. It is responsible for processing all the xDB data, and putting it into the reporting database and putting it into a format where it can be reported on. It is another Sitecore instance, configured to serve the processing role.

Reporting servers are primarily used as the instance to get all the data for displaying analytics. Whenever an author/content editor goes into the content management to see data such as Experience Analytics, or similar analytics, the CM server then uses the Reporting API to access the aggregated data that was processed by the Processing server.

xConnect

Brand new for Sitecore 9, xConnect is a middle layer that sits between the xDB and CM/CD, effectively acting as the abstraction layer for the collection of all xDB data. In addition, it’ open to be used by any trusted client to be able to interact with xDB, so you’re not limited to having on Sitecore data in xDB. It has multiple APIs, and a different set of databases. In short, it is comprised of:

  1. The xConnect Collection service
  2. The xConnect Search service
  3. The xConnect Search Indexer
  4. The Collection databases

This is too short of a topic to cover in a blog post, but there is some fantastic documentation on it in the xConnect developer center documents.

Marketing Automation Service

The Marketing Automation Worker Service is a standalone service (windows server on-premise, web job in Azure) that is responsible for going through the marketing automation plans and enrolling contacts that meet the criteria for those plans. You can install this on any server that has connectivity to the processing pools.

xConnect Search Indexer

The xConnect Search Indexer is a standalone server (windows server on-premise, web job in Azure) that is responsible for adding xDB data into the search index for easy access. A lot of the data is automatically added, but this service primarily adds contact and interaction data, and runs on schedule to check for updates.

Databases

Sitecore 9 supports SQL Server 2016 SP1 for it’s the main content databases, xConnect and xDB databases, and MongoDB is not a requirement. In fact, currently, there is no support for MongoDB, but it is forthcoming.

That’s about all the infrastructure elements needed to run a full Sitecore 9 Experience Platform. There are scaled out and scaled down versions of these, which I will cover in the next post.

A Not So Quick Sitecore Search Primer: Part 2

After a quick foray into Sitecore Search, I decided to jot down a more thorough step-by-step of the things needed to make Sitecore Search from start to finish. All this stuff is probably very well known by Sitecore veterans, but I wanted to get something down for somebody just starting to venture into Sitecore Search. In the Part 1, we setup the index. In this post, we’ll write the code for the search. Please note: I’ve abandoned using Lucene for searches since then and use SOLR now. So the indexing setup will be different, but the search mechanism is more or less the same. This is a big benefit of Sitecore Search.

After the Index

Once the index is setup, the basics of searching using Content Search is pretty simple – you literally call the index and query based on the Content field:

using (var context = ContentSearchManager.GetIndex("indexname").CreateSearchContext())
{
   IQueryable query = context.GetQueryable().Where(p =&gt; p.Content.Contains("someterm"))

   foreach(SearchResultItem sri in query)
   {
         ...
   }

}

Advanced Stuff

The above obviously is not enough for anyone to get by with – even the most basic search requirements are more complex than that.

Basic Field Mapping

One of the first things that will be needed is to map your items entities to the index document entity. SearchResultItem is the base object that Sitecore Content Search provides – this maps to a basic Lucene document.

If you want to search by specific fields other than content (for filters, etc), it would be a good idea to extend this class:

public class ResourceSearchResultItem : SearchResultItem

After that, you can add your own fields from your item templates that you indexed using the IndexField attribute:

[IndexField("title")]
public string Title { get; set; }

This works pretty straightforward for string fields – for any multilist values, you have to make sure you defined your property as IEnumerable so that you can get the facets from the search result list.

Computed Fields

If you want a custom value for a field (computed), you will need to create a computed field and then add that to the index, and use that field name for the attribute:

[IndexField("fieldcontent")]
public string FieldString { get; set; }

The computed field class is very simple as well, implemented using IComputedIndexField.

public class FieldContentField : IComputedIndexField
    {
        public string FieldName
        {
            get
            {
                return "FieldContent";
            }
            set
            {
                
            }
        }

        public string ReturnType
        {
            get
            {
                return "String";
            }
            set
            {
                throw new NotImplementedException();
            }
        }

        public object ComputeFieldValue(IIndexable indexable)
        {
            Assert.ArgumentNotNull(indexable, "indexable");
            string url = null;
            try
            {
                Item item = indexable as SitecoreIndexableItem;
                
                if (item == null)
                {
                    return null;
                }

                //using the item, you can run business rules and return whatever value you need here
              
            }
            catch (WebException webExc)
            {
                //log error
            }
            return null;
        }
    }

Once done, it needs to get added to the index as a computed field:

<fields hint="raw:AddComputedIndexField">
     <field fieldName="fieldcontent">FieldContentField, AssemblyName</field>
</fields>

Once added, you can now use your own class to do searches:

using (var context = ContentSearchManager.GetIndex("indexname").CreateSearchContext())
{
   IQueryable query = context.GetQueryable().Where(p => p.Title.Contains("someterm"))

   foreach(ResourceSearchResultItem sri in query)
   {
         ...
   }

}

Filters

If you are only searching on one field, the above will work fine. If you want to search on multiple fields, you would need build a predicate.

using (var context = ContentSearchManager.GetIndex("indexname").CreateSearchContext())
{
   //Create an initial predicate - use .True<T> since we'll be AND'ing this clause together 
   var filterPredicate = Sitecore.ContentSearch.Linq.Utilities.PredicateBuilder.True<ResourceSearchResultItem>();
   
   filterPredicate = filterPredicate.And(x => x.ResourceType == "someValue");

}

The above is to chain a string of ‘AND’s. To string together a list of ‘OR’s, you would start with a .False and then string together .Or

If you want to have nested conditions, you’ll need to create a new Predicate, and then add it to the parent:

//create a new predicate
var resourceTypePredicate = PredicateBuilder.False<ResourceSearchResultItem>();
resourceTypePredicate = resourceTypePredicate.Or(x => x.CategoryType == "someValue");

//add it to the parent predicate
filterPredicate = filterPredicate.And(resourceTypePredicate);

And then finally, instead of searching on a specific field, you can now search using the Predicate:

IQueryable<ResourceSearchResultItem> query = context.GetQueryable<ResourceSearchResultItem>().Where(filterPredicate);

Facets

Once you have a search result, you get facets on the results if you’ve made a property for them in your result object. Getting facets is pretty straightforward:

var categoryFacets = new FacetResults();

categoryFacets = query
		 .FacetOn(x => x.CategoryString)
		 .GetFacets();

And then you can get the category values from the Categories and Values properties (see very crude example below):

foreach (var facetCategory in categoryFacets.Categories)
{
    foreach (var facet in facetCategory.Values)
    {
        string theValue = facet.Name;

    }
}

Sorting and Paging

Sorting is very simple – just use the LINQ extension methods to sort on one or multiple fields:

IQueryable<ResourceSearchResultItem> query = context.GetQueryable<ResourceSearchResultItem>()
                                                    .Where(filterPredicate)
                                                    .OrderBy(x => x.ResourceType)
                                                    .ThenBy(x => x.Created)

Paging is also pretty simple, being the results are IEnumerable – you can use the LINQ methods Skip() and Take():

IQueryable<ResourceSearchResultItem> query = context.GetQueryable<ResourceSearchResultItem>()
                                                    .Where(filterPredicate)
                                                    .Skip(0).Take(10); 
                                                    

One issue with this is that you won’t be able to get the total number of results, but Sitecore Search gives you a way to do that:

var numberOfSearchResults = query.TotalHits;

I’m also aware there is a LINQ method called Page(), but I have never tried it.

Conclusion

The above areas will account for 90% of searches. There are other extension methods for the search, such as Like and Matches() – you can use them as you need it, and get the search behavior you are looking for. A lot of this information is already available, but I was having trouble finding it all in one place. Hopefully, this is informative and convenient for the someone who is looking for a soup-to-nuts guide on searches.

A Not So Quick Sitecore Search Primer: Part 1

After a quick foray into Sitecore Search, I decided to jot down a more thorough step-by-step of the things needed to make Sitecore Search from start to finish. All this stuff is probably very well known by Sitecore veterans, but I wanted to get something down for somebody just starting to venture into Sitecore Search.

Note: This post will use Lucene as the Search Provider.

First things first – setting up the index

Before doing any kind of search, you have to make configuration files for the index. We will start with a very basic index that will start the search. For that, you need two basic parts – first, you have to setup what items you are indexing:

      <indexConfigurations>
      <MySearchConfiguration type="Sitecore.ContentSearch.LuceneProvider.LuceneIndexConfiguration, Sitecore.ContentSearch.LuceneProvider">
          <indexAllFields>true</indexAllFields>
          <initializeOnAdd>true</initializeOnAdd>
          <analyzer ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/analyzer" />
          <fieldMap type="Sitecore.ContentSearch.FieldMap, Sitecore.ContentSearch">
            <fieldNames hint="raw:AddFieldByFieldName">
              <field fieldName="_uniqueid" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider">
                <analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer, Sitecore.ContentSearch.LuceneProvider" />
              </field>
              <fieldType fieldName="_id" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider">
                <Analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer, Sitecore.ContentSearch.LuceneProvider" />
              </fieldType>
              <field fieldName="category" storageType="YES" indexType="TOKENIZED" vectorType="YES" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider">
                <analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer, Sitecore.ContentSearch.LuceneProvider" />
              </field>
              <field fieldName="tags" storageType="YES" indexType="TOKENIZED" vectorType="YES" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider">
                <analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer, Sitecore.ContentSearch.LuceneProvider" />
              </field>
            </fieldNames>
          </fieldMap>

          <fields hint="raw:AddComputedIndexField">
            <field fieldName="customcontent">MyLibrary.CustomContentField, MyLibrary</field>
          </fields>

          <include hint="list:IncludeTemplate">
            <NewsTemplateID>{B179CB04-3ACC-4737-ADA0-B45D7E98C213}</NewsTemplateID>
          </include>

          <fieldReaders ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/fieldReaders"/>
          <indexFieldStorageValueFormatter ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/indexFieldStorageValueFormatter"/>
          <indexDocumentPropertyMapper ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/indexDocumentPropertyMapper"/>
          <documentBuilderType>Sitecore.ContentSearch.LuceneProvider.LuceneDocumentBuilder, Sitecore.ContentSearch.LuceneProvider</documentBuilderType>
        </MySearchConfiguration>
        
      </indexConfigurations>

Let’s break this down into the different sections:

Starting from the indexConfigurations, we open up a new configuration node. You can name this whatever you want, but note the name of the node, so we can refer to it later.

<MySearchConfiguration type="Sitecore.ContentSearch.LuceneProvider.LuceneIndexConfiguration, Sitecore.ContentSearch.LuceneProvider">

The next part are some options:

<indexAllFields>true</indexAllFields>
<initializeOnAdd>true</initializeOnAdd>

Next comes the reference to the analyzer:

<analyzer ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/analyzer" />

If you look in the Sitecore.ContentSearch.Lucene.DefaultConfigurations.config file in the App_Config folder, you’ll find that it has default index settings for a bunch of the references that are needed. In your index, you can create your own (if you need your own analyzer) and refer to it here, or you can just refer directly to the one in the default configuration node. The ref attribute points directly to that node.

Next comes all the different fields that should be indexed. Even though we have a true as an option, what this will do is index all the text in a field called _content, but if you want separate fields to refer to and search on, you’ll need to add them here.

<fieldMap type="Sitecore.ContentSearch.FieldMap, Sitecore.ContentSearch">

This is one field you have to have – this assigns a unique ID to the document in the index, so when the item in Sitecore gets updated, it doesn’t add a new document to the index – instead it just updates it. In some older versions, this is not needed.

<field fieldName="_uniqueid" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider">
 <analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer, Sitecore.ContentSearch.LuceneProvider" />
</field>

You’ll need to choose the indexType and storageType for each field:

storageType = “YES” or “NO“: pretty straightforward – in the sense that the value of the field is either stored in the index, or not. This is useful for when you don’t want to go back to the database to retrieve the item for values that you want to display.

indexType = “TOKENIZED” or “UN-TOKENIZED” or “NO” or “NO_NORMS

  • TOKENIZED: Any phrases with multiple words will be split up
  • UN-TOKENIZED: Phrases will be stored as a whole – the entire value of the field, essentially
  • NO_NORMS: Phrases will not be split up, same as UN-TOKENIZED and also will not be analyzed, which means that it won’t store boost factors.
  • NO: The field value won’t be searchable, and the only reason to have this option is if you have storageType = “YES”, so you can retrieve the value.

If you have indexAllFields set to true, you don’t need to specify the fields – however, if you want to refer to the fields directly as members (for custom search result classes), they need to be added.

You can add fields by name, by type, or exclude them by name or by type (in our example, we added by name):

<fieldNames hint="raw:AddFieldByFieldName">

To add by type:

<fieldTypes hint=”raw:AddFieldByFieldTypeName”>

Include fields, or exclude fields:

<include hint=”list:IncludeField”>
  <fieldId>{B179CB04-3ACC-4737-ADA0-B45D7E98C213}</fieldId>
</include>

OR

<exclude hint=”list:ExcludeField”>
  <fieldId>{B179CB04-3ACC-4737-ADA0-B45D7E98C213}</fieldId>
</exclude>

The next section is computed fields. Computed fields are great for when a field value possibly points to another field, or multiple fields, and you have to derive a value based on some specific logic. It’s also useful for if you want have related items of some sort. You can calculate the related documents’ values on the fly and add it as a one-to-one field with the document. I’ll get into this in Part 2.

There are a bunch of fields that get added by Sitecore regardless of your config:

  • _content
  • _created
  • _creator
  • _database
  • _datasource
  • _displayname
  • _editor
  • _fullpath
  • _group
  • _indexname
  • _language
  • _latestversion
  • _name
  • _parent
  • _path
  • _template
  • _templatename
  • _updated
    _

  • version

I called out _latestversion because this will be important when you do the searches – when you have multiple versions of the same item, it gets indexed as separate documents, so when you search, you have to make sure you get the latest one. This only really matters on the CM server for previewing, because the web database always only has one version always.

<fields hint="raw:AddComputedIndexField">

Next step is to add the type of templates you want to index. The node name doesn’t really matter – you can name it anything, just include the GUID of the template in the node.

<include hint="list:IncludeTemplate">
     <NewsTemplateID>{B179CB04-3ACC-4737-ADA0-B45D7E98C213}</NewsTemplateID>
</include>

Alternatively, you can choose to include all template, and put a directive to exclude the templates you don’t want to index.

<Exclude hint=”list:ExcludeTemplate”>
     <NewsTemplateID>{B179CB04-3ACC-4737-ADA0-B45D7E98C213}</NewsTemplateID>
</include>

And then, you have to define some other values, such as the field readers, valueformatters, document property mappers, and builder type. You can point them all to the default config node.

<fieldReaders ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/fieldReaders"/>
<indexFieldStorageValueFormatter ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/indexFieldStorageValueFormatter"/>
<indexDocumentPropertyMapper ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/indexDocumentPropertyMapper"/>
<documentBuilderType>Sitecore.ContentSearch.LuceneProvider.LuceneDocumentBuilder, Sitecore.ContentSearch.LuceneProvider</documentBuilderType>

Once you’ve setup what you are indexing, next is to step define the actual index:

<configuration type="Sitecore.ContentSearch.ContentSearchConfiguration, Sitecore.ContentSearch">
        <indexes hint="list:AddIndex">
          <index id="my_index_name" type="Sitecore.ContentSearch.LuceneProvider.LuceneIndex, Sitecore.ContentSearch.LuceneProvider">
            <param desc="name">$(id)</param>
            <param desc="folder">$(id)</param>
            <!-- This initializes index property store. Id has to be set to the index id -->
            <param desc="propertyStore" ref="contentSearch/indexConfigurations/databasePropertyStore" param1="$(id)" />
            <configuration ref="contentSearch/indexConfigurations/MySearchConfiguration" />
            <strategies hint="list:AddStrategy">
              <strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/rebuildAfterFullPublish" />
             </strategies>
            <commitPolicyExecutor type="Sitecore.ContentSearch.CommitPolicyExecutor, Sitecore.ContentSearch">
              <policies hint="list:AddCommitPolicy">
                <policy type="Sitecore.ContentSearch.TimeIntervalCommitPolicy, Sitecore.ContentSearch" />
              </policies>
            </commitPolicyExecutor>
            <locations hint="list:AddCrawler">
              <crawler type="Sitecore.ContentSearch.SitecoreItemCrawler, Sitecore.ContentSearch">
                <Database>master</Database>
                <Root>/sitecore/content/User Content/Site Level/Mindshift/Resources</Root>
              </crawler>
            </locations>
            <enableItemLanguageFallback>false</enableItemLanguageFallback>
            <enableFieldLanguageFallback>false</enableFieldLanguageFallback>

          </index>
        </indexes>
      </configuration>

This starts out by naming the index:

<indexes hint="list:AddIndex">
  <index id="my_index_name" type="Sitecore.ContentSearch.LuceneProvider.LuceneIndex, Sitecore.ContentSearch.LuceneProvider">

Some requisite parameters for index locations, etc.

<param desc="name">$(id)</param>
<param desc="folder">$(id)</param>

Next thing to note is the reference provided to what the index should store. Here is where we point the ref attribute to the index configuration we made earlier:

      <configuration ref="contentSearch/indexConfigurations/MySearchConfiguration" />

Next is the add strategy section – this section defines how the indexes are updated, for both CM and CD. Essentially, it defines how/when indexes are updated when items are added/updated. For basic indexing, I’ve added rebuildAfterFullPublish which will rebuild the index on all remote servers after a publish.

          
<strategies hint="list:AddStrategy">
      <strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/rebuildAfterFullPublish" />
</strategies>  

For all the different index update strategies, go here: http://bit.ly/2h6FDCv

The next important part is the crawler. We will is the default crawler – there are many implementations of crawlers out there, and if you have a need to index your items in a very specific way, you can inherit from the default crawler and build upon it. In which case, that is the crawler type you would specify here.

   
<locations hint="list:AddCrawler">
              <crawler type="Sitecore.ContentSearch.SitecoreItemCrawler, Sitecore.ContentSearch">
                <Database>master</Database>
                <Root>/sitecore/content/User Content/Site Level/Resources</Root>
              </crawler>
</locations>

You also have to specify the root of the content tree where indexing will start. The crawler will traverse from there.

   
<locations hint="list:AddCrawler">
<Root>/sitecore/content/User Content/Site Level/Resources</Root>
          

Last but not least, you need to surround both of these with:

   
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <contentSearch>

Once this is done, you can deploy and you should see your indexes in the /data/indexes folder. You can use SPE (Sitecore Powershell Extensions) or a program like Luke Index Viewer to check the indexes and the fields being indexed.

Before you deploy to CM and CD, you must make sure that you follow the configuration file setup, as it has a bunch of indexes that need to be disabled for CD – if they aren’t disabled, errors get thrown, and interferes with your custom indexes. Go here for the configuration options: http://bit.ly/2fYtJ8y

In Part 2, we’ll get into the code on how to perform basic searches.