Obtree Case Studies

Hounslow Borough Council Cludo Search integration

Overview

Hounslow Borough Council (HBC), a long-time Obtree WCM customer tasked JCMS with implementing an integration between Obtree and Cludo Search. This included both the then public facing website and the intranet, both of which were powered by Obtree WCM. (Note: The public facing site is no longer powered by Obtree)

Cludo offer a remote indexing service, with corresponding search being implemented via a simple JavaScript plugin and - for intranet content - a server-side authentication module. Therefore, customer source content that needs indexing must be available remotely for indexing purposes. As such, integration with HBC public site was straightforward (because all pages are available freely on the web). Integration with the intranet site was more challenging, because of the requirement for proxy authentication.

Public facing site

The Cludo working model is to analyse the customer source HTML and provide a document describing the HTML and JavaScript alterations required to implement the integration. Therefore, the JCMS requirement was essentially to perform straightforward Obtree WCM template changes to satisfy the Cludo integration document.

These included some HTML alterations and some additional inline and referenced JavaScript. The key values set in the JavaScript configuration object were the customer ID and Site ID - so that the AJAX request for search results would retrieve matches from the correct customer and site - don't forget, the data is all held remotely on the Cludo servers.

The results were presented as a simple paginated list.

Intranet site

The requirement for remote indexing of the internal content meant that the implementation for the intranet was considerably more involved. The required steps can be summarised as:

  • HBC needed to allow remote access to servers behind the firewall. (i.e. to a rendered instance of the intranet)
  • Cludo needed to access this opened tunnel and crawl the content.
  • Because this content is private to HBC, but stored on the potentially publicly available Cludo servers, access to search results must be restricted
  • HBC needed to implement a secure server side proxy providing an authentication token to gain access to the crawled content
  • Therefore, HBC needed to provide access to the Cludo servers from the intranet client machines and from the intranet web server
  • HBC needed to implement the client-side JavaScript that tied this all together

Indexing

Allowing access to the material to be indexed required a network and firewall configuration change. This task was performed by the HBC network administrators. The subsequent indexing and storage was performed remotely by Cludo. Therefore, JCMS was not involved with this step.

Search result access

As noted, access to the indexed content required that a server-side authentication token was passed as part of the search request. This required a combination of tasks. The underlying network connectivity to allow search requests to reach the Cludo servers from within the HBC firewalled intranet was realised by HBC network administrators.

The realisation of the server-side proxy and the subsequent CMS integration and template changes to realise the client-side JavaScript were performed by JCMS.

Authentication proxy development

The initial instructions provided by Cludo for intranet integration only included client-side HTML and JavaScript integration. Although a HBC-specific authentication token was provided in addition to the customer and (different) site IDs, it was unclear how this should be used. Eventually, a GitHub repository location was provided that contained a C# skeleton code for the server-side proxy.

This was a .NET 4.5 application with various hooks for authentication, a test HTML and JavaScript file and, crucially, a configuration file with a placeholder for a customer key. Once the HBC customer key was entered into the web.config file and the site and customer IDs added to the client side JavaScript configuration, HBC-specific search results were presented in the test HTML file.

Cludo proxy web.config with customer key
Cludo proxy web.config with customer key

IIS setup

The Cludo proxy code runs as an IIS application, so a suitable virtual directory, pointing at the Cludo C# code, was added and converted to an IIS application. The virtual path becomes important later on for the integration with the Obtree WCM system.

IIS Cludo Application
IIS Cludo Application
IIS Cludo Application configuration
IIS Cludo Application configuration

Client side script and proxy testing

The GitHub skeleton code contains a sample HTML file that can be used to test the application. This has some inline javascript that embeds the remote Cludo javascript. The inline JavaScript was amended with the customer key and site ID and this, coupled with the server-side key held in the server-side code meant that sufficient authentication is sent with the search query request to allow Cludo to both identify the customer and site, and to ensure that the request for secure data has come from a server that has sent a valid customer token.

The simple test harness is sufficient to confirm that the intranet site has been indexed and that the credentials are valid, as shown below.

Initial view of demo search screen
Initial view of demo search screen
View of demo search screen with autocomplete
View of demo search screen with autocomplete
View of demo page with search results
View of demo page with search results

Integration between Obtree WCM and Cludo proxy

The Cludo proxy URL needed to be on a path of the same domain as the intranet, so that the client-side AJAX JavaScript did not encounter any cross origin request problems.

However, Obtree CMS works by intercepting all requests for a given domain (or path in that domain) and attempting to dynamically serve content held in it's own repository. It is of course possible to configure the system to ignore requests on certain paths with so-called EXCLUDEMAGIC entries in the configuration file

Therefore, in addition to the simple template modifications to include the HTML and JavaScript changes, the Obtree configuration was adapted with an EXCLUDEMAGIC entry matching the IIS Application name set when the Cludo code was added to the IIS website running the Obtree engine. This is shown in the following screenshot:

Obtree configuration to exclude a specified path
Obtree configuration to exclude a specified path

This configuration change ensured that any HTTP requests that included /cludosearch/ in the URL would not be processed by the Obtree engine, but passed through to IIS to handle (in this case, the Cludo .NET application).

CMS template integration

The final step was to implement the HTML and JavaScript changes detailed by Cludo following their analysis of the rendered sourcecode. Essentially, the existing search form was re-used, but a DOM identifier added to allow the Cludo code to initialise with the existing search markup. The appropriate initialisation object was also embedded as inline JavaScript to load all required remote libraries and to ensure the correct output DOM targets and format were specified.

Obtree template showing JavaScript Cludo configuration object
Obtree template showing JavaScript Cludo configuration object

Unlike the internet content, the intranet content was indexed with facet attributes, and the results are presented with a breakdown of matches per facet. Each displayed facet is a filter for the current result list. This output is all generated by the Cludo controller code. For example, a search for 'councillors':

Search outcome showing all matching results and available facets
Search outcome showing all matching results and available facets
Search outcome showing all matching results filtered by a facet
Search outcome showing all matching results filtered by a facet