Achieving a federated single view of the customer

I am writing this post as I have just delivered a pilot implementation of a portlet driven web presentational architecture and it seems like a good time to reflect on the concepts that fed into its design to date, and also its long term prospects.

I would like to propose that modern portal and portlet technology standards as key to a realistic solution to the challenging problem of achieving a “single view” of the customer. The technique I will describe can be extended to any type of object whose information is spread across disparate applications of which the solutions architect has often very limited control.

There has been substantial discussion on this topic, but in my researching experience, few approaches have emerged that propose a solution to “single view” when systems have not been specifically architected to fit into such a solution.

This is especially important for local government organisations, where there exists a substantial number of 3rd party supplied applications; often caused by a lack of corporate governance on procurement of applications.

Often, in order to make information held in such applications searchable, the appropriate information is extracted from the application and indexed externally. This is identical to how popular www search engines such as Google provide search capability. However, this search strategy is not fit for organisations that take access security seriously.

Security can quite simply be split into two categories: Infrastructure and application security. Infrastructure security prevents unauthorised access to environments in which information resides in data stores. This includes preventing direct access to enterprise data stores or control of servers. If your infrastructural environment is not secure, then it is irrelevant what degree of security you apply at the application level. However, it is out of scope of this discussion.

Application security is however far more interesting in my opinion. Applications dictate the security model through which information access is facilitated. They are the gatekeepers. It is important to consider that whenever information is retrieved from such an application, it leaves its secure environment. For this precise reason, the usage of centralised indexing technology must be considered a security risk to sensitive enterprise information.

So how can we balance security with usefulness? Bring on federated systems!

Federated Search Strategy

Before discussing what value a federated search strategy can bring to an enterprise, it is important to understand what the term “federation” means.

“[A federation is] the formation of a political unity, with a central government, by a number of separate states, each of which retains control of its own internal affairs.” (sourced from www.dictionary.com)

This definition is highly applicable in the context of solutions architecture; where there is a need to facilitate information and process governance.

Interestingly, as mentioned earlier, information governance (of sort) is often forced upon the systems architect in the form of unmodifiable 3rd party supplied applications promoting their own data stores for information. Unfortunately it is not useful information governance, as it leads to information redundancy which causes issues when providing a comprehensive view of information held on a subject or customer.

However, this form of governance is useful in the area of security. The application, although potentially duplicating information held elsewhere, has been carefully designed (if it holds sensitive information) to prevent unauthorised access to the information it holds. Consequently the application would naturally define user types or roles which govern user permissions. I will hereafter refer to these as “application security roles”.

Federated search is all about allowing each application to govern its own search functionality, on its own data, according to its own rules.

So how does federated search relate to the single view of the customer you could ask? In generic terms, a customer is just another object type modelled by the organisation’s applications.

An organisation has its own roles that make sense at the organisational level. These could be described as “organisational security roles”. These are rarely the same as the application roles, so a mapping scheme between the two sets is required.

Commonly, in a federated search system, a user enters a search criteria once into a common component such as a form field in the header of a website. The system then delegates the search onto all the participating applications. In the event of the user having been authenticated previously, this delegation will include specific application roles identifiers that are derived through the mapping scheme discussed earlier. If no such authentication took place, then a “guest” role is used. The system then waits for all the results to become available before displaying a consolidated results list back to the user.

One challenge is dictating what fields are acceptable search criteria and provide appropriate form field(s) for the user to express their search criteria through. If you are searching for information on a particular customer, then it would make sense to provide search fields for information such as: first name, last name, and of course, customer reference.

Yes that would be very nice – but entirely unrealistic!

The reason being that not all applications participating in the federation system will store a customer reference against a customer object. It’s likely to store a reference of some sort, but there are no guarantees that it will even be in the same format as that of its counter parts from other applications in the federation. In addition, customer’s ID within an application are often generated rather than input by the administrator of that application.

In my opinion, the only way around this issue, is an index of cross references needs to be maintained externally to each application. This is unfortunate, because as mentioned earlier, this information then bypasses the application’s security model. Only the IDs should be kept in such an index to mitigate this risk.

On the flip side, one could consider such an index to be an application in its own right, which participates in a federation. After all, the cross referencing is in itself information, and access to it should be governed by user security roles.

Another issue is that your search criteria fields must refer to data items common to all the applications in order to ensure searches are comprehensive. Otherwise, only a subset of applications will retrieve information on the customer. Again there is a way to work around this, but it involves storing meta information about what fields exist in what applications and then doing preparatory searches against at least one of those applications to return an ID which is common to all the federated applications (potentially requiring further cross referencing indexes) and which can be used in the main search. In addition this approach would need to define precedence rules for which systems to carry out the preparatory search in when more than one option is available. In other words, you can very easily end up with a very complex implementation which grows exponentially in complexity whenever new searchable fields need to be added.

There must be a better way…. and there is!

Federated search through modern portal technology

In recent years, portal standards have matured substantially with the advent of web 2.0 technologies and increased user expectation for seamless integration between applications.

Portals are often seen as simple content aggregation engines. However, they are capable of much more. With the release of the JSR-286 (a.k.a. portlet 2.0) standard in June 2008, portals became capable of acting as messaging brokers to enable IPC (Inter-Portlet Communication) without reverting to proprietary implementation as was the case with JSR-168 (a.k.a. portlet 1.0).

It is common practice to make use of messaging technologies in middleware to benefit from

  • reliable delivery of information
  • asynchronous processing
  • loose coupling of component parts/services

With the advent of JSR-286, we are starting to see the same benefits in the development of presentational components that can be mashed up to produce useful composite applications.

The means through which this is achieved is JSR-286 eventing for IPC. Portlets are now able to trigger named events in the scope of a namespace. These events can transport any object as long as it implements the Serializable interface.

I would suggest that Apache XMLBeans can provide ideal object types to send as payload. It fully implement the XML Schema (XSD) specification and hence can be used to model very complex information structures.

So how do namespaces relate to the concept of federated search? In a federation of applications, namespaces are going to be vitally important due to the high likelihood of identically named fields between systems. Namespaces give context. Two types of contexts should be appreciated:

  • Organisational context
  • Application context

It is the role of the solutions architect to define an appropriately designed organisational context namespace in which all the object types of interest to the organisation can be represented.

Likewise, it is the role of the application developer or supplier to define an appropriately designed application context namespace in which all the object types of interest to the application can be represented.

In order to facilitate effective loosely coupled IPC, applications should only ever have their namespace translated to the organisational namespace and never directly to the namespace of another application. At this time, I believe this is best achieved through broker portlets that must be placed on the same page (JSR-286 event messaging is page scoped).

Earlier in this article I use the example of customer references not matching up between applications. A solution to this would be to deploy a broker portlet whose job is to listen for events in a specific application’s namespace and translate these into the organisational namespace and finally to the namespaces of other applications. Once translated, the portlet can create events that are indifferent to events originating from the destination application’s own portlets.

In fact, if federated search is implemented through JSR-286, then there is no real need in passing anything but cross references between portlets. The less information you pass the better from a security point of view.

If only references are communicated, then the application’s portlet(s) will be needed in order to display any information from its secure data store relevant to the search. This means that the “application security roles” assigned to the user will dictate how much or how little information is shown. In addition, you get instance gratification should the user’s “application security roles” change in the middle of a search.

The final piece of the puzzle is how to initiate a search in the first place. The traditional concept of a single search box bears little relevance in a JSR-286 federated search. I would argue that it should be possible to initiate a search from any application’s portlet that you have access to and that once an object of interest has been selected within that portlet, the eventing described earlier should take place to propagate that selection across all portlets on the page being viewed. With such a design, it is possible to drill down through the information in a recursive fashion. Much like how one browses the world wide web.

Maybe the concepts discussed in this post could be described better as federated browsing triggered by an initial application centric search?

On to discussions!

I hope you have enjoyed reading my thoughts on this matter. I think it’s an interesting topic, and I’m very aware that I’ve only just begun to scrape the surface of it.

Advertisements

8 thoughts on “Achieving a federated single view of the customer

  1. Hi Stian,

    A very good post on your learning and i am looking forward to hearing more about this as you develop your thinking further.

    One of the key business benefits over and above the security of information is that as local councils look to commission services this approach provides a framework to allow for the systems being provided to sit outside of our core infrastructure without compromising the information held in the system directly.

    Carl

  2. Pingback: Going under the hood of a revised web architecture – federated search « Carl's Notepad

  3. Federated searching is what I advocated six years ago. I may have used different words, but the objectives and outcomes should be comparable. What I like about the posting is that it has taken a relatively simple concept to the next stage of identifying some software technology (not my forté). You will find that the Establishment do not like federated solutions – my bids always were rejected. Too many people in power are hanging onto the mistaken belief that they can enforce common Unique Identifiers (UIDs). UIDs have never worked between agencies and never will. Hence we need a method of managing Multiple Identifiers – and this is what you seem to be offering.

    My recent work has been with a message hub – see http://wp.me/p14MGf-8m – and I see this as a stop gap measure. It reinforces the absolute need for meta data. I go further, and say it needs standards. The Education sector have one in the SIF (Systems Interoperability Framework) and this could be extended into more public sector objects.

    The full flexibility of mapping any system to any other system is described in ISO 18876 (briefly mentioned in http://wp.me/p14MGf-7u and http://wp.me/p14MGf-1R). It is often criticised for being too simple and obvious; in which case why has it not been adopted and converted into software? The cynic in me thinks that there is too much invested in proprietary information systems and standards begin to break down monopolies.

  4. There seems to be a desire at present to explore federated solutions from the local authority I work for. This would probably not have been the case 6 years ago when you were advocating this Lenand. In UK local government there was an emphasis on building eForms as they were seen as a good way to standardise data capture. I consider this the opposite to a federated solutions approach, but has increased demand for loose coupling technologies as these eForms were only a conduit to back-office applications. In addition, I portals were quite ill defined 6 years ago. The term used frequently, in many different contexts, and often meant little more than simply bringing together hyperlinks to different information resources. The standardisation of portlets through JSR-168 provided some direction for portal development, which is now started to yield real dividend. Also the advent of social networking has certainly helped!

    I’ve been reading your blog Lenand. Your views on identity are very interesting and I think you’re very correct in what you’re saying. It something I’m going to need to give further thought. I especially like how you describe identity as a sufficient level of trust to meet the requirements of a specific entitlement.

    This of course doesn’t translate to needing to provide repeat information (bad user experience!), because through technologies such as message enrichment (often found in ESBs), supplying only a little information; i.e. a single identity in a specific context; this can be enriched with further information (that are linked to the identity) to provide other identities that are required in other contexts. Local government processes representing applications for service or entitlements often require cross referencing information held in other systems (often if you are in receipt of one service, you are eligible for another), which is accessible only after successful identification of the individual in that system (i.e. another identity).

    This is definitely a topic worth exploring further…

  5. Thanks for taking time to explore further. It needs caucus of like minded people to demonstrate the case for federation. It needs an over-riding commitment to standards,

    I have started my blog with broad based Information Governance http://wp.me/p14MGf-dD in which federation eventually has place.

  6. I just stumbled across an article which describes the case for federation very precisely: http://www.ideaeng.com/tabId/98/itemId/145/Whats-in-a-name-Federated-Search.aspx

    The author of this article describe two definitions of federated search very clearly and suggests two new terms to better define this concept. Federated *Index* Search & Federated *Silo* Search.

    There appears to be great synergy between the description of the latter with that of my post. I would suggest that the portal standards I described in my blog post provide a solution to two of the challenges described by this article:

    * Translating user searches into the syntax of each underlying search engine / database
    * Mapping user credentials and access rights to each of the repository security models

    Where my post appears to differ in opinion is for the remaining “challenges”:

    * Combining results that may use radically different relevancy scores
    * Rendering the combined results in a pleasing and intuitive way
    * Providing page-at-a-time navigation
    * Combining Navigators from each result, such as faceted search, parametric search, taxonomies and auto-generate clusters

    I’m not sure what general opinion on this is (which is largely why I wrote this comment!) but my personal opinion is that if you’re implementing a federated search, then you should not try to combine the results from the individual searches being performed. Like the article describes, this is highly complex (strong taxonomy integration & meta information management is a must) and I believe ranking becomes ill-defined. I would suggest the better approach is to simply display multiple result lists, each rendered purely according to each federated system’s own relevance ranking mechanism.

    Is a further term needed to describe this presentation method maybe?

  7. Given a reasonably short list, eg one page, of options, the human brain is the most cost effective processor. It is usually obvious which result is the best fit and to make assertions of identity. My architecture included a function to give weighted assertions with a time stamp and identity of the asserter. These assertions can be incorporated into federated searches.

  8. Pingback: The Portal Challenge: Integrating software developed in isolation « Metaphorm .blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s