I am writing this post as I have just delivered a pilot implementation of a portlet driven web presentational architecture and it seems like a good time to reflect on the concepts that fed into its design to date, and also its long term prospects.
I would like to propose that modern portal and portlet technology standards as key to a realistic solution to the challenging problem of achieving a “single view” of the customer. The technique I will describe can be extended to any type of object whose information is spread across disparate applications of which the solutions architect has often very limited control.
There has been substantial discussion on this topic, but in my researching experience, few approaches have emerged that propose a solution to “single view” when systems have not been specifically architected to fit into such a solution.
This is especially important for local government organisations, where there exists a substantial number of 3rd party supplied applications; often caused by a lack of corporate governance on procurement of applications.
Often, in order to make information held in such applications searchable, the appropriate information is extracted from the application and indexed externally. This is identical to how popular www search engines such as Google provide search capability. However, this search strategy is not fit for organisations that take access security seriously.
Security can quite simply be split into two categories: Infrastructure and application security. Infrastructure security prevents unauthorised access to environments in which information resides in data stores. This includes preventing direct access to enterprise data stores or control of servers. If your infrastructural environment is not secure, then it is irrelevant what degree of security you apply at the application level. However, it is out of scope of this discussion.
Application security is however far more interesting in my opinion. Applications dictate the security model through which information access is facilitated. They are the gatekeepers. It is important to consider that whenever information is retrieved from such an application, it leaves its secure environment. For this precise reason, the usage of centralised indexing technology must be considered a security risk to sensitive enterprise information.
So how can we balance security with usefulness? Bring on federated systems!
Federated Search Strategy
Before discussing what value a federated search strategy can bring to an enterprise, it is important to understand what the term “federation” means.
“[A federation is] the formation of a political unity, with a central government, by a number of separate states, each of which retains control of its own internal affairs.” (sourced from www.dictionary.com)
This definition is highly applicable in the context of solutions architecture; where there is a need to facilitate information and process governance.
Interestingly, as mentioned earlier, information governance (of sort) is often forced upon the systems architect in the form of unmodifiable 3rd party supplied applications promoting their own data stores for information. Unfortunately it is not useful information governance, as it leads to information redundancy which causes issues when providing a comprehensive view of information held on a subject or customer.
However, this form of governance is useful in the area of security. The application, although potentially duplicating information held elsewhere, has been carefully designed (if it holds sensitive information) to prevent unauthorised access to the information it holds. Consequently the application would naturally define user types or roles which govern user permissions. I will hereafter refer to these as “application security roles”.
Federated search is all about allowing each application to govern its own search functionality, on its own data, according to its own rules.
So how does federated search relate to the single view of the customer you could ask? In generic terms, a customer is just another object type modelled by the organisation’s applications.
An organisation has its own roles that make sense at the organisational level. These could be described as “organisational security roles”. These are rarely the same as the application roles, so a mapping scheme between the two sets is required.
Commonly, in a federated search system, a user enters a search criteria once into a common component such as a form field in the header of a website. The system then delegates the search onto all the participating applications. In the event of the user having been authenticated previously, this delegation will include specific application roles identifiers that are derived through the mapping scheme discussed earlier. If no such authentication took place, then a “guest” role is used. The system then waits for all the results to become available before displaying a consolidated results list back to the user.
One challenge is dictating what fields are acceptable search criteria and provide appropriate form field(s) for the user to express their search criteria through. If you are searching for information on a particular customer, then it would make sense to provide search fields for information such as: first name, last name, and of course, customer reference.
Yes that would be very nice – but entirely unrealistic!
The reason being that not all applications participating in the federation system will store a customer reference against a customer object. It’s likely to store a reference of some sort, but there are no guarantees that it will even be in the same format as that of its counter parts from other applications in the federation. In addition, customer’s ID within an application are often generated rather than input by the administrator of that application.
In my opinion, the only way around this issue, is an index of cross references needs to be maintained externally to each application. This is unfortunate, because as mentioned earlier, this information then bypasses the application’s security model. Only the IDs should be kept in such an index to mitigate this risk.
On the flip side, one could consider such an index to be an application in its own right, which participates in a federation. After all, the cross referencing is in itself information, and access to it should be governed by user security roles.
Another issue is that your search criteria fields must refer to data items common to all the applications in order to ensure searches are comprehensive. Otherwise, only a subset of applications will retrieve information on the customer. Again there is a way to work around this, but it involves storing meta information about what fields exist in what applications and then doing preparatory searches against at least one of those applications to return an ID which is common to all the federated applications (potentially requiring further cross referencing indexes) and which can be used in the main search. In addition this approach would need to define precedence rules for which systems to carry out the preparatory search in when more than one option is available. In other words, you can very easily end up with a very complex implementation which grows exponentially in complexity whenever new searchable fields need to be added.
There must be a better way…. and there is!
Federated search through modern portal technology
In recent years, portal standards have matured substantially with the advent of web 2.0 technologies and increased user expectation for seamless integration between applications.
Portals are often seen as simple content aggregation engines. However, they are capable of much more. With the release of the JSR-286 (a.k.a. portlet 2.0) standard in June 2008, portals became capable of acting as messaging brokers to enable IPC (Inter-Portlet Communication) without reverting to proprietary implementation as was the case with JSR-168 (a.k.a. portlet 1.0).
It is common practice to make use of messaging technologies in middleware to benefit from
- reliable delivery of information
- asynchronous processing
- loose coupling of component parts/services
With the advent of JSR-286, we are starting to see the same benefits in the development of presentational components that can be mashed up to produce useful composite applications.
The means through which this is achieved is JSR-286 eventing for IPC. Portlets are now able to trigger named events in the scope of a namespace. These events can transport any object as long as it implements the Serializable interface.
I would suggest that Apache XMLBeans can provide ideal object types to send as payload. It fully implement the XML Schema (XSD) specification and hence can be used to model very complex information structures.
So how do namespaces relate to the concept of federated search? In a federation of applications, namespaces are going to be vitally important due to the high likelihood of identically named fields between systems. Namespaces give context. Two types of contexts should be appreciated:
- Organisational context
- Application context
It is the role of the solutions architect to define an appropriately designed organisational context namespace in which all the object types of interest to the organisation can be represented.
Likewise, it is the role of the application developer or supplier to define an appropriately designed application context namespace in which all the object types of interest to the application can be represented.
In order to facilitate effective loosely coupled IPC, applications should only ever have their namespace translated to the organisational namespace and never directly to the namespace of another application. At this time, I believe this is best achieved through broker portlets that must be placed on the same page (JSR-286 event messaging is page scoped).
Earlier in this article I use the example of customer references not matching up between applications. A solution to this would be to deploy a broker portlet whose job is to listen for events in a specific application’s namespace and translate these into the organisational namespace and finally to the namespaces of other applications. Once translated, the portlet can create events that are indifferent to events originating from the destination application’s own portlets.
In fact, if federated search is implemented through JSR-286, then there is no real need in passing anything but cross references between portlets. The less information you pass the better from a security point of view.
If only references are communicated, then the application’s portlet(s) will be needed in order to display any information from its secure data store relevant to the search. This means that the “application security roles” assigned to the user will dictate how much or how little information is shown. In addition, you get instance gratification should the user’s “application security roles” change in the middle of a search.
The final piece of the puzzle is how to initiate a search in the first place. The traditional concept of a single search box bears little relevance in a JSR-286 federated search. I would argue that it should be possible to initiate a search from any application’s portlet that you have access to and that once an object of interest has been selected within that portlet, the eventing described earlier should take place to propagate that selection across all portlets on the page being viewed. With such a design, it is possible to drill down through the information in a recursive fashion. Much like how one browses the world wide web.
Maybe the concepts discussed in this post could be described better as federated browsing triggered by an initial application centric search?
On to discussions!
I hope you have enjoyed reading my thoughts on this matter. I think it’s an interesting topic, and I’m very aware that I’ve only just begun to scrape the surface of it.