Sunday 25 September 2011

How to design? - the data

How to design? This is the basic question for creating reliable and well-working OSB components. What are those repetitive steps which the designers or the developers must do?

OSB is a communication platform so the data (requests / responses) are very important. The main task of  the OSB is to deal with the data and not with the business logic.
Usually we have to process the incoming data before sending to the other systems or third parties. These process steps are called the message flow. I emphasize the purpose of these processes are to handle the data and not to implement business workflows.
We have to know the answer of the next question as well: what will we do if an error occurs? We have got lots of choices what to do in these cases but the circumstances (business requirements, used protocols, ...) are different and limit our possibilities.
OSB is a gateway among systems it might slow down the communication and cause bottlenecks. You should be concerned with the performance issues as well.
It is very important to check what is happening inside the OSB. The technical and business users want to monitor the events or just become aware of the errors or the more important happenings.

These are the key factors what You should be concerned with:

  - input / output data
  - message flows
  - error handling
  - performance
  - monitoring

Input / output data (requests / responses)


The handling of the requests or responses is different in case of
  - business services
  - proxy services.

Usually the data structure of requests of the business services is determined as the services which must be connected to OSB have been working already. So You might have less tasks to do. You have to create the business service, set the protocol and the structure of input / output data (in case of webservice just use the XSD in the WSDL).
In some cases (e.g. attachment of email) You can't set exactly the data structure for the business service (anything can be sent in attachment), You have to find out and validate it in the message flow (e.g. XSD validation). In these cases You must check the data because You mustn't encumber the source system unnecessarily.
Sometimes You need to create a request or a response message. If You want to convert an asynchronous service to a synchronous one inside the OSB then You need to create a virtual response because the source system won't send a response back. (e.g. webservice → email)

If You might receive unstructured data then You can't validate it simply. Why? The best example is the attachment of emails for this. You can receive ANYTHING in an email. One part of the incoming email is incorrect (for example a JPG file was sent instead of XML) and the other part seems to be good.
If the same data is being sent to an email address then there is no problem. But what if the partners send orders and complaints to the same email address? And what's more there are more versions of the orders and one of the partners send version 1.0, the other partner sends version 2.0 and so on. In these cases You can't validate so simply.
To consider the different business data (orders, complaints) You have to validate the attachment using one of the XSD's and if an error happens You have to use an other schema and so on. If there are more versions of the incoming data and there is version information in the XML (e.g. <order version="1.0">) then You are able to choose the correct schema to validate the XML. If You haven't got version information then You have to use the strictest schema to validate and the second strictest one and so on.

Proxy service is a new service created by You. In this sense You have got freedom to define the input / output data. But this is not true entirely. If You implement an atomic service usually You have to use the requests / responses format of the business service. In the case of the domain proxy services the input / output data are similar to the data of business service but this is not certain. You have to process the input data and to transform them so the difference between the requests / responses  of the business and proxy service may be huge. This is determined by the business requirements. And if we are talking about enterprise services we have to work out the structure completely ourselves.

To allow more request format for an operation is a bad design pattern. I mean You can use the 'choice' XSD element which let You use different request formats. For example:
<xss:choice>
   <xs:element name="mouse" type="mouse"/>
   <xs:element name="cat" type="cat"/>
</xs:choice>
In this case the request may be this one:
<soap:Envelope ...>
   <soap:Body>
       <mouse>
           ...
       </mouse>
   </soap:Body>
</soap:Envelope>
and this one as well:
<soap:Envelope ...>
   <soap:Body>
       <cat>
           ...
       </cat>
   </soap:Body>
</soap:Envelope>
It seems to be simpler to use the 'choice' XSD element but it is a better and cleaner solution to create an operation for the cats and an other one for mice.

I suggest to put the generally used schemas to a separate project which is for common components. You mustn't modify these schemas even if You should do it because of a proxy service. If You need to modify one of them then copy it to your project and modify that one not the original.

2 comments:

  1. excellent stuff Victor.
    Incidentally the plural of "mouse" is "mice", not "mouses". This is the only error I could find in your article :o)

    ReplyDelete
  2. Thanks Pierluigi, i will fix this! :)

    ReplyDelete