One crucial element Solution Architects forget to focus on

A little while back I had the opportunity to be part of a discussion with a number of IT leaders from some of Australia’s largest companies from varying industries. The discussion echoed one of the key pain points with their IT solutions I believe Solution Architects forget to focus and provide sufficient thinking around.

When looking at solutions, as expected, the main focus is predominantly about:

  • Addressing business requirements
  • Designing or selecting the right solution to address the business problems
  • Keeping the solution in line with the strategic objectives and roadmaps
  • Designing for the correct SLA’s and solution robustness
  • Total cost of the solution and cost effective ways to deliver it

However, how supportable or easy to diagnose failures once the solution is in production and the project team has long gone, is not something that gets too much attention. Now I admit on first glance this seems to be trivial and not as important as I make it out to be, but let me go through why this is more crucial now by focusing on 3 major characteristics of a modern IT environment.

1) Outsourcing

Almost every major company is looking at outsourcing their IT support or have been in an outsourced model for a number of years. In some cases, there can be multiple vendors from across the world providing IT support to a single company. Therefore, in a complex IT environment with many software and hardware components, you may end up with a mix of outsourced vendors supporting your solutions.

In many cases when a business critical solution is impacted, IT vendors are more focused on their own SLA’s and in many cases blame other parties for the issues. This can turn into finger pointing exercise at each other and ultimately the customer pays the price.

Therefore, when your IT solutions with multiple stacks of technology components fail, how easy is it to pinpoint the failure? Can the technical teams diagnose a problem within minutes or hours? Does the solution itself provide capabilities to self-diagnose and assist?

Sometimes depending on the use case, designing an easy to support and smart self-diagnosing capabilities of a solution can be more effective than engineering multiple levels of technical redundancy.

2) Lack of control

In the past companies had more change control over their IT solutions, usually, new software updates or enhancements are only released after multiple rounds of testing and QA processes. More importantly, they had the benefit on when the updates were released, this meant business critical times were shielded from any change impacting the business.

However, with the adoption of many cloud solutions, this control is no longer within the organisation. The solution providers deploy their updates to their solutions and in some cases, you only find out the hard way when suddenly your solution is broken due to a change. One of the CIO’s from the discussion said they had to ring up the local Australian based support teams for the cloud vendor who was not aware there have been updates deployed.

Therefore, in a solution with multiple cloud and non-cloud components which integrate into your other business critical systems, how can you quickly identify when a cloud vendors update has had an impact. Are your solutions designed to be resilient to cloud updates and releases that can happen when you least expect it?

3) Driving costs down

Many companies are going through cost reduction exercises and IT budgets seem to be reducing every year. This is putting a lot of pressure on solutions and projects to implement the cost-effective solution from a project budget perspective. Typically in larger companies, solutions have a shelf life of 3-7 years depending on the industry. Therefore, over the life of the solution, the support expenses can easily out weigh the implementation and licensing costs. This is another key reason where companies suddenly discover their run costs are too high and start looking for outsourced support options.

Alternatively, we could have designed a solution that requires minimal staff to support, with smarter self-diagnosis or failover capabilities allowing a minimum number of staff required to keep the lights on. This type of solution would definitely cost more initially to implement, however, the overall total cost of ownership (TOC) will be far less than vice versa.

Therefore, when selecting or designing your enterprise solutions, pay particular attention to the level of effort needed to maintain and support the solution. This is where the solution architect needs to ensure the projects are focused on the long-term effects and the true costs of support to the enterprise and not the immediate cost-effectiveness for delivery.