Tracking solution architecture risk is a critical activity for a great solution architect. In prior articles, we provided a primer on solution architecture risks, provided strategies for identifying solution architecture risk, and are now going to delve into a solution architecture risk register.
First, and this is important, while all risks are worth tracking, the risks that are the responsibility of a solution architect are the risks related to architecture decisions. If you refer back to the primer on solution architecture risks, you may remember that we defined risk as uncertain outcomes with negative consequences. Therefore, when a solution architect makes an architecture decision, and the outcome is not certain, he or she should identify and track potential negative consequences of those decisions.
Solution Architecture Delivery Risks
In the risk primer, we introduced the concept of delivery vs. production risks, basically separating the risks leading up to production deployment (delivery risks) from those that exist once the solution is live in production (production risks). There was a method to that madness! Both can be a result of solution architecture decisions. Still, the delivery manager (project manager, scrum master, etc.) that is responsible for the delivery is the best person to manage the delivery risks. As a result, our recommended best practice for delivery risks is to hand them off to the delivery manager.
Solution Architecture Production Risks
If you recall from the primer, production risks are the category of risks that exist once the solution has gone live. In an organization that has formal Enterprise Risk Management, these would be tracked and managed in an operational risk register. If that does not describe your organization, then they can be tracked in the architecture document for the design to which they apply. Position them near the top of the document so they are always top of mind for you and your audience.
Solution Architecture Risk Register Fields
What could happen that has a negative consequence? Describe the risk that an architecture choice has created.
Why do we care? What will the impact on the organization be if the risk is realized, e.g. it actually happens?
Identify the severity of the risk on a scale: negligible, marginal, critical, catastrophic. Provide rubrics for each value appropriate for your business or organization. e.g., if you are a bank, a non-financial impact on a single customer might be negligible, while unrecoverable financial transaction errors for all customers might be catastrophic. You can also use an alternative scale, like: high, medium, low. Whatever scale you use, you should always provide a rubric.
Identify the likelihood of the risk occurring on a scale: rare, unlikely, possible, likely, certain. Similar to severity, it can be helpful to provide a rubric. Rubrics could be based on likelihood percentage, the expected frequency of occurrence, or even by comparison with some event with which the audience would be familiar. e.g., A meteor hitting the data center is rare, but a power brownout is likely. As with severity, you can also use an alternative scale, but you should provide a rubric.
The narrative for handling the risk. All the different ways to treat risk could fill an article, or a book – or this Wikipedia article. Some common ones for architecture risks are:
- Avoidance. Do something to remove or reduce the risk; in this case, change the design.
- Acceptance. The “do nothing” response. Someone with appropriate authority agrees that the risk is worth the reward!
- Mitigation. Put additional processes, systems, or controls in place to reduce likelihood or severity of the risk.
- Transference. Execute an agreement that transfers responsibility for the risk from one organization to another. e.g., to a vendor.
- Monitoring/Preparation. Put processes or systems in place to identify when the bad outcome occurs and have a plan for when it happens.
Sample Solution Architecture Risk Register Entry
|Risk||Customer information in the Operational Data Store (ODS) will not match admin platform if the data warehouse daily loading batch cycle does not complete within the processing window.|
|Business Impact||Customer-facing systems utilizing the ODS (call center, web, mobile) will report incorrect balances.|
|Response|| 1. Prioritize ODS data loads that include customer accessed data.|
2. Add “last updated” field to customer-facing systems and display warning if more than 24h.
3. Monitor critical load processes and define manual procedures to load.
Tracking Solution Architecture Risks
Now what? Well, that depends. It depends on how formal and mature risk processes are within your organization. If processes exist, we recommend aligning with those existing processes for approving and managing operational risks. If such processes are not in place, tracking these risks is still worth the effort. Capturing the risks, even as an embedded solution architecture risk register in the architecture products and processes, enables better fact-based discussions, which result in better design outcomes, and provides a trigger for improvement in the future when further design occurs.
No More Risk?
Close! Regarding risk articles, anyway. We may be done with the solution architecture risk register, but we are are not yet done with solution architecture risk. We have one more article left in us on how to use risk as a lever for better architecture.
Risk is just one of the topics included in our solution architecture training curriculum. Drop us a line at [email protected], call (401) 340-1400, or contact us to learn more. Like the tagline says, our reputation is our success. If we can do great things for you, we will. If we can’t, we’ll say so.