What is quality? What is testing? And how can we make stakeholders and team members care? These may be the basic questions you have when wondering about a methodical approach for quality engineering of Information Technology (IT) systems.
A brief introduction is given in below video by Rik Marselis.
[text continues below video]
In today's world many people and organizations rely on IT systems, many things would not be possible without IT systems. So, we need to be able to trust that these IT systems are working good enough to support business processes. This means that the quality must be at the level that fits the purpose and delivers business value.
Definition - Quality
Quality is the totality of features and characteristics of a product or service that bear on its ability to satisfy stated or implied needs.
The right quality can best be achieved by an integral approach. We call this quality engineering.
Definition - Quality Engineering
Quality Engineering is about team members and their stakeholders taking joint responsibility to continuously deliver IT systems with the right quality at the right moment to the businesspeople and their customers. It is a principle of software engineering concerned with applying quality measures to assure the quality of IT systems.
Core to quality engineering is people taking joint responsibility to deliver the right quality by applying various quality measures. These quality measures may be “preventive” (such as user story refinement or pair programming), “detective” (such as static testing and dynamic testing) and “corrective” (such as fixing problems, not only in the product but also in process and/or people’s skills).
In the DevOps IT delivery model, there is continuous focus on quality engineering. Actually, commonly DevOps teams try to implement "continuous everything", which means that they strive to automate as many tasks and activities as possible. This leads to, among other things, Continuous Integration and Continuous Deployment (commonly abbreviated to CI/CD).
To implement continuous quality engineering, of which continuous testing is a part, DevOps teams must use state-of-the art tools powered by artificial intelligence and machine learning. This will enable them to deliver quality at speed, for example by forecasting quality problems and solving them before anyone experiences a failure.
If we want to know if our IT system indeed satisfies the needs, we must measure the quality. We need to define quality indicators and measure these indicators. Most of this measuring is a testing task. Since in Scrum or DevOps, quality is the responsibility of the whole team, this measuring of indicators can be done by any team member that has the required quality engineering skills.
Definition - Testing
Testing consists of verification, validation and exploration activities that provide information about the quality and the related risks, to establish the level of confidence that a test object will be able to deliver the pursued business value.
Testing is one of many quality assurance measures and was often seen as the only quality assurance measure as illustrated below. Since the start of Information Technology, the quality of information systems has been an often-discussed subject.
When faults exist in IT products (ranging from mistaken requirements specifications to wrong program code and broken hardware) failures may occur, and people will not get the services they need. Testing helps detect faults and failures so that they can be fixed before they cause problems for the users. And as part of all quality measures, testing supports the prevention, identification and elimination of waste.
Testing should focus on both functional and non-functional testing. Read more about it here.
Monitoring is another quality assurance measure. Monitoring observes indicators of an IT system that is in live operation. Testing observes indicators primarily before an IT system goes live.
Do not implement a "fixing phase"
The initial approach of IT teams until the 1980s, when coping with insufficient quality, was to try to find all faults by testing and then fix them. To make people aware of this fallacy, the term "fixing phase" helped a lot [Weinberg 2008]. This term was meant to make people aware that testing cannot insert quality at the end, nor can a team fix any problem at the end.
In the 1980s and 1990s people discovered that it is impossible to detect all faults in a complex system; there will never be a perfect system by testing at the end and trying to find and fix all faults. Therefore, a balanced set of quality measures throughout the entire lifecycle needs to assure quality.
Requirements, Design, Development and Operations refer to tasks, not to phases or stages, and in a DevOps context these tasks may be performed simultaneously. And since according to the third DevOps principle of DASA [DASA 2019] the team has an end-to-end responsibility, quality assurance and testing must be implemented as a continuous focus throughout the lifecycle.
Needless to say, not having a fixing phase applies to every kind of IT delivery model, whether you work in a sequential (e.g. waterfall), a high-performance (e.g. DevOps) or a hybrid (e.g. SAFe) way (for more information see the section about the IT delivery models).
By implementing quality assurance and testing throughout the entire IT delivery lifecycle, we reach the often-praised continuous testing.
Keep in mind: not all faults need to be fixed. People want to continuously have insight in the quality level and the remaining risks, and when necessary improve quality throughout the lifecycle. If this means some faults remain, that is fine as long as the stakeholders know it and have the necessary workarounds. Faults that need to be fixed later, can be put on the team's backlog.
An insurance company set up a new insurance system for a new product. Since this company was the first in the market with this product, they expected a great market share. When the deadline for going live with the new product approached, the new system was able to create new entries into the system. Also, deletion of earlier entries was possible. However, updating existing entries was impossible. This looked like being a problem: customers of the new product could not move house, or change the insured artifact. Nevertheless, the insurance system did go live, based on the expected market share and the expected business value based on this market share.
Part of the realized revenues was used to mitigate the problem. The entries of customers moving house were deleted from the system, and new entries based on their new address were created. Extra staff was hired to make sure no automatically generated letters regarding the cancellation, or letters regarding the new insurance, were sent out.
Thus, the business value was not as high as initially projected, but it was still quite a bit higher than the value would have been if the insurance company had not been the first in the market.
Measuring quality provides information for establishing confidence
Over the years, people working in IT projects and operations learned that the level of quality and the number of remaining risks is not directly related to the decision whether an information system should go live or not.
The quality level is determined by measuring various indicators, for example based on quality characteristics. When the quality matches the expectations, this confirms that the quality for a certain aspect is OK, which contributes to the confidence that the IT system will be able to deliver the pursued business value.
A detected anomaly indicates that the quality may not be good enough. After analyzing the cause of the anomaly, the conclusion may be there is a fault. Such a fault could be fixed, or registered as a known fault, when the quality level, despite such a fault, is still good enough.
Sometimes an IT system with a low level of quality (i.e. with many known faults and/or remaining risks) did not stop stakeholders from deciding to use that information system, because they could still achieve their business value. In other cases, an information system contained no faults at all, but because it did not contribute to the business value, it still was not used ("a good system in the wrong process").
Nowadays, business value is the goal of all IT activities. And the activities related to quality and testing aim to establish the level of confidence so that the stakeholders can decide whether or not the IT system is likely to provide the pursued business value. Based on this notion of confidence, the people involved will decide to put the IT system in live operation or not.
Information on quality and risks establishes confidence
Information about the quality and the related quality risks (also called product risks) is the key input for stakeholders to establish their confidence.
Definition - Quality risk
A quality risk is a specific chance that the product fails in relation to the expected impact if this occurs. The chance of failure is determined by the chance of faults and the frequency of use. The impact is related to the operational use of the product.
When determining the risk level, a team performs some sort of quality risk analysis, as explained in Building Block "Quality risk analysis & test strategy". The chance of faults depends on the IT characteristics, e.g. when new technology is used, the chance of faults is higher than with old technology. The frequency of the use of an app on a smartphone is much higher than that of an operations-function that adjusts some system parameters, and with a high frequency of use, the risk of failures is higher. The impact can relate to many negative consequences; often, damage to the corporate image of the responsible organization is seen as a high impact.
Testing measures the indicators related to the pursued value
The process of quality assurance and testing aims at gathering and reporting all information that the stakeholders need to establish their confidence level about the business value of an IT system. This is done by setting objectives related to business value and defining indicators for these objectives. The indicators are measured by testing, and other quality assurance measures. For example, testing can measure the behavior of the IT system when used by specific target audiences, which provides information about usability. Monitoring will measure the behavior in live operation, for example to see whether the use of memory tends to go beyond limits, which calls for corrective action.
The quality risks, their risk level and their possible impact, are determined in a quality risk analysis. These risks are part of the indicators that will be measured by testing. Together with measuring other indicators, testing activities will supply the information that stakeholders need to establish their confidence that the pursued business value can be achieved. Read more about the VOICE model.
Information about quality and risks is shared through some kind of reporting, either in a document, on a team board, on an automated dashboard or by oral communication (preferably only to support other communication).
To further support establishing their level of confidence, stakeholders often want to witness testing activities or do some testing of their own.
Testing consists of verification, validation and exploration
Still you may wonder: what exactly is testing? We provided our definition of testing at the beginning of this page. In this definition you see a number of terms that may require explanation. Testing is generally referred to as one activity, but when you look closer, testing is a combination of verification, validation and exploration activities.
Verification checks if the test object complies with specified requirements (for example from a user story). Validation tests whether the test object is fit for purpose, that is, if it fulfills the user needs and contributes to the business value. Because verification and validation can only focus on what is known, it has to be complemented with exploration to gather information about the as-yet-unknowns.
Verification focuses on "Are we building the IT system right?". Validation focuses on "Are we building the right IT system?" Exploration focuses on "How could the IT system be (mis)used?" Together, verification, validation and exploration provide a complete view on the quality and risks of the IT system.
Definition - Verification
Verification is confirmation by examination and through provision of objective evidence that specified requirements have been fulfilled.
Definition - Validation
Validation is confirmation by examination and through provision of objective evidence that the demands for a specific intended use have been fulfilled.
Definition - Exploration
Exploration is the activity of investigating, and establishing the quality and risks of the use of, an IT system through examination, inquiry and analysis.
Why, you may wonder, does exploration add benefits to verification and validation? Where verification and validation are about things we know about, exploration also investigates the unknowns.
In July 2019, Michael Bolton said on Twitter: "Exploration is not something that we tack on to the end of a test cycle. It is fundamental to how we develop software — yes, to programming, too, not just to testing. If we want to do excellent formal work, it has to come from informal, self-directed, exploratory work." [Bolton 2019]
Testing is about providing different levels of information
The tragedy of testing is that most stakeholders rarely get to see any product created by testing activities (such as test plans, test scenarios, test logs and other testware). In general, most stakeholders are only interested in one thing that testing provides: the information needed to decide whether they can take the IT system to live operation (the go/no-go decision). In other words: all stakeholders want is some kind of report or dashboard. But to fill such a source of information, many activities need to be done and a lot of testware needs to be created.
It is very important to differentiate the information towards different audiences. In the following figure, the brightly colored boxes show three levels of reporting and how they relate to the various layers of expectations, specifications and testware that are relevant to creating these levels of reporting.
[Note: In this figure, the shaded boxes are intended to be illustrative only. They are clearly visible and explained in detail in the section "Test Design", where it is also made clear why the levels of reporting are read from right to left.]
These three levels (from right to left) range from detailed to high-level. The people in the IT delivery team will need very detailed information to know about the quality of the new and changed objects they delivered and to investigate anomalies in the IT system and take corrective action where necessary. The product owner, businesspeople, project-level managers etc. will need overview reports to keep track of the status and progress which enables them to support the teams to adjust priorities or ways of working whenever necessary. High-level management will only need very brief information to supervise the IT delivery process or to support go/no-go decisions.
When multiple teams work together on one IT system, they will need to take care of aggregating information. Especially the overview reports and high-level reports will need to contain information provided by multiple teams to give an overview of the status for the end-to-end IT system.
Static and dynamic testing
Testing generally consists of two main groups of activities.
The first is static testing, also known as reviewing. This can be done without the test object actually running, for example by reviewing requirements or design documents, source code, user manuals, project and test plans and test cases.
In high-performance IT delivery models, such as Scrum and DevOps, a major part of static testing activities is done during the refinement of user stories, so it is not implemented as a separate activity, but it still requires specific focus. Further, static analysis of code is done to comply with coding standards. Static testing is supported by tools as much as possible. Actually, nowadays there are tools that not only do static analysis but are also capable of refactoring the code based on standard refactoring patterns. Still, in general, static testing is a combination of human intelligence and tool power.
Definition - Static testing
Static testing is testing by examining products (such as requirements specifications, manuals or source code) without programs being executed.
When we run a test, in other words, execute the test object (which may vary from one single program module to a full end-to-end business process across multiple organizations), that is what we call dynamic testing. Dynamic testing can be done manually but for many varieties of testing, support by test automation tools is highly recommended.
Definition - Dynamic testing
Dynamic testing is testing by execution of the test object, that is the running of an application.
Always use a mix of static and dynamic testing
In a good setup of testing and quality assurance, both static and dynamic testing are very important. The notion of early quality is implemented by static testing of all deliverables at a very early stage. For example, when reviewing a user story, the team may discover that some statements in the user story are not clear or ambiguous; the product owner can explain or investigate this right away and fix it. This way, there are no delays later in the development process. Other quality aspects can only be tested dynamically when (a part of) the IT system is ready.
Testing is about assessing quality based on criteria
In testing we use criteria to determine if the test object complies with the expectations about quality and risks. The descriptions below are based on one team, but the same also applies when there are multiple teams that deliver an IT system together. In that case, extra effort may be needed to align the various criteria and to measure whether the criteria have been fulfilled.
Definition - Entry criteria
Entry criteria are the criteria an object (for example, a test basis document or a test object) must satisfy to be ready to be used in a specific activity.
Entry criteria define what we expect of a product before it can be used for a specific activity. For example, requirement specifications such as a user story must meet specific entry requirements before an activity can start that uses such requirement specifications as input. In Scrum, we call this the Definition of Ready (DoR).
If we want to determine if testing is ready, we use exit criteria, which are part of the Definition of Done (DoD) in Scrum.
Definition - Exit criteria
Exit criteria are the criteria an object (for example, a test basis document or a test object) must satisfy to be ready at the end of a specific project activity or stage (for example, an iteration).
With exit criteria there is something specific to take into account, namely what the team has agreed to deliver. A cross-functional team, which is common in Scrum and DevOps environments, will agree to deliver an IT product with a specific quality level. This quality level is defined in acceptance criteria and as soon as the product meets the criteria, it is "accepted". The team uses the results from testing to obtain information about the needed changes to improve the quality. The team can now deliver the agreed quality level. In high-performance IT delivery, the team will typically deliver IT systems in an iterative way; they will deliver parts of the IT system and make sure each part has the agreed quality level.
Definition - Acceptance criteria
Acceptance criteria are the criteria a test object must satisfy to be accepted by a user, client or other stakeholder.
In the case of an independent testing team – ie., a team that is not involved in any development activities but just organizes and performs testing, which is more common in traditional sequential IT delivery models – the team cannot be held responsible for the quality of the IT product because they deliver information about the quality, but they do not have any influence on the quality itself. The only criteria they can agree with are completion criteria, which define when the team has done a good job of testing. When they did everything they promised and delivered the information that was requested, their job is ready, regardless of whether the IT product is accepted or not.
Definition - Completion criteria
Completion criteria are the criteria a team must satisfy to have completed a (group of) activity(ies).
Acceptance criteria are defined in terms of quality characteristics and quality risks. Completion criteria are defined in terms of information delivered and effort spent.