These are my thoughts about SW and HW testing methodology and practices, primarily regarding embedded system testing. It was a quick dump of thoughts, but I intend to keep expanding as I think of more aspects of SW testing, based on my years of experience of embedded SW development, embedded system testing, and I/O system testing. There are other excellent sources of testing methods and practices, but this is a compendium of my personal discoveries and experiences. This is not intended to apply to testing of front/back-end development, although I’m sure many of these pearls of wisdom still apply.

Verification And Validation

Product testing has two significant aspects: verification and validation. We often use these terms interchangeably, but they are distinct in their meaning with respect to testing:

Verification is a static test, or non-execution check, of the software and hardware product early in the product design process. It is a QA process that assures the design and implementation meet the requirements and specifications, using various static methods such as walk-throughs and inspections to review design, schematics, and significant code. Verification is the time to uncover ambiguities, discrepancies, and anomalies in the requirements as they are incorporated into design. Verification is also the time to assure that standards are being addressed. Verification is all about answering whether the product to develop will properly address requirements of customers, standards, and internal corporate policies.

Validation is the dynamic testing of the product as it is being built or its completion, using various means of testing (unit test, interface and API testing, integration test, system test, black-box and white-box testing, etc.). Validation is demonstrating that software and hardware components meet the requirements and expectations of customers, with the result being the product ready to be delivered. Validation is all about proving the right product is being or has been developed.

Validation relies heavily on the successful verification that the design meets requirements. Although traditional testing is for the purpose of validating product, the test architect and validation test designers must also be included in the verification process, not only to assist in reviewing the verification process, but also to begin developing the strategy, or approach, to dynamic testing and to begin consideration of system-level “use case” testing.

Even minor updates of product code or hardware must undergo the verification and validation processes, albeit on a much smaller scale. An important observation I have here is that fixing a bug and validating the fix without considering verification against requirements introduces additional bugs that will bite later.

A common theme in many tutorials regarding testing is “Testing is a process for checking functional and non-functional attributes of both software and hardware in a system and ensuring that the final product is defect free.” Here, I disagree on the use of “ensure” term. No, we don’t ensure a defect-free product, where we say it has no defects. Instead, we “assure” the customer that we have a high degree of certainty that the product has no debilitating defect, since we can never be sure the product has no defect. Verification and validation testing is the means of qualifying, and even quantifying, that assurance.

This blog is a memory dump focusing on validation testing, but some thoughts of verification may creep in.

Validation Testing Categories

These categories apply to both software and hardware testing, with some variation.

Unit Test (UT)
“Smoke” Test
Built-in Self Test (BIST)
Integration Test
Functional Validation Test (FVT)
System Test
Manufacturing Test
Release/Acceptance Test

Testing Subcategories

Performance validation
Robustness (recovery from anomalies)
Regression testing
Testing through simulation
Tool qualification testing (verify compilers, libraries, language support, and scripts)

Test Development Approach and General Thoughts

Understand requirements, work out discrepancies and ambiguities.
Identify prerequisite conditions and assure they are met.
Produce test plan(s), describing approach to testing development and execution.
Initial planning should address all aspects of testing: unit, integration, system, release testing.
Identify system-level “use cases” in test plan—how would a user/customer use various aspects of the product.
Select an appropriate testing framework, depending on testing language, automation, continuous integration facility, etc.
Decide whether white-box or black-box testing. Generally, if source code is available, diving into the code can be quite revealing of corner cases that need to be tested. System testing requires a black-box approach, since it relies solely on published requirements.
Define what unit test and functional test are to do. Generally, UT is limited to short runs, especially in automated regression testing, whereas FVT can be as long running as needed to explore deeper into source code.
Identify need to hardware testing harness and work with HW engineers, as needed.
Understand that 100% test coverage is not possible; initially focus on important areas, and plan (hope) for future coverage.
Plan a coverage schedule: 25% coverage (subjective selection), 50%, 75%, 90%
Plan for test automation.
Decide who develops test modules and test cases. SW developers focused on SW development are generally not good candidates for test development, with possible exception of some unit tests, unless a strong disciple is instilled in their test development.
Design the test environment: test harness, remote testing, PC test server, target as client or as server, etc.
Decide when testing is complete and acceptable. Although this especially applies to system test, completion of integration test of major components is generally of internal interest. Publish intended milestones with summaries of how they are to be achieved.
Plan test modules based on individual feature/component being tested.
Use source control (git, bitbucket, etc.) for test planning and test module development.
Frequently consult with developers.
Note that some developers may be more protective of their code, so be gentle in explaining issues with sufficiently convincing backup information and analysis.
Important point: the purpose of testing is not to eliminate all bugs, which is almost impossible and too costly, but to locate and allow removal of bugs that may arise during normal and stressed operation of a product, achieving a sufficient level of acceptance of the product by the customer. The customer should permit a minimal set of noncritical issues in order to take a timely delivery at reasonable cost.
Initial design of the product software should include “test hooks” that are ever-present but have little impact on performance or memory. These hooks would allow incorporation of test code into software so that when the SW (and HW) is under test, the testing still uses original production software, but with minor detours in paths to provide simulation and fake data.
Use of test hooks eliminates (or reduces) the need for conditionally compiled code. In fact, coding standards should require there be no conditionally compiled code (even variations in code for different products should be handled by virtual implementations).

Unit Test Case Design

Plan a “test module” as a source file (Python is a good choice) which addresses the complete unit testing of a specific SW or HW component. However, multiple test modules may be needed to test subordinate features. The choice of testing framework will dictate the test module organization.
Limit each test case to one point of failure or very few points of failure. If a test case has several points of failure, and the first fails, you don’t know the status of subsequent tests in that test case method.
Name a test case according to what is being tested.
A test case must not rely upon the outcome of a previous test case.
Do not assume test cases are performed in source order.
Provide trace logging for each test case beyond what the system under test provides. If the system provides tracing, add output to the system log such that the period of the test case is also indicated.
Do not be too verbose in a test case, show only information that the test framework (pytest, unittest, robot) does not show or varies run-to-run.
Avoid creating test cases that vary run to run (i.e., avoid randomness).
Use unit test support tools (Pytest, Unittest, etc.), but enhance (wrap, decorate, fixture) as needed to reduce redundant or excessive test code.
Avoid breakage in test case runs. Do not allow an exception to escape without handling it and reporting it.
Increase complexity in related test cases (provide simple and complex test cases)
Apply “equivalence class partitioning” (see below)
If random must be used, show random seed so that subsequent analysis can repeat with the same test parameters.
Plan for expansion with additional test cases as new issues are uncovered.
Minimize setup and overhead for each test case.
Plan for automation (for example, no manual inputs or time-sensitive dependencies).
Focus on problem areas, adding new test cases as test development proceeds.
Organize test cases/modules with conditional testing as needed for smoke test, unit testing, functional testing.
Use peer review of test plan and test code.
Generate fake data within UT module (avoid separately stored files which can become lost or obsolete).
During testing, report new issues (through JIRA), but isolate failing test cases until the problem is fixed. Identify that a test case is skipped for a specific reason.

Equivalence Class Partitioning (ECP)

Divide testable inputs into boundaries:
- Acceptable inputs, limit to a few selections
- Outside of acceptable inputs, verify handling of invalid inputs
- Empty, null inputs
- Invalid input types
- Varying codecs
Avoid repeating testing

Test Automation

Identify the automation environment and tools needed and/or available.
Provide recovery methods if testing goes awry (power-cycle, rebuild network, etc.).
Select remote test agent that can perform total setup (Docker, GitHub agent, etc.); limit testing performed on the automation server.
Produce run logs with levels of detail (summary, run details, test logs, system logs).
Retain logs for important test runs.
Perform trend analysis, comparing multiple runs, and report anomalies.
When an issue arises, report (JIRA) with details of situation, SW commit ID, test case, test module, method of manually reproducing if possible.
If problem is investigated by tester (typically not), the problem report must detail that investigation.

Automated Testing

A major aspect of automated testing is automated regression testing using unit tests is applied to a system soon after a software build due to a source change, to demonstrate that the change does not have an obvious adverse effect on the system under test.
Automated regression testing should be relatively short in overall duration. If only software testing, this might be but a few seconds or at most a few minutes, and should be performed just after the build, all as part of the CI process.
Automated testing of software which controls hardware or an embedded system may be more lengthy and thus require a different approach to automation, such as scheduled nightly runs of the automated testing.
Regardless of source changes, regression testing should be performed frequently and regularly, such as nightly, or on weekends, with variations in start times, to:
- Reassure stability in the product.
- Establish a testing history with which trends can be explored.
- Demonstrate resistance to influence by external environmental changes (network issues, power fluctuations, impact by storage changes, etc.)
Automated testing is applied to repeatable tests on a software/hardware/system product in development, each time with an expected consistent outcome. Not just an expected “pass” of all tests cases, but consistent testing process with consistent results in logs, generated data, etc.
Where hardware or system environment is a factor in testing, the automated regression testing should use a reset mechanism, such as remote power cycling, between test modules or test suites.
For embedded system testing, remote operations should be considered such that a CI system (Bamboo or GitHub Action, for examples), controls an agent system which in turn drives the remote testing of the embedded system.
Note that for some products, the time of day may have an effect. Automated testing should be considered for various times of day, especially over midnight and February 29.
Some build and test processes are necessarily disjointed such that the test process does not actually test that which was formally built, requiring the rebuild during the test process. This must be avoided. The package under test must be the same binary image as formally built in the CI process, which is generally the same as released. Otherwise, there is a potential break in the “chain of custody”. For example, consider a build during test will have different timestamp strings in the binary—changes in string length could force an offset in processor L1 cache boundaries, thereby affecting the performance ever so slightly than that of the release product.

Randomness in Automated Testing

Sometimes, randomness should be introduced into the test process, but it should be a controlled randomness:

The random “seed” must be reported in the log, where the seed is the initialization value, such as system time (in milliseconds or finer).
The test procedure must be implemented with the means of specifying the seed so that the same test sequence can be repeated in the event of a testing failure.
The random algorithm must produce the same sequence of random values among all test cases (no randomness in generating the next random value).
If a test case performs operations that may vary from one run to another, and it is based on random algorithm, the random algorithm should be re-seeded for that test case with the same seed value if overall randomness is not selected. The seed value must always be reported at the test case level.
Consider that use of random values may alter the performance and timing of the testing process.
Do consider using controlled randomness in higher-level testing (integration, system) so that over numerous test runs, different use case scenarios can be explored, and different hardware settings and environmental conditions can be exercised.

Hardware Testing Considerations

Is a test harness or simulation capability required (HIL).
Plan for “bring-up” if needed such that processor and critical devices are verified as functional.
Design built-in self-test (BIST) capability, invoked on power-on.
Design a test module per HW component under test.
Consider testing of:
- Power variation
- Fan speed variation (i.e., temperature variation)
- Register masking of floats (identify unsourced/ungrounded signals on schematics)
- ECP testing on registers
- Fault injection via test harness or other means
Share HW component register specifications in original SW source with testing source (i.e., one source base for register definitions, with automated translation scripts as needed).

Concurrency Testing

Software testing may meet stated requirements but may not work well when a program is run with multiple concurrent or simultaneous processes, users, or network nodes. Unfortunately, correction of failures due to concurrency often require a redesign of the software to handle concurrency, thereby forcing a restart of single-process testing. The original SW must account for concurrency.

The test plan must include provisions for performing concurrent testing.
The number of multiple processes or users supported by the SW should be stated in requirements, but generally it far exceeds the practical number that can be tested. The test plan must also indicate the maximum number of processes under test.
Concurrency testing of an entire SW product is generally impractical or impossible in a test environment. Identify major SW components that are most likely to interact with multiple processes and focus on concurrent testing of them.
Identify shared resources and points of potential deadlock or unrestrained access to shared data.
Consider that programs running on a computer may be forced into a lockstep with each other such that access to shared resources are single threaded by the OS, or that process scheduling prevents tests from encountering such situations, or that test processes may inadvertently interact with each other.
Multiple independent nodes in a network are essential to concurrency testing.
With a design for the maximum processes to test, start test development for two concurrent processes.
Test logs and reports need to show interleaved activity among nodes.
Solicit people to participate in hands-on concurrent testing, if necessary (a test framework for concurrency can avoid that).

Manufacturing Test Considerations

Manufacturing of HW devices needs to test individual components, but quickly and with low cost (1 minute delay in the mfg process can add $100s to the overall cost of production).

Use of built-in self-test designed for MFG
Report success or failure
Provide copious logs if a failure
May use test harness to assist in BIST, although there is an added cost to mfg.

Product Software Language Considerations

Consider the volatility of the source language used for the product software, including the language compiler, support libraries, fixture/temp libraries.
Often, updates to compilers and language versions can introduce bugs.
Revision of compiler code generation can change memory footprint or performance (especially where tight code loops are placed on different cache line boundaries).
A separate compiler qualification testing may be essential. Of course, compilers are “fully” tested before release, but specific code paths in the product source may not have been included in that testing.
For any language, the development team should have a documented coding standard endorsed by all team members.
For Python, there is a tendency to use the “PEP 8” coding standard, but invariably everyone will bend its oft-arcane rules so that there seems to be no coding standard. The project should call out an explicitly documented standard rather than rely on PEP 8.
A common IDE should be specified for the team for a given programming language, at least for the source characteristics, such as maximum line length and soft tabbing of N spaces. The IDE configuration files should be distributed among team members to assure that commonality.

Software Development Pitfalls Discovered in Testing

Besides the issues that may arise from concurrency testing, there are several aspects of software development that are often overlooked until integration or system testing. Failures in these are so significant that correction may be only possible with SW redesign or major rework. They must be taken into consideration together during initial SW design; otherwise, overlooking one will impact the others. Obviously, testing must be planned for these, based on requirements. Early integration testing of key components, even if using prototypes, is suggested to avoid these gross problems.

Performance of key components
Error handling and recovery (i.e., robustness)
Memory usage, both code footprint and runtime stack and heap space
Adherence to project coding standards
Endianness of processor and external protocols
Usage of “legal” third-party SW (violations of open-source requirements, use of copyrighted source code, plagiarism from external sources).

Anecdotes of these failures:

A development team decided to work strictly on function. After completing testing for function, they then worked on error handling but found they had to rework the design to accommodate reasonable error handling. Then, they after completing those changes and subsequent testing, performance was considered. A rework was again necessary.
In a memory limited embedded system, 500K lines of C++ was used without regard to its compiled bloat, not only in generated code, but the necessary symbol table required for a linking load. Testing quickly demonstrated that the compiled code could load but left little memory for heap space which was quickly exhausted. I as the support developer resolved the problem by eliminating nonessential symbols (each of which was immense) and mapping the “mangled” symbols to reduced symbols.
Critical looping functions had to be tested and examined after each build to assure the generated code fit properly within processor L1 cache lines; occasionally, performance was thrown off by several percent, depending on what the compiler generated for the same code. The solution was to assure sufficient padding was incorporated in the generated code to always force the same code placement within cache lines, and to notify of source changes in those areas.
Code reviews often reveal coding standard violations. They are fixed but some are overlooked, yet testing may still be successful. When a pedantic “lint” tool is then used, there may be significant rework to placate the lint tool. Testing must be repeated.
SW for a network controller was nearly complete before pairing with another “off the shelf” network node. Byte endianness was overlooked when sending 16-bit data on the network. Quickly resolved with byte-swapping in buffers, but performance was impacted for not specifying HW-assist byte swap.

Code Revision and Review

Source changes or additions must be reviewed by people of several roles: other SW developers, HW engineers if HW is involved, test developers, and project owner. A minimum number of signoffs should be required before a change set is committed to the common source base (in “git” nomenclature, the approved pull request is merged).
Code should not (must not) be changed unless there is an accepted JIRA ticket, and only for changes addressing the issue identified by the ticket.
Multiple issues should not (must not) be covered by a single JIRA ticket. Each item should be described by individual tickets. This allows a simpler code revision and review process.
If massive cosmetic changes to source code are performed, such as renaming variables or functions, then separate JIRA tickets and code reviews should be performed only for those cosmetic changes. This permits a quick code review and acceptance (i.e., a “rubber stamp” of the pull request).
Mixing many cosmetic changes with functional changes makes the review difficult to understand and accept. Totally stumped reviewers should not be hesitant to reject the pull request, asking for multiple pull requests to distinguish the changes.
In reviewing, apply project coding standards. Be adamant about adherence to the standards.
A pull request (PR) should have a summary description to introduce the changes to the reviewer. The summary does not need to cover every change in the PR but should include a bullet list of significant changes as well as introductory information to the reviewer for understanding the rationale for the changes.
Avoid “emotive” emojis in code reviews since they convey no useful information with respect to the code. However, certain graphic symbols could be used (such as a smiley face) as a shorthand means to indicate understanding of a review comment or that change work is underway.
Before submitting to a code request, a modicum of unit testing must be performed by the developer to assure some functionality of the code (I have encountered people who think that if the source compiles, it works).
Various static and dynamic analysis tools should be used by the developer before submitting the pull request to be reviewed by others, such as lint for C/C++ or pylint for Python to find some programming issues.
The developer should have the same access as project regression testing to continuous integration tools to evaluate the pre-committed source changes, such as running a git development branch under CI.
When someone asks a question in a review comment, and the response is significant, then discuss the change with that person or the team. Update the source code to include that response as a documenting comment. Reviews of pull requests may be retained forever but their useful information quickly fades from memory. The important information often provided in a pull request is best captured in the source code for posterity.
Keep emotion out of the review process and be aware of sensitivity of the developer to criticism. A code review is all about the code and not the developer, but most developers don’t see it that way and will react to any negativity suggested in the review comments. Manager: avoid associating code reviews to performance reviews.

Software Development Practices

There are many principles and practices for developing software modules and products, covered by millions of books and web pages. These are just a few of MY observations and personal rules which will be extended over time.

Regardless of the size of the project or module, plan for object-oriented implementation from the start. In Python for example, you can easily create a module without classes, running everything in the file-scope global space. But, more often than not, some class scheme must be introduced in even the simplest of modules, forcing a revision of the source.
In more formal designs (i.e., for production), consider using “setter” and “getter” functions or properties on what would normally be exposed public data, rendering them private, accessible only through those methods.
In the OOP class design, consider providing a utility base class, inherited by most significant classes, to offer test and diagnostic services to the derived class. For example, a “RegressionTest” method in a class could be used to perform regression unit testing of the class, using the inherited services.
For embedded system software, be aware that OOP features may have hidden costs that impact memory usage or performance. For example, using a C++ STL design pattern to manage a linked list could introduce enormous overhead, when all you need is a simple scheme which you can develop in minutes.