Managing test data represents a major challenge in system development projects.
Read Evolutionary Database Design from Fowler and Sadalage for some background material to the subject.
Test data is used for:
- Unit tests
- Integration tests
- System tests
- Acceptance tests and other manual testing
- Load test
- Migration tests (deploying new versions of a system on previous installments)
Try minimize the number of test data sets. Often only two types of test data is sufficient:
- A small, compact, well-known test data set that people are familiar with. Should represent all types of variations in the system,
and the data values should mirror real data (i.e. logical, meaningful values).
- Volume-based set, primary for load testing purposes.
For unit testing purposes, there might be a third fragmented type of test set which serves as input data and mock data in tests.
These are implemented in test code, they are informal, and will not be used outside the unit test.
The rest of this article will thus leave out these test data as they are of no importance for the challenge of managing test data.
Mechanisms for loading test data into a system:
- By code
- By database scripts
- By a (live) master database
When should you use what ?
Where are the test data stored, and in what type of format ?
How often should the test data be reloaded ?
Does it depend on the type of environment ?
How to deal with relative test dates - i.e. how to update relative date values to the current date ?
How to keep the test data set intact when the application is changing,
and when developers, testers, business analysts, people in marketing, product owners, etc need new tests ?
Who should be responsible for managing the test data ?
- A single person (characteristics and roles for this person) ?
- The dev team/test team ?
How to maintain different versions of each test data set ?
How to automate this task, tools, error reporting, separate jobs in CI, dependencies ?