You Need Data for Performance and Stress Testing?

Usually you won't get your product accepted without performance and stress testing. You can be assured that your customer will indeed want to know how their shiny new system will behave with large amount of data. For such tests you need large quantity of data of course. But there is a problem.

Your database is basically empty.

You have many tables and they need to be populated with test data. For some of those same tables you need thousands or even millions of records. And if you are building an ERP or CRM or data warehouse, even number of tables themselves can easily be in the hundreds.

So what are your options?

Copying Production Data - Not Always a Good Idea

If you do have production data you might want to try populating your database with copy of real data.

But is there enough of real data to work with? If not, you simply need more data.

Also you don't want to forget about data confidentiality issues. Sometimes you have to avoid privacy concerns, or you simply must hide critical data. Or you simply have regulations compliance to worry about.

Not to mention transformation headaches if your product is built to replace another system.

Generating Large Quantity of Test Data - a Lot of Tedious Work

You can create database scripts yourself, to generate large amount of test data. But it's very time consuming. And error prone. And tedious all-over.

Chances are you already tried this before.

You wrote all the scripts. That was quite an undertaking and took you days of your work time. You made sure all the formats were correct. And all foreign key relationships were resolved. That all foreign keys had actual values in primary tables, and exact order of script execution is defined.

That was only the beginning.

You ran the scripts and monitored script results. Every now and then a script failed, and you had to find what went wrong. It was usually a tiny little detail you fixed and ran the scripts again. If you were lucky you ran scripts from the script that failed onwards. If not, you had to run everything from scratch. After emptying the database of course. You pulled late working hours for few days, but you made it.

Finally you thought you were done.

There was no way you could be persuaded to go through this again without dire need. So you made a backup of the pristine populated database. If you were lucky to have storage available, of course. And needless to say, after two weeks the database model changed. Not much, but enough.

You had to do most of the work all over again.

Generate Large Quantity of Test Data Automatically

So if copying production data is a bad idea and generating data yourself means countless hours of tedious work, what remains?

Use a specialized test data generator tool.