4 difficulty levels of creating test data

Ivan Karaman

Experienced QA Engineer | Test Automation Coach | Quality Advocate

Published Dec 26, 2023

🕹️If you played computer games, you have seen a difficulty selection screen. It allows you to decide if you want to enjoy the game without any worries or looking for a challenge.

Strangely enough, when one needs some data for testing, they often make a choice that is unnecessary HARD.😟🙅

This could happen for a multitude of reasons, the most common being “I’ve always done it like that” and “I don’t know any other way”.

This article will help you to make your life easier by showing 4 different ways to achieve data generation for testing from the hardest, to the easiest!

1 - Hell 😈

You have probably guessed it! Yes, adding data manually is the hardest difficulty. Famous last words are: “I’ll just quickly register some new users!”. Generally, you should AVOID DOING that! ⛔

There are some exceptions, of course! Do it when it’s:

Quick (you only need a few entries)
One-off (this is the first and only time you will have to do it)
Restricted (no UI, no/limited access to the backend and/or database)

You can win on “hell”, but would you be smiling after that? 😫

2 - Nightmare 😱

If doing it by hand is not a good way, then we need to automate!? Many automation engineers would start by trying to automate data entry via the UI. Your sample code (#cypress) might look like this:

This is often better than clicking around and typing, but UI automation is:

Not always reliable (your goal is to have enough data to continue your exploration, not to fight the UI automation framework)
Slow. "cy.visit" action alone would add a second or two to the execution + starting the framework + starting a browser (ok for a few users, but could take hours or even days to create a lot of data entries)

Nightmare difficulty is not too good either... It’s not enjoyable! 😵💫 Let’s play on normal!

3 - Normal 😙

API! Application Programming Interface. Even the acronym suggests that “test data entry” is one of the intended usages. We are doing some programming here! 🤪

Sample code for the API user creation might look like this:

Example of using Node.js with Axios HTTP client to generate test users via API

If you were to run this code in parallel with the UI you would notice how fast it is. It might be even an order of magnitude faster (x10).

The reason is that your computer doesn’t have to spin up a heavy UI framework, open the browser, navigate to the page and render it. It just makes API calls! 😇

If you don't know how to write Node.js code, you could use a tool with the UI. Postman allows you to do data-driven test runs, see an example here: https://www.youtube.com/watch?v=eJJHDXqIWf0

Another possible benefit of the API is the fact that it MIGHT have endpoints that are MORE SUITABLE for your task. For example, our API may allow the creation of many users via the “/registerUsers” endpoint. In this case, we would only need to make 1 HTTP call with an array of user objects:

4 - Easy 🤩

Can it be even simpler than using the API? Really?! Well… maybe. We could go and do our inserts directly into the database!

Beware! Inserting data directly sometimes might be harder than via the API.

To do that, we just need to generate the sample “user insert” code:

Example of using Node.js with Faker library to generate SQL code for inserting users

If you run this program, it will print a SQL statement, ready to be copy-pasted and run on your DB side!

Downsides of this approach:

When the database is complicated. Entities are living across multiple tables, the database has triggers on data entry, etc. These things might make it really really hard to prepare the insert statement.
Access is restricted. You might be willing to run this script in an environment you have no “write access” to.
DB structure (schema) changing. This will force you to do the script maintenance

If this happens (or could happen), use the API!

Summary 🗒️

Playing on lower difficulties will make your experience better!

Of course “how you want to play the game” is ultimately your choice. I can only ask you to play on “normal” or “easy” and offer the awareness that these options are out there. At times, they might be easier for you to use or not… But if you spend some time learning how to do it (do it, it’s not that hard) it will reward you with:

a lot of saved time (fast and reliable)
satisfaction from learning something new
less stress from doing mundane manual tasks or aimlessly fighting the UI frameworks
…and more! ☺️

The end

If you read till the end, you are the best! If you enjoyed it, don’t forget to react or leave a comment. This helps the algorithm to show it to other relevant people. More people see and react, which means I am more motivated to write more. I write more and YOU will benefit from reading it. ❤️😉

Are you a manual tester? Or a beginner automation engineer? Want to be better? Want to learn more about test automation and not sure how to go on this journey? I can help!

I coach people on test automation within the JavaScript ecosystem. UI/API/Unit/Performance/Contracts, you name it… You can book a free first consultation to talk about your needs and goals here: https://ivanandcode.com/coaching

Anna Kovalova

Building and Leading QA Teams | Co-founder and CEO

11mo

Fantastic insights on the nuanced difficulty levels in creating test data! Navigating the challenges in data creation is indeed a crucial aspect of effective testing.

Simon Devon

11mo

Great article! IMHO, I prefer to use API calls to create data, it's the safest way and you get a bit of extra testing thrown in for good measure. I find that DB calls can add complexity to automation. Cypress handles API calls so well it's a bit of a no brainer...

Raúl Telo Sánchez

Software Engineer · Senior SDET

11mo

I liked the article 😊. In complex scenarios, when multiple tables need to be populated before perfoming a test, I found the easiest to use test fixtures. You can approach this in different ways, I opted to create snapshots of data that I could rely on, and make the backend point to those snpashots when running the tests.

1 Reaction

Chris Van Bael

★ Freelance security & test specialist/trainer ★

11mo

Nice article! However, from my experience, I wouldn't put direct DB entry as the easiest. I'd prefer API. Unfortunately, when working with +20 year old systems, you don't always have an API. Trying to understand the DB layout and values that go into the tables can be difficult, as you've mentioned.

2 Reactions

Bret R.

I Deliver ERP Data Migrations | Driving Seamless Delivery of SoW, Data, and Testing for all ERP Applications | Delivering ERP Success using SAP Data Services | Cloud CRM & HR | Over 50 ERP Projects Delivered

12mo

I prefer creating data by the interface and serves a two fold purpose. 1. Puts the data in as intended and doesn’t miss any additional flags or, what would appear, meaningless settings. 2. Creates valid regression tests for your pack when using the CI method. Also the interface way is quick if you run it overnight and the business can also relate to it. Mind you some testers could see the api route as complex and you need to keep me on to keep it running, thus job security. I’ve been doing this for 30 years. Automate the functional process and it works much more smoothly, there are no missed settings or data elements and it starts the regression pack. Being too clever is almost always a bad choice and makes it harder to maintain. Always use Module automation so you can pick and choose your path of data generation and smaller chunks of code to maintain.

4 difficulty levels of creating test data

Ivan Karaman

Experienced QA Engineer | Test Automation Coach | Quality Advocate

1 - Hell 😈

2 - Nightmare 😱

3 - Normal 😙

4 - Easy 🤩

Summary 🗒️

The end

More articles by this author

Insights from the community

Others also viewed

New Course: Data Analytics with Observable

Elevating JavaScript with Closures: A Journey into Scopes, Data Encapsulation, and More!

Mock Data & Stubs (Fake it till you make it)

Grind 75 - 22 - Middle of the Linked List

Understanding the Dart Event Loop: A Detailed Guide

Parameterised Unit Tests

Unlock the Secrets of Null Handling in C#: Say Goodbye to Null Reference Exceptions Forever!

C++ Core Guidelines: Rules to Exception Handling

Pipes In Nestjs

State Management Basics in React: A Guide

Explore topics

1 - Hell 😈

2 - Nightmare 😱

3 - Normal 😙

4 - Easy 🤩

Summary 🗒️

The end

Bug reporting: where is the devil? (in the details)

Oct 6, 2024

Assertions in Page Object?!

Aug 29, 2024

Make your test automation better with abstractions! 🤖

Jul 29, 2024

Where does a tester belong? Team topologies.

Jul 21, 2024

Practical step-by-step guide for improving your test code!

May 31, 2024

Testing an API without E2E tests?! (Node + Express + Supertest example)

May 26, 2024

Calculate if paying for “test automation lessons” (or any other education) is worth it!

May 13, 2024

The first (and most important) step on the road to a “Test Engineer”

Apr 21, 2024

How to name things while writing code

Apr 15, 2024

“No” to Testing in Production!?… “Yes” to Observability?👀

Mar 26, 2024

Insights from the community

Others also viewed

New Course: Data Analytics with Observable

Elevating JavaScript with Closures: A Journey into Scopes, Data Encapsulation, and More!

Mock Data & Stubs (Fake it till you make it)

Grind 75 - 22 - Middle of the Linked List

Understanding the Dart Event Loop: A Detailed Guide

Parameterised Unit Tests

Unlock the Secrets of Null Handling in C#: Say Goodbye to Null Reference Exceptions Forever!

C++ Core Guidelines: Rules to Exception Handling

Pipes In Nestjs

State Management Basics in React: A Guide

Explore topics