4 difficulty levels of creating test data

Ivan Karaman

Published Dec 26, 2023

🕹️If you played computer games, you have seen a difficulty selection screen. It allows you to decide if you want to enjoy the game without any worries or looking for a challenge.

Strangely enough, when one needs some data for testing, they often make a choice that is unnecessary HARD.😟🙅

This could happen for a multitude of reasons, the most common being “I’ve always done it like that” and “I don’t know any other way”.

This article will help you to make your life easier by showing 4 different ways to achieve data generation for testing from the hardest, to the easiest!

1 - Hell 😈

You have probably guessed it! Yes, adding data manually is the hardest difficulty. Famous last words are: “I’ll just quickly register some new users!”. Generally, you should AVOID DOING that! ⛔

There are some exceptions, of course! Do it when it’s:

Quick (you only need a few entries)
One-off (this is the first and only time you will have to do it)
Restricted (no UI, no/limited access to the backend and/or database)

You can win on “hell”, but would you be smiling after that? 😫

2 - Nightmare 😱

If doing it by hand is not a good way, then we need to automate!? Many automation engineers would start by trying to automate data entry via the UI. Your sample code (#cypress) might look like this:

This is often better than clicking around and typing, but UI automation is:

Not always reliable (your goal is to have enough data to continue your exploration, not to fight the UI automation framework)
Slow. "cy.visit" action alone would add a second or two to the execution + starting the framework + starting a browser (ok for a few users, but could take hours or even days to create a lot of data entries)

Nightmare difficulty is not too good either... It’s not enjoyable! 😵💫 Let’s play on normal!

3 - Normal 😙

API! Application Programming Interface. Even the acronym suggests that “test data entry” is one of the intended usages. We are doing some programming here! 🤪

Sample code for the API user creation might look like this:

If you were to run this code in parallel with the UI you would notice how fast it is. It might be even an order of magnitude faster (x10).

The reason is that your computer doesn’t have to spin up a heavy UI framework, open the browser, navigate to the page and render it. It just makes API calls! 😇

If you don't know how to write Node.js code, you could use a tool with the UI. Postman allows you to do data-driven test runs, see an example here: https://www.youtube.com/watch?v=eJJHDXqIWf0

Another possible benefit of the API is the fact that it MIGHT have endpoints that are MORE SUITABLE for your task. For example, our API may allow the creation of many users via the “/registerUsers” endpoint. In this case, we would only need to make 1 HTTP call with an array of user objects:

4 - Easy 🤩

Can it be even simpler than using the API? Really?! Well… maybe. We could go and do our inserts directly into the database!

Beware! Inserting data directly sometimes might be harder than via the API.

To do that, we just need to generate the sample “user insert” code:

If you run this program, it will print a SQL statement, ready to be copy-pasted and run on your DB side!

Downsides of this approach:

When the database is complicated. Entities are living across multiple tables, the database has triggers on data entry, etc. These things might make it really really hard to prepare the insert statement.
Access is restricted. You might be willing to run this script in an environment you have no “write access” to.
DB structure (schema) changing. This will force you to do the script maintenance

If this happens (or could happen), use the API!

Summary 🗒️

Playing on lower difficulties will make your experience better!

Of course “how you want to play the game” is ultimately your choice. I can only ask you to play on “normal” or “easy” and offer the awareness that these options are out there. At times, they might be easier for you to use or not… But if you spend some time learning how to do it (do it, it’s not that hard) it will reward you with:

a lot of saved time (fast and reliable)
satisfaction from learning something new
less stress from doing mundane manual tasks or aimlessly fighting the UI frameworks
…and more! ☺️

The end

If you read till the end, you are the best! If you enjoyed it, don’t forget to react or leave a comment. This helps the algorithm to show it to other relevant people. More people see and react, which means I am more motivated to write more. I write more and YOU will benefit from reading it. ❤️😉

Are you a manual tester? Or a beginner automation engineer? Want to be better? Want to learn more about test automation and not sure how to go on this journey? I can help!

I coach people on test automation within the JavaScript ecosystem. UI/API/Unit/Performance/Contracts, you name it… You can book a free first consultation to talk about your needs and goals here: https://ivanandcode.com/coaching

Anna Kovalova

Fantastic insights on the nuanced difficulty levels in creating test data! Navigating the challenges in data creation is indeed a crucial aspect of effective testing.

Simon Devon

Great article! IMHO, I prefer to use API calls to create data, it's the safest way and you get a bit of extra testing thrown in for good measure. I find that DB calls can add complexity to automation. Cypress handles API calls so well it's a bit of a no brainer...

Raúl Telo Sánchez

I liked the article 😊. In complex scenarios, when multiple tables need to be populated before perfoming a test, I found the easiest to use test fixtures. You can approach this in different ways, I opted to create snapshots of data that I could rely on, and make the backend point to those snpashots when running the tests.

1 Reaction

Chris Van Bael

Nice article! However, from my experience, I wouldn't put direct DB entry as the easiest. I'd prefer API. Unfortunately, when working with +20 year old systems, you don't always have an API. Trying to understand the DB layout and values that go into the tables can be difficult, as you've mentioned.

2 Reactions

Bret R.

I prefer creating data by the interface and serves a two fold purpose. 1. Puts the data in as intended and doesn’t miss any additional flags or, what would appear, meaningless settings. 2. Creates valid regression tests for your pack when using the CI method. Also the interface way is quick if you run it overnight and the business can also relate to it. Mind you some testers could see the api route as complex and you need to keep me on to keep it running, thus job security. I’ve been doing this for 30 years. Automate the functional process and it works much more smoothly, there are no missed settings or data elements and it starts the regression pack. Being too clever is almost always a bad choice and makes it harder to maintain. Always use Module automation so you can pick and choose your path of data generation and smaller chunks of code to maintain.

4 difficulty levels of creating test data

Ivan Karaman

1 - Hell 😈

2 - Nightmare 😱

3 - Normal 😙

Recommended by LinkedIn

4 - Easy 🤩

Summary 🗒️

The end

More articles by Ivan Karaman

Others also viewed

z/VM CMS Pipeline

Using Requires Expression in C++20 as a Standalone Feature

An In-depth Look at Apollo Client for Angular Applications

Building a Redux-Powered Blog with a Mock API

How To Use Covdata For Better Code Coverage In Go

useEffectEvent is a fake hook

Elevating JavaScript with Closures: A Journey into Scopes, Data Encapsulation, and More!

C++ Core Guidelines: More Non-Rules and Myths

An Absolute Beginner’s Tutorial for Understanding and Implementing Composite Pattern in C#

Explore content categories

1 - Hell 😈

2 - Nightmare 😱

3 - Normal 😙

Recommended by LinkedIn

4 - Easy 🤩

Summary 🗒️

The end

More articles by Ivan Karaman

Bug reporting: where is the devil? (in the details)

Assertions in Page Object?!

Make your test automation better with abstractions! 🤖

Where does a tester belong? Team topologies.

Practical step-by-step guide for improving your test code!

Testing an API without E2E tests?! (Node + Express + Supertest example)

Calculate if paying for “test automation lessons” (or any other education) is worth it!

The first (and most important) step on the road to a “Test Engineer”

How to name things while writing code

“No” to Testing in Production!?… “Yes” to Observability?👀

Others also viewed

z/VM CMS Pipeline

Using Requires Expression in C++20 as a Standalone Feature

An In-depth Look at Apollo Client for Angular Applications

Building a Redux-Powered Blog with a Mock API

How To Use Covdata For Better Code Coverage In Go

useEffectEvent is a fake hook

Elevating JavaScript with Closures: A Journey into Scopes, Data Encapsulation, and More!

C++ Core Guidelines: More Non-Rules and Myths

An Absolute Beginner’s Tutorial for Understanding and Implementing Composite Pattern in C#

Explore content categories