How to Create Good Test Cases

Skip to end of metadata
Go to start of metadata

There are a lot of strict rules and rules of thumb which may increase the quality of tests written for the tool pair mysqltest/mysqltest-run.pl (MTR). The following presentation should give you some hints. This document should not replace reading our excellent manual about testinghttp://dev.mysql.com/doc/mysqltest/en/writing-tests.html.Some of the examples might not workwith MTR1 oron some operating system.MTR2 means the second version of mysql-test-run.pl/mysqltest. MTR2 replaced MTR1 ~ Jan 2009 in MySQL 5.1 and 6.0.The term "protocol" means in most cases "output from a test case." October 2007 - Matthias Leich - Create this document for a MySQL University sessionMarch 2009 - Matthias Leich - Updates and corrections, add "Examples of suspicious scripts"March 2009 - Patrick Crews - Corrections, add "Notes on MTR"

Formal stuff

Rules about formatting within scripts and similar stuff. 

Coding style

Please have a look at the C/C++ coding guidelines and apply them when they make sense: Coding Guidelines

Some more or less strict rules:

  • Try to have a good readable text flow. Lines with very different length are a pain.
  • Avoid lines longer than ~ 100 characters unless there is no other choice. Think about a comparison of the new and the old version of a test script within a graphical diff tool.Having the difference frequent at the end of long lines is very uncomfortable.
  • mysqltest only accepts comment lines starting with '#'.
  • Use spaces, not tabs.
  • Lines must have no trailing spaces.
  • Write SQL statements in the style of the MySQL Reference Manual
    • SQL keywords and reserved words: uppercase
    • Identifiers (table names, column names, etc.): lowercase Ignore this guideline if your intent is to check the processing of mixed lettercases ;-)Please follow this guideline for new tests. Rewriting existing tests to use better formatting is good, but can be tedious - a rule of thumb: don't rewrite unless you intend to touch the entire subtest (not the entire .test file, but the statements that make up a complete testing 'unit')
  • If an SQL statement is long, add line breaks to reformat it and make it easier to read.
SQL statement example
Please avoid too dense code

except you intend to check the parser or the performance of the reader of the code ;-).

Header of scripts

For each test or auxiliary script, create a header that contains the following information:

  • Purpose of the test or script
  • Corresponding WL task, if there is any
  • Creator of the test and date of creation
  • Author of last significant change + date of change + what was changed
  • Dates should be in ISO format (ISO 8601): YYYY-MM-DD
  • In case that the script assigns values to some variables and sources some master test script, please explain the purpose of these variables and why you use these values

Note: The header, like the rest of the test, should not mention confidential information. Remember, our tests are available publicly.

Header example
Comments within boxes

Please work 100% perfect when using boxes made of '#'. Such boxes "jump" into the eye of most readers. Bad example similar to what I found in some tests:

Please note that this is caused by spaces, not tabs.

Comments Everywhere

Write comments, they save the time of others.

Complicated test architecture or tricky code

If your test is very complicated, write a sufficient explanation of the test architecture. This helps avoid having someone else come along and unintentially change the test in ways that:

  • Shift the goal of the check
  • Destroy your test architecture
  • etc.

Comment your code when you do or check something that someone else may think is not trivial. Please write what you intend to check if it is not obvious.

A subtest case is able to reveal a bug

Write a comment if the next subtest revealed a bug in history. Please mention number and title.

Please use exact the formatting "# Bug#nnnnn <Title>"

because it does not look nice to see so many different variants like "BUG #nnnnn", "bug #nnnnn - Title" often even within the same test.

Subtests

At least in cases where your file contains many subtests

  • Mark these subtests for better readability of the script
  • Write also a message into the protocol.

And please explain what each subtest checks, unless it is obvious.

Example:

Make test protocols more understandable

Please have the test write comments into the protocol if this makes the surrounding protocol content much more understandable. This is especially true for tests that do the following:

  • Suppress the printing of SQL statements and result sets
  • Work with more than one connection
Better protocols example

Good script with message about switching the connection:

Protocol of good script:

This is what the protocol looks like without the comment. The output becomes less clear because there is no indication that the connection has been changed.

Some Rules Outside of the Formal Stuff

Error masking

Use error names instead of error numbers whenever possible.

Example:

You can find the error names in the include/mysqld_error.h file of a MySQL source distribution, or in the Server Error Codes and Messages section of the MySQL Reference Manual]. NOTE: Don't do this with error numbers > 2000. Use of the error name does not seem to work well in these cases.

Use of the option "--disable_abort_on_error"

This option is very useful when starting to write a new test because the test will not abort if your script contains some failing statements or SQL syntax errors. By disabling the abort, you get to see more of the errors per test run and can fix more of them at a time.

But there are only rare situations where the final versions of a test should use "--disable_abort_on_error", either at all or during a sequence of several SQL statements.

Perfect cleanup at the end of a test
  • Cleaning up becomes much more comfortable and less error prone if you create your "own" database and create all tables there. At the end of the test, you need to drop just this database.
  • Do not forget to remove all users you created and all permissions you granted. Otherwise the next test might fail when checking grants in general.
  • Close all connections which you have explicitly created and please ensure that the disconnect is finished = the sessions are no more visible within the processlist.
  • Remove all auxiliary files created within your test.

MTR2 runs now by default with "check-testcases" enabled. It checks if there are additional objects like user or tables, modifed system table etc. "check-testcases" will be soon improved. However, proper cleanup should still be a responsibility of both the test author and reviewer.

Use of OS-specific commands

The exec and system commands enable tests to execute external commands. However, many of such commands are available only on certain platforms. (For example, rm is Unix-specific and not available on Windows.) Please avoid these commands if possible. They harm the portability and stability of tests.

Have a look at the mysqltest manual and the t/mysqltest.test test file. There are now several mysqltest built-in commands such as

which are reliable on all operating systems. Sometimes OS commands could be also replaced by SQL statement sequences writing to and reading from files and some SQL string functions.

"Negative" Tests

A "negative" test is a test for which you expect to see a failure. If an error does not occur, that itself indicates a problem.

DO NOT FORGET "NEGATIVE" TESTS where we expect to see fine error messages from the server.

This section contains just a few examples of what to test. Please be creative and imagine what could go wrong in rough reality.

Column with numeric data type:

  • NULL, 0
  • Minimum - 1, Minimum, Minimum + 1
  • Maximum - 1, Maximum, Maximum + 1
  • Negative values if data type is unsigned
  • Garbage like 'abc', '1a', 'a1'

Column with string data type:

  • Null, <empty string>, "exotic" characters like 'ä', single quotes, ...
  • String longer than column

Limited number of tables, columns, variables, ...

  • Maximum - 1, Maximum, Maximum + 1

Assume an SQL operation that affects the filesystem (LOAD DATA, CREATE SCHEMA or TABLE, backup/restore, ...). What will happen if the following conditions occur for a file or directory to be read, written, or implicitly created or deleted?

  • Exists/does not exist
  • Is writable/not writable(missing permission)
  • Is empty/not empty
  • Contains the expected content (Example: text)/unexpected content like maybe a JPG)
  • Is a regular file/a directory/a softlink pointing to .../a pipe
  • Is assigned via an extremely long path
  • Becomes victim of file system full

Tests with Several Variants

Please think twice before you create

  • engine-specific variants of a test

or

  • let runs with ps-protocol/sp-protocol/cursor-protocol/view-protocol

happen. They might be

  • of low value and a permanent wasting of resources and/or
  • fail (protocol variants only)
Variation of the storage engine
  • Checks of the INFORMATION_SCHEMA The storage engines to be used for information_schema tables are hardcoded. Therefore tests focussed on permissions, optimizer strategies, column data types etc. when selecting on INFORMATION_SCHEMA tables should not run with storage engine variations.
  • Please prepare the scripts at least for use with different storage engines. That means mostly:DO NOT use hardcoded storage engine assignments within CREATE TABLE statements.
  • If you assume that there is no significant impact of storage engines on your testing object, create/run with storage engine variants and check this.
    Solution 1:

Do not assign the storage engine within your CREATE TABLE statements at all. The default storage engine MyISAM will be used for your tables. Check your assumption with:

Solution 2:

Assign the storage engine to be used via $variable.

Top level script:

The same script or a sourced script:

Check your assumption by creating and running storage engine variants of the top level test.

Variation of the protocol option

It is usual to run tests with and without the mysql-test-run.pl startup option "--ps-protocol". And there are also attempts of System QA to run the other protocols.

  • Effect of the "--ps-protocol" option: mysqltest will run as many SQL statements as possible as prepared statements.
  • Effect of the "-sp-protocol"/"cursor-protocol"/"-view-protocol" options: As far as I know, mysqltest takes many statements and transforms them into a statement sequence checking the corresponding feature (stored procedures, cursors, or views).

We have a lot of tests running very similar and extremely simple SQL just for the creation of a situation to be tested, check of table content, etc. So it can be assumed that the n'th test running again simple statements does not improve the coverage.

Conclusion:

If your test contains

  • Simple statements: Prevent the non valuable runs for "protocols".
  • Storage engine variations: Prevent the protocol runs for all except one storage engine (my take would be MyISAM).
  • "Unique" and complex statements:
    • ps-protocol statements = all SQL
    • sp-protocol statements = DML
    • view/cursor-protocols statements = SELECTs Do not prevent the protocol runs.Another problem around runs with such protocols is that there are cases where we get different protocol content.

Example:

Script:

Protocol content if running without any "--*-protocol":

Protocol content if running with "--ps-protocol":

So please check whether every new test gives the same result with every protocol but at least with "--ps-protocol". If not, do something of the following:

  • Exclude some protocol variants from execution.
  • Disable the the use of the protocols for problematic statements.
  • Write protocol variant specific tests.

Example solution (code within the top level scripts):

or

Stability checks for random timing problems
  • Recommended Several test runs (<number>), use an memory based (extreme fast) filesystem if available ./mysql-test-run.pl --mem .... --repeat=<number> <your test>
  • Several test runs, use a "slow" disk based filesystem rmdir var # "var" must not be a symlink pointing into a RAM based filesystem
    ./mysql-test-run.pl .... --repeat=<number> <your test>
  • Recommended Generate parallel load on the CPUs and/or the disk where MTR2 reads and writes data (in most cases directory "var"). Examples: all OS: Compile a MySQL version from sourceUnix derivates (OpenSolaris,Linux,OSX,...): tar -chvf - <some path> > /dev/nullUnix derivates: root: dd if=<area of disk containing the "var" directory> of=/dev/nullWindows: Defragmentationrmdir var # "var" must not be a symlink pointing into a RAM based filesystem
    ./mysql-test-run.pl .... --repeat=<number> <your test>
    all OS:./mysql-test-run.pl .... --repeat=<number> --parallel=8 <your test>
Stability checks for hostname-related problems sudo hostname <other hostname>

Good values for <other hostname> to check are "0", "bbb", "mmm", "zzz".

Stability checks for unexpected problems

Ask a colleague to run your test or try another testing box (preferably one with a different operating system).

Row order within result sets

Please keep in mind that the row order within a significant number of result sets depends on storage engine properties and in worst cases the current load (NDB!, partitioning?, parallel query?) on the testing box.

Either use

or, if possible, "decorate" your SELECT with ORDER BY to make the row order static.

Test case behavior should depend on the result of a SHOW

Please have a look at the t/mysqltest.test file where "query_get_value()" is checked.

Does a test execution hang?

Symptom: No text flooding through the command window where a test is running.

Run this command in a different window:

If you see some changes, your test is alive.

Your test is complicated, support debugging

Top level test script:...

At various places within your scripts:

or

You need to know some syntax, option etc., but hate reading manuals

Try this:

Notes on MTR2

MTR2 has brought some new utility that can improve testing, but also some elements that can be pitfalls. This section attempts to describe both.

  • Changes $MYSQLTEST_VARDIR/master-data -> $MYSQLD_DATADIRThis change was introduced to work with parallel. Since MTR2 can run several instances of itself, one path is needed for each instance.You must also use let $MYSQLD_DATADIR= `SELECT @@datadir`;
  • New options max-test-fail: The number of test cases that can fail before the test run aborts. Default=10, set to 0 to continue the run regardless of failure count.NOTE: --force alone is no longer sufficient to guarantee a full test suite run. If the number of failing tests == max-test-fail, then the test run will abort.parallel: Set n number of workers to run tests in parallel. The current recommendation is 2x the number of processors on your machine.NOTE: You can no longer be sure of executing test cases in a particular order due to this and other changes brought on by MTR2. Do not try to create multi-part tests (test_pt1.test, test_pt2.test, etc).NOTE: Be cautious when using a high (>2) value for parallel and --mem in combination. It is possible to use vast amounts of resources on your system and cause problematic performance till a crash of the operating system while MTR2 is running.repeat: Run the input test(s) n times in succession. Very good for diagnosing random failures - set a high value for 'n', create a high load on your machine, and see if you can duplicate the failure.

Examples of "suspicious" tests and scripts

Probably mismatch of focus of test and code sequence

This test will

  • fail if the feature AUTO_INCREMENT is temporary broken
  • will (depending on existence of prerequisite checks) fail or get skipped if the default storage engine does not support AUTO_INCREMENT or LONGTEXT ==> no coverage for other features or properties checked within this testSome questions with recommended action depending on the answer:
  • Is the table t1 just an auxiliary table and not the test object? Yes: Please ensure that the test does not break or gets skipped if the default storage engine does not support AUTO_INCREMENT or LONGTEXT and you are done.No: no action
  • Do we check AUTO_INCREMENT and the corresponding column is t1.id? Yes: no actionNo: Remove the use of AUTO_INCREMENT
  • Do we check LONGTEXT and the corresponding column is t1.name? Yes: no actionNo: Remove the use of LONGTEXT
  • Do we check AUTO_INCREMENT and LONGTEXT in combination? Yes: no actionNo: Split the test at least if it should be a test of basic functionality
    Too greedy test

Let's assume we have to check that every new created table causes a row in information_schema.tables.

  • the number of columns within information_schema.tables changes
  • the data type of a column within ... changes
  • the content within one of the columns within ... changes Lets assume something intentional like VERSION is now 11 instead of 10.This means we should try to avoid "SELECT *".Final statement:

    Are we really consequent? No, but we could

    It's on you which variant you choose. I decide depending on the situation.

Risky handling around additional sessions

Disconnects are performed asynchronous. This is most probably good for the throughput of usual application but has some annoying consequences for the development of tests.

If a

  • disconnect <connection> or
  • KILL <connection_id> or
  • --exec <client program which connects to the server>

occurs within a script for mysqltest than the next statements will be executed even if the disconnect or kill is not complete finished.

I guess this also applies to the end (-> end of statements to be executed by mysqltest) of a test. It triggers some disconnect for any open session but this is also asynchronous.

Not complete finished means that the session is visible within the processlist and the entry about the 'quit' of the session is not already appended to the general log.

So it could happen that a succeeding test suffers by unexpected events like

  • observing the session belonging to the previous test within the processlist
  • being surprised by an entry (the 'quit' of the disconnected session) within the general log which was not caused by its own activity etc.

According to my experiences the likelihood that some test could harm a successing test by unfinished disconnects increases with

  • additional parallel load on the testing box
  • "disconnect <connection>" is missing within the test
  • "shorter" distance between "disconnect <connection>", "KILL <connection_id>" or "--exec ..." and end of test
  • the session runs a "heavy" SQL statement just before its disconnect

In my opinion the most robust solution is to add a

just after the disconnect.

In case you don't believe me, run the sniplet following later

and you will most probably observe something like

Sniplet for demonstration purposes:

The sniplet is intended for demonstration purposes and contains code which is obvious "crap" but needed to enforce the intended effect. Some notes:

  • (1) Such a statement at the begin of a test makes this test very sensitive to not finished disconnects.
  • (2) The wait routine (3) prevents that our look on the process list (4) happens too early = before the server has started the processing of (2). Otherwise we get random result set differences for (4) under high parallel load.
  • How to to prevent that this test harms the succeeding test?
  • Where is the "crap"? There should be a 'reap' for every preceeding 'send <statement>'. This is not just for academic completeness it also ensures here that (2) is really finished.Do we really need 10 seconds within the SLEEP? According to my experiences sleep times <= 2 seconds are critical under high load. What about SLEEP(5)?
Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.

Sign up or Log in to add a comment or watch this page.


The individuals who post here are part of the extended Oracle community and they might not be employed or in any way formally affiliated with Oracle. The opinions expressed here are their own, are not necessarily reviewed in advance by anyone but the individual authors, and neither Oracle nor any other party necessarily agrees with them.