.\"
.\" Disktest raw man text
.\" Copyright (c) International Business Machines Corp., 2001
.\"
.\"
.\" This program is free software; you can redistribute it and/or modify
.\" it under the terms of the GNU General Public License as published by
.\" the Free Software Foundation; either version 2 of the License, or
.\" (at your option) any later version.
.\"
.\" This program is distributed in the hope that it will be useful,
.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
.\" GNU General Public License for more details.
.\"
.\" You should have received a copy of the GNU General Public License
.\" along with this program; if not, write to the Free Software
.\" Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
.\"
.\"  Please send e-mail to yardleyb@us.ibm.com if you have
.\"  questions or comments.
.\"
.\"  Project Website:  TBD

.\" Process this file with
.\" groff -man -Tascii disktest.1
.\"
.\" $Id: disktest.1,v 1.5 2008/02/14 08:22:25 subrata_modak Exp $
.\"

.TH DISKTEST 1 "March 2007" OS "Diag Tools"
.SH NAME
disktest \- Test tool for exersizing disk devices
.SH SYNOPSIS
.B disktest [-q] [-Q] [-r] [-w] [-E
.I cmp_length
.B ] [-a
.I seed
.B ] [ -A
.I action
.B ] [-z|c|n|f
.I fixed_pattern
.B ] [-B
.I sBLK[:eBLK]
.B | -C
.I cycles
.B ] [-d ] [-D
.I r%:w%
.B ] [-F] [-h
.I heartbeat
.B ] [-K
.I threads
.B | -L
.I seeks
.B ] [-P
.I TXPRCA
.B ] [ -m ] [-N
.I sectors
.B ] [-R
.I retry[:retryDelay]
.B ] [-o
.I offset
.B ] [-s
.I sLBA[:eLBA]
.B ] [-S
.I sBLK[:eBLK]
.B ] [-t
.I delayMin[[:[delayMax]][:ioTimeout]]
.B ] [-T
.I seconds
.B ] [-p
r
.B |
R
.B |
l
.B [
u
.B |
d
.B ]
.B ] |
L
.B [
u
.B |
d
.B ]] [-I [
d
.B ]
r
.B |
b
.B |
f
.B [
s [ delay ]
.B ]
.B ]
.I filespec
.SH DESCRIPTION
.B Disktest
does repeated accesses to a
.I filespec
and optionally writes to, reads from, and verifies the data.  By default,
.B disktest
makes assumptions about the running environment which allows for a quick start of IO generation.  However,
.B Disktest
has a large number of command line options which can be used to adapt the test for a variety of uses including data integrity, medium integrity, performance, and simple application simulation.

.B Disktest
will use the device specified by
.I filespec.
If no option is specified otherwise,
.B disktest
will attempt to determine
.I filespec
type.  Fully qualified path must be give when
.I filespec
is not a normal file.  This will help to determine it's type.
.SH OPTIONS
Most options have multipliers that can be used to specify larger amounts.  The following are a list of these multipliers.
.RS

k = 1024, K = 10^3, m = 1024^2, M = 10^6, g = 1024^3, G = 10^9

.RE
These can be used on options such as -B, -L, -N, -S, -s.  The time options also have multipliers.
.RS

m = 60, h = 60*60, d = 60*60*24

.RE
These can be used on options such as -h and -T.
.IP -?
Displays a short description of the command line options and exits normally.
.IP "-a seed"
Use seed for all random number generation when constructing blocks of pseudo-random data and random seeks.
By default, seed is set to the process id number.
To reproduce a previous test run, use the process id number outputted to stdout.
.IP "-A action"
These options are used to modify the default behavor during IO runtime.
The following options can be used for these modifications.
.RS
.RS
.IP g
all threads, even to multiple targets, will be killed and the disktest process will stop.
By default, when an access failure occurs only the threads to the failing target will stop.
.IP c
after error, all threads for all targets will continue to run.
.IP r
the reread that would normally occur on a data miscompare is not performed.
.IP m
after error, a special IO is sent to LBA 0 of the target.
This is so that a trigger can be set on the special data.
At the beginning if the IO is the string "DISKTEST ERROR OCCURRED".
This will overwrite any mark data or normal pattern data written.
.IP s
disable block level synchronization.
By default distest will synchronize IO on a block level according to the POSIX standard that states that an application must serialize IO to a block or raw device.
Disktest will not allow a write if the block is already being written or read, but will allow a read if other reads are occuring.
This option will turn off this checking.
.IP S
enable IO serialization.
This addes to the synchronization of the threads, so that there is no more then one IO oustanding to a target, no matter how many threads are running to the target.
.IP t
causes an IO timeout, to fail an IO run.
normally an IO timeout, see -t option, will just display a warning.
.IP w
If running the -pR seek type, and this action is specified, then blocks will only be written once, and be read many times, i.e. WORM still testing.
This option does nothing with other seek pattern types, or if not reading and writing.

.RE
.B NOTE:
Options g and c are exclusive.
Option m cannot be specified with with c.
Option m cannot be specified if only reading from a device.
When using option s, it is possible that in some operating systems, that a data miscompare may result due to a kernel that does not garentee exclusion or does not handle well multiple readers and writers to the same block.
.RE
.IP "-B sBlockSize[:eBlockSize]"
Set the size of the data block transfer.  If only
.I sBlockSize
is specified, then the transfer length is always a fixed length of
.I sBlockSize.
If
.I eBlockSize
is specified, then the transfer length will be randomly chosen between
.I sBlockSize
and
.I eBlockSize.
The transfer size will always be a multipile of the sector size.
If either parameter is greater then 256, then the value will be integer divided by the sector size default, which is 512 bytes. If either parameter is preceded by a 'k', i.e. 8k, then the value will be multiplied by 1024. Otherwise, the parameters will be taken literally.  The default block size is (1*sector size) or 512.  Note that
.B O_DIRECT
, no filesystem buffering, and some file system may not be able to perform accesses as small as 512 bytes.  This will result in an IO failure with a transfer length of -1.
.IP -c
Use a counting sequence for the bytes within to each block.  The count starts at 0 and increments to 255 then begins again at 0.  Each sector is filled with two of these sequences.
.IP "-C cycles"
Run until the total number of
.I cycles
are complete.  When cycles is set to zero, disktest will run until killed.  The -L or -T option is used to specify how long each cycle will run for.  If neither -T or -L is specified, the cycle length with be calculated to the number of seeks based on the device size, -N, divided by the transfer size, -B. If -C is not specified, only a single cycle is run.
.IP -d
Force
.B disktest
to dump to stdout the amount of data at the location specified by the other command-line option, i.e. -d -s50 -B 32k will dump 32768 bytes of data to stdout starting at LBA 50. The data is formatted into lines of 16 bytes with the location offset and ASCII equivalent.
.IP "-D r%:w%"
Duty cycle used while reading and/or writing.  For example, -D 20:80 would cause
.B disktest
to generate a read 20% of the total run time and generate a write 80%.  If only read or write is give then the percentage is always set to 100 for the specified option.  If the total percentage does not add up to 100, i.e. -D 20:70, then
.B disktest
will split the remaining percentage, resulting in 25% reads and 75% writes.
.IP "-E compare_length"
Turn on error checking.  Data read from
.I filespec
will be checked for correctness up to the number of bytes specified by
.I compare_length.
If
.I compare_length
is 0 then the block size is used as the compare length. By default, data read is not checked for errors.
.IP "-f fixed_pattern"
Use a data pattern consisting of a fixed value.
.I fixed_pattern
can be entered using decimal, any number not starting with a 0, octel, any number staring with a 0, or hexadecimal, any number and [A-F] starting with 0x. The value can be no greater +/- 2^63, in size.
.IP -F
used to specify that the
.I filespec
is actually a file listing the targets that
.B disktest
should operate on.  This allows
.B disktest
to run to multiple targets using the same options from a single command-line.
.IP "-h heartbeat"
Performance data will be sent to stdout every
.I heartbeat
seconds. During a linear test, -pL, only heartbeat statistics for the current operational cycle will be displayed. The default is to only display performance data at the end of the test, which is cumulative for all IO performed throughout the test.
.IP "-I IO_type"
Set the data transfer type to IO_type. Valid IO types are
.B r
(raw),
.B b
(block), or
.B f
(file) I/O.  These options are case sensitive.

The
.B r
(raw) type is used when binding a block device to a raw device, see
.B raw(8). Disktest
will align it's buffers correctly to support raw devices.

The
.B b
(block) type is used when block IO is desired.  The buffer_cache will be used during testing.  Buffer alignment is not required for this type of IO operation.

The
.B f
(file) type is used when accessing a file.  If the file does not exist then it will be created.  If the file exists, then it will opened; see
.B O_CREAT
in
.B open(2)
for more details. Access to the file is performed through the file system that the file is stored on. Adding an S modifier to the F (file) type opertaions will force an
.B fsync(2)
to occur on every write.

Adding
.B d
will open with the
.B O_DIRECT
flag set.
If this option is used, then I/O is limited to being aligned to the file systems block size.
When transferring to a block device w/o a file system, then alignment is to 1k.
These limits have been verified with the 2.4.9 kernel and the o_direct patch from AA.

Adding
.B s
.I sync_interval
Specifies that a sync should occur at
.I sync_interval
number of write IO operations. The default is to sync on every IO.

.B Disktest
will report a failure if
.I filespec
does not match the
.I IO_type
specified.
If no type is specified, then disktest will attempt to determine the file type by using stat(2).
.IP "-K threads"
Set the number of test threads to threads.  Each child can read or write based on the specified criteria.  The default number of test threads is 4.
.IP "-L seeks"
Total number of seeks to occur during testing.  This option specifies the exact number of times a seek occurs on a resource.  By default
.B disktest
will calculate the number of seeks by taking the difference between the start block and the stop block.  If the difference is 0 then the default is 1000 seeks.
.IP "-m"
This option will add the lba, pass count, seed, cycle start time, hostname, and target to each LBA as header data before any IO operation occurs. The mark replaces the first n bytes of data in each LBA.
.IP "-M marker"
This option will override the cycle start time in the -m option with the specified value. This is useful when you are writing in one distest instance and read/verifing in another so that the mark data can be unique and also deterministic.
.IP -n
Use a data pattern that consists of the the lba number.  An lba value of up to a 64b can be stored.  The 64b value is repeated to fill the transfer buffer.
.IP "-N sectors"
Set the number of available sectors to num_secs. If no num_secs is specified, and the size of the device can be determined, then the number of sectors, as reported by the device is used, otherwise, the default number of sectors is 2000.
.IP "-o offset"
This option specifies the LBA
.I offset
to shift all alignment of IO by.
For example, if a test is to perform a full stride write on a storage device, and the os and/or storage device offset the strides by a number of LBAs, this parameter can be used to set that offset, so that IO is aligned to the stride on the storage device.
By default the offset is set to zero.
.IP "-p seek_pattern"
Set the pattern of seeks to
.I seek_pattern.
Valid patterns are
.B l
(linear interleaved writes/reads),
.B L
(linear writes then reads),
.B R
(random),
.B r
(random interleaved writes/reads).  Linear may also specify what happens when the last block is reached.  Option
.B u
specifies that the test should start back at first block after reaching the last block.
.B d
specifies that the test, after reaching the last block, should start at the last block and go to the first block. The default extra option for linear is 'u'. The default seek is random.
.IP "-P perf_opts"
Record performance statistic to stdout. Perf_opts is a string of characters representing which statistics should be reported.  The possible options are:

.RS
.RS
.B T
- Disk throughput

.B X
- Number of transfers

.B P
- Display performance data in ';' delimited format

.B R
- Display runtime

.B C
- Display cycle performance details

.B A
- Display all performance options

.RE
.RE
.IP -q
Suppress all the 'INFO' level messages that are send to stdout.  This includes all the assumption messages the
.B disktest
will print as it finds that the option was not specified in the command line arguments.
.IP -Q
Suppress header data from messages that are send to stdout.
.IP -r
Read from
.I filespec.
This is the default option if -w or -r are not specified.  -E must be specified if data integrity checking is desired.
.IP "-R retry[:retryDelay]"
Specifies that on a seek or transfer error, the IO should be retried.
.I retry
Specifies the number of retry attempts that should be made.
.I retryDelay
Specifies the amount of time delay (msec) before the retry occurs.  If no
.I retryDelay
is specified, then retries will occur immediately.  By default, IO errors are not retried.
.IP "-S start_block[:end_block]"
Set the starting test block to
.I start_block
and the ending test block to
.I end_block.
By default,
.I start_block
is 0 and
.I end_block
is 2000.  If
.I end_block
is not given, and the size of
.I filespec
can be determined, then
.I end_block
is set to the volume capacity reported by the device divided by the transfer length.
This option can only be used when there is a fixed transfer length.
The range given is inclusive, so if -S0:10 is specificed, this will be 11 blocks.
The -S and -s options are exclusive.
.IP "-s start_LBA[:end_LBA]"
Set the starting test LBA to
.I start_LBA
and the ending test LBA to
.I end_LBA.
By default,
.I start_LBA
is 0 and
.I end_LBA
is 2000.  If
.I end_LBA
is not given, and the size of
.I filespec
can be determined, then
.I end_LBA
is set to the volume capacity reported by the device.
This option can only be used when there is a fixed transfer length.
The range given is inclusive, so if -s0:10 is specificed, this will be 11 LBAs.
The -S and -s options are exclusive.
.IP "-t delayMin[[:delayMax][:ioTimeout]]"
Wait
.I
delayMin
milliseconds between each IO.
This is used when attempting to simulate a static load from an application that has some known proccessing time between IO operations.
.I
delayMax
can be added to specify that a per thread random IO delay should be used, between delayMin and delayMax.
When combined with multiple threads, -K, these can be used to model IO load for simulating application processing after IO.
By default, disktest will issue as many IO requests as possible, delayMin/delayMax is set to zero, which may over drive some disk subsystems when multiple hosts running disktest are attached to the same disk subsystem.

If no IOs complete in
.I ioTimeout
seconds, then disktest will consider the test to fail.
The default is now 60 secs, which means that if there are no IO operations to a target from any thread that complete in 60 secs then the test will stop with a failed status, and an ERROR message stating the there is a possible hung IO condition, if it is a true hung IO condition, then disktest IO threads will not terminate with a non-preemtable kernel, and the only error message with be from the ioTimeout ERROR message.
To disable this feature, set the io timeout to 0, which means that the IO timeout time will never be reached which is how disktest operated before this feature was added.
The minute, m, hour, h, and day, d, multipliers can also be used on these perameter.
The following are examples of -t usage.

.RS
.RS
-t0:0, is the default behavior as in previous versions

-t0:2h, is no IO delay, with a 2 hour IO timeout.

-t30, is a 30 msec delay, with default IO timeout.

-t300:1000:1m, is a random delay between 300:1000 msec, with a 1 minute io timeout.

.RE
.RE
.IP "-T runtime"
Run until
.I runtime
seconds have elapsed.
.I Runtime
must always be greater than zero.  -T and -L are exclusive of one another.
.IP -v
The version information will be displayed and
.B disktest
will exit normally.
.IP -w
Write to
.I filespec.
Data will be written as fast as possible and not read back to check for data corruption. can be combined with -r option to do read/write testing and -E to perform data integrity checking.
.IP -z
Use a randomly generated data pattern based on the seed for the bytes within to each block.  The data pattern is random for the first 512 bytes, one LBA.  The pattern is then repeated for each LBA after creating a pseudo random data pattern across the given
.I filespec.
This is done for two reasons.  One, it saves on the memory foot print size need and time required to generate the data, and two, an LBA is the smallest unit of work
.B disktest
operates on.  Therefore,
.B disktest
can maintain the ability to do data checking, random block size transfers, and random block offsets when using random data.
.SH FILES
.I ./disktest
.SH ENVIRONMENT
None.
.SH EXAMPLES
The following are some examples on how to use the options in
.B disktest
to create different types of workloads.  Please use these as a guideline to get started.

.RS
disktest -r -S10:15 -pld -L35 -B 256k -K3 -PTX /dev/sdaa

This will start a read test to blocks 10 through 15.  Seeks are linear and will be performed starting at 10 going to 15 then back to 10.  35 seeks will be performed.  The block size 256k and there will be three threads.  Also, total transfer and throughput will be displayed at the end of the test.

disktest -r -w -D30:70 -K2 -E32 -B 8192 -T 600 -pR -Ibd /dev/sdzz

This will start a write and read test were the work load is 30% reads and 70% writes.  There will be two threads and all read data will be checked for errors up to 32 bytes.  The block size is 8k and the test will run for 600 seconds.  Seeks will be random and /dev/sdzz will be opened with the
.B O_DIRECT
flag set.

disktest -K8 -t500:15000:120s ./testfile

This will start eight read threads, with a minimum read delay of 500 miliseconds, and a maximum of 15 seconds.

disktest -w -Is200 -R3:60000 -Ac -PRTX -B128k -T10 -pr ./afile

This will start four write threads, syncing every 200 IOs. If there is a error on any write, then the same IO will be retried up to 3 times, and the thread will wait for 60 seconds before attempting each retry. If there is an error during the test, just continue on, reporting all errors as warnings.

disktest -Ag -Am -B 16k -C 100 -K 1 -z -ma -pL -P A -S 0:20000 -r -w -E 0 -N 640032 ./afile

Start a read/write test with error checking for 100 cycles.  If there is an error, then write out the special marker to LBA 0 of the target, and stop all testing.  You random data, and all header markers.

.SH DIAGNOSTICS
Output Format
.RS
All output has a header sting that displays in the following format:

.RS
| <date>-<time> | <level> | <pid> | <version> | <device> | <message>

.RE
The first value is the system date and time.  It is expressed as:
.RS
<MONTH>/<DAY>/<YEAR>-<HOUR>:<MIN>:<SEC>.

.RE
The second value is the level of the message.  Current levels include START, END, DEBUG, INFO, WARN, STAT, and ERROR.  The third value is the process id.  This can be used to match up the test processes with the output information if more then one test process is outputting to the same context, such as file. It can also be used to regenerate a test with the same seeks and random data using the -a. The fourth value is the revision number of the test process. The fifth is the target device.  The sixth is the informational message.  The following are some examples:
.RS

| 11/12/01-02:05:01 | START | 1314 | v1.2.3 | /dev/sdaa | Start args: -S100:105 -K5 -pid -r -PTX -L 25 -B 1 -z /dev/sdaa

| 11/12/01-02:05:01 | STAT  | 1314 | v1.2.3 | /dev/sdaa | 12800 bytes read in 25 transfers.

| 11/12/01-02:05:01 | STAT  | 1314 | v1.2.3 | /dev/sdaa | Read Throughput 12800B/s, IOPS 25/s.

| 11/12/01-02:05:01 | END   | 1314 | v1.2.3 | /dev/sdaa | Test Done (Passed)

.RE
.RE
Error Checking
.RS
When error checking is enabled, each read is compared with data that is generated by the command line options specified or assumptions made where no command line option is given.  If a data miscompare results the expected and actual data from the first 16 bytes of the LBA where the error occured is printed to STDOUT, and the IO thread will die without completing any other IO operations, unless the -A option is specified. if the compare_length is not zero, then only the first compare_length bytes are compared, and only if those bytes miscompare will a data miscompare be reported. When using the mark option, data miscompares can be more readly detect.

.RE
Decoding Mark Data
.RS
When using the -m option, it will replace the first 32+ bytes of each LBA with mark information.
The + is the fact that it places the complete target information in the mark, so it can consume more or lease of the LBA depending on the filespec.
The mark information looks as follows:

.RS
00 00 00 00 00 00 00 D4 00 00 00 00 00 00 00 03
.br
00 00 00 00 42 FD 08 FD 00 00 00 00 00 00 43 FA
.br
69 6F 61 72 6B 00 00 00 00 00 00 00 00 00 00 00
.br
2E 2F 74 65 73 74 66 69 6C 65 3A 3B 3C 3D 3E 3F

.RE
The first 8 bytes is the LBA, in this case
.RS
00 00 00 00 00 00 00 D4 : Which equals LBA #212

.RE
The second 8 bytes is the pass count, in this case
.RS
00 00 00 00 00 00 00 03 : Which equals pass count 3

.RE
The third 8 bytes is the start time, in this case
.RS
00 00 00 00 42 FD 08 FD : Which equals 0x42FD08FD or 1123879165 or Fri Aug 12 13:39:25 PDT 2005.
.br
This value is decoding using date --date="1970-01-01 1123879165 sec UTC" .

.RE
The fourth 8 bytes is the random seed, in this case
.RS
00 00 00 00 00 00 43 FA : Which equals 0x43FA

.RE
The next 16 bytes is the first 16 bytes of the name of the host, in this case
.RS
69 6F 61 72 6B 00 00 00 00 00 00 00 00 00 00 00 : Which equals: ioark

.RE
From the 49 byte on is the filespec, in this case
.RS
2E 2F 74 65 73 74  66 69 6C 65 : Which equals: ./testfile
.RE

.RE
Seeking/Accessing
.RS
When a seek failure occurs, the following information is sent to STDOUT:

.RS
| 11/12/01-02:05:01 | ERROR | 2250 | v1.2.3 | /dev/sdzz | lseek failed seek 10, lba = 32714, request pos = 1284, seek pos = -1, errno = 5

.RE
When an access failure occurs, the following information is sent to STDOUT:

.RS
| 11/12/01-02:05:01 | ERROR | 4492 | v1.2.3 | /dev/sdxp | disk access failed: seek 10, lba = 32714, got = 0, asked for = 8192, errno = 2

.RE
An access failure can also occur on a partial access.  In this case, 'got' will equal the number of bytes that were transfered.
Currently, distest treats partial accesses as failures, as distest attempts to always make sure that the LBA target and trasnfer size fits inside the specified volume size.

.RE
Performance
.RS
Performance options will display information about throughput, IO per second, and runtime. This information can be print at the end of the test only, or throughout the test at a given interval using the heartbeat option, -h.

.RE
Dumping
.RS
When dumping data from filespec you will specify -d along with other command-line options.  Here is an example:

.RS
disktest -d -B 1k -s25 /dev/sddz

.RE
This will dump 1024 bytes of data to stdout starting at LBA 25.

.RE
File I/O
.RS
Distest can be used to perform filesystem IO testing.  There is some setup required for this however.  Disktest will not automatically create a file on the filesystem.  Therefore, a file must be initialized.  This is only required for read only testing.  Write and read/write testing will create the file if not already created.  Also note, that when creating a file using random I/O, all the LBAs in the file may not be written.  This can cause disktest to show an error if a request is made to a file to an LBA that has not been previously written.  The following is an example to initialize a file for filesystem IO testing.

disktest -w -pl -N200000 -B128k test.fil

This will create a ~97MB file named test.fil in the current directory writing at 131072B per transfer.  Once this completes any type of IO test can be performed to this file. This can also be done by creating a sparce file by doing the following:

disktest -w -pl -K1 -L1 -S200000 test.fil

.RE
.SH TODO
The following are options that are forthcoming, ideas, and other good stuff:
.RS
Add the following options:
.RS
butterfly: seek option: test will seek lba start/end/start+1/end-1/etc...

non-destructive: will read lba/write lba with read data/then read lba to verify

min seek: force a minimum seek distance during any IO access

max seek: force a maximum seek distance during any IO access

WORO: all blocks will be written and read only once

WRWR: a block will be written then read then written then read

retry: number of times an I/O should be retried, after an error, before counting as a failure

.SH AUTHOR
Brent Yardley (yardleyb@us.ibm.com)