Benchmark, CAM, File system, FreeBSD, GSoC, mmccam, SDIO

microSD Card benchmarking on BeagleBone Black

Hi all, I have been recently involved in benchmarking two different device drivers of FreeBSD ,namely SDHCI and the SDIO driver to compare their relative performance under different circumstances. Here, I will summarize my results, possible conclusions and the benchmarking procedure.

Before moving towards benchmarking, i’d prefer going through this article. It really good and provides a great overview of how data is transferred to disk, different APIs available. This will surely help you selecting options for the benchmark.

Benchmarking procedure

I initially experimented with different benchmarks like iorate, iozone, fio, bonnie++ etc but the one i really liked was IOzone. It provides multitude of features and more importantly, iozone results are fairly consistent with multiple run and in agreement with diskinfo results. It has a flexible license, thus making it ideal for use with any open-source/proprietary application.

IOzone

IOzone is a filesystem benchmark tool. The benchmark tests file I/O performance for the following operations: Read, write, re-read, re-write, read backwards, read strided, fread, fwrite, random read, pread ,mmap, aio_read, aio_write. It has builds available for: AIX, BSDI, HP-UX, IRIX, FreeBSD, Linux, OpenBSD, NetBSD, OSFV3, OSFV4, OSFV5, SCO OpenServer, Solaris, MAC OS X, Windows (95/98/Me/NT/2K/XP) so, This tutorial is relatively independent of the operating system you use.

IOzone can be build from its source or a pre-build binary might be available for your OS.

Building IOzone from it’s source

Source files are available at http://www.iozone.org/src/current/  . There are 4 main files:

  • iozone.c – Main C file. IOzone is structured like a single big application and is not divided into a large number of different files.
  • libasync.c – library for POSIX async read
  • libbif.c – Responsible for writing the output in Excel format, so they can be directly imported into Excel.
  • Makefile
Type: make
The makefile will display a list of supported platforms. Pick the one that matches your configuration and then type: make target
That’s it. You’re done. Now, just run the executable file with appropriate options.

IOzone CLI options

There are a number of options that can be used to configure the benchmark. I won’t be able to go through all of them, but i’ll cover all the options i used/ or are commonly used for benchmarking.

Command i used: Command used: iozone -e -I -a -s 100M -r 4k -r 512k -r 16M -R -i 0 -i 1 -i 2 -f /dev/sdda0s1

-I : In order to benchmark block device correctly , one need to disable the cache. -I option enables direct read/write via DMA to the device, bypassing the buffer. It’s similar to using O_DIRECT flag on linux. It bypasses the kernel’s page cache. Just have a look at it’s implementation:

	case 'I':	/* Use VXFS direct advisory or O_DIRECT from Linux or AIX , or O_DIRECTIO for TRU64  or Solaris directio */
#ifdef VXFS
			direct_flag++;
			sprintf(splash[splash_line++],"\tVxFS advanced feature SET_CACHE, VX_DIRECT enabled\n");
			break;
#endif
#if ! defined(DONT_HAVE_O_DIRECT)
#if defined(linux) || defined(__AIX__) || defined(IRIX) || defined(IRIX64) || defined(Windows) || defined(__FreeBSD__) || defined(solaris) || defined(IOZ_macosx)
			direct_flag++;
			sprintf(splash[splash_line++],"\tO_DIRECT feature enabled\n");
			break;
#endif
#if defined(TRU64)
			direct_flag++;
			sprintf(splash[splash_line++],"\tO_DIRECTIO feature enabled\n");
			break;
#endif
#else
			break;
#endif
#if defined(Windows)
			sprintf(splash[splash_line++],"\tO_DIRECTIO feature not available in Windows version.\n");
			break;
#endif

It clearly states that this feature isn’t available on Windows. However, on FreeBSD it states that this feature will work, but as per my experience i prefer using -U option along with it, to ensure purging all the data in the cache.

-e : It includes flush(fsync/fflush) in timing calculations.


Note:-
Following are the synchronous modes, as defined by POSIX:

  • O_SYNC: File data and all file metadata are written synchronously to disk.
  • O_DSYNC: Only file data and metadata needed to access the file data are written synchronously to disk.
  • O_RSYNC: Not implemented

O_SYNC provides synchronized I/O file integrity completion, meaning write operations will flush data and all associated metadata to the underlying hardware. O_DSYNC provides synchronized I/O data integrity completion, meaning write operations will flush data to  the underlying hardware, but will only flush metadata updates that are required to allow a subsequent read operation to complete successfully.


Thus, using fsync/fflush will guarantee that timing includes time required for writing file metadata as well.

-s #:-  used to specify the size/amount of data to be transferred. It is recommended to use large data size as it guarantee consistent/average results.

-i #:- It is used to specify the tests to be performed.(0=write/rewrite, 1=read/re-read, 2=random-read/write 3=Read-backwards, 4=Re-write-record, 5=stride-read,6=fwrite/re-fwrite, 7=fread/Re-fread,8=random mix, 9=pwrite/Re-pwrite, 10=pread/Re-pread, 11=pwritev/Re-pwritev, 12=preadv/Re-preadv).

a :- It enables automatic mode

-r :- Used to specify record length to be used for transferring data. N different record lengths implies N different tests each with specified record length will be performed. Filesystem IO occurs in smaller record size of 4, 16, 32 kb etc and thus, higher sizes like 16M record length doesn’t reveal much info about the filesystem.

-R :- Used to enable excel generated outputs

-f :-  Used to point towards the file to which data has to be transferred. It is quite important to specify this correctly as write operations can destroy the data that is already there on the disk. In my case i have pointed it to my sd card’s filesystem.

U :- This option is used to mount/unmount the disks filesystem. Mounting/unmounting purges the buffer cache associated with the block device. In FreeBSD, cache is maintained in a file only after device is mounted. Unlike in linux, where cache is maintained on the disk itself.

SD Card Preparation before benchmarking

Each time before running IOzone, it’s advisable to reformat the filesystem, to ensure same results every time the test is performed. On freebsd, this can be done by the following commands:

gpart destroy -F mmcsd0
gpart create -s BSD mmcsd0
gpart add -t freebsd-ufs mmcsd0
newfs /dev/mmcsd0a

Where, mmcsd0 is my device i.e SD card. Moreover, while using -U option, it’s expected to have some configuration about the device mounting in /etc/fstab . Mounting configuration contains the mount point, filesystem type, privileges etc. For ex: /dev/mmcsd0a /mnt ufs rw,noauto/dev/mmcsd0a /mnt ufs rw,noauto

The -U has other limitations such as it doesn’t work at all in the test cases where the benchmark is running across a bunch of clients (distributed mode) as it doesn’t have any method to quiesce the load across the nodes for the -U to do it’s work.

Please note that Flash write tests are extremely sensitive to hidden move/rewrite and erase  cycles performed by the flash translation layer. Also, Results might vary with different flash implementations, so always use the same sd card! and also of same size as seek time varies with size of sd card.

After preparing the SD card, just run iozone multiple times with same options to see if the results are constant or not. If constant then probably, you took all precautions carefully! and results are thus valid.

Benchmarking Results

So, i performed various tests, with different objectives and here are the raw data and graphs: https://docs.google.com/spreadsheets/d/1_lf9S136z0tJyni9W3t1__Rlrkal1fd6L7-vnlmYQpc/edit?usp=sharing

Test#1 – To determine the effect of sd card’s filesystem on its performance

It can be easily comprehended from the graphs that filesystem of sd card doesn’t affect the read performance of sd card at all. However, Write speed is significantly affected by the filesystem , if in case of SDIO/MMCCAM driver.

Test#2 : Performance comparision of SDHCI and SDIO driver

Now, with Freebsd-ufs filesystem on sd card, we see that SDHCI driver has a bit higher read speed the SDIO driver. While, write speed of SDHCI driver depends on block size we use. Maximum performance is with 512K block size. Higher block size doesn’t guarantee better performance.

Test#3 : Effect of caching on test results

Just look at the tremendous difference in sd card’s performance with disk cache on/off. Thus, it’s recommended to enable disk cache when using sd card for general application. However, for benchmarking purpose, just disable the caching.

 

Tagged , , ,

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.