LMBench2

LMbench is a system benchmark tool that was designed to be protable across many *nix type systems. Gelato@UNSW LinuxScalability project will use LMBench2 to benchmarks Ia64 multi-processor machines.

LMBench2 scalability

A question remains, does LMBench2 Scale?

Running LMBench

lmbsum

lmbsum is a PERL script that makes more useful output when you use make results.

The lmbench default output isn't that helpful when you want to aggregate a lot of data, so a revised version is available.

The orignal script was written by Larry McVoy, the author of lmbench. It is distributed with lmbench and called getsummary. It simply lists the output without any aggregation. Randy Hron <rwhron AT earthlink DOT net> updated it to create aggregates of multiple runs (see http://home.earthlink.net/~rwhron/) and Ian Wienand <ianw AT gelato DOT unsw DOT edu DOT au> updated it to give the standard deviation of those runs. It is this version of the file available here.

This new script

Download

You can download the patch to getsummary here : lmbench-getsummary.diff

You should be able to download the perl script from here : lmbsum.pl

Also lmbsum has been modified to produce a comma-separated file, this will allow easy insertion into a spreadsheet if desired and can be found here : lmbsum-CSV.pl

Usage

Apply the patch over getsummary in scripts/getsummary. To get the lmbench results use make results and make rerun as usual. I would usually run make rerun around 5 times for each kernel.

Then pipe the output of make see to the updated script (you need a minium of two runs to make this work)

make see | perl lmbsum.pl

Example

                  L M B E N C H  2 . 0   S U M M A R Y
                 ------------------------------------

Processor, Processes - times in microseconds - smaller is better
----------------------------------------------------------------
                                 null     null                       open    signal   signal    fork    execve  /bin/sh
kernel                           call      I/O     stat    fstat    close   install   handle  process  process  process
-----------------------------  -------  -------  -------  -------  -------  -------  -------  -------  -------  -------
2.5.67lvhpt-1mb-rid              0.440  0.61038    3.767    0.713    5.360    0.722    2.890    262.4    897.5   3725.4
  s.d. (5 runs)                  0.000  0.00301    0.018    0.006    0.027    0.007    0.047      0.0      9.0     13.3
2.5.67plain                      0.439  0.61803    3.588    0.744    5.081    0.685    2.971    252.9    880.2   3642.9
  s.d. (5 runs)                  0.003  0.00262    0.031    0.005    0.008    0.007    0.025      0.0      7.3      4.0

File select - times in microseconds - smaller is better
-------------------------------------------------------
                                select   select   select   select   select   select   select   select
kernel                           10 fd   100 fd   250 fd   500 fd   10 tcp  100 tcp  250 tcp  500 tcp
-----------------------------  -------  -------  -------  -------  -------  -------  -------  -------
2.5.67lvhpt-1mb-rid              2.101   11.255   26.478   51.686    2.923  19.4062  46.8187  92.3938
  s.d.                           0.009    0.014    0.055    0.061    0.008  0.00501  0.02374  0.02502
2.5.67plain                      2.106   11.317   26.553   52.020    2.979  19.4679  46.8949  92.4567
  s.d.                           0.005    0.058    0.158    0.298    0.003  0.00839  0.01286  0.02236

Context switching with 0K - times in microseconds - smaller is better
---------------------------------------------------------------------
                                2proc/0k   4proc/0k   8proc/0k  16proc/0k  32proc/0k  64proc/0k  96proc/0k
kernel                         ctx swtch  ctx swtch  ctx swtch  ctx swtch  ctx swtch  ctx swtch  ctx swtch
-----------------------------  ---------  ---------  ---------  ---------  ---------  ---------  ---------
2.5.67lvhpt-1mb-rid               14.170     14.314     14.306     14.646     14.916     16.484     17.680
  s.d.                             0.127      0.049      0.032      0.055      0.086      0.249      0.172
2.5.67plain                       14.118     14.246     14.252     14.456     15.088     16.530     18.316
  s.d.                             0.087      0.050      0.013      0.104      0.239      0.073      0.013

Context switching with 4K - times in microseconds - smaller is better
---------------------------------------------------------------------
                                2proc/4k   4proc/4k   8proc/4k  16proc/4k  32proc/4k  64proc/4k  96proc/4k
kernel                         ctx swtch  ctx swtch  ctx swtch  ctx swtch  ctx swtch  ctx swtch  ctx swtch
-----------------------------  ---------  ---------  ---------  ---------  ---------  ---------  ---------
2.5.67lvhpt-1mb-rid               14.552     14.548     14.810     15.352     16.604     18.498     21.594
  s.d.                             0.116      0.091      0.045      0.223      0.494      0.223      0.105
2.5.67plain                       14.486     14.536     14.694     15.292     16.180     18.418     21.378
  s.d.                             0.055      0.086      0.032      0.240      0.429      0.449      0.029

Context switching with 8K - times in microseconds - smaller is better
---------------------------------------------------------------------
                                2proc/8k   4proc/8k   8proc/8k  16proc/8k  32proc/8k  64proc/8k  96proc/8k
kernel                         ctx swtch  ctx swtch  ctx swtch  ctx swtch  ctx swtch  ctx swtch  ctx swtch
-----------------------------  ---------  ---------  ---------  ---------  ---------  ---------  ---------
2.5.67lvhpt-1mb-rid               14.984     15.162     15.456     15.958     17.306     19.678     24.158
  s.d.                             0.147      0.042      0.379      0.248      0.864      0.703      0.341
2.5.67plain                       14.920     15.068     15.204     15.946     17.892     20.182     23.570
  s.d.                             0.142      0.081      0.128      0.473      0.860      0.375      0.495

Context switching with 16K - times in microseconds - smaller is better
----------------------------------------------------------------------
                               2proc/16k  4proc/16k  8proc/16k  16prc/16k  32prc/16k  64prc/16k  96prc/16k
kernel                         ctx swtch  ctx swtch  ctx swtch  ctx swtch  ctx swtch  ctx swtch  ctx swtch
-----------------------------  ---------  ---------  ---------  ---------  ---------  ---------  ---------
2.5.67lvhpt-1mb-rid               15.994     16.072     16.554     16.948     18.852     22.542     27.492
  s.d.                             0.043      0.128      0.055      0.358      0.409      0.736      0.277
2.5.67plain                       15.904     15.944     16.118     16.684     18.720     22.418     27.146
  s.d.                             0.013      0.071      0.236      0.021      0.585      1.030      0.223

Context switching with 32K - times in microseconds - smaller is better
----------------------------------------------------------------------
                               2proc/32k  4proc/32k  8proc/32k  16prc/32k  32prc/32k  64prc/32k  96prc/32k
kernel                         ctx swtch  ctx swtch  ctx swtch  ctx swtch  ctx swtch  ctx swtch  ctx swtch
-----------------------------  ---------  ---------  ---------  ---------  ---------  ---------  ---------
2.5.67lvhpt-1mb-rid               17.682     17.906     18.322     19.988     23.844     35.618     49.260
  s.d.                             0.061      0.047      0.083      0.215      1.838      2.022      1.007
2.5.67plain                       17.366     17.820     18.232     20.202     23.318     35.624     48.986
  s.d.                             0.227      0.051      0.095      0.236      1.189      2.397      1.125

Context switching with 64K - times in microseconds - smaller is better
----------------------------------------------------------------------
                               2proc/64k  4proc/64k  8proc/64k  16prc/64k  32prc/64k  64prc/64k  96prc/64k
kernel                         ctx swtch  ctx swtch  ctx swtch  ctx swtch  ctx swtch  ctx swtch  ctx swtch
-----------------------------  ---------  ---------  ---------  ---------  ---------  ---------  ---------
2.5.67lvhpt-1mb-rid               20.998     21.784     24.602     26.742     47.128     89.956     97.148
  s.d.                             0.129      0.149      0.070      0.397      8.989      0.941      0.509
2.5.67plain                       21.000     21.880     24.326     26.420     38.282     88.464     96.212
  s.d.                             0.118      0.352      0.089      0.855      3.184      2.311      0.443

File create/delete and VM system latencies in microseconds - smaller is better
----------------------------------------------------------------------------
                                 0K       0K       1K       1K       4K       4K      10K      10K     Mmap     Prot    Page
kernel                         Create   Delete   Create   Delete   Create   Delete   Create   Delete   Latency  Fault   Fault
------------------------------ -------  -------  -------  -------  -------  -------  -------  -------  -------  ------  ------
2.5.67lvhpt-1mb-rid              24.47    17.95    43.30    30.40    43.36    30.40    62.20    32.42    856.8   0.597    3.00
  s.d.                            0.02     0.06     0.47     0.08     0.39     0.08     0.56     0.06      3.3   0.066    0.00
2.5.67plain                      23.88    17.39    42.54    29.72    42.55    29.76    61.50    31.80    764.0   0.281    2.20
  s.d.                            0.04     0.09     0.17     0.03     0.11     0.04     0.57     0.17      4.7   0.038    0.45

*Local* Communication latencies in microseconds - smaller is better
-------------------------------------------------------------------
kernel                           Pipe   AF/Unix     UDP   RPC/UDP     TCP   RPC/TCP  TCPconn
-----------------------------  -------  -------  -------  -------  -------  -------  -------
2.5.67lvhpt-1mb-rid              6.981   10.197  17.5772  29.4254  23.2528  37.8682   69.528
  s.d.                           0.208    0.210  0.10141  0.21507  0.15138  0.18517    9.868
2.5.67plain                      6.889   10.406  17.2844  29.9277  27.9096  38.0136   64.159
  s.d.                           0.196    0.117  0.16448  0.18959  11.0327  0.10987   11.741

*Local* Communication bandwidths in MB/s - bigger is better
-----------------------------------------------------------
                                                             File     Mmap    Bcopy    Bcopy   Memory   Memory
kernel                           Pipe   AF/Unix    TCP     reread   reread   (libc)   (hand)     read    write
-----------------------------  -------  -------  -------  -------  -------  -------  -------  -------  -------
2.5.67lvhpt-1mb-rid            2086.84  1641.90   429.31  1086.42   589.41   659.75   387.78   589.55   561.75
  s.d.                           12.10    18.85   145.50     1.62     0.27     4.29     0.37     0.41     1.44
2.5.67plain                    2128.07  1640.46   587.65  1088.26   590.27   663.11   388.13   590.19   560.69
  s.d.                           10.49    17.16   204.02     2.01     0.37     1.81     0.32     0.27     3.44

*Local* More Communication bandwidths in MB/s - bigger is better
----------------------------------------------------------------
                                  File     Mmap  Aligned  Partial  Partial  Partial  Partial
OS                                open     open    Bcopy    Bcopy     Mmap     Mmap     Mmap    Bzero
                                 close    close   (libc)   (hand)     read    write   rd/wrt     copy     HTTP
-----------------------------  -------  -------  -------  -------  -------  -------  -------  -------  -------
2.5.67lvhpt-1mb-rid            1095.94   545.80   661.22   677.37   772.32  1603.56   473.73  2374.12   17.134
  s.d.                            2.17     0.50     2.30     2.56     0.43    12.74     0.27     1.98    0.274
2.5.67plain                    1097.69   549.52   664.05   679.93   773.72  1580.05   474.04  2369.03   17.596
  s.d.                            1.32     0.73     1.31     3.06     0.43    46.48     0.35    12.39    0.256

Memory latencies in nanoseconds - smaller is better
---------------------------------------------------
kernel                          Mhz     L1 $     L2 $    Main mem
-----------------------------  -----  -------  -------  ---------
2.5.67lvhpt-1mb-rid              900    2.227    6.706     122.88
  s.d.                             0    0.274    0.274       0.27
2.5.67plain                      900    2.227    6.692     122.52
  s.d.                             0    0.256    0.256       0.26

A tip

This file is now about 2000 lines of perly goodness -- modifying it in any significant way will probably give you a serious case of RSI or really work out your sed/awk skills.

IA64wiki: lmbench (last edited 2009-12-10 03:13:47 by localhost)

Gelato@UNSW is sponsored by
the University of New South Wales National ICT Australia The Gelato Federation Hewlett-Packard Company Australian Research Council
Please contact us with any questions or comments.