HPLのインストール
動作したHPL.dat
HPLのパラメータ設定はいろいろと大変。
特に問題サイズNと粒度NBの設定には気を遣わないとすぐ止まる。
初期設定から適当にいじってどうさしたやつ
N=10k
NB=10,5
Beta1のときはN=8.3kを越えたくらいで動作しなくなってたので,より高いGFLOPS値を得られるようになった。
HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out output file name (if any)
6 device out (6=stdout,7=stderr,file)
1 # of problems sizes (N)
10000 Ns
2 # of NBs
10 5 40 NBs
0 PMAP process mapping (0=Row-,1=Column-major)
1 # of process grids (P x Q)
4 2 4 Ps
4 2 1 Qs
16.0 threshold
1 # of panel fact
0 1 2 PFACTs (0=left, 1=Crout, 2=Right)
1 # of recursive stopping criterium
2 2 4 NBMINs (>= 1)
1 # of panels in recursion
2 NDIVs
1 # of recursive panel fact.
0 1 2 RFACTs (0=left, 1=Crout, 2=Right)
1 # of broadcast
0 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1 # of lookahead depth
0 DEPTHs (>=0)
2 SWAP (0=bin-exch,1=long,2=mix)
64 swapping threshold
0 L1 in (0=transposed,1=no-transposed) form
0 U in (0=transposed,1=no-transposed) form
1 Equilibration (0=no,1=yes)
8 memory alignment in double (> 0)
結果はこんな感じ
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR00L2L2 10000 10 4 4 31.16 2.140e+01
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0045379 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR00L2L2 10000 5 4 4 51.84 1.286e+01
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0037335 ...... PASSED
================================================================================
最終更新:2009年07月07日 14:23