CentOS 5.5 x86_64 + OpenMPIでMPI PC clusterを作る

幸谷 智紀

最終更新日: 2011年1月5日
[ English ]

概要: 本ページは,CentOS 5.5上で標準的に提供されているOpenMPIを用いてMPI PC clusterを作る手順を記したものである。勘違いや間違いも含まれている可能性がある。そーゆーものを見つけた方は,ぜひ下記の連絡先かTwitter(@tkouya)までご指摘下さい。

参考文献&URL

[1] Tomonori Kouya, "How to build MPI PC clusters using standard Linux distributions"
[2] The Community Enterprise Operating System
[3] Open MPI


下線部を入力 or 確認すること。

  1. 下記の設定を満足するようにCentOS 5.5 x86_64をインストールしてネットワークをセットアップ。

    cs-room443-i701 (NIS Server + NFS Server, /home, /usr/localをexport) eth0(GbE) ... 192.168.2.32 eth1(100BASE) ... 192.168.24.146(DHCP) -> The Internet cs-room443-i702 (NIS client) eth0(GbE) ... 192.168.2.33 eth1(100BASE) ... 192.168.24.145(DHCP) -> The Internet

  2. 以下,2.☆〜6.☆はcs-room443-i701, i702で共通

  3. ☆ gcc, gfortran, g++をインストール。

    # yum install gcc gcc-gfortran gcc-c++ Installed: gcc.x86_64 0:4.1.2-48.el5 gcc-c++.x86_64 0:4.1.2-48.el5 gcc-gfortran.x86_64 0:4.1.2-48.el5 Dependency Installed: glibc-devel.x86_64 0:2.5-49.el5_5.7 glibc-headers.x86_64 0:2.5-49.el5_5.7 kernel-headers.x86_64 0:2.6.18-194.26.1.el5 libgfortran.x86_64 0:4.1.2-48.el5 libgomp.x86_64 0:4.4.0-6.el5 libstdc++-devel.x86_64 0:4.1.2-48.el5 Complete!  (注) GOMP(OpenMP)もインストールされる # gcc -v Using built-in specs. Target: x86_64-redhat-linux コンフィグオプション: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-libgcj-multifile --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --enable-plugin --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --with-cpu=generic --host=x86_64-redhat-linux スレッドモデル: posix gcc バージョン 4.1.2 20080704 (Red Hat 4.1.2-48) # gfortran -v Using built-in specs. Target: x86_64-redhat-linux コンフィグオプション: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-libgcj-multifile --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --enable-plugin --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --with-cpu=generic --host=x86_64-redhat-linux スレッドモデル: posix gcc バージョン 4.1.2 20080704 (Red Hat 4.1.2-48) # g++ -v Using built-in specs. Target: x86_64-redhat-linux コンフィグオプション: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-libgcj-multifile --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --enable-plugin --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --with-cpu=generic --host=x86_64-redhat-linux スレッドモデル: posix gcc バージョン 4.1.2 20080704 (Red Hat 4.1.2-48)

  4. ☆ OpenMPI(x86_64版のみ)をインストール。

    # yum install openmpi-devel.x86_64 Installed: openmpi-devel.x86_64 0:1.4-4.el5 Dependency Installed: libibcm.x86_64 0:1.0.5-1.el5 libibverbs.x86_64 0:1.1.3-2.el5 librdmacm.x86_64 0:1.0.10-1.el5 mpi-selector.noarch 0:1.0.2-1.el5 openib.noarch 0:1.4.1-5.el5 openmpi.x86_64 0:1.4-4.el5 openmpi-libs.x86_64 0:1.4-4.el5 Complete!

  5. ☆ mpi-selectorコマンドでOpenMPI環境変数を登録。

    $ mpi-selector --list openmpi-1.4-gcc-x86_64 # mpi-selector --set openmpi-1.4-gcc-x86_64 --system (一度ログアウトする) $ mpi-selector --query default:openmpi-1.4-gcc-x86_64 level:user

  6. ☆ mpicc, mpirunなどのコマンドが使用可能になっていることを確認

    $ mpirun --version mpirun (Open MPI) 1.4 Report bugs to http://www.open-mpi.org/community/help/ $ mpicc --version gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48) Copyright (C) 2006 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

  7. ☆ MPIプログラムをコンパイルして動作させてみる。

    $ cat mpi_hellow.c #include #include "mpi.h" int main(int argc, char *argv[]) { int myrank, numprocs, length_name; char nodename[128]; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); MPI_Comm_size(MPI_COMM_WORLD, &numprocs); MPI_Get_processor_name(nodename, &length_name); printf("Hellow, MPI! (%0d/%0d)-- %s\n", myrank, numprocs, nodename); MPI_Finalize(); return 0; } $ mpicc mpi_hellow.c $ mpirun -np 1 ./a.out Hellow, MPI! (0/1)-- cs-room443-i70x $ mpirun -np 2 ./a.out Hellow, MPI! (0/2)-- cs-room443-i70x Hellow, MPI! (1/2)-- cs-room443-i70x $ mpirun -np 3 ./a.out Hellow, MPI! (1/3)-- cs-room443-i70x Hellow, MPI! (0/3)-- cs-room443-i70x Hellow, MPI! (2/3)-- cs-room443-i70x $ mpirun -np 4 ./a.out Hellow, MPI! (1/4)-- cs-room443-i70x Hellow, MPI! (2/4)-- cs-room443-i70x Hellow, MPI! (0/4)-- cs-room443-i70x Hellow, MPI! (3/4)-- cs-room443-i70x

  8. NFSサーバ(cs-room443-i701)をセッティングして/home, /usr/localをexportする。

    [root@cs-room443-i701 user01]# /sbin/chkconfig --list nfs nfs 0:off 1:off 2:off 3:off 4:off 5:off 6:off [root@cs-room443-i701 user01]# /sbin/chkconfig nfs on [root@cs-room443-i701 user01]# /sbin/service nfs start NFS サービスを起動中: [ OK ] NFS クォータを起動中: [ OK ] NFS デーモンを起動中: [ OK ] NFS mountd を起動中: [ OK ] [root@cs-room443-i701 user01]# cat /etc/exports [root@cs-room443-i701 user01]# vi /etc/exports [root@cs-room443-i701 user01]# cat /etc/exports /home 133.88.120.0/255.255.255.0(rw,async) /usr/local 133.88.120.0/255.255.255.0(rw,async) [root@cs-room443-i701 user01]# /usr/sbin/exportfs -a -v exporting 133.88.120.0/255.255.255.0:/usr/local exporting 133.88.120.0/255.255.255.0:/home [root@cs-room443-i701 user01]# /usr/sbin/exportfs -v /usr/local 133.88.120.0/255.255.255.0(rw,async,wdelay,root_squash,no_subtree_check,anonuid=65534,anongid=65534) /home 133.88.120.0/255.255.255.0(rw,async,wdelay,root_squash,no_subtree_check,anonuid=65534,anongid=65534)

  9. cs-room443-i702からcs-room443-i701の/usr/localがマウントできることを確認。

    [root@cs-room443-i702 user01]# cd /etc/ [root@cs-room443-i702 etc]# cp fstab fstab.org [root@cs-room443-i702 etc]# vi fstab [root@cs-room443-i702 etc]# cat fstab /dev/VolGroup00/LogVol00 / ext3 defaults 1 1 LABEL=/boot /boot ext3 defaults 1 2 tmpfs /dev/shm tmpfs defaults 0 0 devpts /dev/pts devpts gid=5,mode=620 0 0 sysfs /sys sysfs defaults 0 0 proc /proc proc defaults 0 0 /dev/VolGroup00/LogVol01 swap swap defaults 0 0 # 2010-12-24 by T.Kouya cs-room443-i701-sist:/usr/local /usr/local nfs rw,hard,intr 0 0 cs-room443-i701-sist:/home /home nfs rw,hard,intr 0 0 [root@cs-room443-i702 etc]# mount /usr/local Unsupported nfs mount option: hard.intr [root@cs-room443-i702 etc]# mount /usr/local [root@cs-room443-i702 etc]# mount /dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) /dev/sda1 on /boot type ext3 (rw) tmpfs on /dev/shm type tmpfs (rw) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) cs-room443-i701-sist:/usr/local on /usr/local type nfs (rw,hard,intr,addr=133.88.120.73)

  10. cs-room443-i701をNISサーバ(NISドメイン名:cs-pccluster4)として設定し,ypbindでNISクライアントとしての機能も確認。

    [root@cs-room443-i701 user01]# yum install ypserv Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * addons: ftp.nara.wide.ad.jp * base: ftp.nara.wide.ad.jp * extras: ftp.nara.wide.ad.jp * updates: ftp.nara.wide.ad.jp Setting up Install Process Resolving Dependencies --> Running transaction check ---> Package ypserv.x86_64 0:2.19-5.el5 set to be updated --> Finished Dependency Resolution Dependencies Resolved ========================================================================================== Package Arch Version Repository Size ========================================================================================== Installing: ypserv x86_64 2.19-5.el5 base 138 k Transaction Summary ========================================================================================== Install 1 Package(s) Upgrade 0 Package(s) Total download size: 138 k Is this ok [y/N]: y Downloading Packages: ypserv-2.19-5.el5.x86_64.rpm | 138 kB 00:00 Running rpm_check_debug Running Transaction Test Finished Transaction Test Transaction Test Succeeded Running Transaction Installing : ypserv 1/1 Installed: ypserv.x86_64 0:2.19-5.el5 Complete! [root@cs-room443-i701 user01]# /sbin/chkconfig --list portmap 0:off 1:off 2:off 3:on 4:on 5:on 6:off ypbind 0:off 1:off 2:off 3:off 4:off 5:off 6:off yppasswdd 0:off 1:off 2:off 3:off 4:off 5:off 6:off ypserv 0:off 1:off 2:off 3:off 4:off 5:off 6:off ypxfrd 0:off 1:off 2:off 3:off 4:off 5:off 6:off [root@cs-room443-i701 user01]# /sbin/chkconfig ypbind on [root@cs-room443-i701 user01]# /sbin/chkconfig yppasswdd on [root@cs-room443-i701 user01]# /sbin/chkconfig ypserv on [root@cs-room443-i701 user01]# /sbin/chkconfig ypxfrd on [root@cs-room443-i701 user01]# /sbin/chkconfig --list ypbind 0:off 1:off 2:on 3:on 4:on 5:on 6:off yppasswdd 0:off 1:off 2:on 3:on 4:on 5:on 6:off ypserv 0:off 1:off 2:on 3:on 4:on 5:on 6:off ypxfrd 0:off 1:off 2:on 3:on 4:on 5:on 6:off [root@cs-room443-i701 user01]# vi /etc/hosts [root@cs-room443-i701 user01]# cat /etc/yp.conf # /etc/yp.conf - ypbind configuration file # Valid entries are # # domain NISDOMAIN server HOSTNAME # Use server HOSTNAME for the domain NISDOMAIN. # # domain NISDOMAIN broadcast # Use broadcast on the local net for domain NISDOMAIN # # domain NISDOMAIN slp # Query local SLP server for ypserver supporting NISDOMAIN # # ypserver HOSTNAME # Use server HOSTNAME for the local domain. The # IP-address of server must be listed in /etc/hosts. # # broadcast # If no server for the default domain is specified or # none of them is rechable, try a broadcast call to # find a server. # domain cs-pccluster4 server cs-room443-i701-sist [root@cs-room443-i701 user01]# vi /etc/sysconfig/network [root@cs-room443-i701 user01]# cat /etc/sysconfig/network NETWORKING=yes NETWORKING_IPV6=no HOSTNAME=cs-room443-i701 NISDOMAIN=cs-pccluster4 [root@cs-room443-i701 user01]# domainname (none) [root@cs-room443-i701 user01]# domainname cs-pccluster4 [root@cs-room443-i701 user01]# domainname cs-pccluster4 [root@cs-room443-i701 user01]# vi /etc/nsswitch.conf [root@cs-room443-i701 user01]# cat /etc/nsswitch.conf # # /etc/nsswitch.conf # # An example Name Service Switch config file. This file should be # sorted with the most-used services at the beginning. # # The entry '[NOTFOUND=return]' means that the search for an # entry should stop if the search in the previous entry turned # up nothing. Note that if the search failed due to some other reason # (like no NIS server responding) then the search continues with the # next entry. # # Legal entries are: # # nisplus or nis+ Use NIS+ (NIS version 3) # nis or yp Use NIS (NIS version 2), also called YP # dns Use DNS (Domain Name Service) # files Use the local files # db Use the local database (.db) files # compat Use NIS on compat mode # hesiod Use Hesiod for user lookups # [NOTFOUND=return] Stop searching if not found so far # # To use db, put the "db" in front of "files" for entries you want to be # looked up first in the databases # # Example: #passwd: db files nisplus nis #shadow: db files nisplus nis #group: db files nisplus nis #passwd: files passwd: db files nisplus nis #shadow: files shadow: db files nisplus nis #group: files group: db files nisplus nis hosts: db files nisplus nis dns #hosts: files dns # Example - obey only what nisplus tells us... #services: nisplus [NOTFOUND=return] files #networks: nisplus [NOTFOUND=return] files #protocols: nisplus [NOTFOUND=return] files #rpc: nisplus [NOTFOUND=return] files #ethers: nisplus [NOTFOUND=return] files #netmasks: nisplus [NOTFOUND=return] files bootparams: nisplus [NOTFOUND=return] files ethers: files netmasks: files networks: files protocols: files rpc: files services: files netgroup: nisplus publickey: nisplus automount: files nisplus aliases: files nisplus [root@cs-room443-i701 user01]# cd /var/yp [root@cs-room443-i701 yp]# /sbin/service ypserv start YP サーバーサービスを起動中: [ OK ] [root@cs-room443-i701 yp]# make gmake[1]: ディレクトリ `/var/yp/cs-pccluster4' に入ります Updating passwd.byname... Updating passwd.byuid... Updating group.byname... Updating group.bygid... Updating hosts.byname... Updating hosts.byaddr... Updating rpc.byname... Updating rpc.bynumber... Updating services.byname... Updating services.byservicename... Updating netid.byname... Updating protocols.bynumber... Updating protocols.byname... Updating mail.aliases... gmake[1]: ディレクトリ `/var/yp/cs-pccluster4' から出ます [root@cs-room443-i701 yp]# /sbin/service ypserv restart YP サーバーサービスを停止中: [ OK ] YP サーバーサービスを起動中: [ OK ] [root@cs-room443-i701 yp]# /sbin/service ypbind restart NIS サービスを停止中: [ OK ] NIS ドメインにバインド中: [ OK ] NIS ドメインサーバーを検索中 [root@cs-room443-i701 yp]# ypcat hosts 133.88.120.88 cs-bacchus 133.88.120.73 cs-room443-i701-sist 133.88.121.242 cs-athena-sist 133.88.160.142 cs-webserver03 133.88.121.79 cs-muse 192.168.2.1 cs-northpole-g 133.88.120.197 cs-hera-sist 133.88.160.141 cs-room509-wlan 127.0.0.1 localhost.localdomain localhost 127.0.0.1 localhost.localdomain localhost 133.88.121.136 cs-northpole 192.168.2.31 cs-hera 133.88.120.72 cs-room443-02 133.88.120.74 cs-room443-i702-sist 192.168.2.32 cs-room443-i701 133.88.120.87 cs-minerva 133.88.160.140 cs-webserver01 192.168.2.30 cs-athena 192.168.2.33 cs-room443-i702 133.88.120.84 cs-hestia [root@cs-room443-i701 yp]# ypcat passwd user01:$1$VtixqAtB$y4MjxaTGiX1FBJkjwUbl//:500:500:General User No.1 :/home/user01:/bin/bash (再起動) [user01@cs-room443-i701 ~]$ ypcat hosts 133.88.120.88 cs-bacchus 133.88.120.73 cs-room443-i701-sist 133.88.121.242 cs-athena-sist 133.88.160.142 cs-webserver03 133.88.121.79 cs-muse 192.168.2.1 cs-northpole-g 133.88.120.197 cs-hera-sist 133.88.160.141 cs-room509-wlan 127.0.0.1 localhost.localdomain localhost 127.0.0.1 localhost.localdomain localhost 133.88.121.136 cs-northpole 192.168.2.31 cs-hera 133.88.120.72 cs-room443-02 133.88.120.74 cs-room443-i702-sist 192.168.2.32 cs-room443-i701 133.88.120.87 cs-minerva 133.88.160.140 cs-webserver01 192.168.2.30 cs-athena 192.168.2.33 cs-room443-i702 133.88.120.84 cs-hestia

  11. cs-room443-i702をNISクライアントとして動作させる。

    [root@cs-room443-i702 etc]# vi /etc/yp.conf [root@cs-room443-i702 etc]# vi /etc/sysconfig/network [root@cs-room443-i702 etc]# domainname cs-pccluster4 [root@cs-room443-i702 etc]# /sbin/chkconfig ypbind on [root@cs-room443-i702 etc]# /sbin/service ypbind start NIS ドメインにバインド中: [ OK ] NIS ドメインサーバーを検索中 [root@cs-room443-i702 etc]# ypcat hosts 133.88.120.88 cs-bacchus 133.88.120.73 cs-room443-i701-sist 133.88.121.242 cs-athena-sist 133.88.160.142 cs-webserver03 133.88.121.79 cs-muse 192.168.2.1 cs-northpole-g 133.88.120.197 cs-hera-sist 133.88.160.141 cs-room509-wlan 127.0.0.1 localhost.localdomain localhost 127.0.0.1 localhost.localdomain localhost 133.88.121.136 cs-northpole 192.168.2.31 cs-hera 133.88.120.72 cs-room443-02 133.88.120.74 cs-room443-i702-sist 192.168.2.32 cs-room443-i701 133.88.120.87 cs-minerva 133.88.160.140 cs-webserver01 192.168.2.30 cs-athena 192.168.2.33 cs-room443-i702 133.88.120.84 cs-hestia [root@cs-room443-i702 etc]# mount /home (再起動) [user01@cs-room443-i702 ~]$ ypcat hosts 133.88.120.88 cs-bacchus 133.88.120.73 cs-room443-i701-sist 133.88.121.242 cs-athena-sist 133.88.160.142 cs-webserver03 133.88.121.79 cs-muse 192.168.2.1 cs-northpole-g 133.88.120.197 cs-hera-sist 133.88.160.141 cs-room509-wlan 127.0.0.1 localhost.localdomain localhost 127.0.0.1 localhost.localdomain localhost 133.88.121.136 cs-northpole 192.168.2.31 cs-hera 133.88.120.72 cs-room443-02 133.88.120.74 cs-room443-i702-sist 192.168.2.32 cs-room443-i701 133.88.120.87 cs-minerva 133.88.160.140 cs-webserver01 192.168.2.30 cs-athena 192.168.2.33 cs-room443-i702 133.88.120.84 cs-hestia [user01@cs-room443-i702 ~]$ mount /dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) /dev/sda1 on /boot type ext3 (rw) tmpfs on /dev/shm type tmpfs (rw) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) cs-room443-i701-sist:/usr/local on /usr/local type nfs (rw,hard,intr,addr=133.88.120.73) cs-room443-i701-sist:/home on /home type nfs (rw,hard,intr,addr=133.88.120.73) [user01@cs-room443-i702 ~]$

  12. cs-room443-i701のOpenMPI設定を変更。

    [root@cs-room443-i701 user01]# cd /usr/lib64/openmpi/ [root@cs-room443-i701 openmpi]# ls 1.4-gcc [root@cs-room443-i701 openmpi]# cd 1.4-gcc [root@cs-room443-i701 1.4-gcc]# ls bin etc include lib man share [root@cs-room443-i701 1.4-gcc]# cd etc [root@cs-room443-i701 etc]# ls mpivars.csh openmpi-default-hostfile openmpi-totalview.tcl mpivars.sh openmpi-mca-params.conf [root@cs-room443-i701 etc]# cp openmpi-default-hostfile openmpi-default-hostfile.org [root@cs-room443-i701 etc]# vi openmpi-default-hostfile [root@cs-room443-i701 etc]# cp openmpi-mca-params.conf openmpi-mca-params.conf.org [root@cs-room443-i701 etc]# vi openmpi-mca-params.conf [root@cs-room443-i701 etc]# cat openmpi-default-hostfile # # Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana # University Research and Technology # Corporation. All rights reserved. # Copyright (c) 2004-2005 The University of Tennessee and The University # of Tennessee Research Foundation. All rights # reserved. # Copyright (c) 2004-2005 High Performance Computing Center Stuttgart, # University of Stuttgart. All rights reserved. # Copyright (c) 2004-2005 The Regents of the University of California. # All rights reserved. # $COPYRIGHT$ # # Additional copyrights may follow # # $HEADER$ # # This is the default hostfile for Open MPI. Notice that it does not # contain any hosts (not even localhost). This file should only # contain hosts if a system administrator wants users to always have # the same set of default hosts, and is not using a batch scheduler # (such as SLURM, PBS, etc.). # # If you are primarily interested in running Open MPI on one node, you # should *not* simply list "localhost" in here (contrary to prior MPI # implementations, such as LAM/MPI). A localhost-only node list is # created by the RAS component named "localhost" if no other RAS # components were able to find any hosts to run on (this behavior can # be disabled by excluding the localhost RAS component by specifying # the value "^localhost" [without the quotes] to the "ras" MCA # parameter). cs-room443-i701 cpu=4 cs-room443-i702 cpu=4 [root@cs-room443-i701 etc]# cat openmpi-mca-params.conf # # Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana # University Research and Technology # Corporation. All rights reserved. # Copyright (c) 2004-2005 The University of Tennessee and The University # of Tennessee Research Foundation. All rights # reserved. # Copyright (c) 2004-2005 High Performance Computing Center Stuttgart, # University of Stuttgart. All rights reserved. # Copyright (c) 2004-2005 The Regents of the University of California. # All rights reserved. # Copyright (c) 2006 Cisco Systems, Inc. All rights reserved. # $COPYRIGHT$ # # Additional copyrights may follow # # $HEADER$ # # This is the default system-wide MCA parameters defaults file. # Specifically, the MCA parameter "mca_param_files" defaults to a # value of # "$HOME/.openmpi/mca-params.conf:$sysconf/openmpi-mca-params.conf" # (this file is the latter of the two). So if the default value of # mca_param_files is not changed, this file is used to set system-wide # MCA parameters. This file can therefore be used to set system-wide # default MCA parameters for all users. Of course, users can override # these values if they want, but this file is an excellent location # for setting system-specific MCA parameters for those users who don't # know / care enough to investigate the proper values for them. # Note that this file is only applicable where it is visible (in a # filesystem sense). Specifically, MPI processes each read this file # during their startup to determine what default values for MCA # parameters should be used. mpirun does not bundle up the values in # this file from the node where it was run and send them to all nodes; # the default value decisions are effectively distributed. Hence, # these values are only applicable on nodes that "see" this file. If # $sysconf is a directory on a local disk, it is likely that changes # to this file will need to be propagated to other nodes. If $sysconf # is a directory that is shared via a networked filesystem, changes to # this file will be visible to all nodes that share this $sysconf. # The format is straightforward: one per line, mca_param_name = # rvalue. Quoting is ignored (so if you use quotes or escape # characters, they'll be included as part of the value). For example: # Disable run-time MPI parameter checking # mpi_param_check = 0 btl=^openib,udapl btl_tcp_if_include=eth0 # Note that the value "~/" will be expanded to the current user's home # directory. For example: # Change component loading path # component_path = /usr/local/lib/openmpi:~/my_openmpi_components # See "ompi_info --param all all" for a full listing of Open MPI MCA # parameters available and their default values. orte_default_hostfile = /usr/lib64/openmpi/1.4-gcc/etc/openmpi-default-hostfile plm_rsh_agent = ssh

  13. MPIプログラムの動作確認。

    [user01@cs-room443-i701 ~]$ cat mpi_hellow.c #include #include "mpi.h" int main(int argc, char *argv[]) { int myrank, numprocs, length_name; char nodename[128]; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); MPI_Comm_size(MPI_COMM_WORLD, &numprocs); MPI_Get_processor_name(nodename, &length_name); printf("Hellow, MPI! (%0d/%0d)-- %s\n", myrank, numprocs, nodename); MPI_Finalize(); return 0; } [user01@cs-room443-i701 ~]$ mpirun -np 8 ./a.out The authenticity of host 'cs-room443-i702 (192.168.2.33)' can't be established. RSA key fingerprint is 17:bb:ac:f1:93:1d:e8:8a:a7:23:c4:0d:76:70:7f:5d. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'cs-room443-i702' (RSA) to the list of known hosts. user01@cs-room443-i702's password:YOUR PASSWORD(表示されない) Hellow, MPI! (0/8)-- cs-room443-i701 Hellow, MPI! (3/8)-- cs-room443-i701 Hellow, MPI! (1/8)-- cs-room443-i701 Hellow, MPI! (2/8)-- cs-room443-i701 Hellow, MPI! (4/8)-- cs-room443-i702 Hellow, MPI! (6/8)-- cs-room443-i702 Hellow, MPI! (7/8)-- cs-room443-i702 Hellow, MPI! (5/8)-- cs-room443-i702

  14. sshをパスワードなしでログインできるように公開鍵を登録。

    [user01@cs-room443-i701 ~]$ ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/home/user01/.ssh/id_rsa): Enter passphrase (empty for no passphrase):(ENTERキーのみ=パスワードなし) Enter same passphrase again:(ENTERキーのみ) Your identification has been saved in /home/user01/.ssh/id_rsa. Your public key has been saved in /home/user01/.ssh/id_rsa.pub. The key fingerprint is: b4:cb:0f:21:15:45:0a:50:e8:e2:09:d8:00:b9:18:45 user01@cs-room443-i701 [user01@cs-room443-i701 ~]$ cd .ssh [user01@cs-room443-i701 .ssh]$ cp id_rsa.pub authorized_keys [user01@cs-room443-i701 .ssh]$ ssh cs-room443-i702 Last login: Fri Dec 24 19:59:23 2010 from 133.88.121.171 [user01@cs-room443-i702 ~]$ <--パスワードなしでログインできることを確認! [user01@cs-room443-i702 ~]$ exit logout Connection to cs-room443-i702 closed.

  15. パスワードなしでMPIプログラムが動作することを確認。

    [user01@cs-room443-i701 .ssh]$ cd [user01@cs-room443-i701 ~]$ mpirun -np 8 ./a.out Hellow, MPI! (1/8)-- cs-room443-i701 Hellow, MPI! (2/8)-- cs-room443-i701 Hellow, MPI! (0/8)-- cs-room443-i701 Hellow, MPI! (3/8)-- cs-room443-i701 Hellow, MPI! (5/8)-- cs-room443-i702 Hellow, MPI! (4/8)-- cs-room443-i702 Hellow, MPI! (6/8)-- cs-room443-i702 Hellow, MPI! (7/8)-- cs-room443-i702 [user01@cs-room443-i701 ~]$

  16. cs-room443-i702でもOpemMPI(10参照)しておくと,マルチノードでMPIがcs-room443-i702でも使用可能になる。

    [root@cs-room443-i702 user01]# cd /usr/lib64/openmpi/1.4-gcc/etc [root@cs-room443-i702 etc]# ls mpivars.csh openmpi-default-hostfile openmpi-totalview.tcl mpivars.sh openmpi-mca-params.conf [root@cs-room443-i702 etc]# cp openmpi-default-hostfile openmpi-default-hostfile .org cp: target `.org' is not a directory [root@cs-room443-i702 etc]# cp openmpi-default-hostfile openmpi-default-hostfile.org [root@cs-room443-i702 etc]# scp user01@cs-room443-i701:/usr/lib64/openmpi/1.4-gcc/etc/openmpi-default-hostfile ./ The authenticity of host 'cs-room443-i701 (192.168.2.32)' can't be established. RSA key fingerprint is 27:d6:4d:3a:08:9d:cd:38:7f:83:1f:a7:c2:d8:ed:4a. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'cs-room443-i701,192.168.2.32' (RSA) to the list of known hosts. user01@cs-room443-i701's password:YOUR PASSWORD openmpi-default-hostfile 100% 1551 1.5KB/s 00:00 [root@cs-room443-i702 etc]# cp openmpi-mca-params.conf openmpi-mca-params.conf.org [root@cs-room443-i702 etc]# scp user01@cs-room443-i701:/usr/lib64/openmpi/1.4-gcc/etc/open mpi-mca-params.conf ./ user01@cs-room443-i701's password:YOUR PASSWORD openmpi-mca-params.conf 100% 2955 2.9KB/s 00:00 [root@cs-room443-i702 etc]# exit exit [user01@cs-room443-i702 ~]$ ls Desktop how_to_setup_openmpi_cluster_with_centos.txt pool a.out how_to_setup_openmpi_cluster_with_centos.txt~ hellow-mpi.c mpi_hellow.c [user01@cs-room443-i702 ~]$ mpirun -np 8 ./a.out The authenticity of host 'cs-room443-i701 (192.168.2.32)' can't be established. RSA key fingerprint is 27:d6:4d:3a:08:9d:cd:38:7f:83:1f:a7:c2:d8:ed:4a. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'cs-room443-i701,192.168.2.32' (RSA) to the list of known hosts. Hellow, MPI! (0/8)-- cs-room443-i701 Hellow, MPI! (2/8)-- cs-room443-i701 Hellow, MPI! (1/8)-- cs-room443-i701 Hellow, MPI! (3/8)-- cs-room443-i701 Hellow, MPI! (4/8)-- cs-room443-i702 Hellow, MPI! (5/8)-- cs-room443-i702 Hellow, MPI! (6/8)-- cs-room443-i702 Hellow, MPI! (7/8)-- cs-room443-i702 [user01@cs-room443-i702 ~]$ ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/home/user01/.ssh/id_rsa): ./.ssh/id_rsa_2 Enter passphrase (empty for no passphrase):(ENTERキーのみ) Enter same passphrase again:(ENTERキーのみ) Your identification has been saved in ./.ssh/id_rsa_2. Your public key has been saved in ./.ssh/id_rsa_2.pub. The key fingerprint is: 28:7d:85:50:86:ba:d8:4d:91:64:e8:85:b8:34:e2:44 user01@cs-room443-i702 [user01@cs-room443-i702 ~]$ cd .ssh [user01@cs-room443-i702 .ssh]$ cat id_rsa_2.pub >> authorized_keys [user01@cs-room443-i702 .ssh]$ ssh cs-room443-i701 Last login: Fri Dec 24 19:53:38 2010 [user01@cs-room443-i701 ~]$ exit logout Connection to cs-room443-i701 closed. [user01@cs-room443-i702 .ssh]$ cd [user01@cs-room443-i702 ~]$ mpirun -np 8 ./a.out Hellow, MPI! (5/8)-- cs-room443-i702 Hellow, MPI! (6/8)-- cs-room443-i702 Hellow, MPI! (4/8)-- cs-room443-i702 Hellow, MPI! (7/8)-- cs-room443-i702 Hellow, MPI! (0/8)-- cs-room443-i701 Hellow, MPI! (1/8)-- cs-room443-i701 Hellow, MPI! (2/8)-- cs-room443-i701 Hellow, MPI! (3/8)-- cs-room443-i701 [user01@cs-room443-i702 ~]$

  17. ベンチマークテストして性能を測ってみる。IMBの場合,http://software.intel.com/en-us/articles/intel-mpi-benchmarks/からIMB_3.2.tgzをダウンロードして解凍し,imb/srcへ移動してコンパイル・起動すればよい。 以下,詳細を記す。

    [user01@cs-room443-i701 imb]$ tar zxvf IMB_3.2.tgz.tgz imb/ imb/ReadMe_first imb/WINDOWS/ imb/WINDOWS/IMB-EXT_VS_2005/ imb/WINDOWS/IMB-EXT_VS_2005/IMB-EXT.rc (略) imb/src/make_ict imb/src/make_mpich imb/versions_news/ imb/versions_news/Version_history [user01@cs-room443-i701 imb]$ ls IMB_3.2.tgz.tgz imb [user01@cs-room443-i701 imb]$ cd imb [user01@cs-room443-i701 imb]$ ls ReadMe_first WINDOWS doc license src versions_news [user01@cs-room443-i701 imb]$ cd src [user01@cs-room443-i701 src]$ ls GNUmakefile IMB_cache.h IMB_init_transfer.c IMB_reduce.c IMB.c IMB_chk_diff.c IMB_mem_info.h IMB_reduce_scatter.c IMB_allgather.c IMB_comm_info.h IMB_mem_manager.c IMB_scatter.c IMB_allgatherv.c IMB_comments.h IMB_ones_accu.c IMB_scatterv.c IMB_allreduce.c IMB_cpu_exploit.c IMB_ones_bidir.c IMB_sendrecv.c IMB_alltoall.c IMB_declare.c IMB_ones_unidir.c IMB_settings.h IMB_alltoallv.c IMB_declare.h IMB_open_close.c IMB_settings_io.h IMB_appl_errors.h IMB_err_check.h IMB_output.c IMB_strgs.c IMB_barrier.c IMB_err_handler.c IMB_parse_name_ext.c IMB_user_set_info.c IMB_bcast.c IMB_exchange.c IMB_parse_name_io.c IMB_warm_up.c IMB_benchlist.c IMB_g_info.c IMB_parse_name_mpi1.c IMB_window.c IMB_benchmark.h IMB_gather.c IMB_pingping.c IMB_write.c IMB_bnames_ext.h IMB_gatherv.c IMB_pingpong.c Makefile.base IMB_bnames_io.h IMB_init.c IMB_prototypes.h make_ict IMB_bnames_mpi1.h IMB_init_file.c IMB_read.c make_mpich [user01@cs-room443-i701 src]$ make -f make_mpich MPI_HOME=/usr/lib64/openmpi/1.4-gcc touch exe_io *.c; rm -rf exe_ext exe_mpi1 make -f Makefile.base IO CPP=MPIIO make[1]: ディレクトリ `/home/user01/pool/imb/imb/src' に入ります /usr/lib64/openmpi/1.4-gcc/bin/mpicc -I/usr/lib64/openmpi/1.4-gcc/include -DMPIIO -O3 -c IMB.c /usr/lib64/openmpi/1.4-gcc/bin/mpicc -I/usr/lib64/openmpi/1.4-gcc/include -DMPIIO -O3 -c IMB_declare.c (略) make[1]: ディレクトリ `/home/user01/pool/imb/imb/src' から出ます [user01@cs-room443-i701 src]$ ls GNUmakefile IMB_cache.h IMB_mem_info.h IMB_reduce.o IMB-EXT IMB_chk_diff.c IMB_mem_manager.c IMB_reduce_scatter.c IMB-IO IMB_chk_diff.o IMB_mem_manager.o IMB_reduce_scatter.o IMB-MPI1 IMB_comm_info.h IMB_ones_accu.c IMB_scatter.c (略) [user01@cs-room443-i701 src]$ mpirun -np 8 ./IMB-MPI1 #--------------------------------------------------- # Intel (R) MPI Benchmark Suite V3.2, MPI-1 part #--------------------------------------------------- # Date : Fri Dec 24 20:27:52 2010 # Machine : x86_64 # System : Linux # Release : 2.6.18-194.26.1.el5 # Version : #1 SMP Tue Nov 9 12:54:20 EST 2010 # MPI Version : 2.1 # MPI Thread Environment: MPI_THREAD_SINGLE # New default behavior from Version 3.2 on: # the number of iterations per message size is cut down # dynamically when a certain run time (per message size sample) # is expected to be exceeded. Time limit is defined by variable # "SECS_PER_SAMPLE" (=> IMB_settings.h) # or through the flag => -time # Calling sequence was: # ./IMB-MPI1 # Minimum message length in bytes: 0 # Maximum message length in bytes: 4194304 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions : MPI_FLOAT # MPI_Op : MPI_SUM # # # List of Benchmarks to run: # PingPong # PingPing # Sendrecv # Exchange # Allreduce # Reduce # Reduce_scatter # Allgather # Allgatherv # Gather # Gatherv # Scatter # Scatterv # Alltoall # Alltoallv # Bcast # Barrier #--------------------------------------------------- # Benchmarking PingPong # #processes = 2 # ( 6 additional processes waiting in MPI_Barrier) #--------------------------------------------------- #bytes #repetitions t[usec] Mbytes/sec 0 1000 1.16 0.00 1 1000 0.95 1.01 2 1000 0.96 1.99 4 1000 0.95 4.01 8 1000 0.96 7.96 16 1000 0.51 30.15 32 1000 0.52 59.20 64 1000 0.53 115.47 128 1000 0.52 236.54 256 1000 0.54 452.60 512 1000 0.59 823.98 1024 1000 0.66 1476.30 2048 1000 0.89 2185.99 4096 1000 3.02 1293.64 8192 1000 2.59 3020.37 16384 1000 3.86 4051.06 32768 1000 7.34 4259.87 65536 640 12.01 5204.58 131072 320 23.72 5270.80 262144 160 45.98 5437.00 524288 80 77.31 6467.83 1048576 40 178.55 5600.71 2097152 20 451.68 4427.93 4194304 10 1062.89 3763.31 (略) #--------------------------------------------------- # Benchmarking Barrier # #processes = 2 # ( 6 additional processes waiting in MPI_Barrier) #--------------------------------------------------- #repetitions t_min[usec] t_max[usec] t_avg[usec] 1000 0.84 0.84 0.84 #--------------------------------------------------- # Benchmarking Barrier # #processes = 4 # ( 4 additional processes waiting in MPI_Barrier) #--------------------------------------------------- #repetitions t_min[usec] t_max[usec] t_avg[usec] 1000 3.68 3.69 3.69 #--------------------------------------------------- # Benchmarking Barrier # #processes = 8 #--------------------------------------------------- #repetitions t_min[usec] t_max[usec] t_avg[usec] 1000 68.50 69.61 69.06 # All processes entering MPI_Finalize [user01@cs-room443-i701 src]$


Tomonori Kouya < tkouya [ atmark ] cs.sist.ac.jp >

<- Back to Top