您的当前位置:首页正文

mpich2-doc-user

来源:个人技术集锦
MPICH2User’sGuide∗

Version1.0.7

MathematicsandComputerScienceDivision

ArgonneNationalLaboratory

WilliamGroppEwingLuskDavidAshtonPavanBalajiDariusBuntinasRalphButlerAnthonyChanDavidGoodellJayeshKrishnaGuillaumeMercier

RobRossRajeevThakurBrianToonenApril2,2008

ThisworkwassupportedbytheMathematical,Information,andComputationalSci-encesDivisionsubprogramoftheOfficeofAdvancedScientificComputingResearch,Sci-DACProgram,OfficeofScience,U.S.DepartmentofEnergy,underContractDE-AC02-06CH11357.

1

Contents

1Introduction

2MigratingtoMPICH2fromMPICH12.12.22.3

DefaultRuntimeEnvironment

.................

111222344455566671112121515

StartingParallelJobs......................Command-LineArgumentsinFortran.............

3QuickStart

4CompilingandLinking4.14.24.34.4

SpecifyingCompilers.......................SharedLibraries.........................SpecialIssuesforC++......................SpecialIssuesforFortran....................

5RunningProgramswithmpiexec5.15.25.3

Standardmpiexec........................ExtensionsforAllProcessManagementEnvironments....ExtensionsfortheMPDProcessManagementEnvironment.5.3.15.3.25.3.35.4

BasicmpiexecargumentsforMPD...........OtherCommand-LineArgumentstompiexecforMPDEnvironmentVariablesAffectingmpiexecforMPD..

ExtensionsforSMPDProcessManagementEnvironment..5.4.1

mpiexecargumentsforSMPD.............

5.5ExtensionsforthegforkerProcessManagementEnvironment5.5.1

mpiexecargumentsforgforker.............

i

5.65.7

RestrictionsoftheremshellProcessManagementEnvironment17UsingMPICH2withSLURMandPBS.............5.7.15.7.2

MPDinthePBSenvironment.............OSCmpiexec.......................

1718186ManagingtheProcessManagementEnvironment6.1

MPD................................

7Debugging7.1gdbviampiexec.........................7.2

TotalView.............................

8MPE8.1MPILogging...........................8.2User-definedlogging.......................8.3MPIChecking...........................8.4

MPEoptions...........................

9OtherToolsProvidedwithMPICH210MPICH2underWindows

10.1Directories.............................10.2Compiling.............................10.3Running..............................AFrequentlyAskedQuestions

A.1GeneralInformation.......................

A.1.1Q:WhatisMPICH2?..................A.1.2Q:WhatdoesMPICHstandfor?............

ii

18181919232424252627282828282930303030

A.1.3Q:CanMPIbeusedtoprogrammulticoresystems?.A.2BuildingMPICH2........................

A.2.1Q:WhatisthedifferencebetweentheMPDandSMPD

processmanagers?....................A.2.2Q:DoIhavetoconfigure/make/installMPICH2each

timeforeachcompilerIuse?..............A.2.3Q:HowdoIconfiguretousetheAbsoftFortrancom-pilers?...........................A.2.4Q:WhenIconfigureMPICH2,Igetamessageabout

FDZEROandtheconfigureaborts...........A.2.5Q:WhenIusetheg95Fortrancompilerona64-bit

platform,someofthetestsfail.............

30313131323333

A.2.6Q:WhenIrunmake,itfailsimmediatelywithmany

errorsbeginningwith“sock.c:8:24:mpidusock.h:NosuchfileordirectoryInfileincludedfromsock.c:9:../../../../include/mpiimpl.h:91:21:mpidpre.h:NosuchfileordirectoryInfileincludedfromsock.c:9:../../../../in-clude/mpiimpl.h:1150:error:syntaxerrorbefore”MPIDVCRT”../../../../include/mpiimpl.h:1150:warning:nosemi-colonatendofstructorunion”.............34A.2.7Q:Whenbuildingthessmorsshmchannel,Igetthe

error“mpiduprocesslocks.h:234:2:error:#error***Noatomicmemoryoperationspecifiedtoimplementbusylocks***”......................A.2.8Q:WhenusingtheIntelFortran90compiler(version

9),themakefailswitherrorsincompilingstatementthatreferenceMPIADDRESSKIND.............A.2.9Q:ThebuildfailswhenIuseparallelmake......A.3WindowsversionofMPICH2..................

A.3.1IamhavingtroubleinstallingandusingtheWindows

versionofMPICH2....................A.4CompilingMPIPrograms....................

34

3435353535

iii

A.4.1C++andSEEKSET...................A.4.2C++andErrorsinNullcomm::Clone

.........

35363737

A.5RunningMPIPrograms.....................

A.5.1Q:HowdoIpassenvironmentvariablestothepro-cessesofmyparallelprogram..............A.5.2Q:HowdoIpassenvironmentvariablestothepro-cessesofmyparallelprogramwhenusingthempdprocessmanager?.....................A.5.3Q:WhatdeterminesthehostsonwhichmyMPIpro-cessesrun?........................A.5.4Q:OnWindows,IgetanerrorwhenIattempttocall

MPICommspawn......................

373739

A.5.5Q:Myoutputdoesnotappearuntiltheprogramexits39A.5.6Q:Fortranprogramsusingstdiofailwhenusingg95.A.5.7Q:HowdoIrunMPIprogramsinthebackground

whenusingthedefaultMPDprocessmanager?....

4040

iv

1INTRODUCTION1

1Introduction

ThismanualassumesthatMPICH2hasalreadybeeninstalled.Forinstruc-tionsonhowtoinstallMPICH2,seetheMPICH2Installer’sGuide,ortheREADMEinthetop-levelMPICH2directory.Thismanualexplainshowtocompile,link,andrunMPIapplications,andusecertaintoolsthatcomewithMPICH2.Thisisapreliminaryversionandsomesectionsarenotcompleteyet.However,thereshouldbeenoughheretogetyoustartedwithMPICH2.

2MigratingtoMPICH2fromMPICH1

IfyouhavebeenusingMPICH1.2.x(1.2.7p1isthelatestversion),youwillfindanumberofthingsaboutMPICH2thataredifferent(andhopefullybetterineverycase.)YourMPIapplicationprogramsneednotchange,ofcourse,butanumberofthingsabouthowyourunthemwillbedifferent.MPICH2isanall-newimplementationoftheMPIStandard,designedtoimplementalloftheMPI-2additionstoMPI(dynamicprocessmanagement,one-sidedoperations,parallelI/O,andotherextensions)andtoapplythelessonslearnedinimplementingMPICH1tomakeMPICH2morerobust,efficient,andconvenienttouse.TheMPICH2Installer’sGuideprovidessomeinformationonchangesbetweenMPICH1andMPICH2totheprocessofconfiguringandinstallingMPICH.Changestocompiling,linking,andrunningMPIprogramsbetweenMPICH1andMPICH2aredescribedbelow.

2.1DefaultRuntimeEnvironment

InMPICH1,thedefaultconfigurationusedthenow-oldp4portablepro-grammingenvironment.Processeswerestartedviaremoteshellcommands(rshorssh)andtheinformationnecessaryforprocessestofindandcon-nectwithoneanotheroversocketswascollectedandthendistributedatstartuptimeinanon-scalablefashion.Furthermore,theentanglementofprocessmanagmentfunctionalitywiththecommunicationmechanismledtoconfusingbehaviorofthesystemwhenthingswentwrong.

MPICH2providesaseparationofprocessmanagementandcommunica-tion.Thedefaultruntimeenvironmentconsistsofasetofdaemons,called

3QUICKSTART2

mpd’s,thatestablishcommunicationamongthemachinestobeusedbe-foreapplicationprocessstartup,thusprovidingaclearerpictureofwhatiswrongwhencommunicationcannotbeestablishedandprovidingafastandscalablestartupmechanismwhenparalleljobsarestarted.Section6.1de-scribestheMPDprocessmanagementsysteminmoredetail.Otherprocessmanagersarealsoavailable.

2.2StartingParallelJobs

MPICH1providedthempiruncommandtostartMPICH1jobs.TheMPI-2Forumrecommendedastandard,portablecommand,calledmpiexec,forthispurpose.MPICH2implementsmpiexecandallofitsstandardargu-ments,togetherwithsomeextensions.SeeSection5.1forstandardar-gumentstompiexecandvarioussubsectionsofSection5forextensionsparticulartovariousprocessmanagementsystems.

MPICH2alsoprovidesanmpiruncommandforsimplebackwardcom-patibility,butMPICH2’smpirundoesnotprovidealltheoptionsofmpiexecoralloftheoptionsofMPICH1’smpirun.

2.3Command-LineArgumentsinFortran

MPICH1(morepreciselyMPICH1’smpirun)requiredaccesstocommandlineargumentsinallapplicationprograms,includingFortranones,andMPICH1’sconfiguredevotedsomeefforttofindingthelibrariesthatcon-tainedtherightversionsofiargcandgetargandincludingthoselibrarieswithwhichthempif77scriptlinkedMPIprograms.SinceMPICH2doesnotrequireaccesstocommandlineargumentstoapplications,thesefunctionsareoptional,andconfiguredoesnothingspecialwiththem.Ifyouneedtheminyourapplications,youwillhavetoensurethattheyareavailableintheFortranenvironmentyouareusing.

3QuickStart

TouseMPICH2,youwillhavetoknowthedirectorywhereMPICH2hasbeeninstalled.(Eitheryouinstalleditthereyourself,oryoursystemsadmin-istratorhasinstalledit.Oneplacetolookinthiscasemightbe/usr/local.

4COMPILINGANDLINKING3

IfMPICH2hasnotyetbeeninstalled,seetheMPICH2Installer’sGuide.)Wesuggestthatyouputthebinsubdirectoryofthatdirectoryintoyourpath.ThiswillgiveyouaccesstoassortedMPICH2commandstocompile,link,andrunyourprogramsconveniently.Othercommandsinthisdirectorymanagepartsoftherun-timeenvironmentandexecutetools.

Oneofthefirstcommandsyoumightrunismpich2versiontofindouttheexactversionandconfigurationofMPICH2youareworkingwith.SomeofthematerialinthismanualdependsonjustwhatversionofMPICH2youareusingandhowitwasconfiguredatinstallationtime.

YoushouldnowbeabletorunanMPIprogram.LetusassumethatthedirectorywhereMPICH2hasbeeninstalledis/home/you/mpich2-installed,andthatyouhaveaddedthatdirectorytoyourpath,using

setenvPATH/home/you/mpich2-installed/bin:$PATHfortcshandcsh,or

exportPATH=/home/you/mpich2-installed/bin:$PATH

forbashorsh.ThentorunanMPIprogram,albeitonlyononemachine,youcando:

mpd&

cd/home/you/mpich2-installed/examplesmpiexec-n3cpimpdallexit

Detailsforthesecommandsareprovidedbelow,butifyoucansuccessfullyexecutethemhere,thenyouhaveacorrectlyinstalledMPICH2andhaverunanMPIprogram.

4CompilingandLinking

AconvenientwaytocompileandlinkyourprogramisbyusingscriptsthatusethesamecompilerthatMPICH2wasbuiltwith.Thesearempicc,mpicxx,mpif77,andmpif90,forC,C++,Fortran77,andFortran90pro-grams,respectively.Ifanyofthesecommandsaremissing,itmeansthatMPICH2wasconfiguredwithoutsupportforthatparticularlanguage.

4COMPILINGANDLINKING4

4.1SpecifyingCompilers

YouneednotusethesamecompilerthatMPICH2wasbuiltwith,butnotallcompilersarecompatible.YoucanalsospecifythecompilerforbuildingMPICH2itself,asreportedbympich2version,justbyusingthecompilingandlinkingcommandsfromtheprevioussection.TheenvironmentvariablesMPICHCC,MPICHCXX,MPICHF77,andMPICHF90maybeusedtospecifyalternateC,C++,Fortran77,andFortran90compilers,respectively.

4.2SharedLibraries

CurrentlysharedlibrariesareonlytestedonLinuxandMacOSX,andtherearerestrictions.SeetheInstaller’sGuideforhowtobuildMPICH2asasharedlibrary.Ifsharedlibrarieshavebeenbuilt,youwillgetthemauto-maticallywhenyoulinkyourprogramwithanyoftheMPICH2compilationscripts.

4.3SpecialIssuesforC++

Someusersmaygeterrormessagessuchas

SEEK_SETis#definedbutmustnotbefortheC++bindingofMPI

Theproblemisthatbothstdio.handtheMPIC++interfaceuseSEEKSET,SEEKCUR,andSEEKEND.ThisisreallyabugintheMPI-2standard.Youcantryadding

#undefSEEK_SET#undefSEEK_END#undefSEEK_CUR

beforempi.hisincluded,oraddthedefinition

-DMPICH_IGNORE_CXX_SEEK

tothecommandline(thiswillcausetheMPIversionsofSEEKSETetc.tobeskipped).

5RUNNINGPROGRAMSWITHMPIEXEC5

4.4SpecialIssuesforFortran

MPICH2providestwokindsofsupportforFortranprograms.ForFortran77programmers,thefilempif.hprovidesthedefinitionsoftheMPIconstantssuchasMPICOMMWORLD.Fortran90programmersshouldusetheMPImoduleinstead;thisprovidesallofthedefinitionsaswellasinterfacedefinitionsformanyoftheMPIfunctions.However,thisMPImoduledoesnotprovidefullFortran90support;inparticular,interfacesfortheroutines,suchasMPISend,thattake“choice”argumentsarenotprovided.

5RunningProgramswithmpiexec

IfyouhavebeenusingtheoriginalMPICH,oranyofanumberofotherMPIimplementations,thenyouhaveprobablybeenusingmpirunasawaytostartyourMPIprograms.TheMPI-2StandarddescribesmpiexecasasuggestedwaytorunMPIprograms.MPICH2implementsthempiexecstandard,andalsoprovidessomeextensions.MPICH2providesmpirunforbackwardcompatibilitywithexistingscripts,butitdoesnotsupportthesameorasmanyoptionsasmpiexecoralloftheoptionsofMPICH1’smpirun.

5.1Standardmpiexec

HerewedescribethestandardmpiexecargumentsfromtheMPI-2Stan-dard[1].ThesimplestformofacommandtostartanMPIjobis

mpiexec-n32a.out

tostarttheexecutablea.outwith32processes(providinganMPICOMMWORLDofsize32insidetheMPIapplication).Otheroptionsaresupported,forspec-ifyinghoststorunon,searchpathsforexecutables,workingdirectories,andevenamoregeneralwayofspecifyinganumberofprocesses.Multiplesetsofprocessescanberunwithdifferentexectuablesanddifferentvaluesfortheirarguments,with“:”separatingthesetsofprocesses,asin:

mpiexec-n1-hostloginnodemaster:-n32-hostsmpslave

5RUNNINGPROGRAMSWITHMPIEXEC6

The-configfileargumentallowsonetospecifyafilecontainingthespec-ificationsforprocesssetsonseparatelinesinthefile.Thismakesitunnec-essarytohavelongcommandlinesformpiexec.(Seepg.353of[2].)ItisalsopossibletostartaoneprocessMPIjob(withaMPICOMMWORLDwhosesizeisequalto1),withoutusingmpiexec.ThisprocesswillbecomeanMPIprocesswhenitcallsMPIInit,anditmaythencallotherMPIfunctions.Currently,MPICH2doesnotfullysupportcallingthedynamicprocessroutinesfromMPI-2(e.g.,MPICommspawnorMPICommaccept)fromprocessesthatarenotstartedwithmpiexec.

5.2ExtensionsforAllProcessManagementEnvironments

Somempiexecargumentsarespecifictoparticularcommunicationsub-systems(“devices”)orprocessmanagementenvironments(“processman-agers”).Ourintentionistomakeallargumentsasuniformaspossibleacrossdevicesandprocessmanagers.Forthetimebeingwewilldocumenttheseseparately.

5.3

ExtensionsfortheMPDProcessManagementEnviron-ment

MPICH2providesanumberofprocessmanagementsystems.ThedefaultiscalledMPD.MPDprovidesanumberofextensionstothestandardformofmpiexec.5.3.1

BasicmpiexecargumentsforMPD

ThedefaultconfigurationofMPICH2choosestheMPDprocessmanagerandthe“simple”implementationoftheProcessManagementInterface.MPDprovidesaversionofmpiexecthatsupportsboththestandardar-gumentsdescribedinSection5.1andotherargumentsdescribedinthissection.MPDalsoprovidesanumberofcommandsforqueryingtheMPDprocessmanagementenvironmentandinteractingwithjobsithasstarted.Beforerunningmpiexec,theruntimeenvironmentmustbeestablished.InthecaseofMPD,thedaemonsmustberunning.SeeSection6.1forhowtorunandmanagetheMPDdaemons.

5RUNNINGPROGRAMSWITHMPIEXEC7

WeassumethattheMPDringisupandtheinstallation’sbindirectoryisinyourpath;thatis,youcando:

mpdtrace

anditwilloutputalistofnodesonwhichyoucanrunMPIprograms.Nowyouarereadytorunaprogramwithmpiexec.Letusassumethatyouhavecompiledandlinkedtheprogramcpi(intheinstalldir/examplesdirectoryandthatthisdirectoryisinyourPATH.Orthatisyourcurrentworkingdirectoryand‘.’(“dot”)isinyourPATH.Thesimplestthingtodois

mpiexec-n5cpi

toruncpionfivenodes.Theprocessmanagementsystem(suchasMPD)willchoosemachinestorunthemon,andcpiwilltellyouwhereeachisrunning.

Youcanusempiexectorunnon-MPIprogramsaswell.Thisissome-timesusefulinmakingsureallthemachinesareupandreadyforuse.Usefulexamplesinclude

mpiexec-n10hostnameand

mpiexec-n10printenv5.3.2

OtherCommand-LineArgumentstompiexecforMPD

TheMPI-2standardspecifiesthesyntaxandsemanticsofthearguments-n,-path,-wdir,-host,-file,-configfile,and-soft.Allofthesearecur-rentlyimplementedforMPD’smpiexec.Eachoftheseiswhatwecalla“lo-cal”option,sinceitsscopeistheprocessesinthesetofprocessesdescribedbetweencolons,oronseparatelinesofthefilespecifiedby-configfile.Weaddsomeextensionsthatarelocalinthiswayandsomethatare“global”inthesensethattheyapplytoalltheprocessesbeingstartedbytheinvocationofmpiexec.

5RUNNINGPROGRAMSWITHMPIEXEC8

TheMPI-2Standardprovidesawaytopassdifferentargumentstodif-ferentapplicationprocesses,butdoesnotprovideawaytopassenvironmentvariables.MPICH2providesanextensionthatsupportsenvironmentvari-ables.Thelocalparameter-envdoesthisforonesetofprocesses.Thatis,

mpiexec-n1-envFOOBARa.out:-n2-envBAZZFAZZb.outmakesBARthevalueofenvironmentvariableFOOonthefirstprocess,runningtheexecutablea.out,andgivestheenvironmentvariableBAZZthevalueFAZZonthesecondtwoprocesses,runningtheexecutableb.out.Tosetanenvironmentvariablewithoutgivingitavalue,use’’asthevalueintheabovecommandline.

Theglobalparameter-genvcanbeusedtopassthesameenvironmentvariablestoallprocesses.Thatis,

mpiexec-genvFOOBAR-n2a.out:-n4b.out

makesBARthevalueoftheenvironmentvariableFOOonallsixprocesses.If-genvappears,itmustappearinthefirstgroup.Ifboth-genvand-envareused,the-env’saddtotheenvironmentspecifiedoraddedtobythe-genvvariables.Ifthereisonlyonesetofprocesses(no“:”),the-genvand-envareequivalent.

Thelocalparameter-envallisanabbreviationforpassingtheen-tireenvironmentinwhichmpiexecisexecuted.Theglobalversionofitis-genvall.Thisglobalversionisimplicitlypresent.Topassnoenvi-ronmentvariables,use-envnoneand-genvnone.So,forexample,tosetonlytheenvironmentvariableFOOandnoothers,regardlessofthecurrentenvironment,youwoulduse

mpiexec-genvnone-envFOOBAR-n50a.out

InthecaseofMPD,wecurrentlymakeanexceptionforthePATHenviron-mentvariable,whichisalwayspassedthrough.Thisexceptionwasaddedtomakeitunnecessarytoexplicitlypassthisvariableinthedefaultcase.Alistofenvironmentvariablenameswhosevaluesaretobecopiedfromthecurrentenvironmentcanbegivenwiththe-envlist(respectively,-genvlist)parameter;forexample,

5RUNNINGPROGRAMSWITHMPIEXEC9

mpiexec-genvnone-envlistHOME,LD_LIBRARY_PATH-n50a.outsetstheHOMEandLDLIBRARYPATHintheenvironmentofthea.outpro-cessestotheirvaluesintheenvironmentwherempiexecisbeingrun.Inthissituationyoucan’thavecommasintheenvironmentvariablenames,althoughofcoursetheyarepermittedinvalues.

Someextensionparametershaveonlyglobalversions.Theyare-lprovidesranklabelsforlinesofstdoutandstderr.Theseareabit

obscureforprocessesthathavebeenexplicitlyspawned,butarestilluseful.-usizesetsthe“universesize”thatisretrievedbytheMPIattribute

MPIUNIVERSESIZEonMPICOMMWORLD.-bnrisusedwhenonewantstorunexecutablesthathavebeencompiled

andlinkedusingthechp4mpdormyrinetdeviceinMPICH1.TheMPDprocessmanagerprovidesbackwardcompatibilityinthiscase.-machinefilecanbeusedtospecifyinformationabouteachofasetof

machines.Thisinformationmayincludethenumberofprocessestorunoneachhostwhenexecutinguserprograms.Forexample,assumethatamachinefilenamedmfcontains:

#commentlinehostahostb:2hostcifhn=hostc-gigehostd:4ifhn=hostd-gige

Inadditiontospecifyinghostsandnumberofprocessestorunoneach,thismachinefileindicatesthatprocessesrunningonhostcandhostdshouldusethegigeinterfaceonhostcandhostdrespectivelyforMPIcommunications.(ifhnstandsfor“interfacehostname”andshouldbesettoanalternatehostnameforthemachinethatisusedtodesignateanalternatecommunicationinterface.)ThisinterfaceinformationcausestheMPIimplementationtochoosethealternatehostnamewhenmakingconnections.Whenthealternatehostnamespecifiesaparticularinterface,MPICHcommunicationwillthentraveloverthatinterface.

Youmightusethismachinefileinthefollowingway:

5RUNNINGPROGRAMSWITHMPIEXEC

mpiexec-machinefilemf-n7p0

10

Processrank0istorunonhosta,ranks1and2onhostb,rank3onhostc,andranks4-6onhostd.Notethatthefilespecifiesinformationforupto8ranksandweonlyused7.ThatisOK.But,ifwehadused“-n9”,anerrorwouldberaised.Thefileisnotusedasapoolofmachinesthatarecycledthrough;theprocessesaremappedtothehostsintheorderspecifiedinthefile.

Amorecomplexcommand-lineexamplemightbe:

mpiexec-l-machinefilemf-n3p1:-n2p2:-n2p3Here,ranks0-2allrunprogramp1andareexecutedplacingrank0onhostaandranks1-2onhostb.Similarly,ranks3-4runp2andareexecutedonhostcandhostd,respectively.Ranks5-6runonhostdandexecutep3.

-scanbeusedtodirectthestdinofmpiexectospecificprocessesina

paralleljob.Forexample:

mpiexec-sall-n5a.out

directsthestdinofmpiexectoallfiveprocesses.

mpiexec-s4-n5a.out

directsittojusttheprocesswithrank4,and

mpiexec-s1,3-n5a.outsendsittoprocesses1and3,while

mpiexec-s0-3-n5a.outsendsstdintoprocesses0,1,2,and3.

Thedefault,if-sisnotspecified,istosendmpiexec’sstdintoprocess0only.

Theredirectionof-stdinthroughmpiexectovariousMPIprocessesisintendedprimarilyforinteractiveuse.Becauseofthecomplexityofbufferinglargeamountsofdataatvariousprocessesthatmaynothavereadityet,theredirectionoflargeamountsofdatatompiexec’sstdinisdiscouraged,andmaycauseunexpectedresults.Thatis,

5RUNNINGPROGRAMSWITHMPIEXEC

mpiexec-sall-n5a.out11

shouldnotbeusedifbigfileismorethanafewlineslong.Haveoneoftheprocessesopenthefileandreaditinstead.ThefunctionsinMPI-IOmaybeusefulforthispurpose.

A“:”canoptionallybeusedbetweenglobalargsandnormalargumentsets,e.g.:

mpiexec-l-n1-hosthost1pgm1:-n4-hosthost2pgm2isequivalentto:

mpiexec-l:-n1-hosthost1pgm1:-n4-hosthost2pgm2Thisoptionimpliesthattheglobalargumentscanoccuronaseparatelineinthefilespecifiedby-configfilewhenitisusedtoreplacealongcommandline.5.3.3

EnvironmentVariablesAffectingmpiexecforMPD

Asmallnumberofenvironmentvariablesaffectthebehaviorofmpiexec.MPIEXECTIMEOUTThevalueofthisenvironmentvariableisthemaximum

numberofsecondsthisjobwillbepermittedtorun.Whentimeisup,thejobisaborted.MPIEXECPORTRANGEIfthisenvironmentvariableisdefinedthentheMPD

systemwillrestrictitsusageofportsforconnectingitsvariouspro-cessestoportsinthisrange.Ifthisvariableisnotassigned,butMPICHPORTRANGEisassigned,thenitwillusetherangespecifiedbyMPICHPORTRANGEforitsports.Otherwise,itwillusewhateverpaortsareassignedtoitbythesystem.Portrangesaregivenasapairofintegersseparatedbyacolon.MPIEXECBNRIfthisenvironmentvariableisdefined(itsvalue,ifany,is

currentlyinsignificant),thenMPDwillactinbackward-compatibilitymode,supportingtheBNRinterfacefromtheoriginalMPICH(e.g.versions1.2.0–1.2.7p1)insteadofitsnativePMIinterface,asawayforapplicationprocessestointeractwiththeprocessmanagementsystem.

5RUNNINGPROGRAMSWITHMPIEXEC12

MPDCONEXTAddsastringtothedefaultUnixsocketnameusedbympiexec

tofindthelocalmpd.Thisallowsonetorunmultiplempdringsatthesametime.

5.4ExtensionsforSMPDProcessManagementEnvironment

SMPDisanalternateprocessmanagerthatrunsonbothUnixandWin-dows.Itcanlaunchjobsacrossbothplatformsifthebinaryformatsmatch(big/littleendiannessandsizeofCtypes–int,long,void*,etc).5.4.1

mpiexecargumentsforSMPD

mpiexecforsmpdacceptsthestandardMPI-2mpiexecoptions.Execute

mpiexecor

mpiexec-help2

toprinttheusageoptions.Typicalusage:

mpiexec-n10myapp.exeAlloptionstompiexec:-nx

-npx

launchxprocesses-localonlyx

-npx-localonly

launchxprocessesonthelocalmachine

-machinefilefilename

useafiletolistthenamesofmachinestolaunchon

5RUNNINGPROGRAMSWITHMPIEXEC-hosthostname

launchonthespecifiedhost.-hostsnhost1host2...

hostn

13

-hostsnhost1m1host2m2...hostnmn

launchonthespecifiedhosts.Inthesecondversionthenumberofprocesses=m1+m2+...+mn-dirdrive:\\my\\working\\directory

-wdir/my/working/directory

launchprocesseswiththespecifiedworkingdirectory.(-dirand-wdirareequivalent)-envvarval

setenvironmentvariablebeforelaunchingtheprocesses-exitcodes

printtheprocessexitcodeswheneachprocessexits.

-noprompt

preventmpiexecfrompromptingforusercredentials.Insteaderrorswillbeprintedandmpiexecwillexit.-localroot

launchtherootprocessdirectlyfrommpiexecifthehostislocal.(Thisallowstherootprocesstocreatewindowsandbedebugged.)-portport

-pport

specifytheportthatsmpdislisteningon.

-phrasepassphrase

specifythepassphrasetoauthenticateconnectionstosmpdwith.-smpdfilefilename

specifythefilewherethesmpdoptionsarestoredincludingthepassphrase.(unixonlyoption)-pathsearchpathsearchpathforexecutable,;separated

5RUNNINGPROGRAMSWITHMPIEXEC-timeoutseconds

timeoutforthejob.Windowsspecificoptions:

14

-mapdrive:\\\\host\\share

mapadriveonallthenodesthismappingwillberemovedwhentheprocessesexit-logon

promptforuseraccountandpassword

-pwdfilefilename

readtheaccountandpasswordfromthefilespecified.

puttheaccountonthefirstlineandthepasswordonthesecond-nopopupdebugdisablethesystempopupdialogiftheprocesscrashes

-priorityclass[:level]

settheprocessstartuppriorityclassandoptionallylevel.class=0,1,2,3,4=idle,below,normal,above,high

level=0,1,2,3,4,5=idle,lowest,below,normal,above,highestthedefaultis-priority2:3-register

encryptausernameandpasswordtotheWindowsregistry.-remove

deletetheencryptedcredentialsfromtheWindowsregistry.-validate[-hosthostname]

validatetheencryptedcredentialsforthecurrentorspecifiedhost.-delegate

usepasswordlessdelegationtolaunchprocesses.-impersonate

usepasswordlessauthenticationtolaunchprocesses.-plaintext

don’tencryptthedataonthewire.

5RUNNINGPROGRAMSWITHMPIEXEC15

5.5

ExtensionsforthegforkerProcessManagementEnvi-ronment

gforkerisaprocessmanagementsystemforstartingprocessesonasin-glemachine,socalledbecausetheMPIprocessesaresimplyforkedfromthempiexecprocess.ThisprocessmanagersupportsprogramsthatuseMPICommspawnandtheotherdynamicprocessroutines,butdoesnotsup-porttheuseofthedynamicprocessroutinesfromprogramsthatarenotstartedwithmpiexec.ThegforkerprocessmanagerisprimiarilyintendedasadebuggingaidasitsimplifiesdevelopmentandtestingofMPIprogramsonasinglenodeorprocessor.5.5.1

mpiexecargumentsforgforker

Inadditiontothestandardmpiexeccommand-linearguments,thegforkermpiexecsupportsthefollowingoptions:

-npAsynonymforthestandard-nargument

-envSettheenvironmentvariabletofor

theprocessesbeingrunbympiexec.-envnonePassnoenvironmentvariables(otherthanonesspecifiedwith

other-envor-genvarguments)totheprocessesbeingrunbympiexec.Bydefault,allenvironmentvariablesareprovidedtoeachMPIprocess(rationale:principleofleastsurprisefortheuser)-envlistPassthelistedenvironmentvariables(namesseparated

bycommas),withtheircurrentvalues,totheprocessesbeingrunbympiexec.-genvThe

-genvoptionshavethesamemeaningastheircorresponding-envversion,

excepttheyapplytoallexecutables,notjustthecurrentexecutable(inthecasethatthecolonsyntaxisusedtospecifymultipleexecuables).-genvnoneLike-envnone,butforallexecutables-genvlistLike-envlist,butforallexecutables

-usizeSpecifythevaluereturnedforthevalueoftheattributeMPIUNIVERSESIZE.

5RUNNINGPROGRAMSWITHMPIEXEC16

-lLabelstandardoutandstandarderror(stdoutandstderr)withthe

rankoftheprocess-maxtimeSetatimelimitofseconds.

-exitinfoProvidemoreinformationonthereasoneachprocessexitedif

thereisanabnormalexitInadditiontothecommandlineargments,thegforkermpiexecprovidesanumberofenvironmentvariablesthatcanbeusedtocontrolthebehaviorofmpiexec:

MPIEXECTIMEOUTMaximumrunningtimeinseconds.mpiexecwillter-minateMPIprogramsthattakelongerthanthevaluespecifiedbyMPIEXECTIMEOUT.MPIEXECUNIVERSESIZESettheuniversesize

MPIEXECPORTRANGESettherangeofportsthatmpiexecwilluseincom-municatingwiththeprocessesthatitstarts.Theformatofthisis:.Forexample,tospecifyanyportbetween10000and10100,use10000:10100.MPICHPORTRANGEHasthesamemeaningasMPIEXECPORTRANGEandis

usedifMPIEXECPORTRANGEisnotset.MPIEXECPREFIXDEFAULTIfthisenvironmentvariableisset,outputtostan-dardoutputisprefixedbytherankinMPICOMMWORLDoftheprocessandoutputtostandarderrorisprefixedbytherankandthetext(err);botharefollowedbyananglebracket(>).Ifthisvariableisnotset,thereisnoprefix.MPIEXECPREFIXSTDOUTSettheprefixusedforlinessenttostandardout-put.A%disreplacedwiththerankinMPICOMMWORLD;a%wisre-placedwithanindicationofwhichMPICOMMWORLDinMPIjobsthatinvolvemultipleMPICOMMWORLDs(e.g.,onesthatuseMPICommspawnorMPICommconnect).MPIEXECPREFIXSTDERRLikeMPIEXECPREFIXSTDOUT,butforstandarder-ror.

5RUNNINGPROGRAMSWITHMPIEXEC17

MPIEXECSTDOUTBUFSetsthebufferingmodeforstandardoutput.Valid

valuesareNONE(nobuffering),LINE(bufferingbylines),andBLOCK(bufferingbyblocksofcharacters;thesizeoftheblockisimplemen-tationdefined).ThedefaultisNONE.MPIEXECSTDERRBUFLikeMPIEXECSTDOUTBUF,butforstandarderror.

5.6

RestrictionsoftheremshellProcessManagementEnvi-ronment

Theremshell“processmanager”providesaverysimpleversionofmpiexecthatmakesuseofthesecureshellcommand(ssh)tostartprocessesonacollectionofmachines.Asthisisintendedprimarilyasanillustrationofhowtobuildaversionofmpiexecthatworkswithotherprocessmanagers,itdoesnotimplementallofthefeaturesoftheothermpiexecprogramsdescribedinthisdocument.Inparticular,itignoresthecommandlineoptionsthatcontroltheenvironmentvariablesgiventotheMPIprograms.Itdoessupportthesameoutputlabelingfeaturesprovidedbythegforkerversionofmpiexec.However,thisversionofmpiexeccanbeusedmuchlikethempirunforthechp4deviceinMPICH-1torunprogramsonacollectionofmachinesthatallowremoteshells.Afilebythenameofmachinesshouldcontainthenamesofmachinesonwhichprocessescanberun,onemachinenameperline.Theremustbeenoughmachineslistedtosatisfytherequestednumberofprocesses;youcanlistthesamemachinenamemultipletimesifnecessary.

Formorecomplexneedsorforfasterstartup,werecommendtheuseofthempdprocessmanager.

5.7UsingMPICH2withSLURMandPBS

MPICH2canbeusedinbothSLURMandPBSenvironments.IfconfiguredwithSLURM,usethesrunjoblaunchingutilityprovidedbySLURM.ForPBS,MPICH2jobscanbelaunchedintwoways:(i)usingMPDor(ii)usingtheOSCmpiexec.

6MANAGINGTHEPROCESSMANAGEMENTENVIRONMENT185.7.1

MPDinthePBSenvironment

PBSspecifiesthemachinesallocatedtoaparticularjobinthefile$PBSNODEFILE.ButtheformatusedbyPBSisdifferentfromthatofMPD.Specifically,PBSlistseachnodeonasingleline;ifanode(n0)hastwoprocessors,itislistedtwice.MPDontheotherhandusesanidentifier(ncpus)todescribehowmanyprocessorsanodehas.So,ifn0hastwoprocessors,itislistedasn0:2.

OnewaytoconvertthenodefiletotheMPDformatisasfollows:sort$PBSNODEFILE|uniq-C|awk’{printf(”%s:%s”,$2,$1);}’>mpd.nodes

OncethePBSnodefileisconverted,MPDcanbenormallystartedwithinthePBSjobscriptusingmpdbootandtorndownusingmpdallexit.mpdboot-fmpd.hosts-n[NUMNODESREQUESTED]mpiexec-n[NUMPROCESSES]./mytestprogrammpdallexit5.7.2

OSCmpiexec

PeteWyckofffromtheOhioSupercomputerCenterprovidesaalternateutil-itycalledOSCmpiexectolaunchMPICH2jobsonPBSsystemswithoutus-ingMPD.Moreinformationaboutthiscanbefoundhere:http://www.osc.edu/pw/mpiexec

6ManagingtheProcessManagementEnvironment

Someoftheprocessmanagerssupplyusercommandsthatcanbeusedtointeractwiththeprocessmanagerandtocontroljobs.Inthissectionwedescribeusercommandsthatmaybeuseful.

6.1MPD

mpdstartsanmpddaemon.

mpdbootstartsasetofmpd’sonalistofmachines.

mpdtracelistsalltheMPDdaemonsthatarerunning.The-loptionlists

fullhostnamesandtheportwherethempdislistening.

7DEBUGGING19

mpdlistjobsliststhejobsthatthempd’sarerunning.Jobsareidentified

bythenameofthempdwheretheyweresubmittedandanumber.mpdkilljobkillsajobspecifiedbythenamereturnedbympdlistjobsmpdsigjobdeliversasignaltothenamedjob.Signalsarespecifiedbyname

ornumber.Youcanusekeystrokestoprovidesignalsintheusualway,wherempiexecstandsinfortheentireparallelapplication.Thatis,ifmpiexecisbeingruninaUnixshellintheforeground,youcanuse^C(control-C)tosendaSIGINTtotheprocesses,or^Z(control-Z)tosuspendallofthem.Asuspendedjobcanbecontinuedintheusualway.

PreciseargumentformatscanbeobtainedbypassinganyMPDcom-mandthe--helpor-hargument.MoredetailscanbefoundintheREADMEinthempich2top-leveldirectoryortheREADMEfileintheMPDdirectorympich2/src/pm/mpd.

7Debugging

Debuggingparallelprogramsisnotoriouslydifficult.Herewedescribeanumberofapproaches,someofwhichdependontheexactversionofMPICH2youareusing.

7.1gdbviampiexec

IfyouareusingtheMPDprocessmanager,youcanusethe-gdbargumenttompiexectoexecuteaprogramwitheachprocessrunningunderthecontrolofthegdbsequentialdebugger.The-gdboptionhelpscontrolthemultipleinstancesofgdbbysendingstdineithertoallprocessesortoaselectedprocessandbylabelingandmergingoutput.Thecurrentimplementationhassomeminorlimitations.Forexample,wedonotsupportsettingyourownprompt.Thisisbecausewecapturethegdboutputandexamineitbeforeprocessingit,e.g.mergingidenticallines.Also,wesetabreakpointatthebeginningofmaintogetallprocessessynchronizedatthebeginning.Thus,theuserwillhaveaduplicate,unusablebreakpointifhesetsoneattheveryfirstexecutablelineofmain.Otherwise,totheextentpossible,

7DEBUGGING20

wetrytosimplypassuserinputthroughtogdbandletsthingsprogressnormally.

Thefollowingscriptofa-gdbsessiongivesanideaofhowthisworks.Inputkeystrokesaresenttoallprocessesunlessspecifiallydirectedbythe“z”command.

ksl2%0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:0-9:

mpiexec-gdb-n10cpi(gdb)l

5doublef(double);6

7doublef(doublea)8{9return(4.0/(1.0+a*a));10}1112intmain(intargc,char*argv[])13{14intdone=0,n,myid,numprocs,i;(gdb)15doublePI25DT=3.141592653589793238462643;16doublemypi,pi,h,sum,x;17doublestartwtime=0.0,endwtime;18intnamelen;19charprocessor_name[MPI_MAX_PROCESSOR_NAME];2021MPI_Init(&argc,&argv);22MPI_Comm_size(MPI_COMM_WORLD,&numprocs);23MPI_Comm_rank(MPI_COMM_WORLD,&myid);24MPI_Get_processor_name(processor_name,&namelen);(gdb)2526fprintf(stdout,\"Process%dof%dison%s\\n\27myid,numprocs,processor_name);28fflush(stdout);2930n=10000;/*default#ofrectangles*/31if(myid==0)32startwtime=MPI_Wtime();3334MPI_Bcast(&n,1,MPI_INT,0,MPI_COMM_WORLD);(gdb)b30

Breakpoint2at0x4000000000002541:

file/home/lusk/mpich2/examples/cpi.c,line30.

7DEBUGGING

0-9:(gdb)r

0-9:Continuing.

0:Process0of10isonksl21:Process1of10isonksl22:Process2of10isonksl23:Process3of10isonksl24:Process4of10isonksl25:Process5of10isonksl26:Process6of10isonksl27:Process7of10isonksl28:Process8of10isonksl29:Process9of10isonksl20-9:

0-9:Breakpoint2,main(argc=1,argv=0x60000fffffffb4b8)0-9:at/home/lusk/mpich2/examples/cpi.c:300-9:30n=10000;*default#ofrectangles*/0-9:(gdb)n0-9:31if(myid==0)0-9:(gdb)n0:32startwtime=MPI_Wtime();1-9:34MPI_Bcast(&n,1,MPI_INT,0,MPI_COMM_WORLD);0-9:(gdb)z00:(gdb)n0:34MPI_Bcast(&n,1,MPI_INT,0,MPI_COMM_WORLD);0:(gdb)z

0-9:(gdb)where

0-9:#0main(argc=1,argv=0x60000fffffffb4b8)0-9:at/home/lusk/mpich2/examples/cpi.c:340-9:(gdb)n0-9:36h=1.0/(double)n;0-9:(gdb)0-9:37sum=0.0;0-9:(gdb)0-9:39for(i=myid+1;i<=n;i+=numprocs)0-9:(gdb)0-9:41x=h*((double)i-0.5);0-9:(gdb)0-9:42sum+=f(x);0-9:(gdb)0-9:39for(i=myid+1;i<=n;i+=numprocs)0-9:(gdb)0-9:41x=h*((double)i-0.5);0-9:(gdb)0-9:42sum+=f(x);0-9:(gdb)

21

7DEBUGGING22

0-9:39for(i=myid+1;i<=n;i+=numprocs)0-9:(gdb)0-9:41x=h*((double)i-0.5);0-9:(gdb)0-9:42sum+=f(x);0-9:(gdb)0-9:39for(i=myid+1;i<=n;i+=numprocs)0-9:(gdb)0-9:41x=h*((double)i-0.5);0-9:(gdb)0-9:42sum+=f(x);0-9:(gdb)0-9:39for(i=myid+1;i<=n;i+=numprocs)0-9:(gdb)0-9:41x=h*((double)i-0.5);0-9:(gdb)0-9:42sum+=f(x);0-9:(gdb)0-9:39for(i=myid+1;i<=n;i+=numprocs)0-9:(gdb)0-9:41x=h*((double)i-0.5);0-9:(gdb)0-9:42sum+=f(x);0-9:(gdb)psum

0:$1=19.9998759514977991:$1=19.9998675516727252:$1=19.9998587518635493:$1=19.9998495520713284:$1=19.9998399522971585:$1=19.9998299525422036:$1=19.9998195528076587:$1=19.9998087530947698:$1=19.9997975534048329:$1=19.9997859537391920-9:(gdb)c

0-9:Continuing.

0:piisapproximately3.1415926544231256,Erroris0.00000000083333251-9:

1-9:Programexitednormally.

1-9:(gdb)0:wallclocktime=44.9094120:

0:Programexitednormally.0:(gdb)q

0-9:MPIGDBENDINGksl2%

7DEBUGGING

Youcanattachtoarunningjobwith

mpiexec-gdba

wherecomesfrommpdlistjobs.

23

7.2TotalView

MPICH2supportsuseoftheTotalViewdebuggerfromEtnus,throughtheMPDprocessmanageronly.IfMPICH2hasbeenconfiguredtoenabledebuggingwithTotalView(SeethesectiononconfigurationoftheMPDprocessmanagerintheInstaller’sGuide)thenonecandebuganMPIpro-gramstartedwithMPDbyadding-tvtotheglobalmpiexecarguments,asin

mpiexec-tv-n3cpi

YouwillgetapopupwindowfromTotalViewaskingwhetheryouwanttostartthejobinastoppedstate.Ifso,whentheTotalViewwindowappears,youmayseeassemblycodeinthesourcewindow.Clickonmaininthestackwindow(upperleft)toseethesourceofthemainfunction.TotalViewwillshowthattheprogram(allprocesses)arestoppedinthecalltoMPIInit.WhendebuggingwithTotalViewusingtheabovestartupsequence,MPICH2jobscannotberestartedwithoutexitingTotalView.InMPICH2version1.0.6orlater,TotalViewcanbeinvokedonanMPICH2jobasfollows:

totalviewpython-a‘whichmpiexec‘-tvsu\\

andtheMPICH2jobwillbefullyrestartablewithinTotalView.

IfyouhaveMPICH2version1.0.6orlaterandTotalView8.1.0orlater,youcanuseaTotalViewfeaturecalledindirectlaunchwithMPICH2.InvokeTotalViewas:

totalview-a

ThenselecttheProcess/StartupParameterscommand.ChoosetheParalleltabintheresultingdialogboxandchooseMPICH2astheparallelsystem.

8MPE24

ThensetthenumberoftasksusingtheTasksfieldandenterotherneededmpiexecargumentsintotheAdditionalStarterArgumentsfield.

IfyouwanttobeabletoattachtoarunningMPICH2jobusingTo-talView,youmustusethe-tvsuoptiontompiexecwhenstartingthejob.UsingthisoptionwilladdabarrierinsideMPIInitandhencemayaffectstartupperformanceslightly.ItwillhavenoeffectontherunningofthejoboncealltaskshavereturnedfromMPIInit.Inordertodebugarun-ningMPICH2job,youmustattachTotalViewtotheinstanceofPythonthatisrunningthempiexecscript.Ifyouhavejustonetaskrunningonthenodewhereyouinvokedmpiexec,andnootherPythonscriptsrunning,therewillbethreeinstancesofPythonrunningonthenode.OneoftheseistheparentoftheMPICH2taskonthatnode,andoneistheparentofthatPythonprocess.NeitherofthoseistheinstanceofPythonyouwanttoattachto—theyarebothrunningtheMPDscript.ThethirdinstanceofPythonhasnochildrenandisnotthechildofaPythonprocess.Thatistheonethatisrunningmpiexecandistheoneyouwanttoattachto.

8MPE

MPICH2comeswiththesameMPE(Multi-ProcessingEnvironment)toolsthatareincludedwithMPICH1.TheseincludeseveraltracelibrariesforrecordingtheexecutionofMPIprogramsandtheJumpshotandSLOGtoolsforperformancevisualization,andaMPIcollectiveanddatatypecheckinglibrary.TheMPEtoolsarebuiltandinstalledbydefaultandshouldbeavailablewithoutrequiringanyadditionalsteps.TheeasiestwaytouseMPEprofilinglibrariesisthroughthe-mpe=switchprovidedbyMPICH2’scompilerwrappers,mpicc,mpicxx,mpif77,andmpif90.

8.1MPILogging

MPEprovidesautomaticMPIlogging.Forinstance,toviewMPIcommu-nicationpatternofaprogram,fpilog.f,onecansimplylinkthesourcefileasfollows:

mpif90-mpe=mpilog-ofpilogfpilog.f

8MPE25

The-mpe=mpilogoptionwilllinkwithappropriateMPEprofilinglibraries.Thenrunningtheprogramthroughmpiexecwillresultalogfile,Unknown.clog2,intheworkingdirectory.ThefinalstepistoconvertandviewthelogfilethroughJumpshot:jumpshotUnknown.clog2

8.2User-definedlogging

InadditiontousingthepredefinedMPEloggingtologMPIcalls,MPEloggingcallscanbeinsertedintouser’sMPIprogramtodefineandlogstates.ThesestatesarecalledUser-Definedstates.Statesmaybenested,allowingonetodefineastatedescribingauserroutinethatcontainsseveralMPIcalls,anddisplayboththeuser-definedstateandtheMPIoperationscontainedwithinit.

Thetypicalwaytoinsertuser-definedstatesisasfollows:

•GethandlesfromMPElogginglibrary:MPELoggetstateeventIDs()hastobeusedtogetuniqueeventIDs(MPElogginghandles).1ThisisimportantifyouarewritingalibrarythatusestheMPEloggingroutinesfromtheMPEsystem.HardwiringtheeventIDsisconsideredabadideasinceitmaycauseeventIDconfictandsothepracticeisn’tsupported.

•Settheloggedstate’scharacteristics:MPEDescribestate()setsthenameandcolorofthestates.

•Logtheeventsoftheloggedstates:MPELogevent()arecalledtwicetologtheuser-definedstates.

OlderMPElibrariesprovideMPELoggeteventnumber()whichisstillbe-ingsupportedbuthasbeendeprecated.UsersarestronglyurgedtouseMPELoggetstateeventIDs()instead.

1

8MPE

Belowisasimpleexamplethatusesthe3stepsoutlinedabove.

26

inteventID_begin,eventID_end;...

MPE_Log_get_state_eventIDs(&eventID_begin,&eventID_end);...

MPE_Describe_state(eventID_begin,eventID_end,

\"Multiplication\\"red\");

...

MyAmult(Matrixm,Vectorv){

/*Logthestarteventofthered\"Multiplication\"state*/MPE_Log_event(eventID_begin,0,NULL);...Amultcode,includingMPIcalls...

/*Logtheendeventofthered\"Multiplication\"state*/MPE_Log_event(eventID_end,0,NULL);}

ThelogfilegeneratedbythiscodewillhavetheMPIroutinesnestedwithintheroutineMyAmult().

Besidesuser-definedstates,MPE2alsoprovidessupportforuser-definedeventswhichcanbedefinedthroughuseofMPELoggetsoloeventID()andMPEDescribeevent().Formoredetails,e.g.seecpilog.c.

8.3MPIChecking

TovalidatealltheMPIcollectivecallsinaprogrambylinkingthesourcefileasfollows:

mpif90-mpe=mpicheck-owrong_realswrong_reals.fRunningtheprogramwillresultwiththefollowingoutput:

>mpiexec-n4wrong_reals

StartingMPICollectiveandDatatypeChecking!Process3of4isaliveBacktraceofthecallstackatrank3:

At[0]:wrong_reals(CollChk_err_han+0xb9)[0x8055a09]

8MPE27

At[1]:wrong_reals(CollChk_dtype_scatter+0xbf)[0x8057bff]At[2]:wrong_reals(CollChk_dtype_bcast+0x3d)[0x8057ccd]At[3]:wrong_reals(MPI_Bcast+0x6c)[0x80554bc]At[4]:wrong_reals(mpi_bcast_+0x35)[0x80529b5]At[5]:wrong_reals(MAIN__+0x17b)[0x805264f]At[6]:wrong_reals(main+0x27)[0x80dd187]

At[7]:/lib/libc.so.6(__libc_start_main+0xdc)[0x9a34e4]At[8]:wrong_reals[0x8052451]

[cli_3]:abortingjob:

FatalerrorinMPI_Comm_call_errhandler:

CollectiveChecking:BCAST(Rank3)-->Inconsistentdatatypesignatures

detectedbetweenrank3andrank0.TheerrormessagehereshowsthattheMPIBcasthasbeenusedwithin-consistentdatatypeintheprogramwrongreals.f.8.4MPEoptions

OtherMPEprofilingoptionsthatareavailablethroughMPICH2compilerwrappersare

-mpe=mpilog

:AutomaticMPIandMPEuser-definedstateslogging.Thislinksagainst-llmpe-lmpe.:TraceMPIprogramwithprintf.Thislinksagainst-ltmpe.:AnimateMPIprograminreal-time.Thislinksagainst-lampe-lmpe.

:CheckMPIProgramwiththeCollective&Datatype

Checkinglibrary.Thislinksagainst-lmpe_collchk.:UseMPEgraphicsroutineswithX11library.Thislinksagainst-lmpe.:MPEuser-definedstateslogging.Thislinksagainst-lmpe.

-mpe=mpitrace

-mpe=mpianim

-mpe=mpicheck

-mpe=graphics

-mpe=log

9OTHERTOOLSPROVIDEDWITHMPICH228

-mpe=nolog

:NullifyMPEuser-definedstateslogging.Thislinksagainst-lmpe_null.:Printthehelppage.

-mpe=help

FormoredetailsofhowtouseMPEprofilingtools,seempich2/src/mpe2/README.

9OtherToolsProvidedwithMPICH2

MPICH2alsoincludesatestsuiteforMPI-1andMPI-2functionality;thissuitemaybefoundinthempich2/test/mpisourcedirectoryandcanberunwiththecommandmaketesting.ThistestsuiteshouldworkwithanyMPIimplementation,notjustMPICH2.

10

10.1

MPICH2underWindows

Directories

ThedefaultinstallationofMPICH2isinC:\\ProgramFiles\\MPICH2.Un-dertheinstallationdirectoryarethreesub-directories:include,bin,andlib.TheincludeandlibdirectoriescontaintheheaderfilesandlibrariesnecessarytocompileMPIapplications.Thebindirectorycontainsthepro-cessmanager,smpd.exe,andtheMPIjoblauncher,mpiexec.exe.ThedllsthatimplementMPICH2arecopiedtotheWindowssystem32directory.

10.2Compiling

ThelibrariesinthelibdirectorywerecompiledwithMSVisualC++.NET2003andIntelFortran8.1.ThesecompilersandanyothersthatcanlinkwiththeMS.libfilescanbeusedtocreateuserapplications.gccandg77forcygwincanbeusedwiththelibmpich*.alibraries.

ForMSDeveloperStudiousers:CreateaprojectandaddC:\\ProgramFiles\\MPICH2\\include

10MPICH2UNDERWINDOWS29

totheincludepathand

C:\\ProgramFiles\\MPICH2\\lib

tothelibrarypath.Addmpi.libandcxx.libtothelinkcommand.Addcxxd.libtotheDebugtargetlinkinsteadofcxx.lib.

IntelFortran8usersshouldaddfmpich2.libtothelinkcommand.Cygwinusersshoulduselibmpich2.alibfmpich2g.a.

10.3Running

MPIjobsarerunfromacommandpromptusingmpiexec.exe.SeeSec-tion5.4onmpiexecforsmpdforadescriptionoftheoptionstompiexec.

AFREQUENTLYASKEDQUESTIONS30

AFrequentlyAskedQuestions

ThisisthecontentoftheonlineFAQ,asofJune23,2006.

A.1

A.1.1

GeneralInformation

Q:WhatisMPICH2?

MPICH2isafreelyavailable,portableimplementationofMPI,theStandardformessage-passinglibraries.ItimplementsbothMPI-1andMPI-2.A.1.2

Q:WhatdoesMPICHstandfor?

A:MPIstandsforMessagePassingInterface.TheCHcomesfromChameleon,theportabilitylayerusedintheoriginalMPICHtoprovideportabilitytotheexistingmessage-passingsystems.A.1.3

Q:CanMPIbeusedtoprogrammulticoresystems?

A:TherearetwocommonwaystouseMPIwithmulticoreprocessorsormultiprocessornodes:

UseoneMPIprocesspercore(here,acoreisdefinedasaprogramcounterandsomesetofarithmetic,logic,andload/storeunits).

UseoneMPIprocesspernode(here,anodeisdefinedasacollectionofcoresthatshareasingleaddressspace).Usethreadsorcompiler-providedparallelismtoexploitthemultiplecores.OpenMPmaybeusedwithMPI;theloop-levelparallelismofOpenMPmaybeusedwithanyimplementationofMPI(youdonotneedanMPIthatsupportsMPITHREADMULTIPLEwhenthreadsareusedonlyforcomputationaltasks).Thisissometimescalledthehybridprogrammingmodel.

AFREQUENTLYASKEDQUESTIONS31

A.2

A.2.1

BuildingMPICH2

Q:WhatisthedifferencebetweentheMPDandSMPDprocessmanagers?

MPDisthedefaultprocessmanagerforMPICH2onUnixplatforms.ItiswritteninPython.SMPDistheprimaryprocessmanagerforMPICH2onWindows.ItisalsousedforrunningonacombinationofWindowsandLinuxmachines.ItiswritteninC.A.2.2

Q:DoIhavetoconfigure/make/installMPICH2eachtimeforeachcompilerIuse?

No,inmanycasesyoucanbuildMPICH2usingonesetofcompilersandthenusethelibraries(andcompilationscripts)withothercompilers.However,thisdependsonthecompilersproducingcompatibleobjectfiles.Specifically,thecompilersmust

•Supportthesamebasicdatatypeswiththesamesizes.Forexample,theCcompilersshouldusethesamesizesforlonglongandlongdouble.

•Mapthenamesofroutinesinthesourcecodetonamesintheobjectfilesintheobjectfileinthesameway.ThiscanbeaproblemforFor-tranandC++compilers,thoughyoucanoftenforcetheFortrancom-pilerstousethesamenamemapping.Morespecifically,mostFortrancompilersmapnamesinthesourcecodeintoalllower-casewithoneortwounderscoresappendedtothename.TousethesameMPICH2li-brarywithallFortrancompilers,thosecompilersmustmakethesamenamemapping.Thereisoneexceptiontothisthatisdescribedbelow.•PerformthesamelayoutforCstructures.TheClangaugedoesnotspecifyhowstructuresarelayedoutinmemory.For100%compatibil-ity,allcompilersmustfollowthesamerules.However,ifyoudonotuseanyoftheMPIMINLOCorMPIMAXLOCdatatypes,andyoudonotrelyontheMPICH2librarytosettheextentofatypecreatedwithMPITypestructorMPITypecreatestruct,youcanoftenignorethisrequirement.

AFREQUENTLYASKEDQUESTIONS32

•Requirethesameadditionalruntimelibraries.NotallcompilerswillimplementthesameversionofUnix,andsomeroutinesthatMPICH2usesmaybepresentinonlysomeoftheruntimelibrariesassociatedwithspecificcompilers.Theabovemayseemlikeastringentsetofrequirements,butinpractice,manysystemsandcompilersetsmeettheseneeds,iffornootherreasonthanthatanysoftwarebuiltwithmultiplelibrarieswillhaverequirementssimilartothoseofMPICH2forcompatibility.

Ifyourcompilersarecompletelycompatible,downtotheruntimeli-braries,youmayusethecompilationscripts(mpiccetc.)byeitherspecify-ingthecompileronthecommandline,e.g.

mpicc-cc=icc-cfoo.c

orwiththeenvironmentvariablesMPICHCCetc.(thisexampleassumeac-shellsyntax):

setenvMPICH_CCiccmpicc-cfoo.c

Ifthecompileriscompatibleexceptfortheruntimelibraries,thenthissameformatworksaslongasaconfigurationfilethatdescribesthenecessaryruntimelibrariesiscreatedandplacedintotheappropriatedirectory(the“sysconfdir”directoryinconfigureterms).Seetheinstallationmanualformoredetails.

Insomecases,MPICH2isabletobuildtheFortraninterfacesinawaythatsupportsmultiplemappingsofnamesfromtheFortransourcecodetotheobjectfile.Thisisdonebyusingthe“multipleweaksymbol”supportinsomeenvironments.Forexample,whenusinggccunderLinux,thisisthedefault.A.2.3

Q:HowdoIconfiguretousetheAbsoftFortrancompilers?

A:Youhaveseveraloptions.OneistousetheFortran90compilerforbothF77andF90.Another(ifyoudonotneedFortran90)istouse--disable-f90whenconfiguring.TheoptionswithwhichwetestMPICH2andtheAbsoftcompilersarethefollowing:

AFREQUENTLYASKEDQUESTIONS

FFLAGS\"-f-B108\"

F90FLAGS\"-YALL_NAMES=LCS-B108\"F77f77F90f90

33

setenvsetenvsetenvsetenfA.2.4

Q:WhenIconfigureMPICH2,IgetamessageaboutFDZEROandtheconfigureaborts

A:FDZEROispartofthesupportfortheselectcalls(see“manselect”or“man2select”onLinuxandmanyotherUnixsystems).Whatthismeansisthatyoursystem(probablyaMac)hasabrokenversionoftheselectcallandrelateddatatypes.ThisisanOSbug;theonlyrepairistoupdatetheOStogetpastthisbug.Thistestwasaddedspecificallytodetectthiserror;iftherewasaneasywaytoworkaroundit,wewouldhaveincludedit(wedon’tjustimplementFDZEROourselvesbecausewedon’tknowwhatelseisbrokeninthisimplementationofselect).

Ifthisconfigureworkswithgccbutnotwithxlc,thentheproblemiswiththeincludefilesthatxlcisusing;sincethisisanOScall(evenifemulated),allcompilersshouldbeusingconsistentifnotidenticalincludefiles.Inthiscase,youmayneedtoupdatexlc.A.2.5

Q:WhenIusetheg95Fortrancompilerona64-bitplat-form,someofthetestsfail

A:Theg95compilerincorrectlydefinesthedefaultFortranintegerasa64-bitintegerwhiledefiningFortranrealsas32-bitvalues(theFortranstandardrequiresthatINTEGERandREALbethesamesize).ThiswasapparentlydonetoallowaFortranINTEGERtoholdthevalueofapointer,ratherthanrequiringtheprogrammertoselectanINTEGERofasuitableKIND.Toforcetheg95compilertocorrectlyimplementtheFortranstandard,usethe-i4flag.Forexample,settheenvironmentvariableF90FLAGSbeforeconfiguringMPICH2:

setenvF90FLAGS\"-i4\"

G95usersshouldnotethatthere(atthiswriting)aretwodistributionsofg95for64-bitLinuxplatforms.Oneuses32-bitintegersandreals(andconformstotheFortranstandard)andoneuses32-bitintegersand64-bit

AFREQUENTLYASKEDQUESTIONS34

reals.Werecommendusingtheonethatconformstothestandard(notethatthestandardspecifiestheratioofsizes,nottheabsolutesizes,soaFortran95compilerthatused64bitsforbothINTEGERandREALwouldalsoconformtotheFortranstandard.However,suchacompilerwouldneedtouse128bitsforDOUBLEPRECISIONquantities).A.2.6

Q:WhenIrunmake,itfailsimmediatelywithmanyer-rorsbeginningwith“sock.c:8:24:mpidusock.h:NosuchfileordirectoryInfileincludedfromsock.c:9:../../../../in-clude/mpiimpl.h:91:21:mpidpre.h:Nosuchfileordirec-toryInfileincludedfromsock.c:9:../../../../include/mpiimpl.h:1150:error:syntaxerrorbefore”MPIDVCRT”../../../../in-clude/mpiimpl.h:1150:warning:nosemicolonatendofstructorunion”

CheckifyouhavesettheenvirnomentvariableCPPFLAGS.Ifso,unsetitanduseCXXFLAGSinstead.Thenrerunconfigureandmake.A.2.7

Q:Whenbuildingthessmorsshmchannel,Igettheer-ror“mpiduprocesslocks.h:234:2:error:#error***Noatomicmemoryoperationspecifiedtoimplementbusylocks***”

Thessmandsshmchannelsdonotworkonallplatformsbecausetheyusespecialinterprocesslocks(oftenassembly)thatmaynotworkwithsomecompilersormachinearchitectures.TheyworkonLinuxwithgcc,Intel,andPathscalecompilersonvariousIntelarchitectures.TheyalsoworkinWindowsandSolarisenvironments.A.2.8

Q:WhenusingtheIntelFortran90compiler(version9),themakefailswitherrorsincompilingstatementthatref-erenceMPIADDRESSKIND.

Checktheoutputoftheconfigurestep.Ifconfigureclaimsthatifortisacrosscompiler,thelikelyproblemisthatprogramscompiledandlinkedwithifortcannotberunbecauseofamissingsharedlibrary.Trytocompileandrunthefollowingprogram(namedconftest.f90):

AFREQUENTLYASKEDQUESTIONSprogramconftest

integer,dimension(10)::nend

35

Ifthisprogramfailstorun,thentheproblemisthatyourinstallationofiforteitherhasanerrororyouneedtoaddadditionalvaluestoyourenvi-ronmentvariables(suchasLDLIBRARYPATH).Checkyourinstallationdocu-mentationfortheifortcompiler.Seehttp://softwareforums.intel.com/ISN/Community/en-US/search/SearchResults.aspx?q=libimf.soforanexampleofproblemsofthiskindthatusersarehavingwithversion9ofifort.

IfyoudonotneedFortran90,youcanconfigurewith--disable-f90.A.2.9

Q:ThebuildfailswhenIuseparallelmake

Parallelmake(ofteninvokedwithmake-j4)willcauseseveraljobstepsinthebuildprocesstoupdatethesamelibraryfile(libmpich.a)concurrently.Unfortunately,neitherthearnortheranlibprogramscorrectlyhandlethiscase,andtheresultisacorruptedlibrary.Fornow,thesolutionistonotuseaparallelmakewhenbuildingMPICH2.

A.3

A.3.1

WindowsversionofMPICH2

IamhavingtroubleinstallingandusingtheWindowsver-sionofMPICH2

SeethetipsforinstallingandrunningMPICH2onWindowsprovidedbyauser,BrentPaul.OrseetheMPICH2WindowsDevelopmentGuide.

A.4

A.4.1

CompilingMPIPrograms

C++andSEEKSET

Someusersmaygeterrormessagessuchas

SEEK_SETis#definedbutmustnotbefortheC++bindingofMPI

AFREQUENTLYASKEDQUESTIONS36

Theproblemisthatbothstdio.handtheMPIC++interfaceuseSEEKSET,SEEKCUR,andSEEKEND.ThisisreallyabugintheMPI-2standard.Youcantryadding

#undefSEEK_SET#undefSEEK_END#undefSEEK_CUR

beforempi.hisincluded,oraddthedefinition

-DMPICH_IGNORE_CXX_SEEK

tothecommandline(thiswillcausetheMPIversionsofSEEKSETetc.tobeskipped).A.4.2

C++andErrorsinNullcomm::Clone

Someusers,particularlywitholderC++compilers,mayseeerrormessagesoftheform

\"errorC2555:’MPI::Nullcomm::Clone’:overridingvirtualfunctiondiffersfrom’MPI::Comm::Clone’onlybyreturntypeorcallingconvention\".ThisiscausedbythecompilernotimplementingpartoftheC++standard.Toworkaroundthisproblem,addthedefinition

-DHAVE_NO_VARIABLE_RETURN_TYPE_SUPPORTtotheCXXFLAGSvariableoradda

#defineHAVE_NO_VARIABLE_RETURN_TYPE_SUPPORT1beforeincludingmpi.h.

AFREQUENTLYASKEDQUESTIONS37

A.5

A.5.1

RunningMPIPrograms

Q:HowdoIpassenvironmentvariablestotheprocessesofmyparallelprogram

A:Thespecificmethoddependsontheprocessmanagerandversionofmpiexecthatyouareusing.Seetheappropriatespecificsection.A.5.2

Q:HowdoIpassenvironmentvariablestotheprocessesofmyparallelprogramwhenusingthempdprocessman-ager?

A:Bydefault,alltheenvironmentvariablesintheshellwherempiexecisrunarepassedtoallprocessesoftheapplicationprogram.(TheoneexceptionisLDLIBRARYPATHwhenthempd’sarebeingrunasroot.)Thisdefaultcanbeoverriddeninmanyways,andindividualenvironmentvariablescanbepassedtospecificprocessesusingargumentstompiexec.Asynopsisofthepossibleargumentscanbelistedbytyping

mpiexec-help

andfurtherdetailsareavailableintheUsersGuide.A.5.3

Q:WhatdeterminesthehostsonwhichmyMPIprocessesrun?

A:Whereprocessesrun,whetherbydefaultorbyspecifyingthemyourself,dependsontheprocessmanagerbeingused.

Ifyouareusingthegforkerprocessmanager,thenallMPIprocessesrunonthesamehostwhereyouarerunningmpiexec.

Ifyouareusingthempdprocessmanager,whichisthedefault,thenmanyoptionsareavailable.Ifyouareusingmpd,thenbeforeyourunmpiexec,youwillhavestarted,orwillhavehadstartedforyou,aringofprocessescalledmpd’s(multi-purposedaemons),eachrunningonitsownhost.Itislikely,butnotnecessary,thateachmpdwillberunningonaseparatehost.Youcanfindoutwhatthisringofhostsconsistsofbyrunningtheprogrammpdtrace.Oneofthempd’swillberunningonthe“local”machine,theone

AFREQUENTLYASKEDQUESTIONS38

whereyouwillrunmpiexec.ThedefaultplacementofMPIprocesses,ifoneruns

mpiexec-n10a.out

istostartthefirstMPIprocess(rank0)onthelocalmachineandthentodistributetherestaroundthempdringoneatatime.Iftherearemoreprocessesthanmpd’s,thenwraparoundoccurs.Iftherearemorempd’sthanMPIprocesses,thensomempd’swillnotrunMPIprocesses.Thusanynumberofprocessescanberunonaringofanysize.Whileoneisdoingdevelopment,itishandytorunonlyonempd,onthelocalmachine.ThenalltheMPIprocesseswillrunlocallyaswell.

Thefirstmodificationtothisdefaultbehavioristhe-1optiontompiexec(notagreatargumentname).If-1isspecified,asin

mpiexec-1-n10a.out

thenthefirstapplicationprocesswillbestartedbythefirstmpdintheringafterthelocalhost.(Ifthereisonlyonempdinthering,thenthiswillbeonthelocalhost.)Thisoptionisforusewhenaclusterofcomputenodeshasa“headnode”wherecommandslikempiexecarerunbutnotapplicationprocesses.

Ifanmpdisstartedwiththe--ncpusoption,thenwhenitisitsturntostartaprocess,itwillstartseveralapplicationprocessesratherthanjustonebeforehandingoffthetaskofstartingmoreprocessestothenextmpdinthering.Forexample,ifthempdisstartedwith

mpd--ncpus=4

thenitwillstartasmanyasfourapplicationprocesses,withconsecutiveranks,whenitisitsturntostartprocesses.ThisoptionisforuseinclustersofSMP’s,whentheuserwouldlikeconsecutiverankstoappearonthesamemachine.(Inthedefaultcase,thesamenumberofprocessesmightwellrunonthemachine,buttheirrankswouldbedifferent.)

(Afeatureofthe--ncpus=[n]argumentisthatithastheaboveeffectonlyuntilallofthempd’shavestartednprocessesatatimeonce;afterwardseachmpdstartsoneprocessatatime.Thisisinordertobalancethenumberofprocessespermachinetotheextentpossible.)

AFREQUENTLYASKEDQUESTIONS39

Otherwaystocontroltheplacementofprocessesarebydirectuseofargumentstompiexec.SeetheUsersGuide.A.5.4

Q:OnWindows,IgetanerrorwhenIattempttocallMPICommspawn.

A:OnWindows,youneedtostarttheprogramwithmpiexecforanyoftheMPI-2dynamicprocessfunctionstowork.A.5.5

Q:Myoutputdoesnotappearuntiltheprogramexits

A:Outputtostdoutandstderrmaynotbewrittenfromyourprocessimmediatelyafteraprintforfprintf(orPRINTinFortran)because,underUnix,suchoutputisbufferedunlesstheprogrambelievesthattheoutputistoaterminal.Whentheprogramisrunbympiexec,theCstandardI/Olibrary(andnormallytheFortranruntimelibrary)willbuffertheoutput.ForCprogrammers,youcaneitheruseacallfflush(stdout)toforcetheoutputtobewrittenoryoucansetnobufferingbycalling

#include

setvbuf(stdout,NULL,_IONBF,0);

oneachfiledescriptor(stdoutinthisexample)whichyouwanttosendtheoutputimmedatelytoyourterminalorfile.

ThereisnostandardwaytoeitherchangethebufferingmodeortoflushtheoutputinFortran.However,manyFortransincludeanextensiontoprovidethisfunction.Forexample,ing77,

callflush()

canbeused.Thexlfcompilersupports

callflush_(6)

wheretheargumentistheFortranlogicalunitnumber(here6,whichisoftentheunitnumberassociatedwithPRINT).WiththeG95Fortran95compiler,settheenvironmentvariableG95UNBUFFERED6tocauseoutputtounit6tobeunbuffered.

REFERENCESA.5.6

Q:Fortranprogramsusingstdiofailwhenusingg95

40

A:Bydefault,g95doesnotflushoutputtostdout.Thisalsoappearstocauseproblemsforstandardinput.IfyouareusingtheFortranlogicalunits5and6(orthe*unit)forstandardinputandoutput,settheenvironmentvariableG95UNBUFFERED6toyes.A.5.7

Q:HowdoIrunMPIprogramsinthebackgroundwhenusingthedefaultMPDprocessmanager?

A:TorunMPIprogramsinthebackgroundwhenusingMPD,youneedtoredirectstdinfrom/dev/null.Forexample,

mpiexec-n4a.outReferences

[1]MessagePassingInterfaceForum.MPI2:AMessagePassingInterface

standard.InternationalJournalofHighPerformanceComputingAppli-cations,12(1–2):1–299,1998.[2]MarcSnir,SteveW.Otto,StevenHuss-Lederman,DavidW.Walker,

andJackDongarra.MPI—TheCompleteReference:Volume1,TheMPICore,2ndedition.MITPress,Cambridge,MA,1998.

因篇幅问题不能全部显示,请点此查看更多更全内容