您的当前位置:首页正文

Continuous Optimization and Coordinated Power Management 1 Technical Areas of Research IBM

2022-03-13 来源:个人技术集锦
ContinuousOptimizationandCoordinatedPowerManagement

IBMUniversityFacultyAwardProgramProposal

Prof.StephenW.KecklerDepartmentofComputerSciencesTheUniversityofTexasatAustin

IBMTechnicalSponsor:RonKalla(SystemsGroup)

March10,2005

1TechnicalAreasofResearch

VLSIdesign,powermanagement,continuousoptimization

2ProjectDescription

Ascomputerchipdesignershavepushedaggressivelyforhigherperformanceprocesses,circuits,andsystems,designmarginshaveshrunkdramatically.Forexample,intherelativelyrecentpast,peakpowerconsumptionwaswellbelowpackaginglimitsforhigh-performancesystems.However,today’spackagelimitationsonbothpowerconsumptionandheatdissipationhaverisentofirstorderdesignconstraints.Whilemostchipsaredesignedforaparticularmaximumthermaloperatingpoint,theytypicallyeitheroperatefarfromthatpointorarethrottledinacoarsegrainfashiontopreventthermalviolations.Atypicalworkloadforacomputersystemandhowtheworkloadusesthesystemcomponentschangesdrasticallyovertime.Webande-commerceserversseevaryingloaddependingonthetimeofday.Theambienttemperatureseenbyacomputerinamachineroommayvarynotjustonload,butalsowhenothersystemsareaddedtoorremovedfromtheroom.Applicationswithlargedatasetsandirregularaccesspatternsoftenexhibitpoorcachebehavior,leavingtheprocessorstalledforextendedperiodsoftime,whileotherapplicationsmaybemoreprocessorordiskintensive.Evenasingleapplicationgoesthroughphaseswhichplacevaryingburdensonthesystem[1].Asystemdesignedformaximumactivityratesofallitscomponentswouldcertainlynotexceeditspowerlimits,butwouldtypicallyoperatefarfromitstruecapabilities.Thekeychallengeisnotsolelytoreducepowerconsumption,butinsteadtodeliverenergytowhereitismostusefulinthesystematanygiventime.

Weproposetouseon-linecontinuousoptimizationtodynamicallytunethesystemtomeetpower,temper-ature,andenergyconstraints.Whilesubstantialopportunitiesforreducingpowerconsumptionareavailable,extendingthecurrentstrategyoflocalizedcontrolofindividualpowermanagementtechniquesisnotviable.Withoutcoordinatedcontrol,acollectionofindividualtechniquesmaybeenabledindestructiveorineffectivecombinations.Existingopen-loopcontroltechniquesdonotguaranteeeffectiveoperationthroughoutthewiderangeofprocessvariability,applicationspace,andoperatingconditions.Asimplereactiveapproachofenablingapower(orothermetric)savingtechniquebasedonapre-definedsetofeventssuchas”after1000cyclesofinactivity,transitiontosleepmode”doesnottakeintoaccounttheeffectivenessoftheactionatruntime.Tech-niquesthatareappliedglobally,suchasIntel’srecentlyannounced”DemandBasedSwitching”whichallowsfinegrainadjustmentstochip-widevoltageandfrequency,lacktheabilitytochanneltheenergytodifferentpartsofthechipatdifferenttimes.Weproposetoexamineandevaluateclosed-looppowermanagementmechanismsthatmonitorandmeasuretheireffectivenessovertimeandacrossapplications,aswellasallocateenergytothemostcriticalresourcesatanygiventime.

Opportunitiesforpowermanagement:Inourpriorwork,weinvestigatedopportunitiesforreducingbothdynamicpower[2]andstaticpower[3]inthecontextofanout-of-ordermicroprocessor.Inourstudiesof

1

dynamicpowerwetrackeddynamicpowerconsumptionthroughoutapipelinemodeloftheAlpha21264pro-cessor,notingthepowertaxofmis-predictionandover-provisioning.Wefoundthatmis-predictionaccountedforapproximately6Over-provisionedstructuresthataredesignedformaximumthroughputbutnotfullyusedbytypicalprogramsaccountedforabout17pipelineenergy.Inourstudyofstaticpower,wecomparedtheeffectiveness(fromthemicroarchitecturalperspective)ofdifferentmechanismsthatreducestaticpowercon-sumptionincaches,includingpowergatinganddynamicthresholdvoltagemodulation.Wefoundthatthesevaryingtechniquesprovideddifferentbenefitstodifferentcaches,andcouldimprovetheenergy-delayproductofthesecachesbyafactorof20-50,dependingonthecacheandthetechnique.

Theliteratureonmicroarchitecturalmechanismsisalreadylargeandcontinuestogrow.Dynamicmanage-menttechniquesincludeclockgating,dynamicvoltage/frequencyscaling,pipelinegating,pipelinethrottling,anddynamicmicroarchitecturalstructuresizemodulation.Additionalstrategiestodynamicallymanageleakageenergyincludeinstruction-cacheresizinganddrowsycaches,eachofwhichplacesaportionofacacheintoalow-powerstate.

Challengesforcombiningtechniques:Simplyextendingtheexistingclassofmicroarchitecturalmanage-menttechniquestoencompasspower,energy,andtemperatureconstraintsfallsshortofarobustmanagementsystem.Employingmultiplesimultaneouspowermanagementtechniquesposestwomainconcerns.First,powermanagementparametersaretypicallydeterminedwithincompleteknowledgeofphysicalenvironment,operatingconditions,andapplicationcharacteristics.Ifcodeprofilingandpre-fabricationprocessorsimulationsdonotaccuratelymatchactualruntimeconditions,themismatchcanleadtoineffectivemanagement.Forexam-ple,changingthefrequencyandvoltagesettingsbasedonrecentprogrambehaviorviaaperformancemonitormayprovideexcellentcontrolforthetestbenchmarksuiteyetresultinapathologicalcaseforacustomer’sproprietarysoftware.

Second,runtimeeventscouldrepeatedlytriggerconflictsbetweenmanagementpolicies.Forexample,anenergy-savingpolicymightsetthefrequencyatafastrateforaprogramsothatitcancompletethetaskquicklyandthenpowerdowntoconservestaticenergy.Aseparatetemperaturepolicymightsetalowerfrequencytocoolthechipintheeventofexcessiveheatdissipation.Duringprogramexecution,thechipcouldbreachatemperaturethreshold,causingoscillationsbetweenmanagementmechanismsthattriggeraslowerfrequencyforcoolingandfasterfrequencytooptimizeleakage.Avoidingsuchconflictsrequirestestingeachcombinationoftechniques,addingtothecostandcomplexityofprocessorverification.

CoordinatedPowerManagement:Weproposetocontrolthepowermanagementmechanismsinacoor-dinatedfashion,adjustingtheminconcerttoachievethedesiredperformancegoalswithintheconstraintsoflimitedpower,energy,andtemperaturelevels.Theinfrastructureforcoordinatedpowermanagementincludesacollectionofsensors(whichcouldincludetemperaturesensorsaswellasactivitycounters),asetofactuatorsforadjustingthevariouspowermanagementparameters,andacontrollerthatmakespolicydecisions.Weexpectthatthealgorithmsandchangingpoliciesmayrequireprogrammabilityintheformofasimpleembeddedpro-cessor.Whilewewillinitiallyfocusonasingle-chipmicroprocessorwithanembeddedpowermanager,wealsoforseethisapproachcomplementingasystem-levelstrategy(suchasthatacrossanSMP)inwhichthedifferentnodesinthesystemarerunatvaryingfrequenciesandpowerconsumptionaccordingtoload[4].

Currentpowermanagersreacttospecificeventswithpre-determinedresponses,suchasthePentium4thermalcontrolpolicyparaphrasedas”iftemperatureexceedsthethreshold,thenenableintermittentclockgating.”Agoal-drivenmanagementapproachadaptstoawiderrangeofoperatingconditionsandresourceuse,allowingprocessorstorunclosertotheedgeofpower,temperature,andenergylimits.Agoal-seekingapproachisflex-ible,unliketrigger-drivendecisionsthatreacttospecificeventswithpre-determinedresponses.Forexample,agoal-drivencontrollerfacinganimpendingthermalemergencyselectsthemosteffectivechoiceforthesitu-ation,choosingthebestcombinationofclockgating,threadmigration,voltageandfrequencyscaling,orother

2

options.Itcanprovidesaferoperatingconditionsforrun-timeenvironmentsandconfigurationsnotexpectedduringdesignandvalidationphases.

Ourcoordinatedapproachwouldsupplyagoaltothepowermanagersuchas”maximumperformancewithinsettemperatureandenergylimits,”whichwouldthenselecttheappropriatemechanismstoachievethegoal.Themanagermaintainsamodelofthesystemandunderstandsthefirst-ordersensitivityofperformance,tempera-ture,andpowertothemanagementactuatorsatitsdisposal.Wewillexploreafamilyofalgorithms,includingconstrained-optimizationapproaches,whichcanusegradientdescenttechniquestodrivetheconfigurationto-wardthedesiredgoalusingfeedbackfromthesensors.Consequently,themanagercantracksystembehaviorandshiftgoalobjectivesinsynchronywithchangingapplicationdemandsandenergyresources.Thisclosed-loopfeedbacksystemisaverypowerfulparadigmforempiricallyfindinggoodconfigurations,butgoodcontrolsystemsengineeringmethodsmustbeapplied.

Asanexampleofcontinuousoptimizationforpower,considerthefollowingscenario.Theoperatingsystemnotifiesthecoordinatedmanagertoseekthegoalofhighthroughputwithinlimitsofastrictupperboundontemperaturewithmoderatepowerandenergythresholds.Theprocessoriscurrentlyoperatingwithamid-rangevoltagelevel;sensordataindicatethatthetemperatureiswithinanacceptablerangeandthattheperformanceislessthanthegoal.Themanagerdirectsthevoltageregulatortostepupthesupplyvoltageandmonitorsthetemperatureriseandperformancecounters,andcontinuestoraisethefrequencyandvoltageuntilachievingthedesiredperformancetarget.Ifarunningapplicationcausesathermalspike,themanagertakesimmediateactiontocoordinatearesponsebetweenthevoltage,frequency,andactivitymigrationcontrols,whilepostpon-ingacacheleakagepolicythatwouldhavecreatedatemporaryincreaseinwrite-backtrafficataninoppor-tunemoment.Withcoordinatedinformationfrommultiplesourcesandagoal-drivenalgorithm,ahierarchicalpower/energy/temperaturemanagercanadapttothesystemenvironmentandpushtheoperatingconditionstotheedgeofacceptablelimits.

Thecoordinatedmanagerdesignintegratesthefundamentalprinciplesofclosed-loopandgoal-drivencontrolthroughthefollowingbasicmechanisms:

(1)Sensors:Themanagerrequiresaccesstotemperaturesensorsandeventcounters(collectivelyreferredtoassensors)throughoutthechipatappropriatesamplingintervals.Themanagercanalsouseactivitycounterdatatotrackdecisioneffectivenessanddeterminecostfunctionsforknobsettings.

(2)Actuators:Acoordinatedmanagerrequiresusefulknobstoturn,suchasDVFS,pipelinewidthmodula-tion,andsleepmodetechniquesinourexperiments.Aselectionofknobsthatencompassarangeofoptionsfromcoarse-grainglobalcontroltofine-grainlocalizedcontrolprovideresolutionfortuningtheprocessor’soperationtoitsgoalstate.

(3)Feedbackalgorithms:Arobustalgorithmdirectsknobsettingsbysynthesizinginformationfromsensorsandcounters.Thealgorithmmustbestableoverawiderangeofinputandgoalfunctionsinordertopreventsystemfailurefromerrantcontroldecisions.

(4)Hierarchy:Themanagerwillspanhardwareandsoftwareforacombinationofimmediatecontrolandandflexibility.Ahierarchywithinthemanagerdistributesdecisionsaccordingtorequiredresponsetime:quickresponseinhardwareforphenomenawithshorttimeconstants,suchasajumpinleakagepowerwhenaunitexitssleepmode;andsoftwaretohandlelongerintervalsbetweendecisionsforslow-movingtrendslikegradualchipwarming.

(5)Granularity:Someresponsessuchasuniversalclockreductionareappliedatagloballevel,whileothers,suchascachesleepmodes,targetonlyalocalizedarea.Theadventoftechniquessuchasvoltageislandsandgloballyasynchronous,locallysynchronous(GALS)designswillenabletechniquessuchasDVFStobeappliednon-uniformlyacrossthechip.Acoordinatedmanagercantuneawiderangeofcoarseandfinegrainmanagementtechniquestoefficientlymanageresources.

3

Evaluation:Wehavecompletedthedevelopmentofanarchitecturalsimulationinfrastructuretoquantifytheeffectofpower,temperature,andenergymanagementdecisions.Ourinfrastructurecombinesourdetailedandvalidatedmicroarchitecturalsimulator(sim-alpha)withtheWattchpowermodelandtheHotSpottemperaturemodel.Wehavealreadyextendedthesimulatortoincludepowermanagementtechniquessuchasdynamicfrequencyandvoltagescaling,pipelinethrottling,andcacheleakagecontrol.Ourinitialexperimentsmeasuredsystemwithnopowermanagement,uncoordinatedpowermanagement,andfixedpowermanagement(tryingallpossiblepowermanagementparametersettingsandpickingthebestone–atechniquenotfeasibleinreality)[5].TheresultsshowthatthebestpowermanagementsettingssubstantiallyoutperformtheuncoordinatedmanagerandnopowermanagementbyawidemarginonasubsetoftheSPEC2000benchmarks.

Inthecomingyear,wewillevaluatealgorithmsfordynamicallymanagingandallocatingenergysubjecttotemperature,power,andperformanceconstraints.Wehopetosurpasstheperformanceoftheoptimaloff-linealgorithmwithagooddynamicon-linealgorithmthatoperatesinconjunctionwithapplicationexecution.Wewillextendoursimulationinfrastructuretoincludeperformancecountersandasensornetworktoprovidedataforthecoordinatedmanager’sonlinealgorithmandmeasurethesystemresponseatrealisticsamplingintervals.Futureworkmayexaminetheviabilityofusingfiner-grainedvoltage/frequencymodulation,asaffordedthroughfabricationandcircuittechniquessuchasvoltage/frequencyislands.

References:

[1]“DiscoveringandExploitingProgramPhases,”T.Sherwood,E.Perelman,G.Hamerly,S.Sair,andB.CalderIEEEMicro,23(6),pp.84-93,November/December,2003.

[2]“MicroprocessorPipelineEnergyAnalysis,”R.Natarajan,H.Hanson,S.W.Keckler,C.R.Moore,andD.Burger,IEEEInternationalSymposiumonLowPowerElectronicsandDesign(ISLPED),pp.282-287,August,2003.

[3]“StaticEnergyReductionTechniquesforMicroprocessorCaches,”H.Hanson,M.S.Hrishikesh,V.Agarwal,S.W.Keckler,andD.Burger,IEEETransactionsonVLSISystems,11(3),pp.303-313,June,2003.

[4]“SchedulingforHeterogeneousProcessorsinServerSystems,”S.Ghiasi,T.Keller,andF.Rawson(IBMAustinResearchLaboratory),ComputingFrontiersConference,May,2005.

[5]“ACaseforCoordinatedManagementofPerformance,Power,Energy,andTemperature,”H.HansonandS.Keckler,submittedtotheIEEEInternationalSymposiumonLowPowerElectronicsandDesign(ISLPED),2005.

3ProjectObjectivesandGoals

Ourprimarygoalsaretoanswerthefollowingresearchquestions:

Whatarethelimitsofindividualpowermanagementtechniquesappliedinisolation?

Howdothesedifferentpowermanagementtechniquesinteractwhenappliedsimultaneously,butcon-trolledindependently?Aretheinteractionscomplementaryorconfrontational?

Whataretheappropriatemetricsforpower/thermaloptimization(temperature,powerconsumption?),andwhatarethemostappropriatemeansofmeasuringthemon-line?

Whatarethenaturaltimeconstantsofthepowermanagementtechniques?Howlongdoesittaketoinvokeeachtechnique,whatistheoverhead,andhowlongdoesittakeforthetechniquetotakeeffect?Whatarethelimitsofacoordinatedapproachtopowermanagement,inwhichallofthepowermanage-menttechniquesarecontrolledinacooperativefashion?

4

Howclosecanrealcontrolalgorithmsapproachtheoptimallimitsofpowermanagement?Whatarethebenefitsoffeedbackcontrolalgorithmsoveropen-loopalgorithms?

Whatistherightbalancebetweenhardwareandsoftwareinimplementinganembeddedpowermanager?Howdothebestpowermanagementpoliciesonanaggressiveconventionalarchitecturecomparetoamoreconservativesimpler(andperhapsmoreinherentlypowerefficient)architecturewithlessextensivepowermanagement,intermsofpowerandperformance?

Inaddition,weexpectthattheinsightsdevelopedduringthisstudywillbeofinteresttotheIBMSystemsGroup.WeexpecttointeractwiththeRonKallaandCarlAnderson(amongothersatIBM)toensurerelevanceoftheworktoIBMandtoprovideaconduitfortheinsightsbackintoIBM.

4LongTermImpact

Aggressivepowermanagementisnecessarytolimitthepackagingandsystemcostsforpowerdeliveryandcoolinginbothhigh-endandlow-endsystems.On-linecontinuousoptimizationrepresentsadeparturefromtheconventionalapproachofdesigningachip/systemfortheworstcase.Suchoptimizationwillallowdesignstopotentiallyexceedtheirpowerandtemperaturelimits,butwillrelyonon-linemechanismstoensuresafeoperatingconditions.Thisapproachwillallowthesystemtorunclosertotheedgeofthepower/performanceenvelopethancurrentstrategiesthatoverlyrestrictthesystematdesigntime.IBMhasrecognizedtheneedforpowermanagementandhasestablishedacorporate-widelowpowerinitiativecenteredattheIBMAustinResearchLaboratory(ARL).Inaddition,newIBMinitiativesinautonomiccomputingarewell-matchedwiththenotionofcontinuousoptimizationdescribedinthisproposal.CombiningthecircuitsandsystemsworkfromtheARLwiththeexpectedmicroarchitecturalresultsfromthisresearchwilllikelyprovebeneficialtofuturedesignswithintheIBMSystemsgroup.WeareuniquelypositionedtoinvestigatethisareabecauseofourstrongtiestoIBMinboththeTRIPSandPERCSprojects.

5

因篇幅问题不能全部显示,请点此查看更多更全内容