Tutorial on Hardware Accelerators for Deep Neural Networks
文章推薦指數: 80 %
Deep neural networks (DNNs) are currently widely used for many AI applications including computer vision, speech recognition, robotics, etc. TutorialonHardwareAcceleratorsforDeepNeuralNetworks JoelEmer MIT,NVIDIA VivienneSze MIT Yu-HsinChen MIT Tien-JuYang MIT Email:eyerissatmitdotedu WelcometotheDNNtutorialwebsite! AsummaryofallDNNrelatedpapersfromourgroupcanbefoundhere. DNNrelatedwebsitesandresourcescanbefoundhere. TofindoutmoreabouttheEyerissproject,pleasegohere. Tofindoutmoreaboutotheron-goingresearchintheEnergy-EfficientMultimediaSystems(EEMS)groupatMIT,pleasegohere. Follow@eems_mitorsubscribetoourmailinglistforupdatesontheTutorial(e.g.,notificationofwhenslideswillbepostedorupdated) WewillbegivingatwodayshortcourseonDesigningEfficientDeepLearningSystemsonJune21-22,2022onMITCampus.Tofindoutmore,pleasevisitMITProfessionalEducation. RecentNews 9/1/2020 Newarticleon"HowtoEvaluateDeepNeuralNetworkProcessors:TOPS/W(Alone)ConsideredHarmful"inSSCSMagazineisnowavailablehere. 6/25/2020 OurbookonEfficientProcessingofDeepNeuralNetworksisnowavailablehere. 6/15/2020 ExcerptofforthcomingbookonEfficientProcessingofDeepNeuralNetworks,Chapteron"KeyMetricsandDesignObjectives"availablehere. 5/29/2020 VideosofISCAtutorialonTimeloop/AccelergyTutorial:ToolsforEvaluatingDeepNeuralNetworkAcceleratorDesignsavailablehere. 4/17/2020 OurbookonEfficientProcessingofDeepNeuralNetworksnowavailableforpre-orderhere. 2/16/2020 ExcerptofforthcomingbookonEfficientProcessingofDeepNeuralNetworks,Chapteron"AdvancedTechnologies"availablehere. 12/09/2019 VideoandslidesofNeurIPStutorialonEfficientProcessingofDeepNeuralNetworks:fromAlgorithmstoHardwareArchitecturesavailablehere. 11/11/2019 WewillbegivingatwodayshortcourseonDesigningEfficientDeepLearningSystemsatMITinCambridge,MAonJuly20-21,2020.Tofindoutmore,pleasevisitMITProfessionalEducation. 9/22/2019 SlidesforICIPtutorialonEfficientImageProcessing withDeepNeuralNetworksavailablehere. 9/20/2019 CodereleasedforNetAdapt:Platform-AwareNeuralNetworkAdaptationforMobileApplicationshere. AllNews Overview Deepneuralnetworks(DNNs)arecurrentlywidelyusedformanyAIapplicationsincludingcomputervision,speechrecognition,robotics,etc.WhileDNNsdeliverstate-of-the-artaccuracyonmanyAItasks,itcomesatthecostofhighcomputationalcomplexity.Accordingly,designingefficienthardwarearchitecturesfordeepneuralnetworksisanimportantsteptowardsenablingthewidedeploymentofDNNsinAIsystems. Thistutorialprovidesabriefrecaponthebasicsofdeepneuralnetworksandisforthosewhoareinterestedinunderstandinghowthosemodelsaremappingtohardwarearchitectures.Wewillprovideframeworksforunderstandingthedesignspacefordeepneuralnetworkacceleratorsincludingmanagingdatamovement,handlingsparsity,andimportanceofflexibility.Thisisanintermediate-leveltutorialthatwillgobeyondthematerialinthepreviousincarnationsofthistutorial. Anoverviewpaperbasedonthetutorial"EfficientProcessingofDeepNeuralNetworks:ATutorialandSurvey"isavailablehere. Ourbookbasedonthetutorial"EfficientProcessingofDeepNeuralNetworks"isavailablehere. Thisbookprovidesastructuredtreatmentofthekeyprinciplesandtechniquesforenablingefficientprocessingofdeepneuralnetworks(DNNs).ThebookincludesbackgroundonDNNprocessing;adescriptionandtaxonomyofhardwarearchitecturalapproachesfordesigningDNNaccelerators;keymetricsforevaluatingandcomparingdifferentdesigns;featuresoftheDNNprocessingthatareamenabletohardware/algorithmco-designtoimproveenergyefficiencyandthroughput;andopportunitiesforapplyingnewtechnologies.Readerswillfindastructuredintroductiontothefieldaswellasaformalizationandorganizationofkeyconceptsfromcontemporaryworksthatprovidesinsightsthatmaysparknewideas. Anexcerptofthebookon"KeyMetricsandDesignObjectives"and"AdvancedTechnologies"availableathere. ParticipantTakeaways UnderstandthekeydesignconsiderationsforDNN BeabletoevaluatedifferentDNNhardwareimplementationswithbenchmarksandcomparisonmetrics Understandthetradeoffsbetweenvariousarchitecturesandplatforms Assesstheutilityofvariousoptimizationapproaches Understandrecentimplementationtrendsandopportunities SlidesfromISCATutorial(June22,2019) OverviewofDeepNeuralNetworks [slides] PopularDNNsandDatasets [slides] BenchmarkingMetrics[slides] DNNKernelComputation[slides] DNNAccelerators(part1)[slides] DNNAccelerators(part2)[slides] DNNModelandHardwareCo-Design(precision)[slides] DNNProcessingNear/InMemory[slides] DNNModelandHardwareCo-Design(sparsity)[slides] SparseDNNAccelerators [slides] TutorialSummary[slides] SlidesfromPreviousVersionsofTutorial ISCA2017,CICS/MTL2017,MICRO2016 Video MLSys(March2020):HowtoEvaluateDNNAccelerators NeurIPSTutorial(December2019):EfficientProcessingofDeepNeuralNetwork:fromAlgorithmstoHardwareArchitectures SSCSWebinar(April2018):Energy-EfficientDeepLearning:ChallengesandOpportunities BibTeX @article{2017_dnn_piee, title={Efficientprocessingofdeepneuralnetworks:Atutorialandsurvey}, author={Sze,VivienneandChen,Yu-HsinandYang,Tien-JuandEmer,Joel}, journal={ProceedingsoftheIEEE}, year={2017} } RelatedPapers V.Sze,Y.-H.Chen,T.-Y.Yang,J.S.Emer,"HowtoEvaluateDeepNeuralNetworkProcessors:TOPS/W(Alone)ConsideredHarmful,"IEEESolid-StateCircuitsMagazine,vol.12,no.3,pp.28-41,Summer2020.[PDF] T.-J.Yang,V.Sze,"DesignConsiderationsforEfficientDeepNeuralNetworksonProcessing-in-MemoryAccelerators,"IEEEInternationalElectronDevicesMeeting(IEDM),InvitedPaper,December2019.[paperPDF|slidesPDF] Y.-H.Chen,T.-JYang,J.Emer,V.Sze,"Eyerissv2:AFlexibleAcceleratorforEmergingDeepNeuralNetworksonMobileDevices,"IEEEJournalonEmergingandSelectedTopicsinCircuitsandSystems(JETCAS),Vol.9,No.2,pp.292-308,June2019.[paperPDF|earlierversionarXiv] D.Wofk*,F.Ma*,T.-J.Yang,S.Karaman,V.Sze,"FastDepth:FastMonocularDepthEstimationonEmbeddedSystems,"IEEEInternationalConferenceonRoboticsandAutomation(ICRA),May2019.[paperPDF|posterPDF|projectwebsiteLINK|summaryvideo|codegithub] T.-J.Yang,A.Howard,B.Chen,X.Zhang,A.Go,M.Sandler,V.Sze,H.Adam,"NetAdapt:Platform-AwareNeuralNetworkAdaptationforMobileApplications,"EuropeanConferenceonComputerVision(ECCV),September2018.[paperarXiv|posterPDF|projectwebsiteLINK|codegithub] Y.-H.Chen*,T.-J.Yang*,J.Emer,V.Sze,"UnderstandingtheLimitationsofExistingEnergy-EfficientDesignApproachesforDeepNeuralNetworks,"SysMLConference,February2018.[paperPDF|talkvideo]SelectedforOralPresentation V.Sze,T.-J.Yang,Y.-H.Chen,J.Emer,"EfficientProcessingofDeepNeuralNetworks:ATutorialandSurvey,"ProceedingsoftheIEEE,vol.105,no.12,pp.2295-2329,December2017.[paperPDF] T.-J.Yang,Y.-H.Chen,J.Emer,V.Sze,"AMethodtoEstimatetheEnergyConsumptionofDeepNeuralNetworks,"AsilomarConferenceonSignals,SystemsandComputers,InvitedPaper,October2017.[paperPDF|slidesPDF] T.-J.Yang,Y.-H.Chen,V.Sze,"DesigningEnergy-EfficientConvolutionalNeuralNetworks usingEnergy-AwarePruning,"IEEEConferenceonComputerVisionandPatternRecognition(CVPR),July2017.[paperarXiv|posterPDF|DNNenergyestimationtoolLINK|DNNmodelsLINK]HighlightedinMITNews Y.-H.Chen,J.Emer,V.Sze,"UsingDataflowtoOptimizeEnergyEfficiencyofDeepNeuralNetworkAccelerators,"IEEEMicro'sTopPicksfromtheComputerArchitectureConferences,May/June2017.[PDF] A.Suleiman*,Y.-H.Chen*,J.Emer,V.Sze,"TowardsClosingtheEnergyGapBetweenHOGandCNNFeaturesforEmbeddedVision,"IEEEInternationalSymposiumofCircuitsandSystems(ISCAS),InvitedPaper,May2017.[paperPDF|slidesPDF|talkvideo] V.Sze,Y.-H.Chen,J.Emer,A.Suleiman,Z.Zhang,"HardwareforMachineLearning:ChallengesandOpportunities,"IEEECustomIntegratedCircuitsConference(CICC),InvitedPaper,May2017.[paperarXiv|slidesPDF]ReceivedOutstandingInvitedPaperAward Y.-H.Chen,T.Krishna,J.Emer,V.Sze,"Eyeriss: AnEnergy-EfficientReconfigurableAcceleratorforDeepConvolutionalNeuralNetworks,"IEEEJournalofSolidStateCircuits(JSSC),ISSCCSpecialIssue,Vol.52,No.1,pp.127-138,January2017.[PDF] Y.-H.Chen,J.Emer,V.Sze,"Eyeriss:ASpatialArchitectureforEnergy-EfficientDataflowforConvolutionalNeuralNetworks,"InternationalSymposiumonComputerArchitecture(ISCA), pp.367-379,June2016.[paperPDF|slidesPDF]SelectedforIEEEMicro’sTopPicksspecialissueon"mostsignificantpapersincomputerarchitecturebasedonnoveltyandlong-termimpact"from2016 Y.-H.Chen,T.Krishna,J.Emer,V.Sze,"Eyeriss: AnEnergy-EfficientReconfigurableAcceleratorforDeepConvolutionalNeuralNetworks,"IEEEInternationalConferenceonSolid-StateCircuits(ISSCC),pp.262-264,February2016. [paperPDF|slidesPDF|posterPDF|demovideo |projectwebsite]HighlightedinEETimesandMITNews. *Indicatesauthorscontributedequallytothework RelatedWebsitesandResources EyerissProjectWebsite[LINK] DNNEnergyEstimationWebsite[LINK] DNNProcessorBenchmarkingWebsite[LINK] ©MITEEMS2016 Accessibility
延伸文章資訊
- 1IBM/AccDNN: A compiler from AI model to RTL ... - GitHub
A compiler from AI model to RTL (Verilog) accelerator in FPGA hardware with auto design space exp...
- 2(PDF) Perceptron Algorithm and Its Verilog Design
simple perceptron design can replace the defect-tolerant. registers and the simple memory units, ...
- 3Automatically Generate Deep Neural Network Accelerator in ...
- 4物聯網應用及AI晶片設計
專題修課規劃. 上學期(可選修電子專題、或. 專題實作):. ✓ 熟悉Verilog. ✓ 練習Cell-based or FPGA design flow. ✓ 研讀相關論文. ✓ 提出構想.
- 5How to make your own deep learning accelerator chip!
AI Landscape by Shan Tang : Source. Making any chip (ASIC, SOC etc) is a costly, difficult and le...