Basic of AI Accelerator Design using Verilog HDL - Slideshare

文章推薦指數: 80 %
投票人數:10人

Basic of AI Accelerator Design using Verilog HDL. 1. Basic of AI Accelerator Design using Verilog HDL Joohan KIM Ver. 20200512 https://blog.naver.com/ ... SlideShareverwendetCookies,umdieFunktionalitätundLeistungsfähigkeitderWebseitezuverbessernundIhnenrelevanteWerbungbereitzustellen.WennSiedieseWebseiteweiterbesuchen,erklärenSiesichmitderVerwendungvonCookiesaufdieserSeiteeinverstanden.LesenSiebitteunsereNutzervereinbarungunddieDatenschutzrichtlinie. SlideShareverwendetCookies,umdieFunktionalitätundLeistungsfähigkeitderWebseitezuverbessernundIhnenrelevanteWerbungbereitzustellen.WennSiedieseWebseiteweiterbesuchen,erklärenSiesichmitderVerwendungvonCookiesaufdieserSeiteeinverstanden.LesenSiebitteunsereunsereDatenschutzrichtlinieunddieNutzervereinbarung. Start Entdecken Einloggen Registrieren DiesePräsentationwurdeerfolgreichgemeldet. Aktivieredeinekostenlose30-tägigeTestversion,umunbegrenztzulesen. BasicofAIAcceleratorDesignusingVerilogHDL 0 Teilen JoohanKIM • 27.Mär2020 • 0gefälltmir • 83Aufrufe Jetztherunterladen Herunterladen NächstePräsentationen DuliesteineVorschau. Erstelledeinekostenlose30-tägigeTestversion,umweiterzulesen. Kostenlosfortfahren NächsteSlideShare Yannlecun Wirdgeladenin …3 × Facebook Twitter LinkedIn Größe(px) StartenSiemit ÄhnlicheSlideSharesamEndeanzeigen Teilen E-Mail     AmmeistengeclippteFolie 1 1von155 BasicofAIAcceleratorDesignusingVerilogHDL 27.Mär2020 • 0gefälltmir • 83Aufrufe 0 Teilen Jetztherunterladen Herunterladen Herunterladen,umofflinezulesen Ingenieurwesen ppt:BasicofAIAcceleratorDesignusingVerilogHDL git:https://github.com/matbi86/01_ai_accelerator_basic_for_student ref:http://eyeriss.mit.edu/tutorial.html Mehrerfahren JoohanKIM Folgen ppt:BasicofAIAcceleratorDesignusingVerilogHDL git:https://github.com/matbi86/01_ai_accelerator_basic_for_student ref:http://eyeriss.mit.edu/tutorial.html Mehrerfahren Ingenieurwesen Yannlecun Yandex MLconf-DistributedDeepLearningforClassificationandRegressionProblems... SriAmbati H2ODeepLearningThroughExamples021215 SriAmbati NeuralNetworks.Overview OleksandrBaiev DeepLearningintheWildwithArnoCandel SriAmbati DeepLearninglibrariesandfirstexperimentswithTheano VincenzoLomonaco DIYDeepLearningwithCaffeWorkshop odsc DeepLearningwithPython(PyDataSeattle2015) AlexanderKorbonits ParallelProgramminginPython:Speedingupyouranalysis ManojitNandi DeepConvolutionalNetworkevaluationontheIntelXeonPhi GauravRaina Yannlecun Yandex MLconf-DistributedDeepLearningforClassificationandRegressionProblems... SriAmbati H2ODeepLearningThroughExamples021215 SriAmbati NeuralNetworks.Overview OleksandrBaiev DeepLearningintheWildwithArnoCandel SriAmbati DeepLearninglibrariesandfirstexperimentswithTheano VincenzoLomonaco DIYDeepLearningwithCaffeWorkshop odsc DeepLearningwithPython(PyDataSeattle2015) AlexanderKorbonits ParallelProgramminginPython:Speedingupyouranalysis ManojitNandi DeepConvolutionalNetworkevaluationontheIntelXeonPhi GauravRaina WeitereVerwandteInhalte DasKönnteIhnenAuchGefallen DeepConvolutionalNeuralNetworkaccelerationontheIntelXeonPhi GauravRaina 딥러닝개요(2015-05-09KISTEP) KeunwooChoi IntrotoNeuralNetworks DeanWyatte IntroductiontoConvolutionalNeuralNetworks HannesHapke ConvNetJS&CaffeJS Anyline GömülüSistemlerdeDerinÖğrenmeUygulamaları FerhatKurt FromConventionalMachineLearningtoDeepLearningandBeyond.pptx Chun-HaoChang Notesfrom2016bayareadeeplearningschool NiketanPansare IntroductiontoDeepLearning OlegMygryn Convolutionalneuralnetworkinpractice 남주김 IkaLog:DataCollectorforSplatoonandMachineLearning TakeshiHASEGAWA AmazonDeepLearning AmandaMackay(she/her) Deeplearningの概要とドメインモデルの変遷 TaigaNomi 2値化CNNonFPGAでGPUとガチンコバトル(公開版) HirokiNakahara DeepLearninginiOSTutorial Anyline SqueezingDeepLearningIntoMobilePhones AnirudhKoul ADeeperDiveintoApacheMXNet-March2017AWSOnlineTechTalks AmazonWebServices ADeeperDiveintoApacheMXNet-March2017AWSOnlineTechTalks AmazonWebServices BestDeepLearningPostfromLinkedInGroup FarshidPirahansiah TensorProcessingUnit(TPU) AntoniosKatsarakis Thedeeplearningtour-Q12017 EranShlomo ScalableDeepLearningonAWSwithApacheMXNet JulienSIMON WhyisDeeplearninghotrightnow?andHowcanweapplyitoneachdayjob? IssamAlZinati Convolutionalneuralnetwork YanXu IntroductiontoRecurrentNeuralNetwork YanXu 2値ディープニューラルネットワークと組込み機器への応用:開発中のツール紹介 HirokiNakahara DeepLearning&Tensorflow:AnIntro SibyJosePlathottam [AI07]RevolutionizingImageProcessingwithCognitiveToolkit de:code2017 NVIDIA深度學習教育機構(DLI):Approachestoobjectdetection NVIDIATaiwan NVIDIA深度學習教育機構(DLI):Neuralnetworkdeployment NVIDIATaiwan MachineLearninginAction AmazonWebServices Introductiontodeeplearning AbhishekBhandwaldar CNNDataflowImplementationonFPGAs NECSTLab@PolitecnicodiMilano DeepLearningInitiative@NECSTLab NECSTLab@PolitecnicodiMilano IntrotoScalableDeepLearningonAWSwithApacheMXNet AmazonWebServices DeepDiveintoApacheMXNetonAWS KristanaKane DistributedDeepLearningonAWSwithApacheMXNet AmazonWebServices DLDmeetup2017,EfficientDeepLearning Brodmann17 IntroductiontoGenerativeAdversarialNetworks(GAN)withApacheMXNet AmazonWebServices FPGAsinthecloud?(October2017) JulienSIMON ÄhnlicheBücher Kostenlosmiteiner30-tägigenTestversionvonScribd Alleanzeigen LexikonderSymboleundArchetypenfürdieTraumdeutung PeterChairon (0/5) Kostenlos AkkusundBatterienrichtigpflegenundladen:Leichtgemacht,GeldundÄrgergespart! BoHanus (0/5) Kostenlos OpelAstraH3/04-11/09,OpelZafiraB7/05-11/10:Sowird´sgemacht-Band135 RüdigerEtzold (4.5/5) Kostenlos Antarktis-dieverboteneWahrheit:SchaltstelleGeheimerWeltraumprogramme,ZentralefürinterplanetarenSklavenhandel,LandeplatzaußerirdischerFlüchtlinge MichaelE.Salla (3/5) Kostenlos CommandLineKungFu:Bash-Scripting-Tricks,LinuxTippsundpraktischeEinzeilerfürdieShell JasonCannon (0/5) Kostenlos OpelCorsaC9/00bis9/06,OpelMeriva5/03bis4/10:Sowird´sgemacht,Band131 RüdigerEtzold (0/5) Kostenlos Elektro-InstallationenimHaus:LeichtverständlicheAnleitungenfürvölligunerfahreneEinsteigerundProfi-Heimwerker BoHanus (0/5) Kostenlos Panzerketten:DieGleiskettenderdeutschenKettenfahrzeugedesZweitenWeltkrieges PeterSchwarzmann (4.5/5) Kostenlos VWPoloIV11/01-5/09,SeatIbiza4/02-4/08:Sowird´sgemacht-Band129 RüdigerEtzold (3.5/5) Kostenlos DieBlockchain-Revolution:WiedieTechnologiehinterBitcoinnichtnurdasFinanzsystem,sonderndieganzeWeltverändert DonTapscott (3.5/5) Kostenlos DerleichteEinstiegindieElektrotechnik&Elektronik:BauteilederElektrotechnik·Solartechnik·Netzgeräte·MotorenundGeneratoren·Messgeräte·Beleuchtung BoHanus (5/5) Kostenlos DerHaunebuAntrieb:Sofunktionier(t)endielegendärenUFOs HolgerErutan (0/5) Kostenlos DasneueWerkbuchElektronik:DaskompletteKnow-howderElektronikaktuellerklärt RüdigerKlein (5/5) Kostenlos OpelAstraG3/98bis2/04,OpelZafiraA4/99bis6/05:Sowird'sgemacht-Band113 RüdigerEtzold (0/5) Kostenlos VWPassat7von11/10bis10/14:Sowird'sgemachtBand157 RüdigerEtzold (2.5/5) Kostenlos AutodeskInventor2019-Belastungsanalyse(FEM):VielepraktischeÜbungenamKonstruktionsobjektRADLADER ChristianSchlieder (0/5) Kostenlos ÄhnlicheHörbücher Kostenlosmiteiner30-tägigenTestversionvonScribd Alleanzeigen KoiKarpfenhalten:Koi-Haltungleichtgemacht–GrundlegendeszurartgerechtenHaltungvonKois MarkHusemeyer (0/5) Kostenlos Heißzeit:MitVollgasindieKlimakatastrophe-undwiewiraufdieBremsetreten MojibLatif (0/5) Kostenlos DasliebeAuto-GeschichtenvonSchüsselnundFlitzern(ungekürzt) HansEckartRübersamen (0/5) Kostenlos ÖlundGas-ImNetzderKonzerne GerhardKonzelmann (0/5) Kostenlos LeanProduction-Grundlagen:DasPrinzipderschlankenProduktionverstehenundinderPraxisanwenden.SchlankzurWertschöpfung! MaximilianTündermann (0/5) Kostenlos LeanManagementfürEinsteiger:ErfolgsfaktorenvonLeanManagement–LeanLeadership&Co.alslangfristigeErfolgsgaranten MaximilianTündermann (4.5/5) Kostenlos DieAutodoktoren-ZweidrehenamRad:DiebestenGeschichtenausderWerkstatt(UngekürzteLesung) Hans-JürgenFaul (0/5) Kostenlos ShopfloorManagementfürEinsteiger:ErsteSchrittezueinerwertschöpfungsorientiertenFührung.TheoretischeGrundlagen,-begriffeundpraktischeToolszurEinführunginKMUs MalteSchechler (0/5) Kostenlos ElonMusk:WieElonMuskdieWeltverändert–DasungekürzteHörbuch AshleyVance (4.5/5) Kostenlos DerZauberderWirklichkeit-DiefaszinierendeWahrheithinterdenRätselnderNatur(Ungekürzt) RichardDawkins (5/5) Kostenlos ElonMusk:WieElonMuskdieWeltverändert–DieBiografie AshleyVance (0/5) Kostenlos DieVerantwortlicheElektrofachkraft:VEFK-StrukturundBetrieblicheElektrosicherheitfürUnternehmer,FachundFührungskräfte MatthiasSurovcik (0/5) Kostenlos DerGeistinderMaschine:KünstlicheIntelligenzundmenschlichesLeben FrankfurterAllgemeineArchiv (0/5) Kostenlos Ragazzamotorizzata:AufeinerhalbenVespaumganzItalien NatiRasch (0/5) Kostenlos GeplanteObsoleszenz DawidHeftmann (0/5) Kostenlos Raumfahrt:WohinwilldieMenschheit? FrankfurterAllgemeineArchiv (0/5) Kostenlos BasicofAIAcceleratorDesignusingVerilogHDL 1. BasicofAI AcceleratorDesign usingVerilogHDL JoohanKIM Ver.20200512 https://blog.naver.com/chacagea 2. •UnderstandthekeyknowledgeforDNNs •UnderstandthetrendtoimproveDNNs •BeabletodesignCNNCoreusingVerilogHDL •BeabletothedevelopmentflowforHWDesign •Youwillbeinterestedinhardwareaccelerateddesign!! ParticipantTakeaways https://blog.naver.com/chacagea 3. •TrainingResources •https://github.com/matbi86/01_ai_accelerator_basic_for_student •QnA •https://blog.naver.com/chacagea/memo/221865835735 •ShovelingLog!! •https://blog.naver.com/chacagea Resources https://blog.naver.com/chacagea 4. Theory:BasicofDeepLearning •BackgroundofDeepNeuralNetworks •OverviewofDeepNeuralNetworks •Perceptron •DeepNeuralNetwork(DNN) •Inferencevs.Training •WhyStudyCNN? •ImageClassification •SuperResolution •GenerativeAdversarialNetworks(GAN) •UnderstandofConvolutionNeuralNetwork •Channel •Stride •Padding •Pooling Training:CNNOperationCodingwithC •HowtouseVim/GCC •CNNCoreCCoding.(HWwillbeimplementedwiththiscode.) Contents(Day1) https://blog.naver.com/chacagea 5. Theory:UnderstandingCNNandHWImplementation •SurveyofDNNDevelopmentResources •PopularDNNs •LeNet(1998) •AlexNet(2012) •OverFeat(2013) •VGGNet(2014) •GoogleNet(2014) •ResNet(2015) •Frameworks •DataSets •UnderstandofCNN(Advanced) •CPUvsGPUvsFPGAvsASIC(WhyuseHW?) Training:VivadoSimulationEnvironmentTest •IntroduceVivado •4-bitcounterdesignandsimulation. Contents(Day2) https://blog.naver.com/chacagea 6. Theory:CNNCoreDesignusingVerilogHDL •ShortReviewofVerilogHDL •RTL •Parameter •Generate •IndexedPartSelect •CNNCoreSpec •CNNCoreHWArchitecture Training:CNNCoreDesignusingVerilogHDL •CNNDesignwithVerilogHDL •CNNDesignReview Contents(Day3) https://blog.naver.com/chacagea 7. Day1Theory BasicofDeep Learning Ver.20200512 https://blog.naver.com/chacagea 8. Theory:BasicofDeepLearning •BackgroundofDeepNeuralNetworks •OverviewofDeepNeuralNetworks •Perceptron •DeepNeuralNetwork(DNN) •Inferencevs.Training •WhyStudyCNN? •ImageClassification •SuperResolution •GenerativeAdversarialNetworks(GAN) •UnderstandofConvolutionNeuralNetwork •Channel •Stride •Padding •Pooling Training:CNNOperationCodingwithC •HowtouseVim/GCC •CNNCoreCCoding.(HWwillbeimplementedwiththiscode.) Contents(Day1) https://blog.naver.com/chacagea 9. BackgroundofDeepNeuralNetworks https://blog.naver.com/chacagea http://eyeriss.mit.edu/tutorial.html 10. ArtificialIntelligence https://blog.naver.com/chacagea 11. AIandMachineLearning https://blog.naver.com/chacagea 12. Brain-InspiredMachineLearning https://blog.naver.com/chacagea 13. HowDoestheBrainWork? https://blog.naver.com/chacagea 14. Spiking-basedMachineLearning https://blog.naver.com/chacagea 15. SpikingArchitecture https://blog.naver.com/chacagea 16. MachineLearningwithNeuralNetworks https://blog.naver.com/chacagea 17. NeuralNetworks:WeightedSum https://blog.naver.com/chacagea 18. ManyWeightedSums https://blog.naver.com/chacagea 19. DeepLearning https://blog.naver.com/chacagea 20. WhatisDeepLearning? https://blog.naver.com/chacagea •NumberofLayer:5~over1000 •DNN(DeepNeuralNetworks) •VisualizingCNN •low-levelfeatures/higherlevelfeatures 21. WhyisDeepLearningHotNow? https://blog.naver.com/chacagea 22. ImageNetChallenge https://blog.naver.com/chacagea 23. ImageNet:ImageClassificationTask https://blog.naver.com/chacagea 24. GPUUsageforImageNetChallenge https://blog.naver.com/chacagea 25. DeepLearningonGames https://blog.naver.com/chacagea 26. EmergingApplications https://blog.naver.com/chacagea •Medical(CancerDetection,Pre-Natal) •Finance(Trading,EnergyForecasting,Risk) •Infrastructure(StructureSafetyandTraffic) •WeatherForecastingandEventDetection •Self-drivingCars 27. Opportunities https://blog.naver.com/chacagea FromEETimes–September27,2016 ”Todaythejoboftrainingmachinelearningmodelsis limitedbycompute,ifwehadfasterprocessorswe’d runbiggermodels…inpracticewetrainonareasonable subsetofdatathatcanfinishinamatterofmonths.We coulduseimprovementsofseveralordersofmagnitude –100xorgreater.” –GregDiamos,SeniorResearcher,SVAIL,Baidu 28. OverviewofDeepNeuralNetworks https://blog.naver.com/chacagea 29. DNNTimeline https://blog.naver.com/chacagea 30. SoManyNeuralNetworks! https://blog.naver.com/chacagea http://www.asimovinstitute.org/neural-network-zoo/ 31. Perceptron https://blog.naver.com/chacagea 퍼셉트론이란? 다수의신호를입력으로받아하나의신호를출력하는것. AND (w1,w2,세타)=(0.5,0.5,0.7),(0.5,0.5,0.8)또는(1.0,1.0,1.0) NAND (w1,w2,세타)=(-0.5,-0.5,-0.7),(-0.5,-0.5,-0.8)또는(-1.0,-1.0,-1.0) OR (w1,w2,세타)=(0.3,0.3,0.5) 똑같은구조의퍼셉트론에서매개변수의값을통해다양한 논리값을표현할수있다. https://ko.wikipedia.org/wiki/%ED%8D%BC%EC%85%89 %ED%8A%B8%EB%A1%A0 32. ProblemofPerceptron https://blog.naver.com/chacagea SingleLayerPerceptron으로는XOR를표현할수없다. MultiLayerPerceptron Perceptron의Layer가쌓이면non-linear적인문제를해결할수있습니다. 복잡한문제를해결하기위해서는Non-Linearity가중요합니다. SingleLayerPerceptron 33. ActivationFunction https://blog.naver.com/chacagea 34. DNNTerminology101 https://blog.naver.com/chacagea Neurons Synapses 35. DNNTerminology101 https://blog.naver.com/chacagea 36. DNNTerminology101 https://blog.naver.com/chacagea 37. DNNTerminology101 https://blog.naver.com/chacagea 38. DNNTerminology101 https://blog.naver.com/chacagea 39. DNNTerminology101 https://blog.naver.com/chacagea 40. DNNTerminology101 https://blog.naver.com/chacagea 41. DNNTerminology101 https://blog.naver.com/chacagea MatrixDotProduct Bias Layer1Layer2Layer3 42. PopularTypesofDNNs https://blog.naver.com/chacagea 43. Inferencevs.Training https://blog.naver.com/chacagea 44. Inferencevs.Training https://blog.naver.com/chacagea 45. WhyStudyCNN? https://blog.naver.com/chacagea 46. ImageClassification https://blog.naver.com/chacagea https://becominghuman.ai/building-an-image-classifier- using-deep-learning-in-python-totally-from-a- beginners-perspective-be8dbaf22dd8 47. SuperResolution https://blog.naver.com/chacagea https://awesomeopensource.com/project/limbee/NTIRE2 017 EnhancedDeepResidualNetworksforSingleImage Super-Resolution 48. SuperResolution(EDSR) https://blog.naver.com/chacagea https://awesomeopensource.com/project/limbee/NTIRE2 017 49. SuperResolution(EDSR) https://blog.naver.com/chacagea https://awesomeopensource.com/project/limbee/NTIRE2 017 50. GenerativeAdversarialNetworks(GAN) https://blog.naver.com/chacagea https://www.freecodecamp.org/news/an-intuitive- introduction-to-generative-adversarial-networks-gans- 7a2264a81394/ 51. GenerativeAdversarialNetworks(GAN) https://blog.naver.com/chacagea https://www.youtube.com/watch?v=C1YUYWP-6rE 4:00 52. GenerativeAdversarialNetworks(GAN) https://blog.naver.com/chacagea https://hoya012.github.io/blog/SIngle-Image-Super- Resolution-Overview/ 53. UnderstandofConvolutionNeuralNetwork https://blog.naver.com/chacagea 54. MNIST https://blog.naver.com/chacagea MNISTDataBase(ModifiedNationalInstituteofStandardsandTechnologydatabase) -HandWriteImage -28x28Pixel -TrainingSet:60,000,TestSet:10,000 55. ConvolutionalNeuralNetworks https://blog.naver.com/chacagea CNN(ConvolutionNeuralNetwork) http://taewan.kim/post/cnn/ Kernelsize:Kx*Ky Featuremapsize:X*Y ChannelDepth:M(inputchannel)*N(outputchannel) 56. ConvolutionalNeuralNetworks https://blog.naver.com/chacagea ReasonstouseConvolutioninImagefield. -2DImage -NeighborPixelsimilarity 57. Channel https://blog.naver.com/chacagea -Domain -MoreInformation 58. Stride https://blog.naver.com/chacagea http://cs231n.github.io/convolutional-networks/ -ReducedFeatureMapsize 59. Padding https://blog.naver.com/chacagea -Tomaintainfeaturemapsize -ImproveAccuracy 60. Pooling https://blog.naver.com/chacagea -ReducedFeatureMapsize -KeepingAccuracy 61. DeepConvolutionalNeuralNetworks https://blog.naver.com/chacagea 62. Day1Training CNNOperation CodingwithC Ver.20200512 https://blog.naver.com/chacagea 63. HowtouseVim/gcc https://blog.naver.com/chacagea 64. Vim https://blog.naver.com/chacagea Vim(빔[1],ViIMproved)은BramMoolenaar가만든vi호환텍스트편집기이다.CUI용Vim과GUI 용gVim이있다.본래아미가컴퓨터용프로그램이었으나현재는마이크로소프트윈도,리눅스,맥오 에스텐을비롯한여러환경을지원한다. Vim은vi와호환되면서도독자적으로다양한기능을추가하여사용자의편의를돕고있다.특히Vim스 크립트등을사용해서자유롭게편집환경을변경하거나,확장된정규표현식문법,강력한문법강조기 능,다중되돌리기,유니코드를비롯한다국어지원,문법검사등을쓸수있다는점이강점으로꼽힌다. 한편으로는vi와마찬가지로처음에배우기어렵다는점이단점으로지적되는데,이를극복하기위해쉬 운Vim모드를지원한다. https://ko.wikipedia.org/wiki/Vim 65. Vim https://blog.naver.com/chacagea 66. GCC https://blog.naver.com/chacagea GNU컴파일러모음(GNUCompilerCollection,줄여서GCC)는GNU프로젝트의일환으로개발되어 널리쓰이고있는컴파일러이다. 자유소프트웨어중에가장잘알려진것들중하나인GCC는원래C만을지원했던컴파일러로이름도 "GNUC컴파일러"였다.이러한까닭에현재에도GCC는GNU컴파일러모음의일부인GNUC컴파일 러(GNUCCompiler)의줄임말로쓰이기도한다.그러나나중에C++,자바,포트란,에이다등여러언 어를컴파일할수있게되면서,현재의이름으로바뀌게되었다. https://ko.wikipedia.org/wiki/GNU_%EC%BB%B4%ED%8C%8C%EC%9D%BC%EB%9F%AC_%EB%AA% A8%EC%9D%8C GNU는"GNU'sNotUnix!"(GNU는유닉스가아니다!) 67. Makefile https://blog.naver.com/chacagea make명령어에서쓰이는생성파일.이파일에는"어떤원시파일이그목적파일보다새로운것이면 다시컴파일하고다시링크하라(역으로원시가목적보다오래된것이면컴파일후에원시코드가변경 되지않으므로다시컴파일하거나링크할필요가없다)"는규칙을기록해둔다.이에따라어떤파일을 변경했을때어떻게컴파일이나링크하면좋은가를개발자가일일이기억할필요가없으므로소프트 웨어개발노력이그만큼줄어든다. [네이버지식백과]생성파일[makefile](컴퓨터인터넷IT용어대사전,2011.1.20.,전산용어사전편찬위 원회) •각파일에대한반복적명령의자동화로인한시절약 •프로그램의종속구조를빠르게파악할수있으며관리가용이 •단순반복작업및재작성을최소화 출처:https://bowbowbow.tistory.com/12[멍멍멍] 68. CNNCoreCCoding https://blog.naver.com/chacagea 69. CNNCore https://blog.naver.com/chacagea •CO:#ChannelOut •CI:#ChannelIn •KY:#KernelsizeY •KX:#KernelsizeX 70. Day2Theory Understanding CNNandHW Implementation Ver.20200512 https://blog.naver.com/chacagea 71. Theory:UnderstandingCNNandHWImplementation •SurveyofDNNDevelopmentResources •PopularDNNs •LeNet(1998) •AlexNet(2012) •OverFeat(2013) •VGGNet(2014) •GoogleNet(2014) •ResNet(2015) •Frameworks •DataSets •UnderstandofCNN(Advanced) •CPUvsGPUvsFPGAvsASIC(WhyuseHW?) Training:VivadoSimulationEnvironmentTest •IntroduceVivado •4-bitcounterdesignandsimulation. Contents(Day2) https://blog.naver.com/chacagea 72. SurveyofDNNDevelopmentsResources https://blog.naver.com/chacagea 73. PopularDNNs https://blog.naver.com/chacagea [O.Russakovskyetal.,IJCV2015] 74. LeNet-5 https://blog.naver.com/chacagea [Y.Lecunetal,ProceedingsoftheIEEE,1998] 75. AlexNet https://blog.naver.com/chacagea 76. LargeSizeswithVaryingShapes https://blog.naver.com/chacagea [Krizhevskyetal.,NIPS,2012] 61millionweightsand724millionMACs 77. VGG-16 https://blog.naver.com/chacagea [Simonyanetal.,arXiv2014,ICLR2015] 78. GoogLeNet(v1) https://blog.naver.com/chacagea [Szegedyetal.,arXiv2014,CVPR2015] 79. GoogLeNet(v1) https://blog.naver.com/chacagea [Szegedyetal.,arXiv2014,CVPR2015] 80. ResNet-50 https://blog.naver.com/chacagea [Heetal.,arXiv2015,CVPR2016] 81. RevolutionofDepth https://blog.naver.com/chacagea ImageSource:http://icml.cc/2016/tutorials/icml2016_tutorial_deep_residual_networks_kaiminghe.pdf 82. SummaryofPopularDNNs https://blog.naver.com/chacagea 83. SummaryofPopularDNNs https://blog.naver.com/chacagea 84. Frameworks https://blog.naver.com/chacagea 85. Example:LayersinCaffe https://blog.naver.com/chacagea 86. BenefitsofFrameworks https://blog.naver.com/chacagea •Rapiddevelopment •Sharingmodels •Workloadprofiling •Networkhardwareco-design 87. ImageClassificationDatasets https://blog.naver.com/chacagea 88. MNIST https://blog.naver.com/chacagea 89. IMAGENET https://blog.naver.com/chacagea 90. IMAGENET https://blog.naver.com/chacagea 91. ImageClassificationSummary https://blog.naver.com/chacagea 92. NextTasks:LocalizationandDetection https://blog.naver.com/chacagea 93. OthersPopularDatasets https://blog.naver.com/chacagea 94. RecentlyIntroducedDatasets https://blog.naver.com/chacagea •GoogleOpenImages(~9Mimages) –https://github.com/openimages/dataset •Youtube-8M(8Mvideos) –https://research.google.com/youtube8m/ •AudioSet(2Msoundclips) –https://research.google.com/audioset/index.html 95. Summary https://blog.naver.com/chacagea •Developmentresourcespresentedinthis sectionenableustoevaluatehardwareusing theappropriateDNNmodelanddataset –Difficulttaskstypicallyrequirelargermodels –Differentdatasetsfordifferenttasks –Numberofdatasetsgrowingatarapidpace 96. UnderstandofCNN(Advanced) https://blog.naver.com/chacagea 97. DeepConvolutionalNeuralNetworks https://blog.naver.com/chacagea 98. DeepConvolutionalNeuralNetworks https://blog.naver.com/chacagea 99. DeepConvolutionalNeuralNetworks https://blog.naver.com/chacagea 100. DeepConvolutionalNeuralNetworks https://blog.naver.com/chacagea 101. DeepConvolutionalNeuralNetworks https://blog.naver.com/chacagea 102. DeepConvolutionalNeuralNetworks https://blog.naver.com/chacagea 103. Convolution(CONV)Layer https://blog.naver.com/chacagea 104. Convolution(CONV)Layer https://blog.naver.com/chacagea 105. Convolution(CONV)Layer https://blog.naver.com/chacagea 106. Convolution(CONV)Layer https://blog.naver.com/chacagea 107. Convolution(CONV)Layer https://blog.naver.com/chacagea 108. Convolution(CONV)Layer https://blog.naver.com/chacagea 109. Convolution(CONV)Layer https://blog.naver.com/chacagea 110. CNNDecoderRing https://blog.naver.com/chacagea CO:#ChannelOut CI:#ChannelIn KY:#KernelsizeY KX:#KernelsizeX 111. CONVLayerTensorComputation https://blog.naver.com/chacagea 112. CONVLayerImplementation https://blog.naver.com/chacagea Ux:x_posoffset Uy:y_posoffset 113. Fully-Connected(FC)Layer https://blog.naver.com/chacagea (MxCHW)∙(CHWxN) 114. FCLayer–fromCONVLayerPOV(PointofView) https://blog.naver.com/chacagea 115. TraditionalActivationFunctions https://blog.naver.com/chacagea 116. ModernActivationFunctions https://blog.naver.com/chacagea 117. Pooling(POOL)Layer https://blog.naver.com/chacagea 118. POOLLayerImplementation https://blog.naver.com/chacagea 119. CPUvsGPUvsFPGAvsASIC(WhyuseHW?) https://blog.naver.com/chacagea 120. CPUsAreTargetingDeepLearning https://blog.naver.com/chacagea 플롭스(FLOPS,FLoating pointOperationsPerSecond)는컴퓨터의 성능을수치로나타낼때주로사용되는단 위이다. 테라플롭스(1×1012플롭스)가주로쓰인다 https://ko.wikipedia.org/wiki/%ED%94%8 C%EB%A1%AD%EC%8A%A4 (ThermalDesignPower) (Multi-ChannelDRAM) 121. GPUsAreTargetingDeepLearning https://blog.naver.com/chacagea 고대역메모리(HighBandwidthMemory,HBM) NVLink는엔비디아가개발한와이어기반통신프로토콜시 리얼멀티레인근범위통신링크이다. 122. FPGAsforDeepLearning https://blog.naver.com/chacagea 123. SOCsforDeepLearning https://blog.naver.com/chacagea LowPower 124. Siliconalternatives https://blog.naver.com/chacagea https://docs.microsoft.com/ko-kr/azure/machine-learning/how-to-deploy-fpga-web-service 125. ABreakthroughinFPGA-BasedDeepLearningInference https://blog.naver.com/chacagea https://www.eeweb.com/profile/lauro/articles/a-breakthrough-in-fpga-based-deep-learning-inference https://blog.naver.com/chacagea/221745783425 126. Day2Training VivadoSimulation EnvironmentTest Ver.20200512 https://blog.naver.com/chacagea 127. IntroduceVivado https://blog.naver.com/chacagea 128. XilinxVivado https://blog.naver.com/chacagea VivadoDesignSuiteisasoftwaresuiteproducedbyXilinxforsynthesisandanalysis ofHDLdesigns,supersedingXilinxISEwithadditionalfeaturesforsystemona chipdevelopmentandhigh-levelsynthesis.[1][4][5][6]Vivadorepresentsaground-up rewriteandre-thinkingoftheentiredesignflow(comparedtoISE),andhasbeen describedbyreviewersas"wellconceived,tightlyintegrated,blazingfast,scalable, maintainable,andintuitive".[7][8][9] LikethelaterversionsofISE,Vivadoincludesthein-builtlogicsimulatorISIM.[10]Vivado alsointroduceshigh-levelsynthesis,withatoolchainthatconvertsCcodeinto programmablelogic.[5]Vivadohasbeendescribedasa"state-of-the-artcomprehensive EDAtoolwithallthelatestbellsandwhistlesintermsofdatamodel,integration, algorithms,andperformance".[11] Replacingthe15yearoldISEwithVivadoDesignSuitetook1000person-yearsand costUS$200million.[12] https://en.wikipedia.org/wiki/Xilinx_Vivado 129. HowtoInstall https://blog.naver.com/chacagea •Under2019.2ver. •https://blog.naver.com/chacagea/221723857954 •Vitis2019.2 •https://blog.naver.com/chacagea/221820288607 130. HighLevelSynthesis https://blog.naver.com/chacagea •HighLevelSynthesis •상위Language(C/C++/SystemC)로HW설계및 개발하기위한방법. •검증및설계시간을단축시킬수있다. https://en.wikipedia.org/wiki/High-level_synthesis •DrawLineIP •https://blog.naver.com/chacagea/221415323 270 •ResizeIP •https://blog.naver.com/chacagea/221458559 128 •SimpleCNNIP(MNIST) •https://blog.naver.com/chacagea/221365781 890 131. 4-bitcounterdesignandsimulation https://blog.naver.com/chacagea 132. 4-bitcounterSpec https://blog.naver.com/chacagea modulecounter_4b( inputclk, inputreset_n, inputi_data_en, input[3:0]i_cnt_value, outputreg[3:0]o_cnt ); 133. HowtoCheckinWaveform https://blog.naver.com/chacagea Draganddrophere! 134. HowtoCheckinWaveform https://blog.naver.com/chacagea 135. HowtoCheckinWaveform https://blog.naver.com/chacagea 136. Day3Theory CNNCoreDesign UsingVerilogHDL Ver.20200512 https://blog.naver.com/chacagea 137. Theory:CNNCoreDesignusingVerilogHDL •ShortReviewofVerilogHDL •RTL •Parameter •Generate •IndexedPartSelect •CNNCoreSpec •CNNCoreHWArchitecture Training:CNNCoreDesignusingVerilogHDL •CNNDesignwithVerilogHDL •CNNDesignReview Contents(Day3) https://blog.naver.com/chacagea 138. ShortReviewofVerilogHDL https://blog.naver.com/chacagea 139. RegisterTransferLevel(RTL) https://blog.naver.com/chacagea TheRTLdesignisusuallycapturedusingahardware descriptionlanguage(HDL)suchasVerilogorVHDL.https://en.wikipedia.org/wiki/Register-transfer_level always@(posedgeclk)begin Q<=D; end 140. Parameter https://blog.naver.com/chacagea 141. Parameter https://blog.naver.com/chacagea 142. Generate https://blog.naver.com/chacagea 143. IndexedPartSelect https://blog.naver.com/chacagea [+:] logic[31:0]a_vect; logic[0:31]b_vect; logic[63:0]dword; integersel; a_vect[0+:8]//==a_vect[7:0] a_vect[15-:8]//==a_vect[15:8] b_vect[0+:8]//==b_vect[0:7] b_vect[15-:8]//==b_vect[8:15] dword[8*sel+:8]//variablepart-selectwithfixedwidth 144. ExampleOperationinKernel https://blog.naver.com/chacagea 145. CNNCoreHWArchitecture https://blog.naver.com/chacagea 146. CNNCoreArchitecture https://blog.naver.com/chacagea 147. CNNCoreArchitecture https://blog.naver.com/chacagea 148. CNNCoreSpec https://blog.naver.com/chacagea 149. CNNCoreSpec https://blog.naver.com/chacagea defines_cnn_core.vh 150. Day3Training CNNCoreDesign UsingVerilogHDL Ver.20200512 https://blog.naver.com/chacagea 151. CNNDesignwithVerilogHDL https://blog.naver.com/chacagea 152. Treeandrun.py https://blog.naver.com/chacagea 153. CNNDesignReview https://blog.naver.com/chacagea 154. Waveform https://blog.naver.com/chacagea 155. Terminal https://blog.naver.com/chacagea 0gefälltmir × GehörenSiezudenErsten,denendasgefällt! Aufrufe × Aufrufeinsgesamt 83 AufSlideShare 0 AusEinbettungen 0 AnzahlanEinbettungen 0 DuhastdirnununbegrenztenZugriffaufüber20Mio.Dokumentefreigeschaltet! × UnbegrenztesLesevergnügen LerneschnellerundintelligentervonSpitzenfachleuten UnbegrenzteDownloads LadeesdirzumLernenofflineundunterwegsherunter AußerdemerhältstduauchkostenlosenZugangzuScribd! SofortigerZugriffaufMillionenvonE-Books,Hörbüchern,Zeitschriften,Podcastsundmehr. LeseundhöreofflinemitjedemGer/u00E4t. KostenloserZugangzuPremium-DienstenwieTuneIn,Mubiundmehr. EntdeckemehraufScribd × Clipboardteilen × Facebook Twitter LinkedIn Link ÖffentlicheClipboards,diedieseFolieenthalten × KeineöffentlichenClipboardsfürdieseFoliegefunden AnderesClipboardauswählen × SiehabendieseFoliebereitsinsClipboard„“geclippt. Clipboarderstellen SiehabenIhreersteFoliegeclippt! DurchClippenkönnenSiewichtigeFoliensammeln,dieSiespäternocheinmalansehenmöchten.PassenSiedenNamendesClipboardsan,umIhreClipszuspeichern. Name* Beschreibung Sichtbarkeit Füranderesichtbar Abbrechen Speichern SlideShareteilen SonderangebotfürSlideShare-Leser × DieSlideShare-Familiehatsichgeradevergrößert.Siehabennununbegrenzten*ZugriffaufBücher,Hörbücher,ZeitschriftenundmehrvonScribd. AktivierenSieIhrekostenlose60-tägigeTestversion Jederzeitkündbar.



請為這篇文章評分?