Six Types of Neural Networks You Need to Know About

2024-11-14

文章推薦指數： 80 %

投票人數：10人

Six Types of Neural Networks You Need to Know About · 1. Feed-Forward Neural Network · 2. Convolutional Neural Networks (CNN) · 3. Recurrent Neural ... DeepLearningandAISixTypesofNeuralNetworksYouNeedtoKnowAboutFebruary25,2021•13minreadAnIntroductiontotheMostCommonNeuralNetworks NeuralNetshavebecomeprettypopulartoday,butthereremainsadearthofunderstandingaboutthem.Forone,we'veseenalotofpeoplenotbeingabletorecognizethevarioustypesofneuralnetworksandtheproblemstheysolve,letalonedistinguishbetweeneachof them.Andsecond,whichissomehowevenworse,iswhenpeopleindiscriminatelyusethewords“DeepLearning”whentalkingaboutanyneuralnetwork withoutbreakingdownthedifferences. Inthispost,wewilltalkaboutthemostpopularneuralnetworkarchitecturesthateveryoneshouldbefamiliarwithwhenworkinginAIresearch.1.Feed-ForwardNeuralNetwork Thisisthemostbasictypeofneuralnetworkthatcameaboutinlargepartto technologicaladvancementswhichallowedustoaddmanymorehiddenlayerswithoutworryingtoomuchaboutcomputationaltime.Italsobecamepopularthankstothe discoveryofthebackpropagationalgorithmbyGeoffHintonin1990. Source:Wikipedia Thistypeofneuralnetwork essentiallyconsistsofaninputlayer,multiplehiddenlayersandanoutputlayer.Thereisnoloopandinformationonlyflowsforward.Feed-forwardneuralnetworksare generallysuitedforsupervisedlearningwithnumericaldata,thoughithasitsdisadvantagestoo: 1)itcannotbeusedwithsequentialdata; 2)it doesn’tworktoowellwithimagedataastheperformanceofthismodelisheavilyreliantonfeatures,andfindingthefeaturesforanimageortextdatamanuallyisaprettydifficultexerciseonitsown. Thisbringsustothenexttwo classesofneuralnetworks:ConvolutionalNeuralNetworksandRecurrentNeuralNetworks.2.ConvolutionalNeuralNetworks(CNN) TherearealotofalgorithmsthatpeopleusedforimageclassificationbeforeCNNsbecamepopular.PeopleusedtocreatefeaturesfromimagesandthenfeedthosefeaturesintosomeclassificationalgorithmlikeSVM.Somealgorithmalsousedthepixellevelvaluesofimagesasafeaturevectortoo.Togiveanexample,youcouldtrainanSVMwith784featureswhereeachfeatureisthepixelvaluefora28x28image. SowhyCNNsandwhydotheyworksomuchbetter? CNNscanbethoughtofasautomaticfeatureextractorsfromtheimage.WhileifIuseanalgorithmwithpixelvectorIlosealotofspatialinteractionbetweenpixels,aCNNeffectivelyusesadjacentpixelinformationtoeffectivelydownsampletheimagefirstbyconvolutionandthenusesapredictionlayerattheend. ThisconceptwasfirstpresentedbyYannlecunin1998fordigitclassificationwhereheusedasingleconvolutionlayertopredictdigits.ItwaslaterpopularizedbyAlexnetin2012whichusedmultipleconvolutionlayerstoachievestateoftheartonImagenet.Thusmakingthemanalgorithmofchoiceforimageclassificationchallengeshenceforth. OvertimevariousadvancementshavebeenachievedinthisparticularareawhereresearchershavecomeupwithvariousarchitecturesforCNN'slikeVGG,Resnet,Inception,Xceptionetc.whichhavecontinuallymovedthestateoftheartforimageclassification. Incontrast,CNN'sarealsousedforObjectDetectionwhichcanbeaproblembecauseapartfromclassifyingimageswealsowanttodetecttheboundingboxesaroundvariousobjectsintheimage.InthepastresearchershavecomeupwithmanyarchitectureslikeYOLO,RetinaNet,FasterRCNNetctosolvetheobjectdetectionproblemallofwhichuseCNNsaspartoftheirarchitectures. Hereareafewarticlesyoumightwanttolookat:EndtoEndPipelineforsettingupMulticlassImageClassificationforDataScientistsObjectDetection:AnEndtoEndTheoreticalPerspectiveHowtoCreateanEndtoEndObjectDetectorusingYolov5?3.RecurrentNeuralNetworks(LSTM/GRU/Attention) WhatCNNmeansforimages,RecurrentNeuralNetworksaremeantfortext.RNNscanhelpuslearnthesequentialstructureoftextwhereeachwordisdependentonthepreviousword,orawordintheprevioussentence. ForasimpleexplanationofanRNN,thinkofanRNNcellasablackboxtakingasinputahiddenstate(avector)andawordvectorandgivingoutanoutputvectorandthenexthiddenstate.Thisboxhassomeweightswhichneedtobetunedusingbackpropagationofthelosses.Also,thesamecellisappliedtoallthewordssothattheweightsaresharedacrossthewordsinthesentence.Thisphenomenoniscalledweight-sharing. Hiddenstate,Wordvector->(RNNCell)->OutputVector,NextHiddenstate BelowistheexpandedversionofthesameRNNcellwhereeachRNNcellrunsoneachwordtokenandpassesahiddenstatetothenextcell.Forasequenceoflength4like“thequickbrownfox”,TheRNNcellfinallygives4outputvectors,whichcanbeconcatenatedandthenusedaspartofadensefeedforwardarchitecturelikebelowtosolvethefinaltaskLanguageModelingorclassificationtask: LongShortTermMemorynetworks(LSTM)andGatedRecurrentUnits(GRU)areasubclassofRNN,specializedinrememberinginformationforextendedperiods(alsoknownasVanishingGradientProblem)byintroducingvariousgateswhichregulatethecellstatebyaddingorremovinginformationfromit. Fromaveryhighpoint,youcanunderstandLSTM/GRUasaplayonRNNcellstolearnlongtermdependencies.RNNs/LSTM/GRUhavebeenpredominantlyusedforvariousLanguagemodelingtaskswheretheobjectiveistopredictthenextwordgivenastreamofinputWordorfortaskswhichhaveasequentialpatterntothem.IfyouwanttolearnhowtouseRNNforTextClassificationtasks,takealookatthispost. Nextthingweshouldmentionare attention-basedmodels,butlet'sonly talkabouttheintuitionhereasdivingdeep intothosecangetprettytechnical(ifinterested, youcanlookatthispost).Inthepast,conventionalmethodslikeTFIDF/CountVectorizeretc.,wereusedtofindfeaturesfromthetextbydoingakeywordextraction.Somewordsaremorehelpfulindeterminingthecategoryoftextthanothers.However,inthismethodwesortoflostthesequentialstructureofthetext.WithLSTManddeeplearningmethods,wecantakecareofthesequencestructurebutwelosetheabilitytogivehigherweighttomoreimportantwords.Canwehavethebestofbothworlds?TheanswerisYes.Actually,Attentionisallyouneed.Intheauthor’swords: Notallwordscontributeequallytotherepresentationofthesentence'smeaning.Hence,weintroduceattentionmechanismtoextractsuchwordsthatareimportanttothemeaningofthesentenceandaggregatetherepresentationofthoseinformativewordstoformasentencevector4.Transformers Source TransformershavebecomethedefactostandardforanyNaturalLanguageProcessing(NLP)task,andtherecentintroductionoftheGPT-3transformeristhebiggestyet. Inthepast,theLSTMandGRUarchitecture,alongwiththeattentionmechanism,usedtobetheState-of-the-Artapproachforlanguagemodelingproblemsandtranslationsystems.Themainproblemwiththesearchitecturesisthattheyarerecurrentinnature,andtheruntimeincreasesasthesequencelengthincreases.Thatis,thesearchitecturestakeasentenceandprocesseachwordinasequentialway,sowhenthesentencelength increasessodoes thewholeruntime. Transformer,amodelarchitecturefirstexplainedinthepaperAttentionisallyouneed,letsgoofthisrecurrenceandinsteadreliesentirelyonanattentionmechanismtodrawglobaldependenciesbetweeninputandoutput.Andthatmakesitfast,moreaccurateandthearchitectureofchoicetosolvevariousproblemsintheNLPdomain.Ifyouwanttoknowmoreabouttransformers,takealookatthefollowingtwoposts:UnderstandingTransformers,theDataScienceWayUnderstandingTransformers,theProgrammingWay5.GenerativeAdversarialNetworks(GAN) Source:Allofthemarefake Peoplein datasciencehaveseenalotofAI-generatedpeopleinrecenttimes,whetheritbeinpapers,blogs,orvideos.We’vereachedastagewhereit’sbecomingincreasinglydifficulttodistinguishbetweenactualhumanfacesandfacesgeneratedbyartificialintelligence.AndallofthisismadepossiblethroughGANs.GANswillmostlikely changethewaywegeneratevideogamesandspecialeffects.Usingthisapproach,youcancreaterealistictexturesorcharactersondemand,openingupaworldofpossibilities. GANstypicallyemploytwoduelingneuralnetworkstotrainacomputertolearnthenatureofadatasetwellenoughtogenerateconvincingfakes.Oneoftheseneuralnetworksgeneratesfakes(thegenerator),andtheothertriestoclassifywhichimagesarefake(thediscriminator).Thesenetworksimproveovertimebycompetingagainsteachother. Perhapsit'sbestto imaginethegeneratorasarobberandthediscriminatorasapoliceofficer.Themoretherobbersteals,thebetterhegetsatstealingthings.Atthesametime,thepoliceofficeralsogetsbetteratcatchingthethief. Thelossesintheseneuralnetworksareprimarilyafunctionofhowtheothernetworkperforms:Discriminatornetworklossisafunctionofgeneratornetworkquality:Lossishighforthediscriminatorifitgetsfooledbythegenerator’sfakeimages.Generatornetworklossisafunctionofdiscriminatornetworkquality:Lossishighifthegeneratorisnotabletofoolthediscriminator. Inthetrainingphase,wetrainourdiscriminatorandgeneratornetworkssequentially,intendingtoimproveperformanceforboth.Theendgoalistoendupwithweightsthathelpthegeneratortocreaterealistic-lookingimages.Intheend,we’llusethegeneratorneuralnetworktogeneratehigh-qualityfakeimagesfromrandomnoise. Ifyouwanttolearnmoreaboutthemhereisanotherpost:WhatareGANs,andHowdotheyWork?6.Autoencoders AutoencodersaredeeplearningfunctionswhichapproximateamappingfromXtoX,i.e.input=output.Theyfirstcompresstheinputfeaturesintoalower-dimensionalrepresentationandthenreconstructtheoutputfromthisrepresentation. Inalotofplaces,thisrepresentationvectorcanbeusedasmodelfeaturesandthustheyareusedfordimensionalityreduction. AutoencodersarealsousedforAnomalydetectionwherewetrytoreconstructourexamplesusingourautoencoderandifthereconstructionlossistoohighwecanpredictthattheexampleisananomaly.Conclusion Neuralnetworksareessentiallyoneofthegreatestmodelseverinventedandtheygeneralizeprettywellwithmostofthemodelingusecaseswecanthinkof.Today,thesedifferentversionsofneuralnetworksarebeingusedtosolvevariousimportantproblemsindomainslikehealthcare,bankingandtheautomotiveindustry,alongwithbeingusedbybigcompanieslikeApple,GoogleandFacebooktoproviderecommendationsandhelpwithsearchqueries.Forexample,GoogleusedBERTwhichisamodelbasedonTransformerstopoweritssearchqueries. Ifyouwanttoknowmoreaboutdeeplearningapplicationsandusecases,takealookattheSequenceModelscourseintheDeepLearningSpecializationbyAndrewNg.InterestedinadeeplearningsolutionforAIresearch? Getaworkstationwithupto2xNVIDIARTX3070/3080/3090startingatonly$3,700TagscnnrnndeeplearningneuralnetworksganlstmtransformersnlpautoencodersRelatedContentCopyright©2022SabrePCInc.Allrightsreserved.|Privacy&Terms