Neural Network: Architecture, Components & Top Algorithms

文章推薦指數: 80 %
投票人數:10人

A neuron is the basic unit of a neural network. They receive input from an external source or other nodes. Each node is connected with another ... Programs DataScience DataScience|AllCourses DataScience|ExecutivePG DataScience|AdvancedCertificate DataScience|Master MachineLearning&AI|ExecutivePG MachineLearning&AI|Masters MachineLearning&NLP|AdvancedCertificate MachineLearningandCloud|AdvancedCertification ExecutivePGPinDataScience–IIITBangalore M.ScinDataScience–LJMU&IIITBangalore PCPinDataScience–IIMKozhikode ACPinDataScience–IIITBangalore ExecutiveProgrammeinDataScience–IIITB Management MBA&Management|AllCourses MBA(Executive)|NMIMS MBA(Global)|DBSAustralia MBA(Global)|LBSUK DigitalMarketing|AdvancedCertificate Management|PGProgram ProductManagement|Certification Sales&DigitalMarketing|PGProgram LifeInsurance|PGProgram BusinessAnalytics|Certification LifeInsurance(PNB)|PGProgram Management|PGProgram MBA(Global)|DeakinBusinessSchool MBA(DFB)|JGU GlobalMBA MasterofBusinessAdministration–IMT&LBS PGDiplomainManagement MBAExecutiveinBusinessAnalytics MBAfromOPJindalGlobalUniversity MBA(Global)inDigitalMarketing GlobalMBAfromDeakinBusinessSchool MBAinStrategy&Leadership MBAinAdvertising&Branding StrategicInnovation,DigitalMarketing&BusinessAnalytics ProductManagementCertification–DukeCE OperationsManagementandAnalytics–IITDelhi DesignThinkingCertificationProgram–DukeCE ExecutivePGPinHRM-LIBA MastersQualifyingProgram–upGradBschool PCPinHRMandAnalytics–IIMKozhikode PGPinManagement–IMTGhaziabad ManagementProgrammewithPGP–IMT IntegratedSupplyChainManagement–MSU EffectiveLeadership&Management–MSU Software&Technology SoftwareEngineering|AllCourses FullStackDevelopment|ExecutivePG ComputerScience|Masters FullStackDevelopment|PlacementTrack FullStackDevelopment|PGCertification BlockchainTechnology|ExecutivePG BlockchainTechnology|ExecutiveProgram BlockchainTechnology|AdvancedCertificate BigData|ExecutivePG M.ScinCS–LJMU&IIITBangalore PGCinSoftwareEngineering–upGrad FullStackDevelopment–IIITBangalore ExecutivePGPCyberSecurity–IIITB ExecutivePGPCloudComputing–IIITB ExecutivePGPBigData–IIITB ExecutivePGPDevOps–IIITB ACPCyberSecurity–IIITBangalore ACPCloudComputing–IIITBangalore ACPBigData–IIITBangalore ACPDevOps–IIITBangalore NeuralNetwork:Architecture,Components&TopAlgorithms byupGrad Weareanonlineeducationplatformprovidingindustry-relevantprogramsforprofessionals,designedanddeliveredincollaborationwithworld-classfacultyandbusinesses.Mergingthelatesttechnology,pedagogyandservices,wedeliver… May6,2020 Home > ArtificialIntelligence > NeuralNetwork:Architecture,Components&TopAlgorithms ArtificialNeuralNetworks(ANNs)makeupanintegralpartoftheDeepLearningprocess.Theyareinspiredbytheneurologicalstructureofthehumanbrain.AccordingtoAILabPage,ANNsare“complexcomputercodewrittenwiththenumberofsimple,highlyinterconnectedprocessingelementswhichisinspiredbyhumanbiologicalbrainstructureforsimulatinghumanbrainworking&processingdata(Information)models.”  JoinBestMachineLearningCertificationsonlinefromtheWorld’stopUniversities–Masters,ExecutivePostGraduatePrograms,andAdvancedCertificatePrograminML&AItofast-trackyourcareer. DeepLearningfocusesonfivecoreNeuralNetworks,including: Multi-LayerPerceptron RadialBasisNetwork RecurrentNeuralNetworks GenerativeAdversarialNetworks ConvolutionalNeuralNetworks. TableofContents NeuralNetwork:ArchitectureNeuralNetwork:ComponentsNeuralNetwork:AlgorithmsWhatistheLearningProblem?ConclusionWhatisaneuralnetwork?Whatisthedifferencebetweenfeedbackandfeedforwardnetworks?Whatdoyoumeanbythelearningproblem? NeuralNetwork:Architecture NeuralNetworksarecomplexstructuresmadeofartificialneuronsthatcantakeinmultipleinputstoproduceasingleoutput.ThisistheprimaryjobofaNeuralNetwork–totransforminputintoameaningfuloutput.Usually,aNeuralNetworkconsistsofaninputandoutputlayerwithoneormultiplehiddenlayerswithin. InaNeuralNetwork,alltheneuronsinfluenceeachother,andhence,theyareallconnected.Thenetworkcanacknowledgeandobserveeveryaspectofthedatasetathandandhowthedifferentpartsofdatamayormaynotrelatetoeachother.ThisishowNeuralNetworksarecapableoffindingextremelycomplexpatternsinvastvolumesofdata.  Read:MachineLearningvsNeuralNetworks InaNeuralNetwork,theflowofinformationoccursintwoways– FeedforwardNetworks:Inthismodel,thesignalsonlytravelinonedirection,towardstheoutputlayer.FeedforwardNetworkshaveaninputlayerandasingleoutputlayerwithzeroormultiplehiddenlayers.Theyarewidelyusedinpatternrecognition. FeedbackNetworks:Inthismodel,therecurrentorinteractivenetworksusetheirinternalstate(memory)toprocessthesequenceofinputs.Inthem,signalscantravelinbothdirectionsthroughtheloops(hiddenlayer/s)inthenetwork.Theyaretypicallyusedintime-seriesandsequentialtasks.  NeuralNetwork:Components Source InputLayers,Neurons,andWeights– Inthepicturegivenabove,theoutermostyellowlayeristheinputlayer.Aneuronisthebasicunitofaneuralnetwork.Theyreceiveinputfromanexternalsourceorothernodes.Eachnodeisconnectedwithanothernodefromthenextlayer,andeachsuchconnectionhasaparticularweight.Weightsareassignedtoaneuronbasedonitsrelativeimportanceagainstotherinputs. Whenallthenodevaluesfromtheyellowlayeraremultiplied(alongwiththeirweight)andsummarized,itgeneratesavalueforthefirsthiddenlayer.Basedonthesummarizedvalue,thebluelayerhasapredefined“activation”functionthatdetermineswhetherornotthisnodewillbe“activated”andhow“active”itwillbe. Let’sunderstandthisusingasimpleeverydaytask–makingtea.Intheteamakingprocess,theingredientsusedtomaketea(water,tealeaves,milk,sugar,andspices)arethe“neurons”sincetheymakeupthestartingpointsoftheprocess.Theamountofeachingredientrepresentsthe“weight.”Onceyouputinthetealeavesinthewaterandaddthesugar,spices,andmilkinthepan,alltheingredientswillmixandtransformintoanotherstate.Thistransformationprocessrepresentsthe“activationfunction.” Learnabout:DeepLearningvsNeuralNetworks HiddenLayersandOutputLayer– Thelayerorlayershiddenbetweentheinputandoutputlayerisknownasthehiddenlayer.Itiscalledthehiddenlayersinceitisalwayshiddenfromtheexternalworld.ThemaincomputationofaNeuralNetworktakesplaceinthehiddenlayers.So,thehiddenlayertakesalltheinputsfromtheinputlayerandperformsthenecessarycalculationtogeneratearesult.Thisresultisthenforwardedtotheoutputlayersothattheusercanviewtheresultofthecomputation. Inourtea-makingexample,whenwemixalltheingredients,theformulationchangesitsstateandcoloronheating.Theingredientsrepresentthehiddenlayers.Hereheatingrepresentstheactivationprocessthatfinallydeliverstheresult–tea. NeuralNetwork:Algorithms InaNeuralNetwork,thelearning(ortraining)processisinitiatedbydividingthedataintothreedifferentsets: Trainingdataset–ThisdatasetallowstheNeuralNetworktounderstandtheweightsbetweennodes. Validationdataset–Thisdatasetisusedforfine-tuningtheperformanceoftheNeuralNetwork. Testdataset–ThisdatasetisusedtodeterminetheaccuracyandmarginoferroroftheNeuralNetwork. Oncethedataissegmentedintothesethreeparts,NeuralNetworkalgorithmsareappliedtothemfortrainingtheNeuralNetwork.TheprocedureusedforfacilitatingthetrainingprocessinaNeuralNetworkisknownastheoptimization,andthealgorithmusediscalledtheoptimizer.Therearedifferenttypesofoptimizationalgorithms,eachwiththeiruniquecharacteristicsandaspectssuchasmemoryrequirements,numericalprecision,andprocessingspeed. BeforewediveintothediscussionofthedifferentNeuralNetworkalgorithms,let’sunderstandthelearningproblemfirst. Alsoread:NeuralNetworkApplicationsinRealWorld WhatistheLearningProblem? Werepresentthelearningproblemintermsoftheminimizationofalossindex(f).Here,“f”isthefunctionthatmeasurestheperformanceofaNeuralNetworkonagivendataset.Generally,thelossindexconsistsofanerrortermandaregularizationterm.WhiletheerrortermevaluateshowaNeuralNetworkfitsadataset,theregularizationtermhelpspreventtheoverfittingissuebycontrollingtheeffectivecomplexityoftheNeuralNetwork. Thelossfunction[f(w]dependsontheadaptativeparameters–weightsandbiases–oftheNeuralNetwork.Theseparameterscanbegroupedintoasinglen-dimensionalweightvector(w). Here’sapictorialrepresentationofthelossfunction: Source Accordingtothisdiagram,theminimumofthelossfunctionoccursatthepoint(w*).Atanypoint,youcancalculatethefirstandsecondderivativesofthelossfunction.Thefirstderivativesaregroupedinthegradientvector,anditscomponentsaredepictedas: Source  Here,i=1,…..,n.   ThesecondderivativesofthelossfunctionaregroupedintheHessianmatrix,likeso: Source Here,i,j=0,1,…   Nowthatweknowwhatthelearningproblemis,wecandiscussthefivemain  NeuralNetworkalgorithms. 1.One-dimensionaloptimization Sincethelossfunctiondependsonmultipleparameters,one-dimensionaloptimizationmethodsareinstrumentalintrainingNeuralNetwork.Trainingalgorithmsfirstcomputeatrainingdirection(d)andthencalculatethetrainingrate(η)thathelpsminimizethelossinthetrainingdirection[f(η)].  Source Inthediagram,thepointsη1andη2definetheintervalcontainingtheminimumoff,η*. Thus,one-dimensionaloptimizationmethodsaimtofindtheminimumofagivenone-dimensionalfunction.Twoofthemostcommonlyusedone-dimensionalalgorithmsaretheGoldenSectionMethodandBrent’sMethod.  GoldenSectionMethod Thegoldensectionsearchalgorithmisusedtofindtheminimumormaximumofasingle-variablefunction[f(x)].Ifwealreadyknowthatafunctionhasaminimumbetweentwopoints,thenwecanperformaniterativesearchjustlikewewouldinthebisectionsearchfortherootofanequationf(x)=0.Also,ifwecanfindthreepoints(x0f(x1)>f(X2)intheneighborhoodoftheminimum,thenwecandeducethataminimumexistsbetweenx0andx2.Tofindoutthisminimum,wecanconsideranotherpointx3betweenx1andx2,whichwillgiveusthefollowingoutcomes:  Iff(x3)=f3a>f(x1),theminimumisinsidetheintervalx3–x0=a+cthatisrelatedwiththreenewpointsx0f(x1),theminimumisinsidetheintervalx2–x1=brelatedwiththreenewpointsx1BrentinFindRoot[eqn,x,x0,x1]. InBrent’smethod,weuseaLagrangeinterpolatingpolynomialofdegree2.In1973,Brentclaimedthatthismethodwillalwaysconverge,providedthevaluesofthefunctionarecomputablewithinaspecificregion,includingaroot.Iftherearethreepointsx1,x2,andx3,Brent’smethodfitsxasaquadraticfunctionofy,usingtheinterpolationformula: Source  Thesubsequentrootestimatesareachievedbyconsidering,therebyproducingthefollowingequation: Source  Here,P=S[T(R–T)(x3–x2)–(1–R)(x2-x1)]andQ=(T–1)(R–1)(S–1)and, Source 2.Multidimensionaloptimization Bynow,wealreadyknowthatthelearningproblemforNeuralNetworksaimstofindtheparametervector(w*)forwhichthelossfunction(f)takesaminimumvalue.Accordingtothemandatesofthestandardcondition,iftheNeuralNetworkisataminimumofthelossfunction,thegradientisthezerovector. Sincethelossfunctionisanon-linearfunctionoftheparameters,itisimpossibletofindtheclosedtrainingalgorithmsfortheminimum.However,ifweconsidersearchingthroughtheparameterspacethatincludesaseriesofsteps,ateachstep,thelosswillreducebyadjustingtheparametersoftheNeuralNetwork. Inmultidimensionaloptimization,aNeuralNetworkistrainedbychoosingarandomweparametervectorandthengeneratingasequenceofparameterstoensurethatthelossfunctiondecreaseswitheachiterationofthealgorithm.Thisvariationoflossbetweentwosubsequentstepsisknownas“lossdecrement.”Theprocessoflossdecrementcontinuesuntilthetrainingalgorithmreachesorsatisfiesthespecifiedcondition.  Herearethreeexamplesofmultidimensionaloptimizationalgorithms: Gradientdescent Thegradientdescentalgorithmisprobablythesimplestofalltrainingalgorithms.Asitreliesontheinformationprovided`fromthegradientvector,itisafirst-ordermethod.Inthismethod,we’lltakef[w(i)]=f(i)and∇f[w(i)]=g(i).Thestartingpointofthistrainingalgorithmisw(0)thatkeepsprogressinguntilthespecifiedcriterionissatisfied–itmovesfromw(i)tow(i+1)inthetrainingdirectiond(i)=−g(i).Hence,thegradientdescentiteratesasfollows: w(i+1)=w(i)−g(i)η(i), Here,i=0,1,… Theparameterηrepresentsthetrainingrate.Youcansetafixedvalueforηorsetittothevaluefoundbyone-dimensionaloptimizationalongthetrainingdirectionateverystep.However,itispreferredtosettheoptimalvalueforthetrainingrateachievedbylineminimizationateachstep.  Source Thisalgorithmhasmanylimitationssinceitrequiresnumerousiterationsforfunctionsthathavelongandnarrowvalleystructures.Whilethelossfunctiondecreasesmostrapidlyinthedirectionofthedownhillgradient,itdoesnotalwaysensurethefastestconvergence. Newton’smethod Thisisasecond-orderalgorithmasitleveragestheHessianmatrix.Newton’smethodaimstofindbettertrainingdirectionsbymakinguseofthesecondderivativesofthelossfunction.Here,we’lldenotef[w(i)]=f(i),∇f[w(i)]=g(i),andHf[w(i)]=H(i).Now,we’llconsiderthequadraticapproximationoffatw(0)usingTaylor’sseriesexpansion,likeso: f=f(0)+g(0)⋅[w−w(0)]+0.5⋅[w−w(0)]2⋅H(0) Here,H(0)istheHessianmatrixoffcalculatedatthepointw(0).Byconsideringg=0fortheminimumoff(w),wegetthefollowingequation: g=g(0)+H(0)⋅(w−w(0))=0 Asaresult,wecanseethatstartingfromtheparametervectorw(0),Newton’smethoditeratesasfollows: w(i+1)=w(i)−H(i)−1⋅g(i) Here,i=0,1,…andthevectorH(i)−1⋅g(i)isreferredtoas“Newton’sStep.”Youmustrememberthattheparameterchangemaymovetowardsamaximuminsteadofgoinginthedirectionofaminimum.Usually,thishappensiftheHessianmatrixisnotpositivedefinite,therebycausingthefunctionevaluationtobereducedateachiteration.However,toavoidthisissue,weusuallymodifythemethodequationasfollows: w(i+1)=w(i)−(H(i)−1⋅g(i))η Here,i=0,1,…. Youcaneithersetthetrainingrateηtoafixedvalueorthevalueobtainedvialineminimization.So,thevectord(i)=H(i)−1⋅g(i)becomesthetrainingdirectionforNewton’smethod. Source  ThemajordrawbackofNewton’smethodisthattheexactevaluationoftheHessiananditsinverseareprettyexpensivecomputations. Conjugategradient TheconjugategradientmethodfallsbetweenthegradientdescentandNewton’smethod.Itisanintermediatealgorithm–whileitaimstoacceleratetheslowconvergencefactorofthegradientdescentmethod,italsoeliminatestheneedfortheinformationrequirementsconcerningtheevaluation,storage,andinversionoftheHessianmatrixusuallyrequiredinNewton’smethod. Theconjugategradienttrainingalgorithmperformsthesearchintheconjugatedirectionsthatdeliversfasterconvergencethangradientdescentdirections.ThesetrainingdirectionsareconjugatedinaccordancetotheHessianmatrix.Here,ddenotesthetrainingdirectionvector.Ifwestartwithaninitialparametervector[w(0)]andaninitialtrainingdirectionvector[d(0)=−g(0)],theconjugategradientmethodgeneratesasequenceoftrainingdirectionsrepresentedas: d(i+1)=g(i+1)+d(i)⋅γ(i), Here,i=0,1,…andγistheconjugateparameter.Thetrainingdirectionforalltheconjugategradientalgorithmsisperiodicallyresettothenegativeofthegradient.Theparametersareimproved,andthetrainingrate(η)isachievedvialineminimization,accordingtotheexpressionshownbelow: w(i+1)=w(i)+d(i)⋅η(i) Here,i=0,1,… Source   Conclusion Eachalgorithmcomeswithuniqueadvantagesanddrawbacks.TheseareonlyafewalgorithmsusedtotrainNeuralNetworks,andtheirfunctionsonlydemonstratethetipoftheiceberg–asDeepLearningframeworksadvances,sowillthefunctionalitiesofthesealgorithms. Ifyou’reinterestedtolearnmoreaboutneuralnetwork,machinelearningprograms&AI,checkoutIIIT-B&upGrad’sExecutivePGProgrammeinMachineLearning&AIwhichisdesignedforworkingprofessionalsandoffers450+hoursofrigoroustraining,30+casestudies&assignments,IIIT-BAlumnistatus,5+practicalhands-oncapstoneprojects&jobassistancewithtopfirms. Whatisaneuralnetwork? NeuralNetworksaremulti-input,single-outputsystemsmadeupofartificialneurons.ANeuralNetwork'sprincipalfunctionistoconvertinputintomeaningfuloutput.ANeuralNetworkusuallyhasaninputandoutputlayer,aswellasoneormorehiddenlayers.AlloftheneuronsinaNeuralNetworkinfluenceeachother,thustheyareallconnected.Thenetworkcanrecognizeandobserveeveryfacetofthedatasetinquestion,aswellashowthevariouspiecesofdatamayormaynotberelatedtooneanother.ThisishowNeuralNetworkscandetectincrediblycomplicatedpatternsinmassiveamountsofdata. Whatisthedifferencebetweenfeedbackandfeedforwardnetworks? Thesignalsinafeedforwardmodelonlymoveinoneway,totheoutputlayer.Withzeroormorehiddenlayers,feedforwardnetworkshaveoneinputlayerandonesingleoutputlayer.Patternrecognitionmakesextensiveuseofthem.Therecurrentorinteractivenetworksinthefeedbackmodelprocesstheseriesofinputsusingtheirinternalstate(memory).Signalscanmoveinbothwaysthroughthenetwork'sloops(hiddenlayer/s).They'recommonlyutilizedinactivitiesthatrequireasuccessionofeventstohappeninacertainorder. Whatdoyoumeanbythelearningproblem? Thelearningproblemismodelledasalossindexminimizationproblem(f).‘f’denotesthefunctionthatevaluatesaNeuralNetwork'sperformanceonagivendataset.Thelossindexismadeupoftwoterms:anerrorcomponentandaregularizationterm.WhiletheerrortermanalyseshowwellaNeuralNetworkfitsadataset,theregularizationtermpreventsoverfittingbylimitingtheNeuralNetwork'seffectivecomplexity.TheNeuralNetwork'sadaptivevariables–weightsandbiases–determinethelossfunction(f(w)).Thesevariablescanbebundledtogetherintoanuniquen-dimensionalweightvector(w). LeadtheAIDrivenTechnologicalRevolution ApplyForExecutivePGProgrammeinMachineLearning&AIfromIIIT-B Leaveacomment CancelreplyYouremailaddresswillnotbepublished.CommentName Email Website Δ Postnavigation PrevNext RelatedArticles IoT:History,Present&Future byPavanVadapalli Mar3,2022 TensorFlowTutorialforBeginners byPavanVadapalli Feb24,2022 MachineLearningTutorial:LearnMLfromScratch bySumitShukla Feb17,2022 OurTrendingMachineLearningCourses AdvancedCertificationinMachineLearningandCloudfromIITMadras-Duration12Months MasterofScienceinMachineLearning&AIfromIIIT-B&LJMU-Duration18Months ExecutivePGPrograminMachineLearningandAIfromIIIT-B-Duration12Months OurPopularMachineLearningCourse RelatedArticles IoT:History,Present&Future byPavanVadapalli Mar3,2022 TensorFlowTutorialforBeginners byPavanVadapalli Feb24,2022 MachineLearningTutorial:LearnMLfromScratch bySumitShukla Feb17,2022 × Searchfor: RegisterforaDemoCourse Course PGDiplomainDataScience PGCertificationinDataScience PGPrograminBigData PGDiplomainDataScience(Alternative) PGDiplomainML&AI MastersinDataScience MastersinML&AI PGinML&DeepLearning PGCertificationinMachineLearningandNLP × RegisterforaDemoCourse Course PGCertificationinDigitalMarketingandCommunication ExecutivePrograminStrategicDigitalMarketing PGPrograminManagement,SpecialisationinSalesandDigitalMarketing EntrepreneurshipCertificateProgram ProductManagementCertificateProgram PostGraduatePrograminLifeinsurance(HDFC) × RegisterforaDemoCourse Course PGPrograminSoftwareDevelopment PGCertificateinBlockchainSpecialisation PGPrograminBlockchainTechnology × TalktoourCounselortofindabestcoursesuitabletoyourCareerGrowth Programs DataScience Management Technology Courses PGDiplomainDataScience PGCertificationinDataScience PGPrograminBigData PGDiplomainDataScience(Alternative) PGDiplomainML&AI MastersinDataScience MastersinML&AI PGinML&DeepLearning PGCertificationinMachineLearningandNLP PGCertificationinDigitalMarketingandCommunication ExecutivePrograminStrategicDigitalMarketing PGPrograminManagement,SpecialisationinSalesandDigitalMarketing EntrepreneurshipCertificateProgram ProductManagementCertificateProgram PostGraduatePrograminLifeinsurance(HDFC) PGPrograminSoftwareDevelopment PGCertificateinBlockchainSpecialisation PGPrograminBlockchainTechnology × Let’sdoit!No,thanks.



請為這篇文章評分?