The Components of a Neural Network | by Dhruva Krishna

文章推薦指數: 80 %
投票人數:10人

A summary of the key parts that build up one of the most commonly used Deep Learning methods · Introduction · The Neuron · Multiple Inputs · Layers. HomeNotificationsListsStoriesWritePublishedinTowardsDataScienceTheComponentsofaNeuralNetworkAsummaryofthekeypartsthatbuilduponeofthemostcommonlyusedDeepLearningmethodsThisarticleisacontinuationofaseriesIamwritingonkeytheoreticalconceptstoMachineLearning.InadditiontoanintroductiontoML,IhavewrittenarticlesonClassificationandRegressionthatcanbeaccessedonmypage.PhotobyMarkusSpiskeonUnsplashIntroductionNeuralNetworksaretheposterboyofDeepLearning,asectionofMachineLearningcharacterisedbyitsuseofalargenumberofinterwovencomputations.Theindividualcomputationsthemselvesarerelativelystraightforward,butitisthecomplexityintheconnectionsthatgivethemtheiradvancedanalyticability.TheNeuronThebuildingblockofaneuralnetworkisthesingleneuron.Thediagrambelowshowsthestructureofaneutronwithoneinput.ImagebyAuthorTheinputtotheneuronisx,whichhasaweightwassociatedwithit.Theweightistheintrinsicparameter,theparameterthemodelhascontroloverinordertogetabetterfitfortheoutput.Whenwepassaninputintoaneuron,wemultiplyitbyitsweight,givingusx*w.Thesecondelementoftheinputiscalledthebias.Thebiasisdeterminedsolelybythevalueb,sincethevalueofthenodeis1.Thebiasaddsanelementofunpredictabilitytoourmodel,whichhelpsitgeneraliseandgivesourmodeltheflexibilitytoadapttodifferentunseeninputswhenusingtestingdata.Thecombinationofthebiasandinputproducesouroutputy,givingusaformulaofw*x+b=y.Thisshouldlookfamiliarasamodificationoftheequationofastraightline,y=mx+c.NeuralNetworksaremadeupoftens,hundredsormanyeventhousandsofinterconnectedneurons,eachofwhichrunsitsownregression.It’sessentiallyaregressiononsteroids.MultipleInputsNaturally,wewillnotbeabletoanalysemostdatasetswecomeacrossintherealworldusingaregressionassimpleasthediagramabove.Wewillexpecttoseemanymoreinputsthatarecombinedtoestimatetheoutput.Thisisachievedinasimilarwayastheneuronwithoneinput.ImagebyAuthorTheformulafortheaboveequationwillreadx0*w0+x1*w1+x2*w2+b=y.LayersNeuralnetworksorganiseneuronsintolayers.Alayerinwhicheveryneuronisconnectedtoeveryotherneuroninitsnextlayeriscalledadenselayer.ImagebyAuthorThroughthisincreasingcomplexity,neuralnetworksareabletotransformdataandinferrelationshipsinavarietyofcomplexways.Asweaddmorelayersandnodestoournetwork,thiscomplexityincreases.ActivationFunctionCurrentlyourmodelisonlygoodforpredictinglinearrelationshipsinourdata.Inthepreviousdiagram,there’snobenefittorunningthisneuralnetworkasopposedtoaseriesofregressions.NeuralNetworksprovideasolutiontothisintwoways.Thefirstistheabilitytoaddmorelayerstoournetworkbetweentheinputandoutput,knownashiddenlayers.Eachofthesehiddenlayerswillhaveapredefinednumberofnodesandthisaddedcomplexitystartstoseparatetheneuralnetworkfromitsregressioncounterpart.ThesecondwaythatNeuralNetworksaddcomplexityisthroughtheintroductionofanactivationfunctionateverynodethatisn’taninputoroutput.Ifyou’reunfamiliarwiththeterm,IwoulddefinitelycheckoutapreviousarticleIwroteonLinearClassificationwhichlooksatactivationfunctionsinfarmoredepth,buttosummarisefromthere,anactivationfunctionisafunctionthattransformsourinputdatausinganonlinearmethod.SigmoidandReLuarethemostcommonlyusedactivationfunctions.ImagebyAuthorThefactthatbothofthesemodelsarenonlinearmeansthatweaddanotherelementofadaptabilitytoourmodel,becauseitcannowpredictclassesthatdonothavelineardecisionboundariesorapproximatenonlinearfunctions.Inthesimplestofterms,withoutanactivationfunction,neuralnetworkscanonlylearnlinearrelationships.Thefittingofanobjectassimpleasanx²curvewouldnotbepossiblewithouttheintroductionofanactivationfunction.Sotheroleofaneuroninahiddenlayeristotakethesumoftheproductsoftheinputsandtheirweightsandpassthisvalueintoanactivationfunction.Thiswillthenbethevaluepassedastheinputtothenextneuron,beitanotherhiddenneuronortheoutput.ImagebyAuthorOptimisingWeightsWhenaNeuralNetworkisinitialised,itsweightsarerandomlyassigned.Thepoweroftheneuralnetworkcomesfromitsaccesstoahugeamountofcontroloverthedata,throughtheadjustingoftheseweights.Thenetworkiterativelyadjustsweightsandmeasuresperformance,continuingthisprocedureuntilthepredictionsaresufficientlyaccurateoranotherstoppingcriterionisreached.Theaccuracyofourpredictionsaredeterminedbyalossfunction.Alsoknownasacostfunction,thisfunctionwillcomparethemodeloutputwiththeactualoutputsanddeterminehowbadourmodelisinestimatingourdataset.Essentiallyweprovidethemodelafunctionthatitaimstominimiseanditdoesthisthroughtheincrementaltweakingofweights.AcommonmetricforalossfunctionisMeanAbsoluteError,MAE.Thismeasuresthesumoftheabsoluteverticaldifferencesbetweentheestimatesandtheiractualvalues.ImagebyAuthorThejoboffindingthebestsetofweightsisconductedbytheoptimiser.Inneuralnetworks,theoptimisationmethodusedisstochasticgradientdescent.Everytimeperiod,orepoch,thestochasticgradientdescentalgorithmwillrepeatacertainsetofstepsinordertofindthebestweights.StartwithsomeinitialvaluefortheweightsKeepupdatingweightsthatweknowwillreducethecostfunctionStopwhenwehavereachedtheminimumerroronourdatasetGradientDescentrequiresadifferentiablealgorithm,becausewhenwecometofindingtheminimumvalue,wedothisbycalculatingthegradientofourcurrentpositionandthendecidingwhichdirectiontomovetogettoourgradientof0.Weknowthatthepointatwhichthegradientofourerrorfunctionisequalto0istheminimumpointonthecurve,asthediagramsbelowshow.ImagebyAuthorImagebyAuthorThealgorithmweiterateover,step2ofourgradientdescentalgorithm,takesourcurrentweightandsubtractsfromitthedifferentiatedcostfunctionmultipliedbywhatiscalledalearningrate,thesizeofwhichdetermineshowquicklyweconvergetoordivergefromtheminimumvalue.IhaveanexplanationingreaterdetailontheprocessofgradientdescentinmyarticleonLinearRegression.OverandUnderfittingOverfittingandUnderfittingaretwoofthemostimportantconceptsofmachinelearning,becausetheycanhelpgiveyouanideaofwhetheryourMLalgorithmiscapableofitstruepurpose,beingunleashedtotheworldandencounteringnewunseendata.Mathematically,overfittingisdefinedasthesituationwheretheaccuracyonyourtrainingdataisgreaterthantheaccuracyonyourtestingdata.Underfittingisgenerallydefinedaspoorperformanceonboththetrainingandtestingside.Sowhatdothesetwoactuallytellusaboutourmodel?Well,inthecaseofoverfitting,wecanessentiallyinferthatourmodeldoesnotgeneralisewelltounseendata.Ithastakenthetrainingdataandinsteadoffindingthesecomplex,sophisticatedrelationshipswearelookingfor,ithasbuiltarigidframeworkbasedontheobservedbehaviour,takingthetrainingdataasgospel.Thismodeldoesn’thaveanypredictivepower,becauseithasattacheditselftoostronglytotheinitialdataitwasprovided,insteadoftryingtogeneraliseandadapttoslightlydifferentdatasets.Inthecaseofunderfitting,wefindtheopposite,thatourmodelhasnotattacheditselftothedataatall.Similartobefore,themodelhasbeenunabletofindstrongrelationships,butinthiscase,ithasgeneratedlooserulestoprovidecrudeestimationsofthedata,ratherthananythingconcrete.Anunderfitmodelwillthereforealsoperformpoorlyontrainingdatabecauseofitslackofunderstandingoftherelationshipsbetweenthevariables.Avoidingunderfittingisgenerallymorestraightforwardthanitscounterpart,becausegeneralbeliefisthatanunderfitmodelisonethatisn’tcomplexenough.Wecanavoidunderfittingbyaddinglayers,neuronsorfeaturestoourmodelorincreasingthetrainingtime.Someofthemethodsusedtoavoidoverfititngaresimplythedirectoppositesofavoidingunderfitting.Wecanremovesomefeatures,particularlythosethatarecorrelatedwithothersalreadypresentinthedatasetorthathaveverylittlecorrelationwithouroutput.Stoppingthemodelearlieralsoensuresthatwecaptureamoregeneralmodel,insteadofallowingittoover-analyseourdata.Insomecases,overfittingmayoccurduetoamodel’sover-relianceonacertainsetofweights,orpathinourneuralnetwork.Themodelmayhavefound,duringtraining,thatacertainsetofweightsinasectionofourneuralnetworkprovideaverystrongcorrelationwiththeoutput,butthisismoreacoincidencethanthediscoveryofanactualrelationship.Ifthisoccurs,thenwhenpresentedwithtestingdata,themodelwillnotbeabletodeliverthesamelevelofaccuracy.Oursolutionhereistointroducetheconceptofdropout.Theconceptbehinddropoutistoessentiallytoexcludeasectionofthenetworkeverystepofourtrainingprocess.Thiswillhelpusgenerateweightsthataremoreevenacrosstheentirenetworkandensurethatourmodelisnottooreliantonanyonesubsection.That’syoursummaryofthecomponentsofNeuralNetworks.I’mlookingtogothroughalotmoreconceptsinmoredetailinfurtherarticles,sokeepaneyeoutforthose!Evaluatingmodelsisupnext.Ifyou’reinterestedinanyofmypreviousarticles,givemypageafollowaswell.Untilthen,✌️.MorefromTowardsDataScienceFollowYourhomefordatascience.AMediumpublicationsharingconcepts,ideasandcodes.ReadmorefromTowardsDataScienceMorefromMediumDeepLearningSummit:WhichSessionsAreRightforYou?ArtificialIntelligence|Threemajorkeywordsfor2021:Metaverse,NFT,Web3.0WhatisAI??let’sdigintofieldofAIAWS:MOREServicesAnnouncedForthoseofyouwhodon’tknow,AWSstandsforAmazonWebServices.OveradecadeagoAmazonwasputtingasignificantamountofeffort…PuttingAIinCanadianByPeterHenderson,RBCDeepHunt—Issue#29Highlightsofthisweekinclude—InauguralTensorFlowDevSummithappens;GeneralAIChallenge;Isattackingmachinelearningeasierthan…HowDigitalMarketingistakingshapewithAI(ArtificialIntelligence)andML(MachineLearning)‘Disastrous’LackofDiversityinAIIndustryPerpetuatesBias,StudyFindsGetstartedDhruvaKrishna425Followershttps://algofin.substack.comFollowRelatedClassifyingHandwrittenDigitsUsingAMultilayerPerceptronClassifier(MLP)Thedistance-basedalgorithmsindataminingThealgorithmsareusedtomeasurethedistancebetweeneachtextandtocalculatethescore.MelanomaClassificationUsingMixedDatainKerasTensorFlowCNNforIntelImageClassificationTaskHelpStatusWritersBlogCareersPrivacyTermsAboutKnowable



請為這篇文章評分?