A Beginner's Guide to Neural Networks and Deep Learning
文章推薦指數: 80 %
Example: Feedforward Networks & Backpropagation A.I.Wiki ABeginner’sGuidetoImportantTopicsinAI,MachineLearning,andDeepLearning. SubscribetoOurBi-WeeklyAINewsletter Search Accuracy,Precision,Recall,andF1 AIInfrastructure AIvs.MLvs.DL AIWinter Attention,MemoryNetworks&Transformers AutomatedMachineLearning&AI Backpropagation BagofWords&TF-IDF Bayes'Theorem&NaiveBayesClassifiers ClimateChange&AI'sImpact ComparisonofAIFrameworks ConvolutionalNeuralNetwork(CNN) DataforDeepLearning DatasetsandMachineLearning DecisionIntelligenceandMachineLearning DecisionTree DeepAutoencoders Deep-BeliefNetworks DeepReinforcementLearning DeepLearningResources DefineArtificialIntelligence(AI) DenoisingAutoencoders DifferentiableProgramming Eigenvectors,Eigenvalues,PCA,CovarianceandEntropy Evolutionary&GeneticAlgorithms FraudandAnomalyDetection GaussianProcesses&MachineLearning GenerativeAdversarialNetwork(GAN) AIandMachineLearningGlossary GraphAnalytics HopfieldNetworks IndustrialOperationsandAI JavaToolingforAI JavaforDataScience LogisticRegression LSTMs&RNNs MachineLearningAlgorithms MachineLearningDemos MachineLearningResearchGroups&Labs MachineLearningWorkflows MachineLearningdefinition MarkovChainMonteCarlo MNISTdatabase MultilayerPerceptron NaturalLanguageProcessing(NLP) NeuralNetworkTuning NeuralNetworks&DeepLearning OpenDatasets OperationsResearchOptimization PythonToolingforAI QuestionsWhenApplyingDeepLearning RadialBasisFunctionNetworks RandomForest RecurrentNetwork(RNN) RecursiveNeuralTensorNetwork ReinforcementLearningforBusinessUseCases ReinforcementLearningDefinitions RestrictedBoltzmannMachine(RBM) ScalaToolingforAI Simulation,AIandOptimization SpikingNeuralNetworks StrongAIvs.WeakAI SupervisedLearning SupplyChains,AIandMachineLearning SymbolicReasoning ThoughtVectors UnsupervisedLearning DeepLearningUseCases VariationalAutoencoder(VAE) Word2Vec,Doc2VecandNeuralWordEmbeddings ABeginner'sGuidetoNeuralNetworksandDeepLearning Contents NeuralNetworkDefinition AFewConcreteExamples NeuralNetworkElements KeyConceptsofDeepNeuralNetworks Example:FeedforwardNetworks&Backpropagation MultipleLinearRegression GradientDescent LogisticRegression&Classifiers NeuralNetworks&ArtificialIntelligence Updaters CustomLayers,activationfunctionsandlossfunctions NeuralNetworkDefinition Neuralnetworksareasetofalgorithms,modeledlooselyafterthehumanbrain,thataredesignedtorecognizepatterns.Theyinterpretsensorydatathroughakindofmachineperception,labelingorclusteringrawinput.Thepatternstheyrecognizearenumerical,containedinvectors,intowhichallreal-worlddata,beitimages,sound,textortimeseries,mustbetranslated. Neuralnetworkshelpusclusterandclassify.Youcanthinkofthemasaclusteringandclassificationlayerontopofthedatayoustoreandmanage.Theyhelptogroupunlabeleddataaccordingtosimilaritiesamongtheexampleinputs,andtheyclassifydatawhentheyhavealabeleddatasettotrainon.(Neuralnetworkscanalsoextractfeaturesthatarefedtootheralgorithmsforclusteringandclassification;soyoucanthinkofdeepneuralnetworksascomponentsoflargermachine-learningapplicationsinvolvingalgorithmsforreinforcementlearning,classificationandregression.) Whatkindofproblemsdoesdeeplearningsolve,andmoreimportantly,canitsolveyours?Toknowtheanswer,youneedtoaskafewquestions: WhatoutcomesdoIcareabout?Inaclassificationproblem,thoseoutcomesarelabelsthatcouldbeappliedtodata:forexample,spamornot_spaminanemailfilter,good_guyorbad_guyinfrauddetection,angry_customerorhappy_customerincustomerrelationshipmanagement.Othertypesofproblemsincludeanomalydetection(usefulinfrauddetectionandpredictivemaintenanceofmanufacturingequipment),andclustering,whichisusefulinrecommendationsystemsthatsurfacesimilarities. DoIhavetherightdata?Forexample,ifyouhaveaclassificationproblem,you’llneedlabeleddata.Isthedatasetyouneedpubliclyavailable,orcanyoucreateit(withadataannotationservicelikeScaleorAWSMechanicalTurk)?Inthisexample,spamemailswouldbelabeledasspam,andthelabelswouldenablethealgorithmtomapfrominputstotheclassificationsyoucareabout.Youcan’tknowthatyouhavetherightdatauntilyougetyourhandsonit.Ifyouareadatascientistworkingonaproblem,youcan’ttrustanyonetotellyouwhetherthedataisgoodenough.Onlydirectexplorationofthedatawillanswerthisquestion. LearntobuildAIinsimulations » AFewConcreteExamples Deeplearningmapsinputstooutputs.Itfindscorrelations.Itisknownasa“universalapproximator”,becauseitcanlearntoapproximateanunknownfunctionf(x)=ybetweenanyinputxandanyoutputy,assumingtheyarerelatedatall(bycorrelationorcausation,forexample).Intheprocessoflearning,aneuralnetworkfindstherightf,orthecorrectmanneroftransformingxintoy,whetherthatbef(x)=3x+12orf(x)=9x-0.1.Hereareafewexamplesofwhatdeeplearningcando. Classification Allclassificationtasksdependuponlabeleddatasets;thatis,humansmusttransfertheirknowledgetothedatasetinorderforaneuralnetworktolearnthecorrelationbetweenlabelsanddata.Thisisknownassupervisedlearning. Detectfaces,identifypeopleinimages,recognizefacialexpressions(angry,joyful) Identifyobjectsinimages(stopsigns,pedestrians,lanemarkers…) Recognizegesturesinvideo Detectvoices,identifyspeakers,transcribespeechtotext,recognizesentimentinvoices Classifytextasspam(inemails),orfraudulent(ininsuranceclaims);recognizesentimentintext(customerfeedback) Anylabelsthathumanscangenerate,anyoutcomesthatyoucareaboutandwhichcorrelatetodata,canbeusedtotrainaneuralnetwork. Clustering Clusteringorgroupingisthedetectionofsimilarities.Deeplearningdoesnotrequirelabelstodetectsimilarities.Learningwithoutlabelsiscalledunsupervisedlearning.Unlabeleddataisthemajorityofdataintheworld.Onelawofmachinelearningis:themoredataanalgorithmcantrainon,themoreaccurateitwillbe.Therefore,unsupervisedlearninghasthepotentialtoproducehighlyaccuratemodels. Search:Comparingdocuments,imagesorsoundstosurfacesimilaritems. Anomalydetection:Theflipsideofdetectingsimilaritiesisdetectinganomalies,orunusualbehavior.Inmanycases,unusualbehaviorcorrelateshighlywiththingsyouwanttodetectandprevent,suchasfraud. PredictiveAnalytics:Regressions Withclassification,deeplearningisabletoestablishcorrelationsbetween,say,pixelsinanimageandthenameofaperson.Youmightcallthisastaticprediction.Bythesametoken,exposedtoenoughoftherightdata,deeplearningisabletoestablishcorrelationsbetweenpresenteventsandfutureevents.Itcanrunregressionbetweenthepastandthefuture.Thefutureeventislikethelabelinasense.Deeplearningdoesn’tnecessarilycareabouttime,orthefactthatsomethinghasn’thappenedyet.Givenatimeseries,deeplearningmayreadastringofnumberandpredictthenumbermostlikelytooccurnext. Hardwarebreakdowns(datacenters,manufacturing,transport) Healthbreakdowns(strokes,heartattacksbasedonvitalstatsanddatafromwearables) Customerchurn(predictingthelikelihoodthatacustomerwillleave,basedonwebactivityandmetadata) Employeeturnover(ditto,butforemployees) Thebetterwecanpredict,thebetterwecanpreventandpre-empt.Asyoucansee,withneuralnetworks,we’removingtowardsaworldoffewersurprises.Notzerosurprises,justmarginallyfewer.We’realsomovingtowardaworldofsmarteragentsthatcombineneuralnetworkswithotheralgorithmslikereinforcementlearningtoattaingoals. Withthatbriefoverviewofdeeplearningusecases,let’slookatwhatneuralnetsaremadeof. NeuralNetworkElements Deeplearningisthenameweusefor“stackedneuralnetworks”;thatis,networkscomposedofseverallayers. Thelayersaremadeofnodes.Anodeisjustaplacewherecomputationhappens,looselypatternedonaneuroninthehumanbrain,whichfireswhenitencounterssufficientstimuli.Anodecombinesinputfromthedatawithasetofcoefficients,orweights,thateitheramplifyordampenthatinput,therebyassigningsignificancetoinputswithregardtothetaskthealgorithmistryingtolearn;e.g.whichinputismosthelpfulisclassifyingdatawithouterror?Theseinput-weightproductsaresummedandthenthesumispassedthroughanode’sso-calledactivationfunction,todeterminewhetherandtowhatextentthatsignalshouldprogressfurtherthroughthenetworktoaffecttheultimateoutcome,say,anactofclassification.Ifthesignalspassesthrough,theneuronhasbeen“activated.” Here’sadiagramofwhatonenodemightlooklike. Anodelayerisarowofthoseneuron-likeswitchesthatturnonoroffastheinputisfedthroughthenet.Eachlayer’soutputissimultaneouslythesubsequentlayer’sinput,startingfromaninitialinputlayerreceivingyourdata. Pairingthemodel’sadjustableweightswithinputfeaturesishowweassignsignificancetothosefeatureswithregardtohowtheneuralnetworkclassifiesandclustersinput. KeyConceptsofDeepNeuralNetworks Deep-learningnetworksaredistinguishedfromthemorecommonplacesingle-hidden-layerneuralnetworksbytheirdepth;thatis,thenumberofnodelayersthroughwhichdatamustpassinamultistepprocessofpatternrecognition. Earlierversionsofneuralnetworkssuchasthefirstperceptronswereshallow,composedofoneinputandoneoutputlayer,andatmostonehiddenlayerinbetween.Morethanthreelayers(includinginputandoutput)qualifiesas“deep”learning.SodeepisnotjustabuzzwordtomakealgorithmsseemliketheyreadSartreandlistentobandsyouhaven’theardofyet.Itisastrictlydefinedtermthatmeansmorethanonehiddenlayer. Indeep-learningnetworks,eachlayerofnodestrainsonadistinctsetoffeaturesbasedonthepreviouslayer’soutput.Thefurtheryouadvanceintotheneuralnet,themorecomplexthefeaturesyournodescanrecognize,sincetheyaggregateandrecombinefeaturesfromthepreviouslayer. Thisisknownasfeaturehierarchy,anditisahierarchyofincreasingcomplexityandabstraction.Itmakesdeep-learningnetworkscapableofhandlingverylarge,high-dimensionaldatasetswithbillionsofparametersthatpassthroughnonlinearfunctions. Aboveall,theseneuralnetsarecapableofdiscoveringlatentstructureswithinunlabeled,unstructureddata,whichisthevastmajorityofdataintheworld.Anotherwordforunstructureddataisrawmedia;i.e.pictures,texts,videoandaudiorecordings.Therefore,oneoftheproblemsdeeplearningsolvesbestisinprocessingandclusteringtheworld’sraw,unlabeledmedia,discerningsimilaritiesandanomaliesindatathatnohumanhasorganizedinarelationaldatabaseoreverputanameto. Forexample,deeplearningcantakeamillionimages,andclusterthemaccordingtotheirsimilarities:catsinonecorner,icebreakersinanother,andinathirdallthephotosofyourgrandmother.Thisisthebasisofso-calledsmartphotoalbums. Nowapplythatsameideatootherdatatypes:Deeplearningmightclusterrawtextsuchasemailsornewsarticles.Emailsfullofangrycomplaintsmightclusterinonecornerofthevectorspace,whilesatisfiedcustomers,orspambotmessages,mightclusterinothers.Thisisthebasisofvariousmessagingfilters,andcanbeusedincustomer-relationshipmanagement(CRM).Thesameappliestovoicemessages. Withtimeseries,datamightclusteraroundnormal/healthybehaviorandanomalous/dangerousbehavior.Ifthetimeseriesdataisbeinggeneratedbyasmartphone,itwillprovideinsightintousers’healthandhabits;ifitisbeinggeneratedbyanautopart,itmightbeusedtopreventcatastrophicbreakdowns. Deep-learningnetworksperformautomaticfeatureextractionwithouthumanintervention,unlikemosttraditionalmachine-learningalgorithms.Giventhatfeatureextractionisataskthatcantaketeamsofdatascientistsyearstoaccomplish,deeplearningisawaytocircumventthechokepointoflimitedexperts.Itaugmentsthepowersofsmalldatascienceteams,whichbytheirnaturedonotscale. Whentrainingonunlabeleddata,eachnodelayerinadeepnetworklearnsfeaturesautomaticallybyrepeatedlytryingtoreconstructtheinputfromwhichitdrawsitssamples,attemptingtominimizethedifferencebetweenthenetwork’sguessesandtheprobabilitydistributionoftheinputdataitself.RestrictedBoltzmannmachines,forexamples,createso-calledreconstructionsinthismanner. Intheprocess,theseneuralnetworkslearntorecognizecorrelationsbetweencertainrelevantfeaturesandoptimalresults–theydrawconnectionsbetweenfeaturesignalsandwhatthosefeaturesrepresent,whetheritbeafullreconstruction,orwithlabeleddata. Adeep-learningnetworktrainedonlabeleddatacanthenbeappliedtounstructureddata,givingitaccesstomuchmoreinputthanmachine-learningnets.Thisisarecipeforhigherperformance:themoredataanetcantrainon,themoreaccurateitislikelytobe.(Badalgorithmstrainedonlotsofdatacanoutperformgoodalgorithmstrainedonverylittle.)Deeplearning’sabilitytoprocessandlearnfromhugequantitiesofunlabeleddatagiveitadistinctadvantageoverpreviousalgorithms. Deep-learningnetworksendinanoutputlayer:alogistic,orsoftmax,classifierthatassignsalikelihoodtoaparticularoutcomeorlabel.Wecallthatpredictive,butitispredictiveinabroadsense.Givenrawdataintheformofanimage,adeep-learningnetworkmaydecide,forexample,thattheinputdatais90percentlikelytorepresentaperson. Example:FeedforwardNetworks Ourgoalinusinganeuralnetistoarriveatthepointofleasterrorasfastaspossible.Wearerunningarace,andtheraceisaroundatrack,sowepassthesamepointsrepeatedlyinaloop.Thestartinglinefortheraceisthestateinwhichourweightsareinitialized,andthefinishlineisthestateofthoseparameterswhentheyarecapableofproducingsufficientlyaccurateclassificationsandpredictions. Theraceitselfinvolvesmanysteps,andeachofthosestepsresemblesthestepsbeforeandafter.Justlikearunner,wewillengageinarepetitiveactoverandovertoarriveatthefinish.Eachstepforaneuralnetworkinvolvesaguess,anerrormeasurementandaslightupdateinitsweights,anincrementaladjustmenttothecoefficients,asitslowlylearnstopayattentiontothemostimportantfeatures. Acollectionofweights,whethertheyareintheirstartorendstate,isalsocalledamodel,becauseitisanattempttomodeldata’srelationshiptoground-truthlabels,tograspthedata’sstructure.Modelsnormallystartoutbadandenduplessbad,changingovertimeastheneuralnetworkupdatesitsparameters. Thisisbecauseaneuralnetworkisborninignorance.Itdoesnotknowwhichweightsandbiaseswilltranslatetheinputbesttomakethecorrectguesses.Ithastostartoutwithaguess,andthentrytomakebetterguessessequentiallyasitlearnsfromitsmistakes.(Youcanthinkofaneuralnetworkasaminiatureenactmentofthescientificmethod,testinghypothesesandtryingagain–onlyitisthescientificmethodwithablindfoldon.Orlikeachild:theyarebornnotknowingmuch,andthroughexposuretolifeexperience,theyslowlylearntosolveproblemsintheworld.Forneuralnetworks,dataistheonlyexperience.) Hereisasimpleexplanationofwhathappensduringlearningwithafeedforwardneuralnetwork,thesimplestarchitecturetoexplain. Inputentersthenetwork.Thecoefficients,orweights,mapthatinputtoasetofguessesthenetworkmakesattheend. input*weight=guess Weightedinputresultsinaguessaboutwhatthatinputis.Theneuralthentakesitsguessandcomparesittoaground-truthaboutthedata,effectivelyaskinganexpert“DidIgetthisright?” groundtruth-guess=error Thedifferencebetweenthenetwork’sguessandthegroundtruthisitserror.Thenetworkmeasuresthaterror,andwalkstheerrorbackoveritsmodel,adjustingweightstotheextentthattheycontributedtotheerror. error*weight'scontributiontoerror=adjustment Thethreepseudo-mathematicalformulasaboveaccountforthethreekeyfunctionsofneuralnetworks:scoringinput,calculatinglossandapplyinganupdatetothemodel–tobeginthethree-stepprocessoveragain.Aneuralnetworkisacorrectivefeedbackloop,rewardingweightsthatsupportitscorrectguesses,andpunishingweightsthatleadittoerr. Let’slingeronthefirststepabove. MultipleLinearRegression Despitetheirbiologicallyinspiredname,artificialneuralnetworksarenothingmorethanmathandcode,likeanyothermachine-learningalgorithm.Infact,anyonewhounderstandslinearregression,oneoffirstmethodsyoulearninstatistics,canunderstandhowaneuralnetworks.Initssimplestform,linearregressionisexpressedas Y_hat=bX+a whereY_hatistheestimatedoutput,Xistheinput,bistheslopeandaistheinterceptofalineontheverticalaxisofatwo-dimensionalgraph.(Tomakethismoreconcrete:XcouldberadiationexposureandYcouldbethecancerrisk;XcouldbedailypushupsandY_hatcouldbethetotalweightyoucanbenchpress;XtheamountoffertilizerandY_hatthesizeofthecrop.)YoucanimaginethateverytimeyouaddaunittoX,thedependentvariableY_hatincreasesproportionally,nomatterhowfaralongyouareontheXaxis.Thatsimplerelationbetweentwovariablesmovingupordowntogetherisastartingpoint. Thenextstepistoimaginemultiplelinearregression,whereyouhavemanyinputvariablesproducinganoutputvariable.It’stypicallyexpressedlikethis: Y_hat=b_1*X_1+b_2*X_2+b_3*X_3+a (Toextendthecropexampleabove,youmightaddtheamountofsunlightandrainfallinagrowingseasontothefertilizervariable,withallthreeaffectingY_hat.) Now,thatformofmultiplelinearregressionishappeningateverynodeofaneuralnetwork.Foreachnodeofasinglelayer,inputfromeachnodeofthepreviouslayerisrecombinedwithinputfromeveryothernode.Thatis,theinputsaremixedindifferentproportions,accordingtotheircoefficients,whicharedifferentleadingintoeachnodeofthesubsequentlayer.Inthisway,anettestswhichcombinationofinputissignificantasittriestoreduceerror. OnceyousumyournodeinputstoarriveatY_hat,it’spassedthroughanon-linearfunction.Here’swhy:Ifeverynodemerelyperformedmultiplelinearregression,Y_hatwouldincreaselinearlyandwithoutlimitastheX’sincrease,butthatdoesn’tsuitourpurposes. Whatwearetryingtobuildateachnodeisaswitch(likeaneuron…)thatturnsonandoff,dependingonwhetherornotitshouldletthesignaloftheinputpassthroughtoaffecttheultimatedecisionsofthenetwork. Whenyouhaveaswitch,youhaveaclassificationproblem.Doestheinput’ssignalindicatethenodeshouldclassifyitasenough,ornot_enough,onoroff?Abinarydecisioncanbeexpressedby1and0,andlogisticregressionisanon-linearfunctionthatsquashesinputtotranslateittoaspacebetween0and1. Thenonlineartransformsateachnodeareusuallys-shapedfunctionssimilartologisticregression.Theygobythenamesofsigmoid(theGreekwordfor“S”),tanh,hardtanh,etc.,andtheyshapingtheoutputofeachnode.Theoutputofallnodes,eachsquashedintoans-shapedspacebetween0and1,isthenpassedasinputtothenextlayerinafeedforwardneuralnetwork,andsoonuntilthesignalreachesthefinallayerofthenet,wheredecisionsaremade. GradientDescent Thenameforonecommonlyusedoptimizationfunctionthatadjustsweightsaccordingtotheerrortheycausediscalled“gradientdescent.” Gradientisanotherwordforslope,andslope,initstypicalformonanx-ygraph,representshowtwovariablesrelatetoeachother:riseoverrun,thechangeinmoneyoverthechangeintime,etc.Inthisparticularcase,theslopewecareaboutdescribestherelationshipbetweenthenetwork’serrorandasingleweight;i.e.thatis,howdoestheerrorvaryastheweightisadjusted. Toputafinerpointonit,whichweightwillproducetheleasterror?Whichonecorrectlyrepresentsthesignalscontainedintheinputdata,andtranslatesthemtoacorrectclassification?Whichonecanhear“nose”inaninputimage,andknowthatshouldbelabeledasafaceandnotafryingpan? Asaneuralnetworklearns,itslowlyadjustsmanyweightssothattheycanmapsignaltomeaningcorrectly.TherelationshipbetweennetworkErrorandeachofthoseweightsisaderivative,dE/dw,thatmeasuresthedegreetowhichaslightchangeinaweightcausesaslightchangeintheerror. Eachweightisjustonefactorinadeepnetworkthatinvolvesmanytransforms;thesignaloftheweightpassesthroughactivationsandsumsoverseverallayers,soweusethechainruleofcalculustomarchbackthroughthenetworksactivationsandoutputsandfinallyarriveattheweightinquestion,anditsrelationshiptooverallerror. Thechainruleincalculusstatesthat Inafeedforwardnetwork,therelationshipbetweenthenet’serrorandasingleweightwilllooksomethinglikethis: Thatis,giventwovariables,Errorandweight,thataremediatedbyathirdvariable,activation,throughwhichtheweightispassed,youcancalculatehowachangeinweightaffectsachangeinErrorbyfirstcalculatinghowachangeinactivationaffectsachangeinError,andhowachangeinweightaffectsachangeinactivation. Theessenceoflearningindeeplearningisnothingmorethanthat:adjustingamodel’sweightsinresponsetotheerroritproduces,untilyoucan’treducetheerroranymore. LogisticRegression Onadeepneuralnetworkofmanylayers,thefinallayerhasaparticularrole.Whendealingwithlabeledinput,theoutputlayerclassifieseachexample,applyingthemostlikelylabel.Eachnodeontheoutputlayerrepresentsonelabel,andthatnodeturnsonoroffaccordingtothestrengthofthesignalitreceivesfromthepreviouslayer’sinputandparameters. Eachoutputnodeproducestwopossibleoutcomes,thebinaryoutputvalues0or1,becauseaninputvariableeitherdeservesalabeloritdoesnot.Afterall,thereisnosuchthingasalittlepregnant. Whileneuralnetworksworkingwithlabeleddataproducebinaryoutput,theinputtheyreceiveisoftencontinuous.Thatis,thesignalsthatthenetworkreceivesasinputwillspanarangeofvaluesandincludeanynumberofmetrics,dependingontheproblemitseekstosolve. Forexample,arecommendationenginehastomakeabinarydecisionaboutwhethertoserveanadornot.ButtheinputitbasesitsdecisiononcouldincludehowmuchacustomerhasspentonAmazoninthelastweek,orhowoftenthatcustomervisitsthesite. Sotheoutputlayerhastocondensesignalssuchas$67.59spentondiapers,and15visitstoawebsite,intoarangebetween0and1;i.e.aprobabilitythatagiveninputshouldbelabeledornot. Themechanismweusetoconvertcontinuoussignalsintobinaryoutputiscalledlogisticregression.Thenameisunfortunate,sincelogisticregressionisusedforclassificationratherthanregressioninthelinearsensethatmostpeoplearefamiliarwith.Itcalculatestheprobabilitythatasetofinputsmatchthelabel. Let’sexaminethislittleformula. Forcontinuousinputstobeexpressedasprobabilities,theymustoutputpositiveresults,sincethereisnosuchthingasanegativeprobability.That’swhyyouseeinputastheexponentofeinthedenominator–becauseexponentsforceourresultstobegreaterthanzero.Nowconsidertherelationshipofe’sexponenttothefraction1/1.One,asweknow,istheceilingofaprobability,beyondwhichourresultscan’tgowithoutbeingabsurd.(We’re120%sureofthat.) Astheinputxthattriggersalabelgrows,theexpressionetothexshrinkstowardzero,leavinguswiththefraction1/1,or100%,whichmeansweapproach(withouteverquitereaching)absolutecertaintythatthelabelapplies.Inputthatcorrelatesnegativelywithyouroutputwillhaveitsvalueflippedbythenegativesignone’sexponent,andasthatnegativesignalgrows,thequantityetothexbecomeslarger,pushingtheentirefractioneverclosertozero. Nowimaginethat,ratherthanhavingxastheexponent,youhavethesumoftheproductsofalltheweightsandtheircorrespondinginputs–thetotalsignalpassingthroughyournet.That’swhatyou’refeedingintothelogisticregressionlayerattheoutputlayerofaneuralnetworkclassifier. Withthislayer,wecansetadecisionthresholdabovewhichanexampleislabeled1,andbelowwhichitisnot.Youcansetdifferentthresholdsasyouprefer–alowthresholdwillincreasethenumberoffalsepositives,andahigheronewillincreasethenumberoffalsenegatives–dependingonwhichsideyouwouldliketoerr. NeuralNetworks&ArtificialIntelligence Insomecircles,neuralnetworksaresynonymouswithAI.Inothers,theyarethoughtofasa“bruteforce”technique,characterizedbyalackofintelligence,becausetheystartwithablankslate,andtheyhammertheirwaythroughtoanaccuratemodel.Bythisinterpretation,neuralnetworksareeffective,butinefficientintheirapproachtomodeling,sincetheydon’tmakeassumptionsaboutfunctionaldependenciesbetweenoutputandinput. Forwhatit’sworth,theforemostAIresearchgroupsarepushingtheedgeofthedisciplinebytraininglargerandlargerneuralnetworks.Bruteforceworks.Itisanecessary,ifnotsufficient,conditiontoAIbreakthroughs.OpenAI’spursuitofmoregeneralAIemphasizesabruteforceapproach,whichhasproveneffectivewithwell-knownmodelssuchasGPT-3. AlgorithmssuchasHinton’scapsulenetworksrequirefarfewerinstancesofdatatoconvergeonanaccuratemodel;thatis,presentresearchhasthepotentialtoresolvethebruteforceinefficienciesofdeeplearning. Whileneuralnetworksareusefulasafunctionapproximator,mappinginputstooutputsinmanytasksofperception,toachieveamoregeneralintelligence,theycanbecombinedwithotherAImethodstoperformmorecomplextasks.Forexample,deepreinforcementlearningembedsneuralnetworkswithinareinforcementlearningframework,wheretheymapactionstorewardsinordertoachievegoals.Deepmind’svictoriesinvideogamesandtheboardgameofgoaregoodexamples. FurtherReading ReinforcementLearningandNeuralNetworks RecurrentNeuralNetworks(RNNs)andLSTMs Word2vecandNeuralWordEmbeddings ConvolutionalNeuralNetworks(CNNs)andImageProcessing Accuracy,PrecisionandRecall AttentionMechanismsandTransformers Eigenvectors,Eigenvalues,PCA,CovarianceandEntropy GraphAnalyticsandDeepLearning SymbolicReasoningandMachineLearning MarkovChainMonteCarlo,AIandMarkovBlankets GenerativeAdversarialNetworks(GANs) AIvsMachineLearningvsDeepLearning MultilayerPerceptrons(MLPs) Simulations,OptimizationandAI ARecipeforTrainingNeuralNetworks,byAndrejKarpathy OptimizationAlgorithms Someexamplesofoptimizationalgorithmsinclude: ADADELTA ADAGRAD ADAM NESTEROVS NONE RMSPROP SGD CONJUGATEGRADIENT HESSIANFREE LBFGS LINEGRADIENTDESCENT ActivationFunctions Theactivationfunctiondeterminestheoutputthatanodewillgenerate,baseduponitsinput. Someexamplesinclude: CUBE ELU HARDSIGMOID HARDTANH IDENTITY LEAKYRELU RATIONALTANH RELU RRELU SIGMOID SOFTMAX SOFTPLUS SOFTSIGN TANH Share Tweet ChrisNicholson ChrisNicholsonistheCEOofPathmind.HepreviouslyledcommunicationsandrecruitingattheSequoia-backedrobo-advisor,FutureAdvisor,whichwasacquiredbyBlackRock.Inapriorlife,ChrisspentadecadereportingontechandfinanceforTheNewYorkTimes,BusinessweekandBloomberg,amongothers. Newsletter Abi-weeklydigestofAIusecasesinthenews.
延伸文章資訊
- 18 Applications of Neural Networks | Analytics Steps
- 2Real-Life Applications of Neural Networks | Smartsheet
Adaptive Learning: Like humans, neural networks model non-linear and complex relationships and bu...
- 3Neural Networks: Applications in the Real World | upGrad blog
- 4A Neural Network Playground
- 5Neural Network Definition - Investopedia