Types of artificial neural networks - Wikipedia
文章推薦指數: 80 %
Feedforward networks can be constructed with various types of units, such as binary McCulloch–Pitts neurons, the simplest of which is the perceptron. Continuous ... Typesofartificialneuralnetworks FromWikipedia,thefreeencyclopedia Jumptonavigation Jumptosearch ClassificationofArtificialNeuralNetworks(ANNs)Therearemanytypesofartificialneuralnetworks(ANN). Artificialneuralnetworksarecomputationalmodelsinspiredbybiologicalneuralnetworks,andareusedtoapproximatefunctionsthataregenerallyunknown.Particularly,theyareinspiredbythebehaviourofneuronsandtheelectricalsignalstheyconveybetweeninput(suchasfromtheeyesornerveendingsinthehand),processing,andoutputfromthebrain(suchasreactingtolight,touch,orheat).Thewayneuronssemanticallycommunicateisanareaofongoingresearch.[1][2][3][4]Mostartificialneuralnetworksbearonlysomeresemblancetotheirmorecomplexbiologicalcounterparts,butareveryeffectiveattheirintendedtasks(e.g.classificationorsegmentation). Someartificialneuralnetworksareadaptivesystemsandareusedforexampletomodelpopulationsandenvironments,whichconstantlychange. Neuralnetworkscanbehardware-(neuronsarerepresentedbyphysicalcomponents)orsoftware-based(computermodels),andcanuseavarietyoftopologiesandlearningalgorithms. Contents 1Feedforward 1.1Groupmethodofdatahandling 1.2Autoencoder 1.3Probabilistic 1.4Timedelay 1.5Convolutional 1.6Deepstackingnetwork 1.6.1Tensordeepstackingnetworks 2Regulatoryfeedback 3Radialbasisfunction(RBF) 3.1HowRBFnetworkswork 3.1.1RadialBasisFunction 3.1.2Architecture 3.1.3Training 3.2Generalregressionneuralnetwork 3.3Deepbeliefnetwork 4Recurrentneuralnetwork 4.1Fullyrecurrent 4.1.1Hopfield 4.1.2Boltzmannmachine 4.1.3Self-organizingmap 4.1.4Learningvectorquantization 4.2Simplerecurrent 4.3Reservoircomputing 4.3.1Echostate 4.4Longshort-termmemory 4.5Bi-directional 4.6Hierarchical 4.7Stochastic 4.8GeneticScale 5Modular 5.1Committeeofmachines 5.2Associative 6Physical 7Othertypes 7.1Instantaneouslytrained 7.2Spiking 7.3Regulatoryfeedback 7.4Neocognitron 7.5Compoundhierarchical-deepmodels 7.6Deeppredictivecodingnetworks 7.7Multilayerkernelmachine 8Dynamic 8.1Cascading 8.2Neuro-fuzzy 8.3Compositionalpattern-producing 9Memorynetworks 9.1One-shotassociativememory 9.2Hierarchicaltemporalmemory 9.3Holographicassociativememory 9.4LSTM-relateddifferentiablememorystructures 9.5NeuralTuringmachines 9.6Semantichashing 9.7Pointernetworks 10Hybrids 10.1Encoder–decodernetworks 11Seealso 12References Feedforward[edit] Mainarticle:Feedforwardneuralnetwork Thefeedforwardneuralnetworkwasthefirstandsimplesttype.Inthisnetworktheinformationmovesonlyfromtheinputlayerdirectlythroughanyhiddenlayerstotheoutputlayerwithoutcycles/loops.Feedforwardnetworkscanbeconstructedwithvarioustypesofunits,suchasbinaryMcCulloch–Pittsneurons,thesimplestofwhichistheperceptron.Continuousneurons,frequentlywithsigmoidalactivation,areusedinthecontextofbackpropagation. Groupmethodofdatahandling[edit] Mainarticle:Groupmethodofdatahandling TheGroupMethodofDataHandling(GMDH)[5]featuresfullyautomaticstructuralandparametricmodeloptimization.ThenodeactivationfunctionsareKolmogorov–Gaborpolynomialsthatpermitadditionsandmultiplications.Itusesadeepmultilayerperceptronwitheightlayers.[6]Itisasupervisedlearningnetworkthatgrowslayerbylayer,whereeachlayeristrainedbyregressionanalysis.Uselessitemsaredetectedusingavalidationset,andprunedthroughregularization.Thesizeanddepthoftheresultingnetworkdependsonthetask.[7] Autoencoder[edit] Mainarticle:Autoencoder Anautoencoder,autoassociatororDiabolonetwork[8]: 19 issimilartothemultilayerperceptron(MLP)–withaninputlayer,anoutputlayerandoneormorehiddenlayersconnectingthem.However,theoutputlayerhasthesamenumberofunitsastheinputlayer.Itspurposeistoreconstructitsowninputs(insteadofemittingatargetvalue).Therefore,autoencodersareunsupervisedlearningmodels.Anautoencoderisusedforunsupervisedlearningofefficientcodings,[9][10]typicallyforthepurposeofdimensionalityreductionandforlearninggenerativemodelsofdata.[11][12] Probabilistic[edit] Mainarticle:Probabilisticneuralnetwork Aprobabilisticneuralnetwork(PNN)isafour-layerfeedforwardneuralnetwork.ThelayersareInput,hidden,pattern/summationandoutput.InthePNNalgorithm,theparentprobabilitydistributionfunction(PDF)ofeachclassisapproximatedbyaParzenwindowandanon-parametricfunction.Then,usingPDFofeachclass,theclassprobabilityofanewinputisestimatedandBayes’ruleisemployedtoallocateittotheclasswiththehighestposteriorprobability.[13]ItwasderivedfromtheBayesiannetwork[14]andastatisticalalgorithmcalledKernelFisherdiscriminantanalysis.[15]Itisusedforclassificationandpatternrecognition. Timedelay[edit] Mainarticle:Timedelayneuralnetwork Atimedelayneuralnetwork(TDNN)isafeedforwardarchitectureforsequentialdatathatrecognizesfeaturesindependentofsequenceposition.Inordertoachievetime-shiftinvariance,delaysareaddedtotheinputsothatmultipledatapoints(pointsintime)areanalyzedtogether. Itusuallyformspartofalargerpatternrecognitionsystem.Ithasbeenimplementedusingaperceptronnetworkwhoseconnectionweightsweretrainedwithbackpropagation(supervisedlearning).[16] Convolutional[edit] Mainarticle:Convolutionalneuralnetwork Aconvolutionalneuralnetwork(CNN,orConvNetorshiftinvariantorspaceinvariant)isaclassofdeepnetwork,composedofoneormoreconvolutionallayerswithfullyconnectedlayers(matchingthoseintypicalANNs)ontop.[17][18]Itusestiedweightsandpoolinglayers.Inparticular,max-pooling.[19]ItisoftenstructuredviaFukushima'sconvolutionalarchitecture.[20]Theyarevariationsofmultilayerperceptronsthatuseminimalpreprocessing.[21]ThisarchitectureallowsCNNstotakeadvantageofthe2Dstructureofinputdata. Itsunitconnectivitypatternisinspiredbytheorganizationofthevisualcortex.Unitsrespondtostimuliinarestrictedregionofspaceknownasthereceptivefield.Receptivefieldspartiallyoverlap,over-coveringtheentirevisualfield.Unitresponsecanbeapproximatedmathematicallybyaconvolutionoperation.[22] CNNsaresuitableforprocessingvisualandothertwo-dimensionaldata.[23][24]Theyhaveshownsuperiorresultsinbothimageandspeechapplications.Theycanbetrainedwithstandardbackpropagation.CNNsareeasiertotrainthanotherregular,deep,feed-forwardneuralnetworksandhavemanyfewerparameterstoestimate.[25] CapsuleNeuralNetworks(CapsNet)addstructurescalledcapsulestoaCNNandreuseoutputfromseveralcapsulestoformmorestable(withrespecttovariousperturbations)representations.[26] ExamplesofapplicationsincomputervisionincludeDeepDream[27]androbotnavigation.[28]Theyhavewideapplicationsinimageandvideorecognition,recommendersystems[29]andnaturallanguageprocessing.[30] Deepstackingnetwork[edit] Adeepstackingnetwork(DSN)[31](deepconvexnetwork)isbasedonahierarchyofblocksofsimplifiedneuralnetworkmodules.Itwasintroducedin2011byDengandDong.[32]Itformulatesthelearningasaconvexoptimizationproblemwithaclosed-formsolution,emphasizingthemechanism'ssimilaritytostackedgeneralization.[33]EachDSNblockisasimplemodulethatiseasytotrainbyitselfinasupervisedfashionwithoutbackpropagationfortheentireblocks.[34] Eachblockconsistsofasimplifiedmulti-layerperceptron(MLP)withasinglehiddenlayer.Thehiddenlayerhhaslogisticsigmoidalunits,andtheoutputlayerhaslinearunits.ConnectionsbetweentheselayersarerepresentedbyweightmatrixU;input-to-hidden-layerconnectionshaveweightmatrixW.TargetvectorstformthecolumnsofmatrixT,andtheinputdatavectorsxformthecolumnsofmatrixX.Thematrixofhiddenunitsis H = σ ( W T X ) {\displaystyle{\boldsymbol{H}}=\sigma({\boldsymbol{W}}^{T}{\boldsymbol{X}})} .Modulesaretrainedinorder,solower-layerweightsWareknownateachstage.Thefunctionperformstheelement-wiselogisticsigmoidoperation.Eachblockestimatesthesamefinallabelclassy,anditsestimateisconcatenatedwithoriginalinputXtoformtheexpandedinputforthenextblock.Thus,theinputtothefirstblockcontainstheoriginaldataonly,whiledownstreamblocks'inputaddstheoutputofprecedingblocks.Thenlearningtheupper-layerweightmatrixUgivenotherweightsinthenetworkcanbeformulatedasaconvexoptimizationproblem: min U T f = ‖ U T H − T ‖ F 2 , {\displaystyle\min_{U^{T}}f=\|{\boldsymbol{U}}^{T}{\boldsymbol{H}}-{\boldsymbol{T}}\|_{F}^{2},} whichhasaclosed-formsolution.[31] Unlikeotherdeeparchitectures,suchasDBNs,thegoalisnottodiscoverthetransformedfeaturerepresentation.Thestructureofthehierarchyofthiskindofarchitecturemakesparallellearningstraightforward,asabatch-modeoptimizationproblem.Inpurelydiscriminativetasks,DSNsoutperformconventionalDBNs. Tensordeepstackingnetworks[edit] ThisarchitectureisaDSNextension.Itofferstwoimportantimprovements:ituseshigher-orderinformationfromcovariancestatistics,andittransformsthenon-convexproblemofalower-layertoaconvexsub-problemofanupper-layer.[35]TDSNsusecovariancestatisticsinabilinearmappingfromeachoftwodistinctsetsofhiddenunitsinthesamelayertopredictions,viaathird-ordertensor. WhileparallelizationandscalabilityarenotconsideredseriouslyinconventionalDNNs,[36][37][38]alllearningforDSNsandTDSNsisdoneinbatchmode,toallowparallelization.[39][40]Parallelizationallowsscalingthedesigntolarger(deeper)architecturesanddatasets. Thebasicarchitectureissuitablefordiversetaskssuchasclassificationandregression. Regulatoryfeedback[edit] Regulatoryfeedbacknetworksstartedasamodeltoexplainbrainphenomenafoundduringrecognitionincludingnetwork-wideburstinganddifficultywithsimilarityfounduniversallyinsensoryrecognition.Amechanismtoperformoptimizationduringrecognitioniscreatedusinginhibitoryfeedbackconnectionsbacktothesameinputsthatactivatethem.Thisreducesrequirementsduringlearningandallowslearningandupdatingtobeeasierwhilestillbeingabletoperformcomplexrecognition. Radialbasisfunction(RBF)[edit] Mainarticle:RadialbasisfunctionnetworkRadialbasisfunctionsarefunctionsthathaveadistancecriterionwithrespecttoacenter.Radialbasisfunctionshavebeenappliedasareplacementforthesigmoidalhiddenlayertransfercharacteristicinmulti-layerperceptrons.RBFnetworkshavetwolayers:Inthefirst,inputismappedontoeachRBFinthe'hidden'layer.TheRBFchosenisusuallyaGaussian.Inregressionproblemstheoutputlayerisalinearcombinationofhiddenlayervaluesrepresentingmeanpredictedoutput.Theinterpretationofthisoutputlayervalueisthesameasaregressionmodelinstatistics.Inclassificationproblemstheoutputlayeristypicallyasigmoidfunctionofalinearcombinationofhiddenlayervalues,representingaposteriorprobability.Performanceinbothcasesisoftenimprovedbyshrinkagetechniques,knownasridgeregressioninclassicalstatistics.Thiscorrespondstoapriorbeliefinsmallparametervalues(andthereforesmoothoutputfunctions)inaBayesianframework. RBFnetworkshavetheadvantageofavoidinglocalminimainthesamewayasmulti-layerperceptrons.Thisisbecausetheonlyparametersthatareadjustedinthelearningprocessarethelinearmappingfromhiddenlayertooutputlayer.Linearityensuresthattheerrorsurfaceisquadraticandthereforehasasingleeasilyfoundminimum.Inregressionproblemsthiscanbefoundinonematrixoperation.Inclassificationproblemsthefixednon-linearityintroducedbythesigmoidoutputfunctionismostefficientlydealtwithusingiterativelyre-weightedleastsquares. RBFnetworkshavethedisadvantageofrequiringgoodcoverageoftheinputspacebyradialbasisfunctions.RBFcentresaredeterminedwithreferencetothedistributionoftheinputdata,butwithoutreferencetothepredictiontask.Asaresult,representationalresourcesmaybewastedonareasoftheinputspacethatareirrelevanttothetask.Acommonsolutionistoassociateeachdatapointwithitsowncentre,althoughthiscanexpandthelinearsystemtobesolvedinthefinallayerandrequiresshrinkagetechniquestoavoidoverfitting. AssociatingeachinputdatumwithanRBFleadsnaturallytokernelmethodssuchassupportvectormachines(SVM)andGaussianprocesses(theRBFisthekernelfunction).Allthreeapproachesuseanon-linearkernelfunctiontoprojecttheinputdataintoaspacewherethelearningproblemcanbesolvedusingalinearmodel.LikeGaussianprocesses,andunlikeSVMs,RBFnetworksaretypicallytrainedinamaximumlikelihoodframeworkbymaximizingtheprobability(minimizingtheerror).SVMsavoidoverfittingbymaximizinginsteadamargin.SVMsoutperformRBFnetworksinmostclassificationapplications.Inregressionapplicationstheycanbecompetitivewhenthedimensionalityoftheinputspaceisrelativelysmall. HowRBFnetworkswork[edit] RBFneuralnetworksareconceptuallysimilartoK-NearestNeighbor(k-NN)models.Thebasicideaisthatsimilarinputsproducesimilaroutputs. Inthecaseinofatrainingsethastwopredictorvariables,xandyandthetargetvariablehastwocategories,positiveandnegative.Givenanewcasewithpredictorvaluesx=6,y=5.1,howisthetargetvariablecomputed? Thenearestneighborclassificationperformedforthisexampledependsonhowmanyneighboringpointsareconsidered.If1-NNisusedandtheclosestpointisnegative,thenthenewpointshouldbeclassifiedasnegative.Alternatively,if9-NNclassificationisusedandtheclosest9pointsareconsidered,thentheeffectofthesurrounding8positivepointsmayoutweightheclosest9(negative)point. AnRBFnetworkpositionsneuronsinthespacedescribedbythepredictorvariables(x,yinthisexample).Thisspacehasasmanydimensionsaspredictorvariables.TheEuclideandistanceiscomputedfromthenewpointtothecenterofeachneuron,andaradialbasisfunction(RBF)(alsocalledakernelfunction)isappliedtothedistancetocomputetheweight(influence)foreachneuron.Theradialbasisfunctionissonamedbecausetheradiusdistanceistheargumenttothefunction. Weight=RBF(distance) RadialBasisFunction[edit] ThevalueforthenewpointisfoundbysummingtheoutputvaluesoftheRBFfunctionsmultipliedbyweightscomputedforeachneuron. Theradialbasisfunctionforaneuronhasacenterandaradius(alsocalledaspread).Theradiusmaybedifferentforeachneuron,and,inRBFnetworksgeneratedbyDTREG,theradiusmaybedifferentineachdimension. Withlargerspread,neuronsatadistancefromapointhaveagreaterinfluence. Architecture[edit] RBFnetworkshavethreelayers: Inputlayer:Oneneuronappearsintheinputlayerforeachpredictorvariable.Inthecaseofcategoricalvariables,N-1neuronsareusedwhereNisthenumberofcategories.Theinputneuronsstandardizesthevaluerangesbysubtractingthemediananddividingbytheinterquartilerange.Theinputneuronsthenfeedthevaluestoeachoftheneuronsinthehiddenlayer. Hiddenlayer:Thislayerhasavariablenumberofneurons(determinedbythetrainingprocess).Eachneuronconsistsofaradialbasisfunctioncenteredonapointwithasmanydimensionsaspredictorvariables.Thespread(radius)oftheRBFfunctionmaybedifferentforeachdimension.Thecentersandspreadsaredeterminedbytraining.Whenpresentedwiththexvectorofinputvaluesfromtheinputlayer,ahiddenneuroncomputestheEuclideandistanceofthetestcasefromtheneuron'scenterpointandthenappliestheRBFkernelfunctiontothisdistanceusingthespreadvalues.Theresultingvalueispassedtothesummationlayer. Summationlayer:Thevaluecomingoutofaneuroninthehiddenlayerismultipliedbyaweightassociatedwiththeneuronandaddstotheweightedvaluesofotherneurons.Thissumbecomestheoutput.Forclassificationproblems,oneoutputisproduced(withaseparatesetofweightsandsummationunit)foreachtargetcategory.Thevalueoutputforacategoryistheprobabilitythatthecasebeingevaluatedhasthatcategory. Training[edit] Thefollowingparametersaredeterminedbythetrainingprocess: Thenumberofneuronsinthehiddenlayer Thecoordinatesofthecenterofeachhidden-layerRBFfunction Theradius(spread)ofeachRBFfunctionineachdimension TheweightsappliedtotheRBFfunctionoutputsastheypasstothesummationlayer VariousmethodshavebeenusedtotrainRBFnetworks.OneapproachfirstusesK-meansclusteringtofindclustercenterswhicharethenusedasthecentersfortheRBFfunctions.However,K-meansclusteringiscomputationallyintensiveanditoftendoesnotgeneratetheoptimalnumberofcenters.Anotherapproachistousearandomsubsetofthetrainingpointsasthecenters. DTREGusesatrainingalgorithmthatusesanevolutionaryapproachtodeterminetheoptimalcenterpointsandspreadsforeachneuron.Itdetermineswhentostopaddingneuronstothenetworkbymonitoringtheestimatedleave-one-out(LOO)errorandterminatingwhentheLOOerrorbeginstoincreasebecauseofoverfitting. Thecomputationoftheoptimalweightsbetweentheneuronsinthehiddenlayerandthesummationlayerisdoneusingridgeregression.AniterativeprocedurecomputestheoptimalregularizationLambdaparameterthatminimizesthegeneralizedcross-validation(GCV)error. Generalregressionneuralnetwork[edit] Mainarticle:Generalregressionneuralnetwork AGRNNisanassociativememoryneuralnetworkthatissimilartotheprobabilisticneuralnetworkbutitisusedforregressionandapproximationratherthanclassification. Deepbeliefnetwork[edit] Mainarticle:Deepbeliefnetwork ArestrictedBoltzmannmachine(RBM)withfullyconnectedvisibleandhiddenunits.Notetherearenohidden-hiddenorvisible-visibleconnections. Adeepbeliefnetwork(DBN)isaprobabilistic,generativemodelmadeupofmultiplehiddenlayers.Itcanbeconsideredacompositionofsimplelearningmodules.[41] ADBNcanbeusedtogenerativelypre-trainadeepneuralnetwork(DNN)byusingthelearnedDBNweightsastheinitialDNNweights.Variousdiscriminativealgorithmscanthentunetheseweights.Thisisparticularlyhelpfulwhentrainingdataarelimited,becausepoorlyinitializedweightscansignificantlyhinderlearning.Thesepre-trainedweightsendupinaregionoftheweightspacethatisclosertotheoptimalweightsthanrandomchoices.Thisallowsforbothimprovedmodelingandfasterultimateconvergence.[42] Recurrentneuralnetwork[edit] Mainarticle:Recurrentneuralnetwork Recurrentneuralnetworks(RNN)propagatedataforward,butalsobackwards,fromlaterprocessingstagestoearlierstages.RNNcanbeusedasgeneralsequenceprocessors. Fullyrecurrent[edit] Thisarchitecturewasdevelopedinthe1980s.Itsnetworkcreatesadirectedconnectionbetweeneverypairofunits.Eachhasatime-varying,real-valued(morethanjustzeroorone)activation(output).Eachconnectionhasamodifiablereal-valuedweight.Someofthenodesarecalledlabelednodes,someoutputnodes,theresthiddennodes. Forsupervisedlearningindiscretetimesettings,trainingsequencesofreal-valuedinputvectorsbecomesequencesofactivationsoftheinputnodes,oneinputvectoratatime.Ateachtimestep,eachnon-inputunitcomputesitscurrentactivationasanonlinearfunctionoftheweightedsumoftheactivationsofallunitsfromwhichitreceivesconnections.Thesystemcanexplicitlyactivate(independentofincomingsignals)someoutputunitsatcertaintimesteps.Forexample,iftheinputsequenceisaspeechsignalcorrespondingtoaspokendigit,thefinaltargetoutputattheendofthesequencemaybealabelclassifyingthedigit.Foreachsequence,itserroristhesumofthedeviationsofallactivationscomputedbythenetworkfromthecorrespondingtargetsignals.Foratrainingsetofnumeroussequences,thetotalerroristhesumoftheerrorsofallindividualsequences. Tominimizetotalerror,gradientdescentcanbeusedtochangeeachweightinproportiontoitsderivativewithrespecttotheerror,providedthenon-linearactivationfunctionsaredifferentiable.Thestandardmethodiscalled"backpropagationthroughtime"orBPTT,ageneralizationofback-propagationforfeedforwardnetworks.[43][44]Amorecomputationallyexpensiveonlinevariantiscalled"Real-TimeRecurrentLearning"orRTRL.[45][46]UnlikeBPTTthisalgorithmislocalintimebutnotlocalinspace.[47][48]AnonlinehybridbetweenBPTTandRTRLwithintermediatecomplexityexists,[49][50]withvariantsforcontinuoustime.[51]AmajorproblemwithgradientdescentforstandardRNNarchitecturesisthaterrorgradientsvanishexponentiallyquicklywiththesizeofthetimelagbetweenimportantevents.[52][53]TheLongshort-termmemoryarchitectureovercomestheseproblems.[54] Inreinforcementlearningsettings,noteacherprovidestargetsignals.Insteadafitnessfunctionorrewardfunctionorutilityfunctionisoccasionallyusedtoevaluateperformance,whichinfluencesitsinputstreamthroughoutputunitsconnectedtoactuatorsthataffecttheenvironment.Variantsofevolutionarycomputationareoftenusedtooptimizetheweightmatrix. Hopfield[edit] Mainarticle:Hopfieldnetwork TheHopfieldnetwork(likesimilarattractor-basednetworks)isofhistoricinterestalthoughitisnotageneralRNN,asitisnotdesignedtoprocesssequencesofpatterns.Insteaditrequiresstationaryinputs.ItisanRNNinwhichallconnectionsaresymmetric.Itguaranteesthatitwillconverge.IftheconnectionsaretrainedusingHebbianlearningtheHopfieldnetworkcanperformasrobustcontent-addressablememory,resistanttoconnectionalteration. Boltzmannmachine[edit] Mainarticle:Boltzmannmachine TheBoltzmannmachinecanbethoughtofasanoisyHopfieldnetwork.Itisoneofthefirstneuralnetworkstodemonstratelearningoflatentvariables(hiddenunits).Boltzmannmachinelearningwasatfirstslowtosimulate,butthecontrastivedivergencealgorithmspeedsuptrainingforBoltzmannmachinesandProductsofExperts. Self-organizingmap[edit] Mainarticle:Self-organizingmap Theself-organizingmap(SOM)usesunsupervisedlearning.Asetofneuronslearntomappointsinaninputspacetocoordinatesinanoutputspace.Theinputspacecanhavedifferentdimensionsandtopologyfromtheoutputspace,andSOMattemptstopreservethese. Learningvectorquantization[edit] Mainarticle:Learningvectorquantization Learningvectorquantization(LVQ)canbeinterpretedasaneuralnetworkarchitecture.Prototypicalrepresentativesoftheclassesparameterize,togetherwithanappropriatedistancemeasure,inadistance-basedclassificationscheme. Simplerecurrent[edit] Simplerecurrentnetworkshavethreelayers,withtheadditionofasetof"contextunits"intheinputlayer.Theseunitsconnectfromthehiddenlayerortheoutputlayerwithafixedweightofone.[55]Ateachtimestep,theinputispropagatedinastandardfeedforwardfashion,andthenabackpropagation-likelearningruleisapplied(notperforminggradientdescent).Thefixedbackconnectionsleaveacopyofthepreviousvaluesofthehiddenunitsinthecontextunits(sincetheypropagateovertheconnectionsbeforethelearningruleisapplied). Reservoircomputing[edit] Mainarticle:Reservoircomputing Reservoircomputingisacomputationframeworkthatmaybeviewedasanextensionofneuralnetworks.[56]Typicallyaninputsignalisfedintoafixed(random)dynamicalsystemcalledareservoirwhosedynamicsmaptheinputtoahigherdimension.Areadoutmechanismistrainedtomapthereservoirtothedesiredoutput.Trainingisperformedonlyatthereadoutstage.Liquid-statemachines[57]areatypeofreservoircomputing.[58] Echostate[edit] Mainarticle:Echostatenetwork Theechostatenetwork(ESN)employsasparselyconnectedrandomhiddenlayer.Theweightsofoutputneuronsaretheonlypartofthenetworkthataretrained.ESNaregoodatreproducingcertaintimeseries.[59] Longshort-termmemory[edit] Mainarticle:Longshort-termmemory Thelongshort-termmemory(LSTM)[54]avoidsthevanishinggradientproblem.Itworksevenwhenwithlongdelaysbetweeninputsandcanhandlesignalsthatmixlowandhighfrequencycomponents.LSTMRNNoutperformedotherRNNandothersequencelearningmethodssuchasHMMinapplicationssuchaslanguagelearning[60]andconnectedhandwritingrecognition.[61] Bi-directional[edit] Bi-directionalRNN,orBRNN,useafinitesequencetopredictorlabeleachelementofasequencebasedonboththepastandfuturecontextoftheelement.[62]ThisisdonebyaddingtheoutputsoftwoRNNs:oneprocessingthesequencefromlefttoright,theotheronefromrighttoleft.Thecombinedoutputsarethepredictionsoftheteacher-giventargetsignals.ThistechniqueprovedtobeespeciallyusefulwhencombinedwithLSTM.[63] Hierarchical[edit] HierarchicalRNNconnectselementsinvariouswaystodecomposehierarchicalbehaviorintousefulsubprograms.[64][65] Stochastic[edit] Mainarticle:Artificial_neural_network§ Stochastic_neural_network Adistrictfromconventionalneuralnetworks,stochasticartificialneuralnetworkusedasanapproximationto randomfunctions. GeneticScale[edit] ARNN(oftenaLSTM)whereaseriesisdecomposedintoanumberofscaleswhereeveryscaleinformstheprimarylengthbetweentwoconsecutivepoints.AfirstorderscaleconsistsofanormalRNN,asecondorderconsistsofallpointsseparatedbytwoindicesandsoon.TheNthorderRNNconnectsthefirstandlastnode.TheoutputsfromallthevariousscalesaretreatedasaCommitteeofMachinesandtheassociatedscoresareusedgeneticallyforthenextiteration. Modular[edit] Mainarticle:Modularneuralnetwork Biologicalstudieshaveshownthatthehumanbrainoperatesasacollectionofsmallnetworks.Thisrealizationgavebirthtotheconceptofmodularneuralnetworks,inwhichseveralsmallnetworkscooperateorcompetetosolveproblems. Committeeofmachines[edit] Mainarticle:Committeemachine Acommitteeofmachines(CoM)isacollectionofdifferentneuralnetworksthattogether"vote"onagivenexample.Thisgenerallygivesamuchbetterresultthanindividualnetworks.Becauseneuralnetworkssufferfromlocalminima,startingwiththesamearchitectureandtrainingbutusingrandomlydifferentinitialweightsoftengivesvastlydifferentresults.[citationneeded]ACoMtendstostabilizetheresult. TheCoMissimilartothegeneralmachinelearningbaggingmethod,exceptthatthenecessaryvarietyofmachinesinthecommitteeisobtainedbytrainingfromdifferentstartingweightsratherthantrainingondifferentrandomlyselectedsubsetsofthetrainingdata. Associative[edit] Theassociativeneuralnetwork(ASNN)isanextensionofcommitteeofmachinesthatcombinesmultiplefeedforwardneuralnetworksandthek-nearestneighbortechnique.ItusesthecorrelationbetweenensembleresponsesasameasureofdistanceamidtheanalyzedcasesforthekNN.ThiscorrectstheBiasoftheneuralnetworkensemble.Anassociativeneuralnetworkhasamemorythatcancoincidewiththetrainingset.Ifnewdatabecomeavailable,thenetworkinstantlyimprovesitspredictiveabilityandprovidesdataapproximation(self-learns)withoutretraining.AnotherimportantfeatureofASNNisthepossibilitytointerpretneuralnetworkresultsbyanalysisofcorrelationsbetweendatacasesinthespaceofmodels.[66] Physical[edit] Mainarticle:Physicalneuralnetwork Aphysicalneuralnetworkincludeselectricallyadjustableresistancematerialtosimulateartificialsynapses.ExamplesincludetheADALINEmemristor-basedneuralnetwork.[67]An opticalneuralnetwork isaphysicalimplementationofan artificialneuralnetwork with opticalcomponents. Othertypes[edit] Instantaneouslytrained[edit] Instantaneouslytrainedneuralnetworks(ITNN)wereinspiredbythephenomenonofshort-termlearningthatseemstooccurinstantaneously.Inthesenetworkstheweightsofthehiddenandtheoutputlayersaremappeddirectlyfromthetrainingvectordata.Ordinarily,theyworkonbinarydata,butversionsforcontinuousdatathatrequiresmalladditionalprocessingexist. Spiking[edit] Spikingneuralnetworks(SNN)explicitlyconsiderthetimingofinputs.Thenetworkinputandoutputareusuallyrepresentedasaseriesofspikes(deltafunctionormorecomplexshapes).SNNcanprocessinformationinthetimedomain(signalsthatvaryovertime).Theyareoftenimplementedasrecurrentnetworks.SNNarealsoaformofpulsecomputer.[68] Spikingneuralnetworkswithaxonalconductiondelaysexhibitpolychronization,andhencecouldhaveaverylargememorycapacity.[69] SNNandthetemporalcorrelationsofneuralassembliesinsuchnetworks—havebeenusedtomodelfigure/groundseparationandregionlinkinginthevisualsystem. Regulatoryfeedback[edit] Aregulatoryfeedbacknetworkmakesinferencesusingnegativefeedback.[70]Thefeedbackisusedtofindtheoptimalactivationofunits.Itismostsimilartoanon-parametricmethodbutisdifferentfromK-nearestneighborinthatitmathematicallyemulatesfeedforwardnetworks. Neocognitron[edit] Theneocognitronisahierarchical,multilayerednetworkthatwasmodeledafterthevisualcortex.Itusesmultipletypesofunits,(originallytwo,calledsimpleandcomplexcells),asacascadingmodelforuseinpatternrecognitiontasks.[71][72][73]LocalfeaturesareextractedbyS-cellswhosedeformationistoleratedbyC-cells.Localfeaturesintheinputareintegratedgraduallyandclassifiedathigherlayers.[74]Amongthevariouskindsofneocognitron[75]aresystemsthatcandetectmultiplepatternsinthesameinputbyusingbackpropagationtoachieveselectiveattention.[76]Ithasbeenusedforpatternrecognitiontasksandinspiredconvolutionalneuralnetworks.[77] Compoundhierarchical-deepmodels[edit] Compoundhierarchical-deepmodelscomposedeepnetworkswithnon-parametricBayesianmodels.FeaturescanbelearnedusingdeeparchitecturessuchasDBNs,[78]deepBoltzmannmachines(DBM),[79]deepautoencoders,[80]convolutionalvariants,[81][82]ssRBMs,[83]deepcodingnetworks,[84]DBNswithsparsefeaturelearning,[85]RNNs,[86]conditionalDBNs,[87]de-noisingautoencoders.[88]Thisprovidesabetterrepresentation,allowingfasterlearningandmoreaccurateclassificationwithhigh-dimensionaldata.However,thesearchitecturesarepooratlearningnovelclasseswithfewexamples,becauseallnetworkunitsareinvolvedinrepresentingtheinput(adistributedrepresentation)andmustbeadjustedtogether(highdegreeoffreedom).Limitingthedegreeoffreedomreducesthenumberofparameterstolearn,facilitatinglearningofnewclassesfromfewexamples.HierarchicalBayesian(HB)modelsallowlearningfromfewexamples,forexample[89][90][91][92][93]forcomputervision,statisticsandcognitivescience. CompoundHDarchitecturesaimtointegratecharacteristicsofbothHBanddeepnetworks.ThecompoundHDP-DBMarchitectureisahierarchicalDirichletprocess(HDP)asahierarchicalmodel,incorporatingDBMarchitecture.Itisafullgenerativemodel,generalizedfromabstractconceptsflowingthroughthemodellayers,whichisabletosynthesizenewexamplesinnovelclassesthatlook"reasonably"natural.Allthelevelsarelearnedjointlybymaximizingajointlog-probabilityscore.[94] InaDBMwiththreehiddenlayers,theprobabilityofavisibleinput''ν''is: p ( ν , ψ ) = 1 Z ∑ h exp ( ∑ i j W i j ( 1 ) ν i h j 1 + ∑ j ℓ W j ℓ ( 2 ) h j 1 h ℓ 2 + ∑ ℓ m W ℓ m ( 3 ) h ℓ 2 h m 3 ) , {\displaystylep({\boldsymbol{\nu}},\psi)={\frac{1}{Z}}\sum_{h}\exp\left(\sum_{ij}W_{ij}^{(1)}\nu_{i}h_{j}^{1}+\sum_{j\ell}W_{j\ell}^{(2)}h_{j}^{1}h_{\ell}^{2}+\sum_{\ellm}W_{\ellm}^{(3)}h_{\ell}^{2}h_{m}^{3}\right),} where h = { h ( 1 ) , h ( 2 ) , h ( 3 ) } {\displaystyle{\boldsymbol{h}}=\{{\boldsymbol{h}}^{(1)},{\boldsymbol{h}}^{(2)},{\boldsymbol{h}}^{(3)}\}} isthesetofhiddenunits,and ψ = { W ( 1 ) , W ( 2 ) , W ( 3 ) } {\displaystyle\psi=\{{\boldsymbol{W}}^{(1)},{\boldsymbol{W}}^{(2)},{\boldsymbol{W}}^{(3)}\}} arethemodelparameters,representingvisible-hiddenandhidden-hiddensymmetricinteractionterms. AlearnedDBMmodelisanundirectedmodelthatdefinesthejointdistribution P ( ν , h 1 , h 2 , h 3 ) {\displaystyleP(\nu,h^{1},h^{2},h^{3})} .Onewaytoexpresswhathasbeenlearnedistheconditionalmodel P ( ν , h 1 , h 2 ∣ h 3 ) {\displaystyleP(\nu,h^{1},h^{2}\midh^{3})} andapriorterm P ( h 3 ) {\displaystyleP(h^{3})} . Here P ( ν , h 1 , h 2 ∣ h 3 ) {\displaystyleP(\nu,h^{1},h^{2}\midh^{3})} representsaconditionalDBMmodel,whichcanbeviewedasatwo-layerDBMbutwithbiastermsgivenbythestatesof h 3 {\displaystyleh^{3}} : P ( ν , h 1 , h 2 ∣ h 3 ) = 1 Z ( ψ , h 3 ) exp ( ∑ i j W i j ( 1 ) ν i h j 1 + ∑ j ℓ W j ℓ ( 2 ) h j 1 h ℓ 2 + ∑ ℓ m W ℓ m ( 3 ) h ℓ 2 h m 3 ) . {\displaystyleP(\nu,h^{1},h^{2}\midh^{3})={\frac{1}{Z(\psi,h^{3})}}\exp\left(\sum_{ij}W_{ij}^{(1)}\nu_{i}h_{j}^{1}+\sum_{j\ell}W_{j\ell}^{(2)}h_{j}^{1}h_{\ell}^{2}+\sum_{\ellm}W_{\ellm}^{(3)}h_{\ell}^{2}h_{m}^{3}\right).} Deeppredictivecodingnetworks[edit] Adeeppredictivecodingnetwork(DPCN)isapredictivecodingschemethatusestop-downinformationtoempiricallyadjustthepriorsneededforabottom-upinferenceprocedurebymeansofadeep,locallyconnected,generativemodel.Thisworksbyextractingsparsefeaturesfromtime-varyingobservationsusingalineardynamicalmodel.Then,apoolingstrategyisusedtolearninvariantfeaturerepresentations.Theseunitscomposetoformadeeparchitectureandaretrainedbygreedylayer-wiseunsupervisedlearning.ThelayersconstituteakindofMarkovchainsuchthatthestatesatanylayerdependonlyontheprecedingandsucceedinglayers. DPCNspredicttherepresentationofthelayer,byusingatop-downapproachusingtheinformationinupperlayerandtemporaldependenciesfrompreviousstates.[95] DPCNscanbeextendedtoformaconvolutionalnetwork.[95] Multilayerkernelmachine[edit] Multilayerkernelmachines(MKM)areawayoflearninghighlynonlinearfunctionsbyiterativeapplicationofweaklynonlinearkernels.Theyusekernelprincipalcomponentanalysis(KPCA),[96]asamethodfortheunsupervisedgreedylayer-wisepre-trainingstepofdeeplearning.[97] Layer ℓ + 1 {\displaystyle\ell+1} learnstherepresentationofthepreviouslayer ℓ {\displaystyle\ell} ,extractingthe n l {\displaystylen_{l}} principalcomponent(PC)oftheprojectionlayer l {\displaystylel} outputinthefeaturedomaininducedbythekernel.Toreducethedimensionaliityoftheupdatedrepresentationineachlayer,asupervisedstrategyselectsthebestinformativefeaturesamongfeaturesextractedbyKPCA.Theprocessis: rankthe n ℓ {\displaystylen_{\ell}} featuresaccordingtotheirmutualinformationwiththeclasslabels; fordifferentvaluesofKand m ℓ ∈ { 1 , … , n ℓ } {\displaystylem_{\ell}\in\{1,\ldots,n_{\ell}\}} ,computetheclassificationerrorrateofaK-nearestneighbor(K-NN)classifierusingonlythe m l {\displaystylem_{l}} mostinformativefeaturesonavalidationset; thevalueof m ℓ {\displaystylem_{\ell}} withwhichtheclassifierhasreachedthelowesterrorratedeterminesthenumberoffeaturestoretain. SomedrawbacksaccompanytheKPCAmethodforMKMs. Amorestraightforwardwaytousekernelmachinesfordeeplearningwasdevelopedforspokenlanguageunderstanding.[98]Themainideaistouseakernelmachinetoapproximateashallowneuralnetwithaninfinitenumberofhiddenunits,thenusestackingtosplicetheoutputofthekernelmachineandtherawinputinbuildingthenext,higherlevelofthekernelmachine.Thenumberoflevelsinthedeepconvexnetworkisahyper-parameteroftheoverallsystem,tobedeterminedbycrossvalidation. Dynamic[edit] Dynamicneuralnetworksaddressnonlinearmultivariatebehaviourandinclude(learningof)time-dependentbehaviour,suchastransientphenomenaanddelayeffects.Techniquestoestimateasystemprocessfromobserveddatafallunderthegeneralcategoryofsystemidentification. Cascading[edit] Cascadecorrelationisanarchitectureandsupervisedlearningalgorithm.Insteadofjustadjustingtheweightsinanetworkoffixedtopology,[99]Cascade-Correlationbeginswithaminimalnetwork,thenautomaticallytrainsandaddsnewhiddenunitsonebyone,creatingamulti-layerstructure.Onceanewhiddenunithasbeenaddedtothenetwork,itsinput-sideweightsarefrozen.Thisunitthenbecomesapermanentfeature-detectorinthenetwork,availableforproducingoutputsorforcreatingother,morecomplexfeaturedetectors.TheCascade-Correlationarchitecturehasseveraladvantages:Itlearnsquickly,determinesitsownsizeandtopology,retainsthestructuresithasbuiltevenifthetrainingsetchangesandrequiresnobackpropagation. Neuro-fuzzy[edit] Aneuro-fuzzynetworkisafuzzyinferencesysteminthebodyofanartificialneuralnetwork.DependingontheFIStype,severallayerssimulatetheprocessesinvolvedinafuzzyinference-likefuzzification,inference,aggregationanddefuzzification.EmbeddinganFISinageneralstructureofanANNhasthebenefitofusingavailableANNtrainingmethodstofindtheparametersofafuzzysystem. Compositionalpattern-producing[edit] Mainarticle:Compositionalpattern-producingnetwork Compositionalpattern-producingnetworks(CPPNs)areavariationofartificialneuralnetworkswhichdifferintheirsetofactivationfunctionsandhowtheyareapplied.Whiletypicalartificialneuralnetworksoftencontainonlysigmoidfunctions(andsometimesGaussianfunctions),CPPNscanincludebothtypesoffunctionsandmanyothers.Furthermore,unliketypicalartificialneuralnetworks,CPPNsareappliedacrosstheentirespaceofpossibleinputssothattheycanrepresentacompleteimage.Sincetheyarecompositionsoffunctions,CPPNsineffectencodeimagesatinfiniteresolutionandcanbesampledforaparticulardisplayatwhateverresolutionisoptimal. Memorynetworks[edit] Memorynetworks[100][101]incorporatelong-termmemory.Thelong-termmemorycanbereadandwrittento,withthegoalofusingitforprediction.Thesemodelshavebeenappliedinthecontextofquestionanswering(QA)wherethelong-termmemoryeffectivelyactsasa(dynamic)knowledgebaseandtheoutputisatextualresponse.[102] Insparsedistributedmemoryorhierarchicaltemporalmemory,thepatternsencodedbyneuralnetworksareusedasaddressesforcontent-addressablememory,with"neurons"essentiallyservingasaddressencodersanddecoders.However,theearlycontrollersofsuchmemorieswerenotdifferentiable.[103] One-shotassociativememory[edit] Thistypeofnetworkcanaddnewpatternswithoutre-training.Itisdonebycreatingaspecificmemorystructure,whichassignseachnewpatterntoanorthogonalplaneusingadjacentlyconnectedhierarchicalarrays.[104]Thenetworkoffersreal-timepatternrecognitionandhighscalability;thisrequiresparallelprocessingandisthusbestsuitedforplatformssuchaswirelesssensornetworks,gridcomputing,andGPGPUs. Hierarchicaltemporalmemory[edit] Mainarticle:Hierarchicaltemporalmemory Hierarchicaltemporalmemory(HTM)modelssomeofthestructuralandalgorithmicpropertiesoftheneocortex.HTMisabiomimeticmodelbasedonmemory-predictiontheory.HTMisamethodfordiscoveringandinferringthehigh-levelcausesofobservedinputpatternsandsequences,thusbuildinganincreasinglycomplexmodeloftheworld. HTMcombinesexistingideastomimictheneocortexwithasimpledesignthatprovidesmanycapabilities.HTMcombinesandextendsapproachesusedinBayesiannetworks,spatialandtemporalclusteringalgorithms,whileusingatree-shapedhierarchyofnodesthatiscommoninneuralnetworks. Holographicassociativememory[edit] Mainarticle:Holographicassociativememory HolographicAssociativeMemory(HAM)isananalog,correlation-based,associative,stimulus-responsesystem.Informationismappedontothephaseorientationofcomplexnumbers.Thememoryiseffectiveforassociativememorytasks,generalizationandpatternrecognitionwithchangeableattention.Dynamicsearchlocalizationiscentraltobiologicalmemory.Invisualperception,humansfocusonspecificobjectsinapattern.Humanscanchangefocusfromobjecttoobjectwithoutlearning.HAMcanmimicthisabilitybycreatingexplicitrepresentationsforfocus.Itusesabi-modalrepresentationofpatternandahologram-likecomplexsphericalweightstate-space.HAMsareusefulforopticalrealizationbecausetheunderlyinghyper-sphericalcomputationscanbeimplementedwithopticalcomputation.[105] LSTM-relateddifferentiablememorystructures[edit] Apartfromlongshort-termmemory(LSTM),otherapproachesalsoaddeddifferentiablememorytorecurrentfunctions.Forexample: Differentiablepushandpopactionsforalternativememorynetworkscalledneuralstackmachines[106][107] Memorynetworkswherethecontrolnetwork'sexternaldifferentiablestorageisinthefastweightsofanothernetwork[108] LSTMforgetgates[109] Self-referentialRNNswithspecialoutputunitsforaddressingandrapidlymanipulatingtheRNN'sownweightsindifferentiablefashion(internalstorage)[110][111] Learningtotransducewithunboundedmemory[112] NeuralTuringmachines[edit] Mainarticle:NeuralTuringmachine NeuralTuringmachines[113]coupleLSTMnetworkstoexternalmemoryresources,withwhichtheycaninteractbyattentionalprocesses.ThecombinedsystemisanalogoustoaTuringmachinebutisdifferentiableend-to-end,allowingittobeefficientlytrainedbygradientdescent.PreliminaryresultsdemonstratethatneuralTuringmachinescaninfersimplealgorithmssuchascopying,sortingandassociativerecallfrominputandoutputexamples. Differentiableneuralcomputers(DNC)areanNTMextension.Theyout-performedNeuralturingmachines,longshort-termmemorysystemsandmemorynetworksonsequence-processingtasks.[114][115][116][117][118] Semantichashing[edit] Approachesthatrepresentpreviousexperiencesdirectlyanduseasimilarexperiencetoformalocalmodelareoftencallednearestneighbourork-nearestneighborsmethods.[119]Deeplearningisusefulinsemantichashing[120]whereadeepgraphicalmodeltheword-countvectors[121]obtainedfromalargesetofdocuments.[clarificationneeded]Documentsaremappedtomemoryaddressesinsuchawaythatsemanticallysimilardocumentsarelocatedatnearbyaddresses.Documentssimilartoaquerydocumentcanthenbefoundbyaccessingalltheaddressesthatdifferbyonlyafewbitsfromtheaddressofthequerydocument.Unlikesparsedistributedmemorythatoperateson1000-bitaddresses,semantichashingworkson32or64-bitaddressesfoundinaconventionalcomputerarchitecture. Pointernetworks[edit] Deepneuralnetworkscanbepotentiallyimprovedbydeepeningandparameterreduction,whilemaintainingtrainability.Whiletrainingextremelydeep(e.g.,1millionlayers)neuralnetworksmightnotbepractical,CPU-likearchitecturessuchaspointernetworks[122]andneuralrandom-accessmachines[123]overcomethislimitationbyusingexternalrandom-accessmemoryandothercomponentsthattypicallybelongtoacomputerarchitecturesuchasregisters,ALUandpointers.Suchsystemsoperateonprobabilitydistributionvectorsstoredinmemorycellsandregisters.Thus,themodelisfullydifferentiableandtrainsend-to-end.Thekeycharacteristicofthesemodelsisthattheirdepth,thesizeoftheirshort-termmemory,andthenumberofparameterscanbealteredindependently. Hybrids[edit] Encoder–decodernetworks[edit] Encoder–decoderframeworksarebasedonneuralnetworksthatmaphighlystructuredinputtohighlystructuredoutput.Theapproacharoseinthecontextofmachinetranslation,[124][125][126]wheretheinputandoutputarewrittensentencesintwonaturallanguages.Inthatwork,anLSTMRNNorCNNwasusedasanencodertosummarizeasourcesentence,andthesummarywasdecodedusingaconditionalRNNlanguagemodeltoproducethetranslation.[127]Thesesystemssharebuildingblocks:gatedRNNsandCNNsandtrainedattentionmechanisms. Seealso[edit] Adaptiveresonancetheory Artificiallife Autoassociativememory Autoencoder Biologicallyinspiredcomputing Bluebrain Connectionistexpertsystem Counterpropagationnetwork Decisiontree Expertsystem Geneticalgorithm InSituAdaptiveTabulation Largememorystorageandretrievalneuralnetworks Lineardiscriminantanalysis Logisticregression Multilayerperceptron Neuralgas Neuroevolution,NeuroEvolutionofAugmentedTopologies(NEAT) Ni1000chip Opticalneuralnetwork Particleswarmoptimization Predictiveanalytics Principalcomponentsanalysis Simulatedannealing Systolicarray Timedelayneuralnetwork(TDNN) References[edit] ^UniversityOfSouthernCalifornia.(2004,June16).GrayMatters:NewCluesIntoHowNeuronsProcessInformation.ScienceDailyQuote:"..."It'samazingthatafterahundredyearsofmodernneuroscienceresearch,westilldon'tknowthebasicinformationprocessingfunctionsofaneuron,"saidBartlettMel..." ^WeizmannInstituteofScience.(2007,April2).It'sOnlyAGameOfChance:LeadingTheoryOfPerceptionCalledIntoQuestion.ScienceDailyQuote:"..."Sincethe1980s,manyneuroscientistsbelievedtheypossessedthekeyforfinallybeginningtounderstandtheworkingsofthebrain.Butwehaveprovidedstrongevidencetosuggestthatthebrainmaynotencodeinformationusingprecisepatternsofactivity."..." ^UniversityOfCalifornia–LosAngeles(2004,December14).UCLANeuroscientistGainsInsightsIntoHumanBrainFromStudyOfMarineSnail.ScienceDailyQuote:"..."Ourworkimpliesthatthebrainmechanismsforformingthesekindsofassociationsmightbeextremelysimilarinsnailsandhigherorganisms...Wedon'tfullyunderstandevenverysimplekindsoflearningintheseanimals."..." ^YaleUniversity.(2006,April13).BrainCommunicatesInAnalogAndDigitalModesSimultaneously.ScienceDailyQuote:"...McCormicksaidfutureinvestigationsandmodelsofneuronaloperationinthebrainwillneedtotakeintoaccountthemixedanalog-digitalnatureofcommunication.Onlywithathoroughunderstandingofthismixedmodeofsignaltransmissionwillatrulyindepthunderstandingofthebrainanditsdisordersbeachieved,hesaid..." ^Ivakhnenko,AlexeyGrigorevich(1968)."Thegroupmethodofdatahandling–arivalofthemethodofstochasticapproximation".SovietAutomaticControl.13(3):43–55. ^Ivakhnenko,A.G.(1971)."PolynomialTheoryofComplexSystems".IEEETransactionsonSystems,Man,andCybernetics.1(4):364–378.doi:10.1109/TSMC.1971.4308320.S2CID 17606980. ^Kondo,T.;Ueno,J.(2008)."Multi-layeredGMDH-typeneuralnetworkself-selectingoptimumneuralnetworkarchitectureanditsapplicationto3-dimensionalmedicalimagerecognitionofbloodvessels".InternationalJournalofInnovativeComputing,InformationandControl.4(1):175–187. ^Bengio,Y.(2009)."LearningDeepArchitecturesforAI"(PDF).FoundationsandTrendsinMachineLearning.2:1–127.CiteSeerX 10.1.1.701.9550.doi:10.1561/2200000006. ^Liou,Cheng-Yuan(2008)."ModelingwordperceptionusingtheElmannetwork"(PDF).Neurocomputing.71(16–18):3150–3157.doi:10.1016/j.neucom.2008.04.030. ^Liou,Cheng-Yuan(2014)."Autoencoderforwords".Neurocomputing.139:84–96.doi:10.1016/j.neucom.2013.09.055. ^DiederikPKingma;Welling,Max(2013)."Auto-EncodingVariationalBayes".arXiv:1312.6114[stat.ML]. ^GeneratingFaceswithTorch,BoesenA.,LarsenL.andSonderbyS.K.,2015torch.ch/blog/2015/11/13/gan.html ^"Competitiveprobabilisticneuralnetwork(PDFDownloadAvailable)".ResearchGate.Retrieved2017-03-16. ^"Archivedcopy".Archivedfromtheoriginalon2010-12-18.Retrieved2012-03-22.{{citeweb}}:CS1maint:archivedcopyastitle(link) ^"Archivedcopy"(PDF).Archivedfromtheoriginal(PDF)on2012-01-31.Retrieved2012-03-22.{{citeweb}}:CS1maint:archivedcopyastitle(link) ^TDNNFundamentals,KapitelausdemOnlineHandbuchdesSNNS ^Zhang,Wei(1990)."Paralleldistributedprocessingmodelwithlocalspace-invariantinterconnectionsanditsopticalarchitecture".AppliedOptics.29(32):4790–7.Bibcode:1990ApOpt..29.4790Z.doi:10.1364/ao.29.004790.PMID 20577468. ^Zhang,Wei(1988)."Shift-invariantpatternrecognitionneuralnetworkanditsopticalarchitecture".ProceedingsofAnnualConferenceoftheJapanSocietyofAppliedPhysics. ^J.Weng,N.AhujaandT.S.Huang,"Learningrecognitionandsegmentationof3-Dobjectsfrom2-Dimages,"Proc.4thInternationalConf.ComputerVision,Berlin,Germany,pp.121–128,May,1993. ^Fukushima,K.(1980)."Neocognitron:Aself-organizingneuralnetworkmodelforamechanismofpatternrecognitionunaffectedbyshiftinposition".Biol.Cybern.36(4):193–202.doi:10.1007/bf00344251.PMID 7370364.S2CID 206775608. ^LeCun,Yann."LeNet-5,convolutionalneuralnetworks".Retrieved16November2013. ^"ConvolutionalNeuralNetworks(LeNet)–DeepLearning0.1documentation".DeepLearning0.1.LISALab.Retrieved31August2013. ^LeCunetal.,"BackpropagationAppliedtoHandwrittenZipCodeRecognition,"NeuralComputation,1,pp.541–551,1989. ^YannLeCun(2016).SlidesonDeepLearningOnline ^"UnsupervisedFeatureLearningandDeepLearningTutorial".ufldl.stanford.edu. ^Hinton,GeoffreyE.;Krizhevsky,Alex;Wang,SidaD.(2011),"TransformingAuto-Encoders",LectureNotesinComputerScience,Springer,pp. 44–51,CiteSeerX 10.1.1.220.5099,doi:10.1007/978-3-642-21735-7_6,ISBN 9783642217340 ^Szegedy,Christian;Liu,Wei;Jia,Yangqing;Sermanet,Pierre;Reed,Scott;Anguelov,Dragomir;Erhan,Dumitru;Vanhoucke,Vincent;Rabinovich,Andrew(2014).GoingDeeperwithConvolutions.ComputingResearchRepository.p. 1.arXiv:1409.4842.doi:10.1109/CVPR.2015.7298594.ISBN 978-1-4673-6964-0.S2CID 206592484. ^Ran,Lingyan;Zhang,Yanning;Zhang,Qilin;Yang,Tao(2017-06-12)."ConvolutionalNeuralNetwork-BasedRobotNavigationUsingUncalibratedSphericalImages"(PDF).Sensors.17(6):1341.Bibcode:2017Senso..17.1341R.doi:10.3390/s17061341.ISSN 1424-8220.PMC 5492478.PMID 28604624. ^vandenOord,Aaron;Dieleman,Sander;Schrauwen,Benjamin(2013-01-01).Burges,C.J.C.;Bottou,L.;Welling,M.;Ghahramani,Z.;Weinberger,K.Q.(eds.).Deepcontent-basedmusicrecommendation(PDF).CurranAssociates.pp. 2643–2651. ^Collobert,Ronan;Weston,Jason(2008-01-01).AUnifiedArchitectureforNaturalLanguageProcessing:DeepNeuralNetworkswithMultitaskLearning.Proceedingsofthe25thInternationalConferenceonMachineLearning.NewYork,NY,USA:ACM.pp. 160–167.doi:10.1145/1390156.1390177.ISBN 978-1-60558-205-4.S2CID 2617020. ^abDeng,Li;Yu,Dong;Platt,John(2012)."Scalablestackingandlearningforbuildingdeeparchitectures"(PDF).2012IEEEInternationalConferenceonAcoustics,SpeechandSignalProcessing(ICASSP):2133–2136.doi:10.1109/ICASSP.2012.6288333.ISBN 978-1-4673-0046-9.S2CID 16171497. ^Deng,Li;Yu,Dong(2011)."DeepConvexNet:AScalableArchitectureforSpeechPatternClassification"(PDF).ProceedingsoftheInterspeech:2285–2288.doi:10.21437/Interspeech.2011-607. ^David,Wolpert(1992)."Stackedgeneralization".NeuralNetworks.5(2):241–259.CiteSeerX 10.1.1.133.8090.doi:10.1016/S0893-6080(05)80023-1. ^Bengio,Y.(2009-11-15)."LearningDeepArchitecturesforAI".FoundationsandTrendsinMachineLearning.2(1):1–127.CiteSeerX 10.1.1.701.9550.doi:10.1561/2200000006.ISSN 1935-8237. ^Hutchinson,Brian;Deng,Li;Yu,Dong(2012)."Tensordeepstackingnetworks".IEEETransactionsonPatternAnalysisandMachineIntelligence.1–15(8):1944–1957.doi:10.1109/tpami.2012.268.PMID 23267198.S2CID 344385. ^Hinton,Geoffrey;Salakhutdinov,Ruslan(2006)."ReducingtheDimensionalityofDatawithNeuralNetworks".Science.313(5786):504–507.Bibcode:2006Sci...313..504H.doi:10.1126/science.1127647.PMID 16873662.S2CID 1658773. ^Dahl,G.;Yu,D.;Deng,L.;Acero,A.(2012)."Context-DependentPre-TrainedDeepNeuralNetworksforLarge-VocabularySpeechRecognition".IEEETransactionsonAudio,Speech,andLanguageProcessing.20(1):30–42.CiteSeerX 10.1.1.227.8990.doi:10.1109/tasl.2011.2134090.S2CID 14862572. ^Mohamed,Abdel-rahman;Dahl,George;Hinton,Geoffrey(2012)."AcousticModelingUsingDeepBeliefNetworks".IEEETransactionsonAudio,Speech,andLanguageProcessing.20(1):14–22.CiteSeerX 10.1.1.338.2670.doi:10.1109/tasl.2011.2109382.S2CID 9530137. ^Deng,Li;Yu,Dong(2011)."DeepConvexNet:AScalableArchitectureforSpeechPatternClassification"(PDF).ProceedingsoftheInterspeech:2285–2288.doi:10.21437/Interspeech.2011-607. ^Deng,Li;Yu,Dong;Platt,John(2012)."Scalablestackingandlearningforbuildingdeeparchitectures"(PDF).2012IEEEInternationalConferenceonAcoustics,SpeechandSignalProcessing(ICASSP):2133–2136.doi:10.1109/ICASSP.2012.6288333.ISBN 978-1-4673-0046-9.S2CID 16171497. ^Hinton,G.E.(2009)."Deepbeliefnetworks".Scholarpedia.4(5):5947.Bibcode:2009SchpJ...4.5947H.doi:10.4249/scholarpedia.5947. ^Larochelle,Hugo;Erhan,Dumitru;Courville,Aaron;Bergstra,James;Bengio,Yoshua(2007).AnEmpiricalEvaluationofDeepArchitecturesonProblemswithManyFactorsofVariation.Proceedingsofthe24thInternationalConferenceonMachineLearning.ICML'07.NewYork,NY,USA:ACM.pp. 473–480.CiteSeerX 10.1.1.77.3242.doi:10.1145/1273496.1273556.ISBN 9781595937933.S2CID 14805281. ^Werbos,P.J.(1988)."Generalizationofbackpropagationwithapplicationtoarecurrentgasmarketmodel".NeuralNetworks.1(4):339–356.doi:10.1016/0893-6080(88)90007-x. ^DavidE.Rumelhart;GeoffreyE.Hinton;RonaldJ.Williams.LearningInternalRepresentationsbyErrorPropagation. ^A.J.RobinsonandF.Fallside.Theutilitydrivendynamicerrorpropagationnetwork.TechnicalReportCUED/F-INFENG/TR.1,CambridgeUniversityEngineeringDepartment,1987. ^R.J.WilliamsandD.Zipser.Gradient-basedlearningalgorithmsforrecurrentnetworksandtheircomputationalcomplexity.InBack-propagation:Theory,ArchitecturesandApplications.Hillsdale,NJ:Erlbaum,1994. ^Schmidhuber,J.(1989)."Alocallearningalgorithmfordynamicfeedforwardandrecurrentnetworks".ConnectionScience.1(4):403–412.doi:10.1080/09540098908915650.S2CID 18721007. ^NeuralandAdaptiveSystems:FundamentalsthroughSimulation.J.C.Principe,N.R.Euliano,W.C.Lefebvre ^Schmidhuber,J.(1992)."AfixedsizestorageO(n3)timecomplexitylearningalgorithmforfullyrecurrentcontinuallyrunningnetworks".NeuralComputation.4(2):243–248.doi:10.1162/neco.1992.4.2.243.S2CID 11761172. ^R.J.Williams.Complexityofexactgradientcomputationalgorithmsforrecurrentneuralnetworks.TechnicalReportTechnicalReportNU-CCS-89-27,Boston:NortheasternUniversity,CollegeofComputerScience,1989. ^Pearlmutter,B.A.(1989)."Learningstatespacetrajectoriesinrecurrentneuralnetworks"(PDF).NeuralComputation.1(2):263–269.doi:10.1162/neco.1989.1.2.263.S2CID 16813485. ^S.Hochreiter.UntersuchungenzudynamischenneuronalenNetzen.Diplomathesis,Institutf.Informatik,TechnischeUniv.Munich,1991. ^S.Hochreiter,Y.Bengio,P.Frasconi,andJ.Schmidhuber.Gradientflowinrecurrentnets:thedifficultyoflearninglong-termdependencies.InS.C.KremerandJ.F.Kolen,editors,AFieldGuidetoDynamicalRecurrentNeuralNetworks.IEEEPress,2001. ^abHochreiter,S.;Schmidhuber,J.(1997)."Longshort-termmemory".NeuralComputation.9(8):1735–1780.doi:10.1162/neco.1997.9.8.1735.PMID 9377276.S2CID 1915014. ^NeuralNetworksasCyberneticSystems2ndandrevisededition,HolkCruse [1] ^Schrauwen,Benjamin,DavidVerstraeten,andJanVanCampenhout"Anoverviewofreservoircomputing:theory,applications,andimplementations."ProceedingsoftheEuropeanSymposiumonArtificialNeuralNetworksESANN2007,pp.471–482. ^Mass,Wolfgang;Nachtschlaeger,T.;Markram,H.(2002)."Real-timecomputingwithoutstablestates:Anewframeworkforneuralcomputationbasedonperturbations".NeuralComputation.14(11):2531–2560.doi:10.1162/089976602760407955.PMID 12433288.S2CID 1045112. ^Echostatenetwork,Scholarpedia ^Jaeger,H.;Harnessing(2004)."Predictingchaoticsystemsandsavingenergyinwirelesscommunication".Science.304(5667):78–80.Bibcode:2004Sci...304...78J.CiteSeerX 10.1.1.719.2301.doi:10.1126/science.1091277.PMID 15064413.S2CID 2184251. ^F.A.GersandJ.Schmidhuber.LSTMrecurrentnetworkslearnsimplecontextfreeand contextsensitivelanguagesIEEETransactionsonNeuralNetworks12(6):1333–1340,2001. ^A.Graves,J.Schmidhuber.OfflineHandwritingRecognitionwithMultidimensionalRecurrentNeuralNetworks.AdvancesinNeuralInformationProcessingSystems22,NIPS'22,p545-552,Vancouver,MITPress,2009. ^Schuster,Mike;Paliwal,KuldipK.(1997)."Bidirectionalrecurrentneuralnetworks".IEEETransactionsonSignalProcessing.45(11):2673–2681.Bibcode:1997ITSP...45.2673S.CiteSeerX 10.1.1.331.9441.doi:10.1109/78.650093. ^Graves,A.;Schmidhuber,J.(2005)."FramewisephonemeclassificationwithbidirectionalLSTMandotherneuralnetworkarchitectures".NeuralNetworks.18(5–6):602–610.CiteSeerX 10.1.1.331.5800.doi:10.1016/j.neunet.2005.06.042.PMID 16112549. ^Schmidhuber,J.(1992)."Learningcomplex,extendedsequencesusingtheprincipleofhistorycompression".NeuralComputation.4(2):234–242.doi:10.1162/neco.1992.4.2.234.S2CID 18271205. ^DynamicRepresentationofMovementPrimitivesinanEvolvedRecurrentNeuralNetwork ^"AssociativeNeuralNetwork".www.vcclab.org.Retrieved2017-06-17. ^Anderson,JamesA.;Rosenfeld,Edward(2000).TalkingNets:AnOralHistoryofNeuralNetworks.ISBN 9780262511117. ^Gerstner;Kistler."SpikingNeuronModels:SingleNeurons,Populations,Plasticity".icwww.epfl.ch.Retrieved2017-06-18.Freelyavailableonlinetextbook ^IzhikevichEM(February2006)."Polychronization:computationwithspikes".NeuralComputation.18(2):245–82.doi:10.1162/089976606775093882.PMID 16378515.S2CID 14253998. ^AchlerT.,OmarC.,AmirE.,"SheddingWeights:MoreWithLess",IEEEProc.InternationalJointConferenceonNeuralNetworks,2008 ^DavidH.HubelandTorstenN.Wiesel(2005).Brainandvisualperception:thestoryofa25-yearcollaboration.OxfordUniversityPress.p. 106.ISBN 978-0-19-517618-6. ^Hubel,DH;Wiesel,TN(October1959)."Receptivefieldsofsingleneuronesinthecat'sstriatecortex".J.Physiol.148(3):574–91.doi:10.1113/jphysiol.1959.sp006308.PMC 1363130.PMID 14403679. ^Fukushima1987,p.83. ^Fukushima1987,p.84. ^Fukushima2007 ^Fukushima1987,pp.81,85 ^LeCun,Yann;Bengio,Yoshua;Hinton,Geoffrey(2015)."Deeplearning".Nature.521(7553):436–444.Bibcode:2015Natur.521..436L.doi:10.1038/nature14539.PMID 26017442.S2CID 3074096. ^Hinton,G.E.;Osindero,S.;Teh,Y.(2006)."Afastlearningalgorithmfordeepbeliefnets"(PDF).NeuralComputation.18(7):1527–1554.CiteSeerX 10.1.1.76.1541.doi:10.1162/neco.2006.18.7.1527.PMID 16764513.S2CID 2309950. ^Hinton,Geoffrey;Salakhutdinov,Ruslan(2009)."EfficientLearningofDeepBoltzmannMachines"(PDF).3:448–455.{{citejournal}}:Citejournalrequires|journal=(help) ^Larochelle,Hugo;Bengio,Yoshua;Louradour,Jerdme;Lamblin,Pascal(2009)."ExploringStrategiesforTrainingDeepNeuralNetworks".TheJournalofMachineLearningResearch.10:1–40. ^Coates,Adam;Carpenter,Blake(2011)."TextDetectionandCharacterRecognitioninSceneImageswithUnsupervisedFeatureLearning"(PDF):440–445.{{citejournal}}:Citejournalrequires|journal=(help) ^Lee,Honglak;Grosse,Roger(2009).Convolutionaldeepbeliefnetworksforscalableunsupervisedlearningofhierarchicalrepresentations.Proceedingsofthe26thAnnualInternationalConferenceonMachineLearning.pp. 1–8.CiteSeerX 10.1.1.149.6800.doi:10.1145/1553374.1553453.ISBN 9781605585161.S2CID 12008458. ^Courville,Aaron;Bergstra,James;Bengio,Yoshua(2011)."UnsupervisedModelsofImagesbySpike-and-SlabRBMs"(PDF).Proceedingsofthe28thInternationalConferenceonMachineLearning.Vol. 10.pp. 1–8. ^Lin,Yuanqing;Zhang,Tong;Zhu,Shenghuo;Yu,Kai(2010)."DeepCodingNetwork".AdvancesinNeuralInformationProcessingSystems23(NIPS2010).Vol. 23.pp. 1–9. ^Ranzato,MarcAurelio;Boureau,Y-Lan(2007)."SparseFeatureLearningforDeepBeliefNetworks"(PDF).AdvancesinNeuralInformationProcessingSystems.23:1–8. ^Socher,Richard;Lin,Clif(2011)."ParsingNaturalScenesandNaturalLanguagewithRecursiveNeuralNetworks"(PDF).Proceedingsofthe26thInternationalConferenceonMachineLearning. ^Taylor,Graham;Hinton,Geoffrey(2006)."ModelingHumanMotionUsingBinaryLatentVariables"(PDF).AdvancesinNeuralInformationProcessingSystems. ^Vincent,Pascal;Larochelle,Hugo(2008).Extractingandcomposingrobustfeatureswithdenoisingautoencoders.Proceedingsofthe25thInternationalConferenceonMachineLearning–ICML'08.pp. 1096–1103.CiteSeerX 10.1.1.298.4083.doi:10.1145/1390156.1390294.ISBN 9781605582054.S2CID 207168299. ^Kemp,Charles;Perfors,Amy;Tenenbaum,Joshua(2007)."LearningoverhypotheseswithhierarchicalBayesianmodels".DevelopmentalScience.10(3):307–21.CiteSeerX 10.1.1.141.5560.doi:10.1111/j.1467-7687.2007.00585.x.PMID 17444972. ^Xu,Fei;Tenenbaum,Joshua(2007)."WordlearningasBayesianinference".Psychol.Rev.114(2):245–72.CiteSeerX 10.1.1.57.9649.doi:10.1037/0033-295X.114.2.245.PMID 17500627. ^Chen,Bo;Polatkan,Gungor(2011)."TheHierarchicalBetaProcessforConvolutionalFactorAnalysisandDeepLearning"(PDF).Proceedingsofthe28thInternationalConferenceonInternationalConferenceonMachineLearning.Omnipress.pp. 361–368.ISBN 978-1-4503-0619-5. ^Fei-Fei,Li;Fergus,Rob(2006)."One-shotlearningofobjectcategories".IEEETransactionsonPatternAnalysisandMachineIntelligence.28(4):594–611.CiteSeerX 10.1.1.110.9024.doi:10.1109/TPAMI.2006.79.PMID 16566508.S2CID 6953475. ^Rodriguez,Abel;Dunson,David(2008)."TheNestedDirichletProcess".JournaloftheAmericanStatisticalAssociation.103(483):1131–1154.CiteSeerX 10.1.1.70.9873.doi:10.1198/016214508000000553.S2CID 13462201. ^Ruslan,Salakhutdinov;Joshua,Tenenbaum(2012)."LearningwithHierarchical-DeepModels".IEEETransactionsonPatternAnalysisandMachineIntelligence.35(8):1958–71.CiteSeerX 10.1.1.372.909.doi:10.1109/TPAMI.2012.269.PMID 23787346.S2CID 4508400. ^abChalasani,Rakesh;Principe,Jose(2013)."DeepPredictiveCodingNetworks".arXiv:1301.3541[cs.LG]. ^Scholkopf,B;Smola,Alexander(1998)."Nonlinearcomponentanalysisasakerneleigenvalueproblem".NeuralComputation.44(5):1299–1319.CiteSeerX 10.1.1.53.8911.doi:10.1162/089976698300017467.S2CID 6674407. ^Cho,Youngmin(2012)."KernelMethodsforDeepLearning"(PDF):1–9.{{citejournal}}:Citejournalrequires|journal=(help) ^Deng,Li;Tur,Gokhan;He,Xiaodong;Hakkani-Tür,Dilek(2012-12-01)."UseofKernelDeepConvexNetworksandEnd-To-EndLearningforSpokenLanguageUnderstanding".MicrosoftResearch. ^Fahlman,ScottE.;Lebiere,Christian(August29,1991)."TheCascade-CorrelationLearningArchitecture"(PDF).CarnegieMellonUniversity.Retrieved4October2014. ^Schmidhuber,Juergen(2014)."MemoryNetworks".arXiv:1410.3916[cs.AI]. ^Schmidhuber,Juergen(2015)."End-To-EndMemoryNetworks".arXiv:1503.08895[cs.NE]. ^Schmidhuber,Juergen(2015)."Large-scaleSimpleQuestionAnsweringwithMemoryNetworks".arXiv:1506.02075[cs.LG]. ^Hinton,GeoffreyE.(1984)."Distributedrepresentations".Archivedfromtheoriginalon2016-05-02. ^Nasution,B.B.;Khan,A.I.(February2008)."AHierarchicalGraphNeuronSchemeforReal-TimePatternRecognition".IEEETransactionsonNeuralNetworks.19(2):212–229.doi:10.1109/TNN.2007.905857.PMID 18269954.S2CID 17573325. ^Sutherland,JohnG.(1January1990)."Aholographicmodelofmemory,learningandexpression".InternationalJournalofNeuralSystems.01(3):259–267.doi:10.1142/S0129065790000163. ^S.Das,C.L.Giles,G.Z.Sun,"LearningContextFreeGrammars:LimitationsofaRecurrentNeuralNetworkwithanExternalStackMemory,"Proc.14thAnnualConf.oftheCog.Sci.Soc.,p.79,1992. ^Mozer,M.C.;Das,S.(1993)."Aconnectionistsymbolmanipulatorthatdiscoversthestructureofcontext-freelanguages".AdvancesinNeuralInformationProcessingSystems.5:863–870. ^Schmidhuber,J.(1992)."Learningtocontrolfast-weightmemories:Analternativetorecurrentnets".NeuralComputation.4(1):131–139.doi:10.1162/neco.1992.4.1.131.S2CID 16683347. ^Gers,F.;Schraudolph,N.;Schmidhuber,J.(2002)."LearningprecisetimingwithLSTMrecurrentnetworks"(PDF).JMLR.3:115–143. ^JürgenSchmidhuber(1993)."Anintrospectivenetworkthatcanlearntorunitsownweightchangealgorithm".ProceedingsoftheInternationalConferenceonArtificialNeuralNetworks,Brighton.IEE.pp. 191–195. ^Hochreiter,Sepp;Younger,A.Steven;Conwell,PeterR.(2001)."LearningtoLearnUsingGradientDescent".ICANN.2130:87–94.CiteSeerX 10.1.1.5.323. ^Schmidhuber,Juergen(2015)."LearningtoTransducewithUnboundedMemory".arXiv:1506.02516[cs.NE]. ^Schmidhuber,Juergen(2014)."NeuralTuringMachines".arXiv:1410.5401[cs.NE]. ^Burgess,Matt."DeepMind'sAIlearnedtoridetheLondonUndergroundusinghuman-likereasonandmemory".WIREDUK.Retrieved2016-10-19. ^"DeepMindAI'Learns'toNavigateLondonTube".PCMAG.Retrieved2016-10-19. ^Mannes,John."DeepMind'sdifferentiableneuralcomputerhelpsyounavigatethesubwaywithitsmemory".TechCrunch.Retrieved2016-10-19. ^Graves,Alex;Wayne,Greg;Reynolds,Malcolm;Harley,Tim;Danihelka,Ivo;Grabska-Barwińska,Agnieszka;Colmenarejo,SergioGómez;Grefenstette,Edward;Ramalho,Tiago(2016-10-12)."Hybridcomputingusinganeuralnetworkwithdynamicexternalmemory".Nature.538(7626):471–476.Bibcode:2016Natur.538..471G.doi:10.1038/nature20101.ISSN 1476-4687.PMID 27732574.S2CID 205251479. ^"Differentiableneuralcomputers|DeepMind".DeepMind.Retrieved2016-10-19. ^Atkeson,ChristopherG.;Schaal,Stefan(1995)."Memory-basedneuralnetworksforrobotlearning".Neurocomputing.9(3):243–269.doi:10.1016/0925-2312(95)00033-6. ^Salakhutdinov,Ruslan,andGeoffreyHinton."Semantichashing."InternationalJournalofApproximateReasoning50.7(2009):969–978. ^Le,QuocV.;Mikolov,Tomas(2014)."Distributedrepresentationsofsentencesanddocuments".arXiv:1405.4053[cs.CL]. ^Schmidhuber,Juergen(2015)."PointerNetworks".arXiv:1506.03134[stat.ML]. ^Schmidhuber,Juergen(2015)."NeuralRandom-AccessMachines".arXiv:1511.06392[cs.LG]. ^Kalchbrenner,N.;Blunsom,P.(2013).Recurrentcontinuoustranslationmodels.EMNLP'2013.pp. 1700–1709. ^Sutskever,I.;Vinyals,O.;Le,Q.V.(2014)."Sequencetosequencelearningwithneuralnetworks"(PDF).Twenty-eighthConferenceonNeuralInformationProcessingSystems.arXiv:1409.3215. ^Schmidhuber,Juergen(2014)."LearningPhraseRepresentationsusingRNNEncoder-DecoderforStatisticalMachineTranslation".arXiv:1406.1078[cs.CL]. ^Schmidhuber,Juergen;Courville,Aaron;Bengio,Yoshua(2015)."DescribingMultimediaContentusingAttention-basedEncoder—DecoderNetworks".IEEETransactionsonMultimedia.17(11):1875–1886.arXiv:1507.01053.Bibcode:2015arXiv150701053C.doi:10.1109/TMM.2015.2477044.S2CID 1179542. Retrievedfrom"https://en.wikipedia.org/w/index.php?title=Types_of_artificial_neural_networks&oldid=1072725554" Categories:ComputationalstatisticsArtificialneuralnetworksClassificationalgorithmsComputationalneuroscienceHiddencategories:CS1maint:archivedcopyastitleCS1errors:missingperiodicalArticleswithshortdescriptionShortdescriptionisdifferentfromWikidataAllarticleswithunsourcedstatementsArticleswithunsourcedstatementsfromOctober2008WikipediaarticlesneedingclarificationfromJune2017 Navigationmenu Personaltools NotloggedinTalkContributionsCreateaccountLogin Namespaces ArticleTalk English expanded collapsed Views ReadEditViewhistory More expanded collapsed Search Navigation MainpageContentsCurrenteventsRandomarticleAboutWikipediaContactusDonate Contribute HelpLearntoeditCommunityportalRecentchangesUploadfile Tools WhatlinkshereRelatedchangesUploadfileSpecialpagesPermanentlinkPageinformationCitethispageWikidataitem Print/export DownloadasPDFPrintableversion Languages Deutsch粵語 Editlinks
延伸文章資訊
- 1MLP vs CNN vs RNN Deep Learning, Machine Learning Model - LinkedIn
- 2Types of artificial neural networks - Wikipedia
Feedforward networks can be constructed with various types of units, such as binary McCulloch–Pit...
- 3The mostly complete chart of Neural Networks, explained
The zoo of neural network types grows exponentially. One needs a map to navigate between many eme...
- 47 Types of Neural Networks in Artificial Intelligence Explained
There are many types of neural networks like Perceptron, Hopfield, Self-organizing maps, Boltzman...
- 55 Different Types of Neural Networks - ProjectPro
Autoencoders: These are a special kind of neural network that consists of three main parts: encod...