Natural language processing

Introduction

Languageistheessentialcharacteristicthatdistinguisheshumansfromotheranimals.Amongalllivingthings,onlyhumanshavetheabilitytospeak.Avarietyofhumanintelligencesarecloselyrelatedtolanguage.Humanlogicalthinkingisintheformoflanguage,andmostofhumanknowledgeisalsorecordedandpasseddownintheformoflanguage.Therefore,itisalsoanimportant,evencorepartofartificialintelligence.

Usingnaturallanguagetocommunicatewithcomputersiswhatpeoplehavelongpursued.Becauseithasnotonlyobviouspracticalsignificance,butalsoimportanttheoreticalsignificance:peoplecanusethecomputerinthelanguagetheyaremostaccustomedto,withoutspendingalotoftimeandenergytolearnvariouscomputerlanguagesthatarenotverynaturalandaccustomed;Peoplecanalsouseittofurtherunderstandthehumanlanguageabilityandthemechanismofintelligence.

Naturallanguageprocessingreferstothetechnologythatusesthenaturallanguageusedbyhumanstocommunicatewithmachinestocommunicatewitheachother.Throughartificialprocessingofnaturallanguage,thecomputercanreadandunderstandit.Theresearchonnaturallanguageprocessingbeganwiththeexplorationofmachinetranslationbyhumans.Althoughnaturallanguageprocessinginvolvesmulti-dimensionaloperationssuchasspeech,grammar,semantics,andpragmatics,insimpleterms,thebasictaskofnaturallanguageprocessingistosegmenttheprocessedcorpusbasedontheontologydictionary,wordfrequencystatistics,contextualsemanticanalysis,etc.Theunitisthesmallestpartofspeechandisrichinsemanticlexicalitems.

NaturalLanguageProcessing(NLP)takeslanguageasitsobjectandusescomputertechnologytoanalyze,understand,andprocessnaturallanguage.Thatis,thecomputerisusedasapowerfultoolforlanguageresearch.Underthesupportof,conductquantitativeresearchonlanguageinformation,andprovidelanguagedescriptionsthatcanbeusedjointlybetweenhumansandcomputers.Itincludestwoparts:NaturalLanguageUnderstanding(NLU)andNaturalLanguageGeneration(NLG).Itisatypicalborderlineinterdisciplinarysubject,involvinglanguagescience,computerscience,mathematics,cognition,logic,etc.,focusingonthefieldofinteractionbetweencomputersandhuman(natural)languages.Peoplerefertotheprocessofusingcomputerstoprocessnaturallanguageatdifferentperiodsorwhenthefocusisdifferent.ItisalsocalledNaturalLanguageUnderstanding(NLU),HumanLanguageTechnology(HLT),ComputationalLinguisticsHl(ComputationalLinguistics),QuantitativeLinguistics,MathematicalLinguistics.

Realizingnaturallanguagecommunicationbetweenhumansandcomputersmeansthatcomputersmustnotonlyunderstandthemeaningofnaturallanguagetexts,butalsoexpressgivenintentionsandthoughtsinnaturallanguagetexts.Theformeriscallednaturallanguageunderstanding,andthelatteriscallednaturallanguagegeneration.Therefore,naturallanguageprocessinggenerallyincludestwoparts:naturallanguageunderstandingandnaturallanguagegeneration.Historically,therehavebeenmoreresearchesonnaturallanguageunderstanding,butlessresearchonnaturallanguagegeneration.Butthissituationhaschanged.

Whetherrealizingnaturallanguageunderstandingornaturallanguagegenerationisfarlesssimplethanpeopleoriginallyimagined,itisverydifficult.Judgingfromthecurrenttheoreticalandtechnologicalstatus,auniversal,high-qualitynaturallanguageprocessingsystemisstillalong-termgoal,butforcertainapplications,practicalsystemswithconsiderablenaturallanguageprocessingcapabilitieshaveemerged,andsomehavebeencommercialized.,Andevenbegantoindustrialize.Typicalexamplesare:naturallanguageinterfacesofmultilingualdatabasesandexpertsystems,variousmachinetranslationsystems,full-textinformationretrievalsystems,automaticabstractingsystems,etc.

Naturallanguageprocessing,thatis,realizingnaturallanguagecommunicationbetweenhumansandcomputers,orrealizingnaturallanguageunderstandingandnaturallanguagegenerationisverydifficult.Therootcauseofthedifficultyisthewidevarietyofambiguitiesorambiguitiesatalllevelsofnaturallanguagetextsanddialogues.

Thereisamany-to-manyrelationshipbetweentheform(string)ofnaturallanguageanditsmeaning.Infact,thisisexactlythecharmofnaturallanguage.Butfromtheperspectiveofcomputerprocessing,wemusteliminateambiguity,andsomepeoplethinkthatitisthecentralprobleminnaturallanguageunderstanding,thatis,toconvertthenaturallanguageinputwithpotentialambiguityintosomeunambiguousinternalcomputerrepresentation.

Thewidespreadexistenceofambiguitymakeseliminatingthemrequiresalotofknowledgeandreasoning,whichbringsgreatdifficultiestolinguistic-basedmethodsandknowledge-basedmethods.Therefore,thesemethodsarethemainstreamofnature.Inthepastfewdecades,languageprocessingresearchhasmadealotofachievementsintheoryandmethodsontheonehand,buthasnotmadesignificantachievementsinthedevelopmentofsystemsthatcanhandlelarge-scalerealtexts.Mostofthesystemsdevelopedaresmall-scale,researchdemonstrationsystems.

Therearetwoaspectstothecurrentproblems:Ontheonehand,thegrammarsofarislimitedtoanalyzinganisolatedsentence,andthereisstillalackofsystematicresearchontheconstraintsandinfluenceofcontextandconversationenvironmentonthissentence,sotheanalysisAmbiguity,omissionofwords,meaningofpronouns,anddifferentmeaningsofthesamesentenceondifferentoccasionsorbydifferentpeople.Therearenoclearrulestofollow,anditneedstobesolvedgraduallybystrengtheningpragmaticsresearch.Ontheotherhand,peopleunderstandasentencenotonlybygrammar,butalsousealotofrelevantknowledge,includinglifeknowledgeandspecializedknowledge,whichcannotallbestoredinthecomputer.Therefore,awrittencomprehensionsystemcanonlybeestablishedwithinalimitedrangeofvocabulary,sentencepatternsandspecifictopics;onlyafterthestoragecapacityandoperatingspeedofthecomputeraregreatlyimproved,itispossibletoexpandthescopeappropriately.

Theproblemhasbecomethemainproblemofnaturallanguageunderstandingintheapplicationofmachinetranslation.Thisisoneofthereasonswhythetranslationqualityoftoday'smachinetranslationsystemisstillfarfromtheidealgoal;andthetranslationqualityisthekeytothesuccessorfailureofthemachinetranslationsystem.ChinesemathematicianandlinguistProfessorZhouHaizhongoncepointedoutintheclassicpaper"FiftyYearsofMachineTranslation":Toimprovethequalityofmachinetranslation,thefirstthingtodoistosolvetheproblemofthelanguageitselfratherthantheproblemofprogramdesign;Amachinetranslationsystemwilldefinitelynotimprovethequalityofmachinetranslation;inaddition,whenhumanshavenotyetunderstoodhowthebrainperformsfuzzyrecognitionandlogicaljudgmentoflanguage,itisnotnecessaryformachinetranslationtoachieve"faithfulness,expressiveness,andelegance".possible.

Historyofdevelopment

Theearliestresearchworkonnaturallanguageunderstandingismachinetranslation.In1949,AmericanWeaverfirstproposedamachinetranslationdesignplan.Itsdevelopmentismainlydividedintothreestages.

Earlynaturallanguageprocessing

Thefirststage(60s~80s):basedonrulestoestablishvocabulary,syntaxandsemanticanalysis,questionandanswer,chatandmachinetranslationsystem.Theadvantageisthatrulescanmakeuseofhumanintrospectiveknowledge,donotrelyondata,andcanstartquickly;theproblemisinsufficientcoverage,likeatoysystem,rulemanagementandscalabilityhavenotbeenresolved.

Statisticalnaturallanguageprocessing

Thesecondstage(beginninginthe1990s):Statistics-basedmachinelearning(ML)becamepopular,andmanyNLPsbegantousestatistics-basedmethods.Themainideaistouselabeleddatatoestablishamachinelearningsystembasedonmanuallydefinedfeatures,andusethedatatodeterminetheparametersofthemachinelearningsystemthroughlearning.Usetheselearnedparametersatruntimetodecodetheinputdataandgettheoutput.Machinetranslationandsearchengineshavesucceededinusingstatisticalmethods.

Naturallanguageprocessingwithneuralnetworks

Thethirdstage(after2008):Deeplearningbeginstoexertitspowerinspeechandimages.Subsequently,NLPresearchersbegantoturntheirattentiontodeeplearning.First,usedeeplearningforfeaturecalculationorcreateanewfeature,andthenexperiencetheeffectundertheoriginalstatisticallearningframework.Forexample,searchengineshaveaddeddeeplearningsearchtermsanddocumentsimilaritycalculationstoimprovesearchrelevance.Since2014,peoplehavetriedtomodeldirectlythroughdeeplearningforend-to-endtraining.Atpresent,progresshasbeenmadeinthefieldsofmachinetranslation,questionanswering,andreadingcomprehension,andadeeplearningboomhasemerged.

Keyconceptsandtechnologies

Informationextraction(IE)

InformationextractionistheextractionandconversionofunstructuredinformationembeddedintextintostructureddataTheprocessofextractingtherelationshipbetweennamedentitiesfromthenaturallanguagecorpusisadeeperresearchbasedonnamedentityrecognition.Themainprocessofinformationextractionhasthreesteps:first,automaticprocessingofunstructureddata,second,targetedextractionoftextinformation,andfinallystructuredrepresentationoftheextractedinformation.Themostbasicworkofinformationextractionisnamedentityrecognition,andthecoreliesintheextractionofentityrelationships.

AutomaticSummarization

AutomaticSummarizationisaninformationcompressiontechnologythatusesacomputertoautomaticallyextracttextinformationinaccordancewithacertainruleandassembleitintoashortsummary.Itaimstoachievetwogoals:First,keepthelanguageshort,andsecondlykeepimportantinformation.

Voicerecognitiontechnology

Voicerecognitiontechnologyisthetechnologythatallowsthemachinetoconvertthespeechsignalintothecorrespondingtextorcommandthroughtheprocessofrecognitionandunderstanding,whichistomakethemachineunderstandthehumanvoice,Itsgoalistoconvertthevocabularycontentofhumanspeechintocomputer-readabledata.Todothis,thecontinuousspeechmustfirstbebrokendownintounitssuchaswordsandphonemes,andasetofrulesforunderstandingsemanticsmustbeestablished.Intermsofprocess,speechrecognitiontechnologyincludesfront-endnoisereduction,speechcuttingandframing,featureextraction,andstatematching.Theframeworkcanbedividedintothreeparts:acousticmodel,languagemodelanddecoding.

Transformermodel

In2017,theTransformermodelwasfirstproposedbytheGoogleteam.Transformerisamodelbasedontheattentionmechanismtoacceleratedeeplearningalgorithms.Themodelconsistsofasetofencodersandasetofdecoders.Theencoderisresponsibleforprocessinginputofanylengthandgeneratingitsexpression,andthedecoderisresponsibleforconvertingthenewexpressionintoThepurposeword.TheTransformermodelusestheattentionmechanismtoobtaintherelationshipbetweenallotherwordsandgenerateanewrepresentationofeachword.TheadvantageofTransformeristhattheattentionmechanismcandirectlycapturetherelationshipbetweenallwordsinasentencewithoutconsideringthepositionoftheword.Thetraditionalencoder-decodermodelmustbecombinedwiththeinherentmodeofRNNorCNN(ConvolutionalNeuralNetworks,CNN)beforethemodelisabandoned,andthefullAttentionstructureisusedinsteadofLSTM,whichreducestheamountofcalculationandimprovesparallelefficiencywithoutcompromisingthefinalexperimentalresults.Butthismodelalsohasflaws.Firstofall,thismodelistoocomputationallyexpensive,andsecondly,thereistheproblemofinconspicuoususeoflocationinformation,anditisunabletocapturelong-distanceinformation.

Natural language processing

Naturallanguageprocessingtechnologybasedontraditionalmachinelearning

Naturallanguageprocessingcanclassifyprocessingtasksandformmultiplesubtasks.TraditionalmachinelearningmethodscanuseSVM(supportvectormachineModel),

Markov(Markovmodel),CRF(ConditionalRandomFieldModel)andothermethodsprocessmultiplesubtasksinnaturallanguagetofurtherimprovetheaccuracyoftheprocessingresults.However,intermsofpracticalapplicationeffects,therearestillthefollowingshortcomings:(1)Theperformanceofthetraditionalmachinelearningtrainingmodelistoodependentonthequalityofthetrainingset,andthetrainingsetneedstobemanuallylabeled,whichreducesthetrainingefficiency.(2)Thetrainingsetinthetraditionalmachinelearningmodelwillhaveverydifferentapplicationeffectsindifferentfields,whichwillweakentheapplicabilityoftrainingandexposethedisadvantagesofasinglelearningmethod.Ifyouwantthetrainingdatasettobeapplicabletomultipledifferentfields,itwillconsumealotofhumanresourcesformanuallabeling.(3)Whenprocessinghigher-levelandmoreabstractnaturallanguage,machinelearningcannotmanuallylabelthesenaturallanguagefeatures,sothattraditionalmachinelearningcanonlylearnpre-establishedrules,butcannotlearncomplexlanguagefeaturesoutsideoftherules.

Naturallanguageprocessingtechnologybasedondeeplearning

Deeplearningisamajorbranchofmachinelearning.Innaturallanguageprocessing,deeplearningmodelssuchasconvolutionalneuralnetworksandloopsneedtobeapplied.Neuralnetwork,etc.,throughlearningthegeneratedwordvectors,tocompletetheprocessofnaturallanguageclassificationandunderstanding.Comparedwithtraditionalmachinelearning,naturallanguageprocessingtechnologybasedondeeplearninghasthefollowingadvantages:(1)Deeplearningcanconstantlylearnlanguagefeaturesbasedonthevectorizationofwordsorsentences,andmasterhigher-levelandmoreabstractlanguagesFeaturestomeetthenaturallanguageprocessingrequirementsofalargenumberoffeatureprojects.(2)Deeplearningdoesnotrequireexpertstomanuallydefinethetrainingset,andcanautomaticallylearnhigh-levelfeaturesthroughneuralnetworks.

Technicaldifficulties

Effectivedefinitionofcontent

Thevocabularybetweensentencesindailylifeusuallydoesnotexistinisolation,andallwordsinthediscourseneedtobeCorrelationcanexpressthecorrespondingmeaning.Onceaspecificsentenceisformed,thecorrespondingdefiningrelationshipwillbeformedbetweenthewords.Ifthereisnoeffectivedefinition,thecontentwillbecomeambiguousandcannotbeeffectivelyunderstood.Forexample,hewentouttoplayquietlywithhismotherandsisterbehindhisback.Ifthepreposition"he"isnotdefinedinthissentence,itiseasytoformthatthemotherandsisterdonotknowthatheisgoingouttoplay,orthemotherdoesnotknowthatheisgoingouttoplaywithhissister.

Disambiguationandambiguity

Theuseofwordsandsentencesindifferentsituationsoftenhasmultiplemeanings,anditiseasytoproducevagueconceptsordifferentideas,suchasthewordhighmountainsandflowingwater.Ithasmultiplemeanings,whichcanexpressboththenaturalenvironmentandtherelationshipbetweenthetwo,andevendescribethebeautyofthemusic.Therefore,naturallanguageprocessingneedstobedefinedaccordingtothecontentbeforeandafter,toeliminateambiguityandambiguity,andexpressthetruemeaning.

Defectiveorirregularinput

Forexample,foreignaccentsorlocalaccentsareencounteredinspeechprocessing,orspelling,grammaroropticalcharacterrecognition(OCR)mistake.

Languagebehaviorandplanning

Sentencesareoftennotjustliteral;forexample,"Canyoupassthesaltover?"Agoodanswershouldbetopassthesaltover.;Inmostcontexts,"yes"wouldbeabadanswer,althoughtheanswer"no"or"toofarIcan'tgetit"isalsoacceptable.Furthermore,ifacoursewasnotofferedinthepreviousyear,itisbettertoanswerthequestion"Howmanystudentsfailedthiscourselastyear?"toanswer"Thiscoursewasnotofferedlastyear"thantoanswer"Noonefailed."

Associatedtechnology

ComputerScience

Theoriginalpurposeofnaturallanguageprocessingistorealizethenaturallanguagedialoguebetweenhumansandcomputers,andthecomputerasamainbodyofthedialogueisnaturallanguageDealwiththeprerequisitesraisedbythisconcept.Foralongtime,peoplehavelongedforrobotstobeusedinlifeandbecomeanimportantproductivitytopromotesocialdevelopment,especiallyforrobotstohave"humanintelligence".Naturallanguageprocessing,asanimportantpartofthefieldofartificialintelligence,playsanimportantroleinpromotingthetrueintelligenceofrobots.Ithasaniconiceffect.Inrecentyears,computerperformancehasgreatlyimprovedintermsofdatastoragecapacity,processingspeed,etc.Itispossibleformassivedataprocessingandprobabilitystatisticstodiscoverthelawsoflanguageandobtaininternalconnections.

Internettechnology

TheemergenceoftheInternethasmadethedisseminationofinformationmoreconvenient.TheemergenceofvariousnewmediarelyingonInternettechnologyhasbecomethemainchannelofinformationdissemination.VariousonlinechatsSoftwarehasincreasedthewaysforpeopletocommunicate.Theseinformationintheformoftextthatrequiresacertainperiodoftimehasbroughtaboutanexplosivegrowthofdata,andhasprovidedmassiveresourcesfortheuseofstatistical-basednaturallanguageprocessing.RelyingonInternettechnology,theemergenceofopensourceplatformsisalsoanimportantwayforresearcherstoobtainresearchresources.

Machinelearningmethods

Machinelearningisamulti-fieldcross-disciplinethatusesdataandexperiencetoimprovecomputeralgorithmsandoptimizecomputerperformance.ItcanbetracedbacktotheleastsquaresmethodandMarkovinthe17thcentury.Chain,butitsrealdevelopmentshouldbecountedfromthe1950s.Ithasexperiencedtheimplementationof"learningwithorwithoutknowledge",systemdescriptionbasedongraphstructureandlogicalstructure,combinedwithvariousapplicationstoexpandtothelearningofmultipleconcepts.Thedevelopmentofthesestageshasenteredthefourthstageofthenewer,whichcantrulymakecomputersintelligentsincethe1980s.

Usingsemi-supervisedorunsupervisedmachinelearningmethodstoprocessmassiveamountsofnaturallanguagealsocorrespondstothedevelopmentofmachinelearning,whichcanberoughlydividedintotwostages:thetraditionoflinearmodelsbasedondiscreterepresentationsMachinelearning,deeplearningbasedonnon-linearmodelsofcontinuousrepresentation.

Deeplearningisacomputerautomaticlearningalgorithm,includinginputlayer,hiddenlayer,andoutputlayer.Theinputlayerisalargeamountofdataprovidedbyresearchersandistheprocessingobjectofthealgorithm.ThehiddenlayerThenumberoflayersisdeterminedbytheexperimenter.Itistheprocessofthealgorithmmarkingthedata,discoveringtherules,andestablishingtherelationshipbetweenthecharacteristicpoints.Theoutputlayeristheresultthattheresearchercanget.Generallyspeaking,themoredatatheinputlayergets,Thegreaterthenumberofhiddenlayers,thebettertheresultofdatadifferentiation,buttheproblemisthattheamountofcalculationincreasesandthedifficultyofcalculationincreases.Fortunately,computerhardwarehasmadegreatstridesinrecentyears.Asthelatestimpetustopromotenaturallanguageprocessing,machinelearninghasshownunprecedentedadvantages:(1)Overcomingtheshortcomingsofthesparsenessofartificiallanguagefeatures,deeplearningcanusedistributedvectorstodowordprocessingClassification,part-of-speechtags,wordmeaningtags,dependencies,etc.canbeeffectivelymarked;

(2)Overcomingtheproblemofincompletemanualmarkingoflanguagefeatures.Manuallanguagemarkingismissedduetoheavyworkload.Thepossibilityisveryhigh,andhigh-efficiencycomputerstoperformthisworkcangreatlyreducesucherrors;

(3)Overcomingthelargeamountofcalculationandlongcalculationtimeoftraditionalmachinelearningalgorithms,deeplearningusesThecalculationofthematrixgreatlyreducestheamountofcalculation.

Toolsandplatforms

NLTK:Acomprehensivepython-basedNLPlibrary.

StanfordNLP:AlibraryofNLPalgorithmscommonlyusedinacademia.

ChineseNLPtools:THULAC,HarbinInstituteofTechnologyLTP,jiebawordsegmentation.

Researchhotspots

Pre-trainingtechnology

Theessenceofthepre-trainingideaisthatthemodelparametersarenolongerrandomlyinitialized,butaretrainedthroughthelanguagemodel.ThecurrentsolutionforNLPtasksispre-trainingandfine-tuning.Pre-trainingisagreathelpforNLPtasks,andtherearemoreandmorepre-traininglanguagemodels,fromtheoriginalWord2vec]andGlovetotheuniversallanguagetextclassificationmodelULMFiTandEMLo.Thecurrentbestpre-traininglanguagemodelisbasedontheTransformermodel.ThismodelwasproposedbyVaswanietal.ItisacompletelybasedonSelf-AttentionandiscurrentlythebestfeatureextractorintheNLPfield.Itcannotonlyperformparalleloperationsbutalsocapturelong-distancefeaturedependencies.

Currentlythemostinfluentialpre-traininglanguagemodelisthetwo-waydeeplanguagemodelbasedonTransformer-BERT.BERTiscomposedofamulti-layertwo-wayTransformerdecoder,mainlyincludingtwodifferentsizeversions:thebasicversionhas12layersofTransformer,eachTransformerhas12multi-headattentionlayers,andthehiddenlayersizeis768;theenhancedversionhas24LayerTransformer,thereare24multi-headattentionlayersineachTransformer,andthehiddenlayersizeis1,024.Itcanbeseenthattheeffectofadeepandnarrowmodelisbetterthanthatofashallowandwidemodel.Atpresent,BERThasexcellentperformanceinmultipletaskssuchasmachinetranslation,textclassification,textsimilarity,andreadingcomprehension.TherearetwowaystotraintheBERTmodel:(1)Thewaytocoverwords.(2)Adoptthemethodofpredictingthenextsentenceofthesentence.

Throughtheabovetwomethodsoftrainingtoobtainagenerallanguagemodel,andthenusethefine-tuningmethodtoperformdownstreamtasks,suchastextclassification,machinetranslationandothertasks.Comparedwithpreviouspre-trainedmodels,BERTcancapturetruebidirectionalcontextualsemantics.ButBERTalsohascertainshortcomings.Whentrainingthemodel,theuseofalargenumberof[MASK]willaffectthemodeleffect,andonly15%ofthetagsineachbatcharepredicted,sotheconvergencespeedofBERTduringtrainingisslow.Inaddition,duetotheinconsistencybetweenthepre-trainingprocessandthegenerationprocess,theperformanceofnaturallanguagegenerationtasksispoor,andBERTcannotcompletedocument-levelNLPtasks,andisonlysuitableforsentenceandparagraph-leveltasks.

XLNetisageneralizedautoregressivelanguagemodelbasedonTransformer-XL.DisadvantagesofTransformer:(1)Themaximumdependentdistancebetweencharactersislimitedbythelengthoftheinput.(2)Whenthelengthoftheinputtextexceeds512characters,eachsegmentistrainedseparatelyfromthebeginning,whichreducesthetrainingefficiencyandaffectstheperformanceofthemodel.Inresponsetotheabovetwoshortcomings,Transformer-XLintroducedtwosolutions:DivisionRecurrenceMechanismandRelativePositionalEncoding.Transformer-XLisfasterintestingandcancapturelongercontextlengths.

UnsupervisedrepresentationlearninghasachievedgreatsuccessinthefieldofNLP.Underthisconcept,manyresearchershaveexploreddifferentunsupervisedpre-traininggoals.Autoregressivelanguagemodelingandautoencodinglanguageare2Themostsuccessfulpre-traininggoals.XLNetisageneralizedautoregressivemethodthatintegratesautoregressiveandautoencoding.XLNetdoesnotusethefixedforwardorbackwardfactorizationorderinthetraditionalautoregressivemodel,butusesarandomarrangementofnaturallanguagetopredictthewordsthatmayappearinacertainposition.ThismethodnotonlymakeseverypositioninthesentenceItcanlearncontextualinformationfromalllocations,anditcanalsoconstructbidirectionalsemanticstobetteracquirecontextualsemantics.SinceXLNetusesTransformer-XL,theperformanceofthemodelisbetter,especiallyintasksthatincludelongtextsequences.

WhetheritistheBERTorXLNetlanguagemodel,theyperformverywellinEnglishcorpus,buttheeffectisaverageonChinesecorpus.ERNIEusesChinesecorpustotrainalanguagemodel.ERNIEisaknowledge-enhancedsemanticrepresentationmodel,whichhasexcellentperformanceinlanguageinference,semanticsimilarity,namedentityrecognition,textclassificationandotherNLPChinesetasks.ERNIEcanlearnthecompletesemanticrepresentationoflargersemanticunitsbymodelingthepredictedChinesecharacterswhenprocessingChinesecorpus.TheinternalcoreoftheERNIEmodeliscomposedofTransformer.Themodelstructuremainlyincludestwomodules.Thetextencoder(T-Encoder)ofthelowermoduleismainlyresponsibleforcapturingthebasicvocabularyandsyntacticinformationfromtheinputtags,andtheknowledgeencoder(KEncoder)oftheuppermoduleisresponsibleforintegratingtheknowledgeinformationobtainedfromthelowerlayerintoInthetextinformation,inordertobeabletorepresenttheheterogeneousinformationoftagsandentitiesintoaunifiedfeaturespace.

GraphNeuralNetworkTechnology

TheresearchofGraphNeuralNetwork(GraphNeuralNetwork)mainlyfocusesonthepropagationandaggregationofadjacentnodeinformation.ItisproposedfromtheconceptofGraphNeuralNetwork,Tobeinspiredbyconvolutionalneuralnetworksindeeplearning.Graphneuralnetworkhasaveryimportantpositionfortheapplicationofnon-Euclideandataindeeplearning,especiallytheuseofgraphstructureinthetraditionalBayesiancausalnetworkcanexplainthecharacteristics,inthedefinitionofdeepneuralnetworkrelationscanbereasoned,causalExplainableissueshavegreatresearchsignificance.Howtousedeeplearningmethodstoanalyzeandreasonaboutgraphstructuredatahasattractedalotofresearchandattention.

Thegeneralgraphneuralnetworkinferenceprocesscanberepresentedbygraphnodepre-representation,graphnodesampling,sub-graphextraction,sub-graphfeaturefusion,graphneuralnetworkgenerationandtrainingsub-processes.Thespecificstepsareasfollows:

STEP1graphnodepre-representation:EmbedrepresentationofeachnodeinthegraphthroughthemethodofGraphEmbedding;

STEP2graphnodesampling:foreachnodeinthegraphOrsamplethepositiveandnegativesamplesoftheexistingnodepairs;

STEP3subgraphextraction:extracttheneighboringnodesofeachnodeinthegraphtoconstructann-thordersubgraph,wherenrepresentstheneighboringnodeofthenthlayer,thusFormageneralsub-graphstructure;

STEP4sub-graphfeaturefusion:performlocalorglobalfeatureextractionforeachsub-graphinputtotheneuralnetwork;

STEP5generatesgraphneuralnetworkandTraining:Definethenumberoflayersandinputandoutputparametersofthenetwork,andperformnetworktrainingonthegraphdata.

1.GraphConvolutionalNeuralNetworks

Thepopularityofdeeplearningisinseparablefromthewideapplicabilityofconvolutionalneuralnetworks.Theresearchofgraphneuralnetworksisinseparable.Thelongesttimeandthemostresearchresultsaregraphconvolutionalneuralnetworks.Fromtheperspectiveoffeaturespace,graphconvolutionalneuralnetworkscanbedividedintotwotypes:frequencydomainandspatialdomain.

Thegraphconvolutionalneuralnetworkinthefrequencydomainisbasedontheproblemofgraphsignalprocessing.Theconvolutionallayerofthegraphneuralnetworkisdefinedasafilter,thatis,thefilterremovesthenoisesignaltoobtaintheclassificationresultoftheinputsignal.Inpracticalproblems,itcanonlybeusedtodealwithundirectedgraphstructureswithnoinformationontheedges.Thegraphoftheinputsignalisdefinedasaneigen-decomposableLaplacianmatrix.Thenormalizedeigen-decompositioncanbeexpressedasageneralstructure.Theanglematrix𝑨isthecharacteristicmatrixcomposedofeigenvalues𝜆𝑖arrangedinorder.

2.Space-basedgraphconvolutionalneuralnetwork

Similartotheconvolutionaloperationofconvolutionalneuralnetworkonimagepixelsindeeplearning,Thespace-basedgraphconvolutionalneuralnetworkexpressesthetransferandaggregationofinformationbetweenneighboringnodesbycalculatingtheconvolutionbetweenasinglenodeinthecenterandneighboringnodes,asanewnoderepresentationofthefeaturedomain.

Futureprospects

Thefieldofnaturallanguageprocessinghasalwaysbeendominatedbytworesearchmethodsbasedonrulesandstatistics.Bothresearchmethodshaveencounteredbottlenecks,basedonrulesandtraditionalmachines.Afterthelearningmethodreachesacertainstage,itisdifficulttomakegreaterbreakthroughs.Itisnotuntiltheimprovementofcomputingpoweranddatastoragethatgreatlypromotesthedevelopmentofnaturallanguageprocessing.Thebreakthroughofspeechrecognitionhasmadedeeplearningtechnologyverypopular.Machinetranslationhasalsomadegreaterprogress.GoogleTranslatecurrentlyusesdeepneuralnetworktechnologytoelevatemachinetranslationtoanewlevel.Evenifitdoesnotmeetthemanualtranslationstandards,itissufficienttomeetmostoftheneeds.Informationextractionhasalsobecomemoreintelligent,canbetterunderstandcomplexsentencestructuresandrelationshipsbetweenentities,andextractcorrectfacts.Deeplearningpromotestheprogressofnaturallanguageprocessingtasks,andnaturallanguageprocessingtasksalsoprovidebroadapplicationprospectsfordeeplearning,makingpeopleinvestmoreinalgorithmdesign.Theadvancementofartificialintelligencewillcontinuetopromotethedevelopmentofnaturallanguageprocessing,andmakenaturallanguageprocessingfacethefollowingchallenges:1)Betteralgorithms.Amongthethreeelementsofartificialintelligencedevelopment(data,computingpower,andalgorithms),themostrelevanttonaturallanguageprocessingresearchersisalgorithmdesign.Deeplearninghasshownstrongadvantagesinmanytasks,buttherationalityofthebackwardpropagationmethodhasrecentlybeenquestioned.Deeplearningisamethodofcompletingsmalltasksthroughbigdata.Thefocusisoninduction.Thelearningefficiencyisrelativelylow.Whetheritcanstartfromsmalldataandanalyzeitsunderlyingprinciples,andcompletemulti-tasksfromadeductiveperspectiveisthefuture.Itisworthstudying.

2)In-depthanalysisoflanguage.Althoughdeeplearninghasgreatlyimprovedtheeffectofnaturallanguageprocessing,thefieldisaboutthescienceoflanguagetechnology,ratherthanfindingthebestmachinelearningmethods.Thecoreisstilllinguisticissues.Inthefuture,theproblemsinlanguagealsoneedtopayattentiontosemanticunderstanding.Fromlarge-scalenetworkdata,throughin-depthsemanticanalysis,combinedwithlinguistictheory,wecandiscoverthelawsofsemanticgenerationandunderstanding,studythehiddenpatternsbehindthedata,andexpandandimprovetheexistingones.Theknowledgemodelmakesthesemanticrepresentationmoreaccurate.Languageunderstandingrequiresacombinationofreasonandexperience.Reasonisapriori,andexperiencecanexpandknowledge.Therefore,itisnecessarytomakefulluseofworldknowledgeandlinguistictheorytoguideadvancedtechnologytounderstandsemantics.Partofthesemanticinformationisimpliedinthedistributedwordvector.Throughdifferentcombinationsofwordvectors,richersemanticscanbeexpressed.However,thesemanticfunctionofthewordvectorisstillnotfullyutilized.ThesemanticrepresentationmodeinthelanguageisexploredandthesemanticsItisthekeytaskoffutureresearchtoexpressitcompletelyandaccuratelyinformallanguageforthecomputertounderstand.

3)Interdisciplinaryofmultipledisciplines.Ontheissueofunderstandingsemantics,itisnecessarytofindasuitablemodel.Intheexplorationofmodels,itisnecessarytofullylearnfromtheresearchresultsinthefieldsoflanguagephilosophy,cognitivescienceandbrainscience,anddiscoverthegenerationandunderstandingofsemanticsfromacognitiveperspective,whichmayestablishabettermodelforlanguageunderstanding.Intoday'stechnologicalinnovation,multi-disciplinaryintersectioncanbetterpromotethedevelopmentofnaturallanguageprocessing.

Deeplearninghasbroughtmajortechnologicalbreakthroughstonaturallanguageprocessing,anditswidespreadapplicationhasgreatlychangedpeople’sdailylives.Whendeeplearningiscombinedwithothercognitivesciencesandlinguistics,itmaybeabletoexertgreaterpowertosolvesemanticunderstandingproblemsandbringtrue"intelligence".

AlthoughdeeplearninghasachievedgreatsuccessinvarioustasksofNLP,ifitisputintouseonalargescale,therearestillmanyresearchdifficultiesthatneedtobeovercome.Thelargerthedeepneuralnetworkmodel,thelongerthemodeltrainingtime.Howtoreducethemodelvolumewhilekeepingthemodelperformanceunchangedisadirectionoffutureresearch.Inaddition,thedeepneuralnetworkmodelhaspoorinterpretability,andthereislittleprogressintheresearchofnaturallanguagegenerationtasks.However,withthein-depthstudyofdeeplearning,inthenearfuture,moreresearchresultsanddevelopmentwillbemadeinthefieldofNLP.