Introduction
Languageistheessentialcharacteristicthatdistinguisheshumansfromotheranimals.Amongalllivingthings,onlyhumanshavetheabilitytospeak.Avarietyofhumanintelligencesarecloselyrelatedtolanguage.Humanlogicalthinkingisintheformoflanguage,andmostofhumanknowledgeisalsorecordedandpasseddownintheformoflanguage.Therefore,itisalsoanimportant,evencorepartofartificialintelligence.
Usingnaturallanguagetocommunicatewithcomputersiswhatpeoplehavelongpursued.Becauseithasnotonlyobviouspracticalsignificance,butalsoimportanttheoreticalsignificance:peoplecanusethecomputerinthelanguagetheyaremostaccustomedto,withoutspendingalotoftimeandenergytolearnvariouscomputerlanguagesthatarenotverynaturalandaccustomed;Peoplecanalsouseittofurtherunderstandthehumanlanguageabilityandthemechanismofintelligence.
Naturallanguageprocessingreferstothetechnologythatusesthenaturallanguageusedbyhumanstocommunicatewithmachinestocommunicatewitheachother.Throughartificialprocessingofnaturallanguage,thecomputercanreadandunderstandit.Theresearchonnaturallanguageprocessingbeganwiththeexplorationofmachinetranslationbyhumans.Althoughnaturallanguageprocessinginvolvesmulti-dimensionaloperationssuchasspeech,grammar,semantics,andpragmatics,insimpleterms,thebasictaskofnaturallanguageprocessingistosegmenttheprocessedcorpusbasedontheontologydictionary,wordfrequencystatistics,contextualsemanticanalysis,etc.Theunitisthesmallestpartofspeechandisrichinsemanticlexicalitems.
NaturalLanguageProcessing(NLP)takeslanguageasitsobjectandusescomputertechnologytoanalyze,understand,andprocessnaturallanguage.Thatis,thecomputerisusedasapowerfultoolforlanguageresearch.Underthesupportof,conductquantitativeresearchonlanguageinformation,andprovidelanguagedescriptionsthatcanbeusedjointlybetweenhumansandcomputers.Itincludestwoparts:NaturalLanguageUnderstanding(NLU)andNaturalLanguageGeneration(NLG).Itisatypicalborderlineinterdisciplinarysubject,involvinglanguagescience,computerscience,mathematics,cognition,logic,etc.,focusingonthefieldofinteractionbetweencomputersandhuman(natural)languages.Peoplerefertotheprocessofusingcomputerstoprocessnaturallanguageatdifferentperiodsorwhenthefocusisdifferent.ItisalsocalledNaturalLanguageUnderstanding(NLU),HumanLanguageTechnology(HLT),ComputationalLinguisticsHl(ComputationalLinguistics),QuantitativeLinguistics,MathematicalLinguistics.
Realizingnaturallanguagecommunicationbetweenhumansandcomputersmeansthatcomputersmustnotonlyunderstandthemeaningofnaturallanguagetexts,butalsoexpressgivenintentionsandthoughtsinnaturallanguagetexts.Theformeriscallednaturallanguageunderstanding,andthelatteriscallednaturallanguagegeneration.Therefore,naturallanguageprocessinggenerallyincludestwoparts:naturallanguageunderstandingandnaturallanguagegeneration.Historically,therehavebeenmoreresearchesonnaturallanguageunderstanding,butlessresearchonnaturallanguagegeneration.Butthissituationhaschanged.
Whetherrealizingnaturallanguageunderstandingornaturallanguagegenerationisfarlesssimplethanpeopleoriginallyimagined,itisverydifficult.Judgingfromthecurrenttheoreticalandtechnologicalstatus,auniversal,high-qualitynaturallanguageprocessingsystemisstillalong-termgoal,butforcertainapplications,practicalsystemswithconsiderablenaturallanguageprocessingcapabilitieshaveemerged,andsomehavebeencommercialized.,Andevenbegantoindustrialize.Typicalexamplesare:naturallanguageinterfacesofmultilingualdatabasesandexpertsystems,variousmachinetranslationsystems,full-textinformationretrievalsystems,automaticabstractingsystems,etc.
Naturallanguageprocessing,thatis,realizingnaturallanguagecommunicationbetweenhumansandcomputers,orrealizingnaturallanguageunderstandingandnaturallanguagegenerationisverydifficult.Therootcauseofthedifficultyisthewidevarietyofambiguitiesorambiguitiesatalllevelsofnaturallanguagetextsanddialogues.
Thereisamany-to-manyrelationshipbetweentheform(string)ofnaturallanguageanditsmeaning.Infact,thisisexactlythecharmofnaturallanguage.Butfromtheperspectiveofcomputerprocessing,wemusteliminateambiguity,andsomepeoplethinkthatitisthecentralprobleminnaturallanguageunderstanding,thatis,toconvertthenaturallanguageinputwithpotentialambiguityintosomeunambiguousinternalcomputerrepresentation.
Thewidespreadexistenceofambiguitymakeseliminatingthemrequiresalotofknowledgeandreasoning,whichbringsgreatdifficultiestolinguistic-basedmethodsandknowledge-basedmethods.Therefore,thesemethodsarethemainstreamofnature.Inthepastfewdecades,languageprocessingresearchhasmadealotofachievementsintheoryandmethodsontheonehand,buthasnotmadesignificantachievementsinthedevelopmentofsystemsthatcanhandlelarge-scalerealtexts.Mostofthesystemsdevelopedaresmall-scale,researchdemonstrationsystems.
Therearetwoaspectstothecurrentproblems:Ontheonehand,thegrammarsofarislimitedtoanalyzinganisolatedsentence,andthereisstillalackofsystematicresearchontheconstraintsandinfluenceofcontextandconversationenvironmentonthissentence,sotheanalysisAmbiguity,omissionofwords,meaningofpronouns,anddifferentmeaningsofthesamesentenceondifferentoccasionsorbydifferentpeople.Therearenoclearrulestofollow,anditneedstobesolvedgraduallybystrengtheningpragmaticsresearch.Ontheotherhand,peopleunderstandasentencenotonlybygrammar,butalsousealotofrelevantknowledge,includinglifeknowledgeandspecializedknowledge,whichcannotallbestoredinthecomputer.Therefore,awrittencomprehensionsystemcanonlybeestablishedwithinalimitedrangeofvocabulary,sentencepatternsandspecifictopics;onlyafterthestoragecapacityandoperatingspeedofthecomputeraregreatlyimproved,itispossibletoexpandthescopeappropriately.
Theproblemhasbecomethemainproblemofnaturallanguageunderstandingintheapplicationofmachinetranslation.Thisisoneofthereasonswhythetranslationqualityoftoday'smachinetranslationsystemisstillfarfromtheidealgoal;andthetranslationqualityisthekeytothesuccessorfailureofthemachinetranslationsystem.ChinesemathematicianandlinguistProfessorZhouHaizhongoncepointedoutintheclassicpaper"FiftyYearsofMachineTranslation":Toimprovethequalityofmachinetranslation,thefirstthingtodoistosolvetheproblemofthelanguageitselfratherthantheproblemofprogramdesign;Amachinetranslationsystemwilldefinitelynotimprovethequalityofmachinetranslation;inaddition,whenhumanshavenotyetunderstoodhowthebrainperformsfuzzyrecognitionandlogicaljudgmentoflanguage,itisnotnecessaryformachinetranslationtoachieve"faithfulness,expressiveness,andelegance".possible.
Historyofdevelopment
Theearliestresearchworkonnaturallanguageunderstandingismachinetranslation.In1949,AmericanWeaverfirstproposedamachinetranslationdesignplan.Itsdevelopmentismainlydividedintothreestages.
Earlynaturallanguageprocessing
Thefirststage(60s~80s):basedonrulestoestablishvocabulary,syntaxandsemanticanalysis,questionandanswer,chatandmachinetranslationsystem.Theadvantageisthatrulescanmakeuseofhumanintrospectiveknowledge,donotrelyondata,andcanstartquickly;theproblemisinsufficientcoverage,likeatoysystem,rulemanagementandscalabilityhavenotbeenresolved.
Statisticalnaturallanguageprocessing
Thesecondstage(beginninginthe1990s):Statistics-basedmachinelearning(ML)becamepopular,andmanyNLPsbegantousestatistics-basedmethods.Themainideaistouselabeleddatatoestablishamachinelearningsystembasedonmanuallydefinedfeatures,andusethedatatodeterminetheparametersofthemachinelearningsystemthroughlearning.Usetheselearnedparametersatruntimetodecodetheinputdataandgettheoutput.Machinetranslationandsearchengineshavesucceededinusingstatisticalmethods.
Naturallanguageprocessingwithneuralnetworks
Thethirdstage(after2008):Deeplearningbeginstoexertitspowerinspeechandimages.Subsequently,NLPresearchersbegantoturntheirattentiontodeeplearning.First,usedeeplearningforfeaturecalculationorcreateanewfeature,andthenexperiencetheeffectundertheoriginalstatisticallearningframework.Forexample,searchengineshaveaddeddeeplearningsearchtermsanddocumentsimilaritycalculationstoimprovesearchrelevance.Since2014,peoplehavetriedtomodeldirectlythroughdeeplearningforend-to-endtraining.Atpresent,progresshasbeenmadeinthefieldsofmachinetranslation,questionanswering,andreadingcomprehension,andadeeplearningboomhasemerged.
Keyconceptsandtechnologies
Informationextraction(IE)
InformationextractionistheextractionandconversionofunstructuredinformationembeddedintextintostructureddataTheprocessofextractingtherelationshipbetweennamedentitiesfromthenaturallanguagecorpusisadeeperresearchbasedonnamedentityrecognition.Themainprocessofinformationextractionhasthreesteps:first,automaticprocessingofunstructureddata,second,targetedextractionoftextinformation,andfinallystructuredrepresentationoftheextractedinformation.Themostbasicworkofinformationextractionisnamedentityrecognition,andthecoreliesintheextractionofentityrelationships.
AutomaticSummarization
AutomaticSummarizationisaninformationcompressiontechnologythatusesacomputertoautomaticallyextracttextinformationinaccordancewithacertainruleandassembleitintoashortsummary.Itaimstoachievetwogoals:First,keepthelanguageshort,andsecondlykeepimportantinformation.
Voicerecognitiontechnology
Voicerecognitiontechnologyisthetechnologythatallowsthemachinetoconvertthespeechsignalintothecorrespondingtextorcommandthroughtheprocessofrecognitionandunderstanding,whichistomakethemachineunderstandthehumanvoice,Itsgoalistoconvertthevocabularycontentofhumanspeechintocomputer-readabledata.Todothis,thecontinuousspeechmustfirstbebrokendownintounitssuchaswordsandphonemes,andasetofrulesforunderstandingsemanticsmustbeestablished.Intermsofprocess,speechrecognitiontechnologyincludesfront-endnoisereduction,speechcuttingandframing,featureextraction,andstatematching.Theframeworkcanbedividedintothreeparts:acousticmodel,languagemodelanddecoding.
Transformermodel
In2017,theTransformermodelwasfirstproposedbytheGoogleteam.Transformerisamodelbasedontheattentionmechanismtoacceleratedeeplearningalgorithms.Themodelconsistsofasetofencodersandasetofdecoders.Theencoderisresponsibleforprocessinginputofanylengthandgeneratingitsexpression,andthedecoderisresponsibleforconvertingthenewexpressionintoThepurposeword.TheTransformermodelusestheattentionmechanismtoobtaintherelationshipbetweenallotherwordsandgenerateanewrepresentationofeachword.TheadvantageofTransformeristhattheattentionmechanismcandirectlycapturetherelationshipbetweenallwordsinasentencewithoutconsideringthepositionoftheword.Thetraditionalencoder-decodermodelmustbecombinedwiththeinherentmodeofRNNorCNN(ConvolutionalNeuralNetworks,CNN)beforethemodelisabandoned,andthefullAttentionstructureisusedinsteadofLSTM,whichreducestheamountofcalculationandimprovesparallelefficiencywithoutcompromisingthefinalexperimentalresults.Butthismodelalsohasflaws.Firstofall,thismodelistoocomputationallyexpensive,andsecondly,thereistheproblemofinconspicuoususeoflocationinformation,anditisunabletocapturelong-distanceinformation.
Naturallanguageprocessingtechnologybasedontraditionalmachinelearning
Naturallanguageprocessingcanclassifyprocessingtasksandformmultiplesubtasks.TraditionalmachinelearningmethodscanuseSVM(supportvectormachineModel),
Markov(Markovmodel),CRF(ConditionalRandomFieldModel)andothermethodsprocessmultiplesubtasksinnaturallanguagetofurtherimprovetheaccuracyoftheprocessingresults.However,intermsofpracticalapplicationeffects,therearestillthefollowingshortcomings:(1)Theperformanceofthetraditionalmachinelearningtrainingmodelistoodependentonthequalityofthetrainingset,andthetrainingsetneedstobemanuallylabeled,whichreducesthetrainingefficiency.(2)Thetrainingsetinthetraditionalmachinelearningmodelwillhaveverydifferentapplicationeffectsindifferentfields,whichwillweakentheapplicabilityoftrainingandexposethedisadvantagesofasinglelearningmethod.Ifyouwantthetrainingdatasettobeapplicabletomultipledifferentfields,itwillconsumealotofhumanresourcesformanuallabeling.(3)Whenprocessinghigher-levelandmoreabstractnaturallanguage,machinelearningcannotmanuallylabelthesenaturallanguagefeatures,sothattraditionalmachinelearningcanonlylearnpre-establishedrules,butcannotlearncomplexlanguagefeaturesoutsideoftherules.
Naturallanguageprocessingtechnologybasedondeeplearning
Deeplearningisamajorbranchofmachinelearning.Innaturallanguageprocessing,deeplearningmodelssuchasconvolutionalneuralnetworksandloopsneedtobeapplied.Neuralnetwork,etc.,throughlearningthegeneratedwordvectors,tocompletetheprocessofnaturallanguageclassificationandunderstanding.Comparedwithtraditionalmachinelearning,naturallanguageprocessingtechnologybasedondeeplearninghasthefollowingadvantages:(1)Deeplearningcanconstantlylearnlanguagefeaturesbasedonthevectorizationofwordsorsentences,andmasterhigher-levelandmoreabstractlanguagesFeaturestomeetthenaturallanguageprocessingrequirementsofalargenumberoffeatureprojects.(2)Deeplearningdoesnotrequireexpertstomanuallydefinethetrainingset,andcanautomaticallylearnhigh-levelfeaturesthroughneuralnetworks.
Technicaldifficulties
Effectivedefinitionofcontent
Thevocabularybetweensentencesindailylifeusuallydoesnotexistinisolation,andallwordsinthediscourseneedtobeCorrelationcanexpressthecorrespondingmeaning.Onceaspecificsentenceisformed,thecorrespondingdefiningrelationshipwillbeformedbetweenthewords.Ifthereisnoeffectivedefinition,thecontentwillbecomeambiguousandcannotbeeffectivelyunderstood.Forexample,hewentouttoplayquietlywithhismotherandsisterbehindhisback.Ifthepreposition"he"isnotdefinedinthissentence,itiseasytoformthatthemotherandsisterdonotknowthatheisgoingouttoplay,orthemotherdoesnotknowthatheisgoingouttoplaywithhissister.
Disambiguationandambiguity
Theuseofwordsandsentencesindifferentsituationsoftenhasmultiplemeanings,anditiseasytoproducevagueconceptsordifferentideas,suchasthewordhighmountainsandflowingwater.Ithasmultiplemeanings,whichcanexpressboththenaturalenvironmentandtherelationshipbetweenthetwo,andevendescribethebeautyofthemusic.Therefore,naturallanguageprocessingneedstobedefinedaccordingtothecontentbeforeandafter,toeliminateambiguityandambiguity,andexpressthetruemeaning.
Defectiveorirregularinput
Forexample,foreignaccentsorlocalaccentsareencounteredinspeechprocessing,orspelling,grammaroropticalcharacterrecognition(OCR)mistake.
Languagebehaviorandplanning
Sentencesareoftennotjustliteral;forexample,"Canyoupassthesaltover?"Agoodanswershouldbetopassthesaltover.;Inmostcontexts,"yes"wouldbeabadanswer,althoughtheanswer"no"or"toofarIcan'tgetit"isalsoacceptable.Furthermore,ifacoursewasnotofferedinthepreviousyear,itisbettertoanswerthequestion"Howmanystudentsfailedthiscourselastyear?"toanswer"Thiscoursewasnotofferedlastyear"thantoanswer"Noonefailed."
Associatedtechnology
ComputerScience
Theoriginalpurposeofnaturallanguageprocessingistorealizethenaturallanguagedialoguebetweenhumansandcomputers,andthecomputerasamainbodyofthedialogueisnaturallanguageDealwiththeprerequisitesraisedbythisconcept.Foralongtime,peoplehavelongedforrobotstobeusedinlifeandbecomeanimportantproductivitytopromotesocialdevelopment,especiallyforrobotstohave"humanintelligence".Naturallanguageprocessing,asanimportantpartofthefieldofartificialintelligence,playsanimportantroleinpromotingthetrueintelligenceofrobots.Ithasaniconiceffect.Inrecentyears,computerperformancehasgreatlyimprovedintermsofdatastoragecapacity,processingspeed,etc.Itispossibleformassivedataprocessingandprobabilitystatisticstodiscoverthelawsoflanguageandobtaininternalconnections.
Internettechnology
TheemergenceoftheInternethasmadethedisseminationofinformationmoreconvenient.TheemergenceofvariousnewmediarelyingonInternettechnologyhasbecomethemainchannelofinformationdissemination.VariousonlinechatsSoftwarehasincreasedthewaysforpeopletocommunicate.Theseinformationintheformoftextthatrequiresacertainperiodoftimehasbroughtaboutanexplosivegrowthofdata,andhasprovidedmassiveresourcesfortheuseofstatistical-basednaturallanguageprocessing.RelyingonInternettechnology,theemergenceofopensourceplatformsisalsoanimportantwayforresearcherstoobtainresearchresources.
Machinelearningmethods
Machinelearningisamulti-fieldcross-disciplinethatusesdataandexperiencetoimprovecomputeralgorithmsandoptimizecomputerperformance.ItcanbetracedbacktotheleastsquaresmethodandMarkovinthe17thcentury.Chain,butitsrealdevelopmentshouldbecountedfromthe1950s.Ithasexperiencedtheimplementationof"learningwithorwithoutknowledge",systemdescriptionbasedongraphstructureandlogicalstructure,combinedwithvariousapplicationstoexpandtothelearningofmultipleconcepts.Thedevelopmentofthesestageshasenteredthefourthstageofthenewer,whichcantrulymakecomputersintelligentsincethe1980s.
Usingsemi-supervisedorunsupervisedmachinelearningmethodstoprocessmassiveamountsofnaturallanguagealsocorrespondstothedevelopmentofmachinelearning,whichcanberoughlydividedintotwostages:thetraditionoflinearmodelsbasedondiscreterepresentationsMachinelearning,deeplearningbasedonnon-linearmodelsofcontinuousrepresentation.
Deeplearningisacomputerautomaticlearningalgorithm,includinginputlayer,hiddenlayer,andoutputlayer.Theinputlayerisalargeamountofdataprovidedbyresearchersandistheprocessingobjectofthealgorithm.ThehiddenlayerThenumberoflayersisdeterminedbytheexperimenter.Itistheprocessofthealgorithmmarkingthedata,discoveringtherules,andestablishingtherelationshipbetweenthecharacteristicpoints.Theoutputlayeristheresultthattheresearchercanget.Generallyspeaking,themoredatatheinputlayergets,Thegreaterthenumberofhiddenlayers,thebettertheresultofdatadifferentiation,buttheproblemisthattheamountofcalculationincreasesandthedifficultyofcalculationincreases.Fortunately,computerhardwarehasmadegreatstridesinrecentyears.Asthelatestimpetustopromotenaturallanguageprocessing,machinelearninghasshownunprecedentedadvantages:(1)Overcomingtheshortcomingsofthesparsenessofartificiallanguagefeatures,deeplearningcanusedistributedvectorstodowordprocessingClassification,part-of-speechtags,wordmeaningtags,dependencies,etc.canbeeffectivelymarked;
(2)Overcomingtheproblemofincompletemanualmarkingoflanguagefeatures.Manuallanguagemarkingismissedduetoheavyworkload.Thepossibilityisveryhigh,andhigh-efficiencycomputerstoperformthisworkcangreatlyreducesucherrors;
(3)Overcomingthelargeamountofcalculationandlongcalculationtimeoftraditionalmachinelearningalgorithms,deeplearningusesThecalculationofthematrixgreatlyreducestheamountofcalculation.
Toolsandplatforms
NLTK:Acomprehensivepython-basedNLPlibrary.
StanfordNLP:AlibraryofNLPalgorithmscommonlyusedinacademia.
ChineseNLPtools:THULAC,HarbinInstituteofTechnologyLTP,jiebawordsegmentation.
Researchhotspots
Pre-trainingtechnology
Theessenceofthepre-trainingideaisthatthemodelparametersarenolongerrandomlyinitialized,butaretrainedthroughthelanguagemodel.ThecurrentsolutionforNLPtasksispre-trainingandfine-tuning.Pre-trainingisagreathelpforNLPtasks,andtherearemoreandmorepre-traininglanguagemodels,fromtheoriginalWord2vec]andGlovetotheuniversallanguagetextclassificationmodelULMFiTandEMLo.Thecurrentbestpre-traininglanguagemodelisbasedontheTransformermodel.ThismodelwasproposedbyVaswanietal.ItisacompletelybasedonSelf-AttentionandiscurrentlythebestfeatureextractorintheNLPfield.Itcannotonlyperformparalleloperationsbutalsocapturelong-distancefeaturedependencies.
Currentlythemostinfluentialpre-traininglanguagemodelisthetwo-waydeeplanguagemodelbasedonTransformer-BERT.BERTiscomposedofamulti-layertwo-wayTransformerdecoder,mainlyincludingtwodifferentsizeversions:thebasicversionhas12layersofTransformer,eachTransformerhas12multi-headattentionlayers,andthehiddenlayersizeis768;theenhancedversionhas24LayerTransformer,thereare24multi-headattentionlayersineachTransformer,andthehiddenlayersizeis1,024.Itcanbeseenthattheeffectofadeepandnarrowmodelisbetterthanthatofashallowandwidemodel.Atpresent,BERThasexcellentperformanceinmultipletaskssuchasmachinetranslation,textclassification,textsimilarity,andreadingcomprehension.TherearetwowaystotraintheBERTmodel:(1)Thewaytocoverwords.(2)Adoptthemethodofpredictingthenextsentenceofthesentence.
Throughtheabovetwomethodsoftrainingtoobtainagenerallanguagemodel,andthenusethefine-tuningmethodtoperformdownstreamtasks,suchastextclassification,machinetranslationandothertasks.Comparedwithpreviouspre-trainedmodels,BERTcancapturetruebidirectionalcontextualsemantics.ButBERTalsohascertainshortcomings.Whentrainingthemodel,theuseofalargenumberof[MASK]willaffectthemodeleffect,andonly15%ofthetagsineachbatcharepredicted,sotheconvergencespeedofBERTduringtrainingisslow.Inaddition,duetotheinconsistencybetweenthepre-trainingprocessandthegenerationprocess,theperformanceofnaturallanguagegenerationtasksispoor,andBERTcannotcompletedocument-levelNLPtasks,andisonlysuitableforsentenceandparagraph-leveltasks.
XLNetisageneralizedautoregressivelanguagemodelbasedonTransformer-XL.DisadvantagesofTransformer:(1)Themaximumdependentdistancebetweencharactersislimitedbythelengthoftheinput.(2)Whenthelengthoftheinputtextexceeds512characters,eachsegmentistrainedseparatelyfromthebeginning,whichreducesthetrainingefficiencyandaffectstheperformanceofthemodel.Inresponsetotheabovetwoshortcomings,Transformer-XLintroducedtwosolutions:DivisionRecurrenceMechanismandRelativePositionalEncoding.Transformer-XLisfasterintestingandcancapturelongercontextlengths.
UnsupervisedrepresentationlearninghasachievedgreatsuccessinthefieldofNLP.Underthisconcept,manyresearchershaveexploreddifferentunsupervisedpre-traininggoals.Autoregressivelanguagemodelingandautoencodinglanguageare2Themostsuccessfulpre-traininggoals.XLNetisageneralizedautoregressivemethodthatintegratesautoregressiveandautoencoding.XLNetdoesnotusethefixedforwardorbackwardfactorizationorderinthetraditionalautoregressivemodel,butusesarandomarrangementofnaturallanguagetopredictthewordsthatmayappearinacertainposition.ThismethodnotonlymakeseverypositioninthesentenceItcanlearncontextualinformationfromalllocations,anditcanalsoconstructbidirectionalsemanticstobetteracquirecontextualsemantics.SinceXLNetusesTransformer-XL,theperformanceofthemodelisbetter,especiallyintasksthatincludelongtextsequences.
WhetheritistheBERTorXLNetlanguagemodel,theyperformverywellinEnglishcorpus,buttheeffectisaverageonChinesecorpus.ERNIEusesChinesecorpustotrainalanguagemodel.ERNIEisaknowledge-enhancedsemanticrepresentationmodel,whichhasexcellentperformanceinlanguageinference,semanticsimilarity,namedentityrecognition,textclassificationandotherNLPChinesetasks.ERNIEcanlearnthecompletesemanticrepresentationoflargersemanticunitsbymodelingthepredictedChinesecharacterswhenprocessingChinesecorpus.TheinternalcoreoftheERNIEmodeliscomposedofTransformer.Themodelstructuremainlyincludestwomodules.Thetextencoder(T-Encoder)ofthelowermoduleismainlyresponsibleforcapturingthebasicvocabularyandsyntacticinformationfromtheinputtags,andtheknowledgeencoder(KEncoder)oftheuppermoduleisresponsibleforintegratingtheknowledgeinformationobtainedfromthelowerlayerintoInthetextinformation,inordertobeabletorepresenttheheterogeneousinformationoftagsandentitiesintoaunifiedfeaturespace.
GraphNeuralNetworkTechnology
TheresearchofGraphNeuralNetwork(GraphNeuralNetwork)mainlyfocusesonthepropagationandaggregationofadjacentnodeinformation.ItisproposedfromtheconceptofGraphNeuralNetwork,Tobeinspiredbyconvolutionalneuralnetworksindeeplearning.Graphneuralnetworkhasaveryimportantpositionfortheapplicationofnon-Euclideandataindeeplearning,especiallytheuseofgraphstructureinthetraditionalBayesiancausalnetworkcanexplainthecharacteristics,inthedefinitionofdeepneuralnetworkrelationscanbereasoned,causalExplainableissueshavegreatresearchsignificance.Howtousedeeplearningmethodstoanalyzeandreasonaboutgraphstructuredatahasattractedalotofresearchandattention.
Thegeneralgraphneuralnetworkinferenceprocesscanberepresentedbygraphnodepre-representation,graphnodesampling,sub-graphextraction,sub-graphfeaturefusion,graphneuralnetworkgenerationandtrainingsub-processes.Thespecificstepsareasfollows:
STEP1graphnodepre-representation:EmbedrepresentationofeachnodeinthegraphthroughthemethodofGraphEmbedding;
STEP2graphnodesampling:foreachnodeinthegraphOrsamplethepositiveandnegativesamplesoftheexistingnodepairs;
STEP3subgraphextraction:extracttheneighboringnodesofeachnodeinthegraphtoconstructann-thordersubgraph,wherenrepresentstheneighboringnodeofthenthlayer,thusFormageneralsub-graphstructure;
STEP4sub-graphfeaturefusion:performlocalorglobalfeatureextractionforeachsub-graphinputtotheneuralnetwork;
STEP5generatesgraphneuralnetworkandTraining:Definethenumberoflayersandinputandoutputparametersofthenetwork,andperformnetworktrainingonthegraphdata.
1.GraphConvolutionalNeuralNetworks
Thepopularityofdeeplearningisinseparablefromthewideapplicabilityofconvolutionalneuralnetworks.Theresearchofgraphneuralnetworksisinseparable.Thelongesttimeandthemostresearchresultsaregraphconvolutionalneuralnetworks.Fromtheperspectiveoffeaturespace,graphconvolutionalneuralnetworkscanbedividedintotwotypes:frequencydomainandspatialdomain.
Thegraphconvolutionalneuralnetworkinthefrequencydomainisbasedontheproblemofgraphsignalprocessing.Theconvolutionallayerofthegraphneuralnetworkisdefinedasafilter,thatis,thefilterremovesthenoisesignaltoobtaintheclassificationresultoftheinputsignal.Inpracticalproblems,itcanonlybeusedtodealwithundirectedgraphstructureswithnoinformationontheedges.Thegraphoftheinputsignalisdefinedasaneigen-decomposableLaplacianmatrix.Thenormalizedeigen-decompositioncanbeexpressedasageneralstructure.Theanglematrix𝑨isthecharacteristicmatrixcomposedofeigenvalues𝜆𝑖arrangedinorder.
2.Space-basedgraphconvolutionalneuralnetwork
Similartotheconvolutionaloperationofconvolutionalneuralnetworkonimagepixelsindeeplearning,Thespace-basedgraphconvolutionalneuralnetworkexpressesthetransferandaggregationofinformationbetweenneighboringnodesbycalculatingtheconvolutionbetweenasinglenodeinthecenterandneighboringnodes,asanewnoderepresentationofthefeaturedomain.
Futureprospects
Thefieldofnaturallanguageprocessinghasalwaysbeendominatedbytworesearchmethodsbasedonrulesandstatistics.Bothresearchmethodshaveencounteredbottlenecks,basedonrulesandtraditionalmachines.Afterthelearningmethodreachesacertainstage,itisdifficulttomakegreaterbreakthroughs.Itisnotuntiltheimprovementofcomputingpoweranddatastoragethatgreatlypromotesthedevelopmentofnaturallanguageprocessing.Thebreakthroughofspeechrecognitionhasmadedeeplearningtechnologyverypopular.Machinetranslationhasalsomadegreaterprogress.GoogleTranslatecurrentlyusesdeepneuralnetworktechnologytoelevatemachinetranslationtoanewlevel.Evenifitdoesnotmeetthemanualtranslationstandards,itissufficienttomeetmostoftheneeds.Informationextractionhasalsobecomemoreintelligent,canbetterunderstandcomplexsentencestructuresandrelationshipsbetweenentities,andextractcorrectfacts.Deeplearningpromotestheprogressofnaturallanguageprocessingtasks,andnaturallanguageprocessingtasksalsoprovidebroadapplicationprospectsfordeeplearning,makingpeopleinvestmoreinalgorithmdesign.Theadvancementofartificialintelligencewillcontinuetopromotethedevelopmentofnaturallanguageprocessing,andmakenaturallanguageprocessingfacethefollowingchallenges:1)Betteralgorithms.Amongthethreeelementsofartificialintelligencedevelopment(data,computingpower,andalgorithms),themostrelevanttonaturallanguageprocessingresearchersisalgorithmdesign.Deeplearninghasshownstrongadvantagesinmanytasks,buttherationalityofthebackwardpropagationmethodhasrecentlybeenquestioned.Deeplearningisamethodofcompletingsmalltasksthroughbigdata.Thefocusisoninduction.Thelearningefficiencyisrelativelylow.Whetheritcanstartfromsmalldataandanalyzeitsunderlyingprinciples,andcompletemulti-tasksfromadeductiveperspectiveisthefuture.Itisworthstudying.
2)In-depthanalysisoflanguage.Althoughdeeplearninghasgreatlyimprovedtheeffectofnaturallanguageprocessing,thefieldisaboutthescienceoflanguagetechnology,ratherthanfindingthebestmachinelearningmethods.Thecoreisstilllinguisticissues.Inthefuture,theproblemsinlanguagealsoneedtopayattentiontosemanticunderstanding.Fromlarge-scalenetworkdata,throughin-depthsemanticanalysis,combinedwithlinguistictheory,wecandiscoverthelawsofsemanticgenerationandunderstanding,studythehiddenpatternsbehindthedata,andexpandandimprovetheexistingones.Theknowledgemodelmakesthesemanticrepresentationmoreaccurate.Languageunderstandingrequiresacombinationofreasonandexperience.Reasonisapriori,andexperiencecanexpandknowledge.Therefore,itisnecessarytomakefulluseofworldknowledgeandlinguistictheorytoguideadvancedtechnologytounderstandsemantics.Partofthesemanticinformationisimpliedinthedistributedwordvector.Throughdifferentcombinationsofwordvectors,richersemanticscanbeexpressed.However,thesemanticfunctionofthewordvectorisstillnotfullyutilized.ThesemanticrepresentationmodeinthelanguageisexploredandthesemanticsItisthekeytaskoffutureresearchtoexpressitcompletelyandaccuratelyinformallanguageforthecomputertounderstand.
3)Interdisciplinaryofmultipledisciplines.Ontheissueofunderstandingsemantics,itisnecessarytofindasuitablemodel.Intheexplorationofmodels,itisnecessarytofullylearnfromtheresearchresultsinthefieldsoflanguagephilosophy,cognitivescienceandbrainscience,anddiscoverthegenerationandunderstandingofsemanticsfromacognitiveperspective,whichmayestablishabettermodelforlanguageunderstanding.Intoday'stechnologicalinnovation,multi-disciplinaryintersectioncanbetterpromotethedevelopmentofnaturallanguageprocessing.
Deeplearninghasbroughtmajortechnologicalbreakthroughstonaturallanguageprocessing,anditswidespreadapplicationhasgreatlychangedpeople’sdailylives.Whendeeplearningiscombinedwithothercognitivesciencesandlinguistics,itmaybeabletoexertgreaterpowertosolvesemanticunderstandingproblemsandbringtrue"intelligence".
AlthoughdeeplearninghasachievedgreatsuccessinvarioustasksofNLP,ifitisputintouseonalargescale,therearestillmanyresearchdifficultiesthatneedtobeovercome.Thelargerthedeepneuralnetworkmodel,thelongerthemodeltrainingtime.Howtoreducethemodelvolumewhilekeepingthemodelperformanceunchangedisadirectionoffutureresearch.Inaddition,thedeepneuralnetworkmodelhaspoorinterpretability,andthereislittleprogressintheresearchofnaturallanguagegenerationtasks.However,withthein-depthstudyofdeeplearning,inthenearfuture,moreresearchresultsanddevelopmentwillbemadeinthefieldofNLP.