... suggeststheimportanceofallowingabbreviations (and evenautomaticword-completions),providingrapid on- linespellingcorrection,dictionarymaintenance(includingfacilitiesfordefiningnewmacro-expansionsbased on functionkeys and specialkeyboardaids)aswellashelpful on- linesyntaxchecking,ambiguityreduction and otherhelpfacilities.Theresistancetotheuseofkeyboardsalsoemphasizestheimportanceofexploringotherpossiblemodesofinput,includingspeech and pointingdevices.Inaddition,aswehavealreadysuggested,developmentofthesortofnaturallanguagesystemthatwouldbetrulyusefulraisesahostofdeepproblemsthatarecurrentlyunderInvestigation ... withoutbuildinganyotherrepresentationofwhatthequerymeans.AlthoughtheSODAquerythatresultsfromtheanalysisofanEnglishqueryrepresents,atleastinsomesense,theintendedmeaningofthelatter,itdoessoinawaythatdirectlyreflectsthestructureofthedatabasebeingqueried.Consequently,iftwodatabasesencodethesameinformationindifferentstructures,theresultwillbetwodifferentdatabasequeriesforthesameEnglishsentence.Forexample,ifauserasks“HowmanySwissmountainsarethere?”thedatabasequeriesgeneratedinresponsetohisquerycanlookverydifferent,depending on whetherthetuplesrepresentingSwisspeaksaredistinguishedfromthoserepresentingotherpeaksbytheirmembershipinadifferentrelation,orbythepresenceoftheword“SWZ”inaCOUNTRYfield.Theproblemthiscreatesisnotjustanaestheticone:toacquirethesemantic and pragmaticrulesnecessaryforgeneratingadatabasequerydirectlyfromanEnglishquery,TEAMwouldhavetoasktheDBEaboutfarmorethanthestructure and contentsofthedatabase.Answeringtheessentialquestionsforsuchanacquisitionwouldrequirethekindofexpertiseinnatural-languageprocessingthatTEAMisintendedtorenderunnecessary.Thus,thedemandsoftransportabilityprecludeuseoftheSODAlanguageastheprimaryrepresentationofthemeaningofqueries.62.2LogicalFormLogicalformplaysacentralroleinTEAM:itmediatesbetweenthewayanenduserthinksabouttheinformationinadatabase,asrevealedinhisqueriestothesystem, and thewayinformationcanberetrievedthroughqueriesinaformaldatabase-querylanguage.Thepredicates and termsinthelogicalform-foraparticularqueryarederived-frominformationinthelexicon -and conceptual1naddition,DIALOGICwasdesignedtobeagenerallanguageunderstandingsystemthatcanbeappliedtotasksotherthandatabasequerying.Therefore,itwasundesirabletorestrictitsapplicationbychoosinganunsuitablesemanticrepresentation.—15—schema;7,hence,thechoiceoflogicalformindirectlyaffectsthedesignofthosecomponentsofthesystem and determines,inpart,theinformationtheDBEmustsupply.ThelogicalformemployedbyTEAMisfirst-orderlogicextendedbycertainintensional and higher-orderoperators and augmentedwithspecialquantifiersfordefinitedeterminers and interrogativedeterminers.MuchresearchhasbeendonetodeviseappropriatelogicalformsformanykindsofsentencesMoor8l],butthatinvestigationliesbeyondthescopeofthisarticle.2.3WhatInformationIsAcquired2.3.1TheLexiconThelexiconisarepositoryoftheinformationabouteachwordthatisnecessaryformorphological,syntactic, and semanticanalysis.Therearetwoclassesoflexicalitems:closed and open.Closedclasses(e.g.,pronouns,conjunctions, and determiners)containonlyafinite,usuallysmallnumberoflexicalitems.Typically,thesewordshavecomplex and specializedgrammaticalfunctions,alongwithatleastsome]fixedmeaningsthatareindependentofthedomain.Theyarelikelytooccurwithhighfrequencyinqueriestoalmostanydatabase.Openclasses(e.g.,nouns,verbs,adjectives)aremuchlarger and themeaningsoftheirmemberstendtovary,depending on theparticulardatabase and domain.Therefore,mostclosed-classwordsarebuiltintotheinitialTEAMlexicon,whileopen-classwordsareacquiredforeachdomainseparately.However,thereareanumberofopen-classwords,suchasthosecorrespondingtoconceptsintheinitialconceptualschema(seeSection2.3.2) and wordsforcommonunitsofmeasure(e.g.,“meter”,“pound”),thataresobroadlyapplicabletosomanydatabasedomainsthattheyareincludedintheinitiallexiconaswell.Lexicalentriesincludethoseforthenamesoffilesubjects(i.e.,theentitiesaboutwhichsomerelationcontainsinformation—e.g.,peaksforPEAK, and countriesforWORLDCinthesampledatabaseillustratedinFigure1.3),fieldnames, and fieldvalues.Inaddition,theDBEcansupplyadjectives and verbs,aswellassynonymsforwordsalreadyacquired(seeSection2.4).Associatedwitheverylexicalentryissyntactic and semanticinformationforeachofitssenses.Syntacticinformationconsistsofitsprimarycategory(e.g.,noun,verb,oradjective),subcategory(e.g.,count,unit,ormassfornouns;objecttypesforverbs), and morphology.Semanticinformationdepends on thesyntacticcategory.Theentryforeachnounincludesthesort(s)orindividual(s)intheconceptualschema(Section2.3.2)towhichthatnouncanrefer.Entriesforadjectives and verbsincludetheconceptualpredicatetowhichtheyrefer,plusinformationabouthowthevarioussyntacticconstituentsofasentencemapontoargumentsofthepredicate.Scalaradjectives(e.g.,“high”)alsoincludeanindicationofdirection on thescale(plusorminus).2.3.2ConceptualSchemaTheconceptualschemacontainsinformationabouttheobjects,properties, and relationsinthedomainofthedatabase.Itincludessetsofindividuals,predicates,constraints on theargumentsofpredicates, and theinformationneededforcertainpragmaticprocessing.Theinformationalcontentissimilartothatcommonlyencodedinsemanticnetworks,buttheapparatususedismoreeclectic.Theconceptualschemaconsistsofasorthierarchy and descriptionsofvariouspropertiesofnonsortpredicates.Thesorthierarchyrelatescertainmonadicjpredicatesthatplayaprimaryroleincategorizingindividuals.Thesearecalledsortpredicates(representedhereinitalicsasinPERSON).TEAMwasdesignedwithaconsiderableamountofthisconceptualinformationbuiltin.Figure3illustrates7Asnotedpreviously,thespecificformdependsalso on generalsyntactic,semantic, and pragmaticrulesforEnglishthatareencodedinthevariouscomponentsofDIALOGIC.—16—THINGpJIysiCal-obfeetatitract-otJ.ctttgat-pezsw.pefitiocatioi,scald,~hst-qbsmwsuze-witlcgcZ-ab~nameqnalityfsaW.r.COWitnwsatetimetime-~wsateWCiIM-mmasvzesped-measareeotw,w-m~asartlir.sar-meaSVJiarea-medsateworth-measuretpera~re-measure/peak-heightFigure3:AFragmentofTEAM’sSortHierarchyaportionofthishierarchy.Eachlineconnectinglevelsofthehierarchysignifiesaset-subsetrelationshipbetweentwocategoriesofindividuals.Thesortsconnectedbythesmallarcsdirectlybelowthenodesaredisjoint;thatis,noindividualca~beintwosortsjoinedinthismanner.Thesorthierarchygrowsasinformationaboutadatabaseisacquired.Th~DBEisrequiredtopositionsomeofthenewlyacquiredconceptsintheirappropriateplacesinthehierarchy.Eachfieldinthedatabaseisassociatedwiththesortofobjectsthatcanappearinthatfield.Severaladditionalpropertiesareassociatedwiththesortsderivedfromsymbolicfields and fromcertainkindsofarithmeticfields.Witheachsortobtainedfromasymbolicfield,TEAMassociatesapredicatethatencodestherelationshipbetweenthatsort and thesortofthefilesubject.Forexample,fortherelationWORLDCinSection1.3,whichincludesinformationaboutcapitals and continents,thesystemwouldlinkthesortWORLDC-CAPITALwiththepredicateWORLDC-CAPITAL-OF(inthisarticle,predicatesareshowninboldface),whichtakestwoarguments:thefirstofsortWORLDC-CAPITAL,thesecondofsortCOUNTRY.Thislinkisusedinhandlingquerieslike“WhatisthecapitalofeachcountryinEurope?”Inparticular,itisusedtodeterminewhatitmeansforacapitaltobe“of”acountry,orloracountrytobe“in”Europe.Additionalpropertiesofthesortindicatewhetherindividualinstancesofitcanmodifyorstandforinstancesofthesortofthefilesubject(e.g.,“Europeancountries,”butnot“Europeans”canbeusedtorefertothecountriescsatisfyingthepredication(CONTINENT-OFcEUROPE)).Sortsthatcorrespondtoarithmeticfieldscontainingmeasures(e.g.,length,age)alsoincludeinformationaboutboththeimplicitunitofmeasurement(e.g.,feet,years), and thekindofthingbeingmeasured(e.g.,linearextent,temporalextent).Severalotherkindsofinformationareassociatedwithnonsortpredicates.Adelineationspecifiestheconstraints on thesortsforeachofapredicate’sarguments;multipledelineationsaresupportedbutcannotbedescribedinthisbriefformat.Predicatescorrespondingtocomparative-formingadjectives(e.g.,~“tall”)~havetwoadditionalproperties:~alink~to~thepredicatethatspecifiesthedegree(e.g.,PEAK-HEIGHTinourexample), and anindicationofpolarityalongthescalebeingmeasured(e.g.,plusforTALL,minusforSHORT).—17—SEPTEMBER1985VOL.8NO.3aquarterlybulletinoftheIEEEcomputersociety technical committee DatabaseEngineeringContentsLetterfromtheEditor1Databases and NaturalLanguageProcessingZ.W.Pylyshyn and R.I.Kittredge2TEAM:AnExperimentalTransportableNaturalLanguageInterfaceP.Martin,D.E.Appelt,8.J.Grosz, and F.Pereira10AMultilingualInterfacetoDatabasesH.Lehmann,N.Ott, and M.Zoeppritz23Evaluation and AssessmentofaDomain-IndependentNaturalLanguageQuery ... addinformationashegainsexperiencewithTEAM and thetypesofquestionsthatareaskedbytheendusers.Inanattempttosatisfyalltheseconstraints,themenu-orientedsystemdepictedinFigure4wasdeveloped.Theacquisitionsystemconsistsofamenuofgeneralcommandsattheverytop,threemenusassociatedwithrelations,fields, and lexicalitemsrespectively, and, atthebottom,a—19—Figure5:AcquiringtheVirtualRelationsPKCONT and HEMICwindowforquestions and answers.WhentheDBEusesthemousetoselectoneoftheitemsfromthethreemenus,asetofquestionsappearsinthequestion-answeringareaatthebottomofthedisplay,towhichhecanthenrespond.Oneofthegeneralprinciplesofacquisitionisevidentfromthisdisplay,namely,thattheacquisitioniscenteredupontherelations and fieldsinthedatabase,becausethisistheinformationmostfamiliartotheDBE.Theanswerstoeachquestioncanaffectthelexicon,theconceptualschema, and thedatabaseschema.TheDBEneednotbeawareofexactlywhyTEAMposesthequestionsitdoes—allhehastodoisanswerthemcorrectly.Eventheentriesdisplayedinthewordmenuowetheirpresencetoquestionsaboutthedatabase.TheDBEvolunteersentriestothismenuonlyinthecaseofverbacquisition,tosupplyanadjectivecorrespondingtosomenounalreadyinTEAM’slexicon,ortoenterasynonymforsomelexicon-residentword.TheDBEisassumednottohaveanyknowledgeofformallinguisticsorofnatural-languageprocessingmethods.Heisassumed,however,toknowsomegeneralfactsaboutEnglish—forexample,whatpropernouns,verbs,plurals, and tenseare,butnothingmoredetailedthanthat.Ifmoresophisticatedlinguisticinformationisrequired,asinthecaseofverbacquisition,TEAMproceedsbyaskingquestionsaboutsamplesentences,allowingtheDBEtorely on hisintuitionasanativespeaker, and extractingtheinformationitneedsfromhisresponses.Virtualrelationsarespecifiediconically.TheleftsideofFigure5showstheacquisitionofavirtualrelationthatidentifiesthecontinent(PKCONT-CONTINENT,derivedfromWORLDC-CONTINENT)ofapeak(PKCONT-NAME,fromPEAK-NAME)byperformingadatabasejoin on thePEAK-COUNTRY and WORLDC-CONTINENTfields.Similarly,therightsideofFigure5showstheacquisitionofthevirtualrelationthatencodesthehemisphere(HEMIC-HEMI)ofacountry(HEMIC.NAME)byjoining on theWORLDC-CONTINENT and CONT-NAMEfields.Ifhewishes,theDBEcanchangepreviousanswers.IncrementalupdatesarepossiblebecausemostofthemethodsforupdatingthevariousTEAMstructures(lexicon,schemata)weredevisedtoundotheeffectsofpreviousanswersbeforetheeffectsofnewanswerscouldbeasserted.HelpinformationisalwaysavailabletoassisttheDBEwhenheisunsurehowtoansweraquestion.SelectingthequestiontextwiththemouseproducesamoreelaboratedescriptionoftheinformationTEAMistryingtoelicit,usuallyaccompaniedbypertinentexamples.Finally,theacquisitioncomponentkeepstrackofwhatinformationremainstobesuppliedbeforeTEAMhastheminimumitneedstohandlequeries.TheDBEdoesnothavetodeterminehimselfhowmuchinformationissufficient;allhehastodoistoperceivethatnoacquisitionwindowindicatesremainingunansweredquestions.Ofcourse,theDBEcanalwaysprovideinformationbeyondtheminimum—forexample,bysupplyingadditionalverbs,derivedadjectives,orsynonyms.—20—3ConclusionsTEAMhasbeentestedinavarietyofmultifiledatabasedomainsbyafairlylargenumberofpeopleinadditiontoitsoriginalimplementationteam.Whilethetestinghasbeenmuchlessrigorousthanwouldberequiredforanactualproduct,enoughhasbeenlearnedtoconcludethatthebasicideas~work”—namely,thatitispossibletobuildanatural-languageinterfacethatisgeneralenoughtoallowitsadaptationtonewdomainsbyuserswhoarefamiliarwiththesedomains,butarethemselvesneitherexperts on thesystemitselfnorspecialistsinAlorlinguistics.TEAMhandlesawiderangeofverbs,acapabilitythatisabsolutelyessentialforfluentnatural-languagecommunication.Asitembodiesnodiscoursemodel,itshandlingofpronounresolution and determinerscopingiscorrespondinglylimited.Whileitsgrammarcoverageisquiteextensive,theformalismusedtorepresentit and theprocessesusedtoimplementitareyieldingtonewer and moreperspicuousdesigns~Shie84].Wearenowinvestigatingwaystoprovidetransportabilityinnatural-language systems thatcaninteractwithavarietyofsoftwareservicesbeyonddatabaseaccess and whichmoreextensivediscoursecapabilitieswillbeembodied.AcknowledgmentsJerryR.Hobbs,RobertC.Moore,JaneJ.Robinson, and DanielSagalowiczplayedimportantrolesinthedesignofTEAM.ArmarArchbold,NormanHaas,GaryHendrix,LornaShinkle,MarkStickel and DavidH.Warrenalsocontributedtotheproject.9ReferencesGros85}BarbaraGrosz,DouglasE.Appelt,PaulMartin, and FernandoPereira.TEAM:AnExperimentintheDesignofTransportableNaturalLanguageInterfaces. Technical Note,ArtificialIntelligenceCenter,SRIInternational,MenloPark,California,1985.Cros82]BarbaraGrosz,NormanHaas,GaryC.Hendrix,JerryHobbs,PaulMartin,RobertMoore,JaneRobinson, and StanRosenschein.DIALOCIC:ACoreNatural-LanguageProcessingSystem. Technical Note270,ArtificialIntelligenceCenter,SRIInternational,MenloPark,California,November1982.Hendl7]GaryG.Hendrix.Humanengineeringforappliednaturallanguageprocessing.InProc.oftheFifthInternationalJointConference on ArtificialIntelligence,pages183—191,InternationalJointConferences on ArtificialIntelligence,Cambridge,Massachusetts,August1977.Mart83]PaulMartin,DouglasAppelt, and FernandoPereira.Transportability and generalityinanatural-languageinterfacesystem.InAlanBundy,editor,Proc.oftheEightInternationalJointConference on ArtificialIntelligence,pages573—581,InternationalJointConferences on ArtificialIntelligence,August1983.IMoor79IRobertC.Moore.HandlingComplexQueriesinaDistributedDatabase. Technical ~Note470,~ArtificialIntelligenceCenter,SRIInternational,MenloPark,California,October1979.Moor8l]RobertC.Moore.Problemsinlogicalform.InProc.ofthe19thAnnualMeetingoftheAssociationforComputationalLinguistics,Stanford,California,1981.9ThedevelopmentofTEAMwassupportedbyDARPAcontractsN00039.80.C.0645,N00039.83.C-0109, and N00039.80-C.0575;theNationalLibraryofMedicineNIHgrantLM03611; and NSFgrantIST.8209346.—21—Robi82]JaneJ.Robinson.Diagram:agrammarfordialogues.CommunicationsoftheACM,25(1):27—47,1982.Shie84]StuartM.ShieberThedesignofacomputerlanguageforlinguisticinformation.InProc.ofColing84,pages362—366,AssociationforComputationalLinguistics,June1984.Wa1t75JDavidWaltz.Natural.languageaccesstoalargedatabase:anengineeringapproach.InProc.oftheFourthInternatioalJointConference on ArtificialIntelligence,pages868—872,InternationalConferences on ArtificialIntelligence,September1975.—22—AMULTILINGUALINTERFACETODATABASESHubertLehxnann,NikolausOtt,MagdalenaZoeppritzIBMGermany,}~eidelbergScientificCenterAbstractTheUserSpecialtyLanguages(USL)System,aportableinterfacetorelationaldatabasesinrestrictedEnglish,French,German,Italian, and Spanishisdescribed.Webrieflydiscussourdesignobjectives,theoretical and practicalproblemsweencounteredduringsystemrealization, and theconsequenceswehavedrawnforasuccessorproject.TheGerman and EnglishversionsoftheUSLSystemhavebeenextensivelyevaluatedwithrealusers and realapplications,whichnotonlyshoweduswherewecouldimproveoursystembutalsoprovidedvaluableinsightsforthemethodsofsoftwareergonomics.IntroductionWhenwetalkaboutinteractionwithdatabaseswemustclarifytwothings:1.whoarethegroupsofpeoplewhowanttoobtaininformation, and 2.whataretheoperationstobeperformed on thedatabasetoyieldtheinformationdesired?Thenwecanthinkabouthowtheseoperationsaretobespecifiedbyagivenuser.Anumberofquerylanguageshavebeendevelopedduringthe70’s and effortstoshowtheir“user-friendlinesstt,theirappropriatenessfor“non-DPexperts”havebeenmadewithgreaterorlessersuccess(cf.e.g.LEHN79]forasurvey).Adifferentapproachistoregardhumanquestion-answeringdialogasamodelfortheinteractionwithadatabase,aspresumablyitisbesttotalktothecomputerinone’sownlanguage.Theproblemthenistorelatenaturallanguageexpressionstodatainthedatabase and totheoperationstobeperformed on them.IntheUSLprojectweshowedthat•fragmentsofnaturallanguagecanbeimplementedthatarelargeenoughtobeusablefordatabaseaccess,•thesyntax and semanticsofsuchfragmentscanbedescribedinsuchawaythatthesystembecomesindependentoftheparticulardomainofdiscourse(thispropertyhasbecomeknownas(trans)portability),•adaptationtoanewdomaincanbeachievedwithouttraininginlinguistics,•naturallanguageinterfacescanbebuiltwhichoperate on standarddatabases(i:e~neitherrequirespe~ialrepresentatioflnormaMp~lationofdata).—23—DesignprinciplesTheUSLSystemwasdesignedwiththeobjectivestobeusableinrealisticapplications,tobeportable,toenableadaptationtonewdomainsbynon-linguists, and toprovideaninterfaceto~.i.aitdarddatabases.Alatergoalwastheadaptationtoavarietyofdifferentlanguages,whichbroughtinafewnewaspects,butwas on thewholearelativelystraightforwardtask.TheseobjectiveshadanumberofconsequencesforthedesignoftheUSLsystemwhichwediscussinthefollowingsections.ConsequencesofportabilityAsystemisportableif...