***************************************** * * * ATTRIBUTE SET BIB-1 (Z39.50-1995): * * SEMANTICS * * * ***************************************** September 1995 Modified October 1997 to change the definition of Use attribute Documet identifier (1032). - Old Definition: A persistent identifier, or Doc-ID, assigned by a server, that uniquely identifies a document on that server. - New Definition: An identifier or Doc-ID, assigned by a server, that uniquely identifies a document on that server. May or may not be persistent. May be, for example, a URL. 1. ABOUT THIS DOCUMENT This document provides suggested interpretations for the semantics of the bib-1 Attribute Set. This document represents consensus among the members of the Z39.50 Implementors Group. It will be maintained as an official document of the Z39.50 Maintenance Agency, and will be revised periodically to reflect the most pragmatic guidelines for interoperability agreed upon by the Implementors Group. This document contains references to certain definitions and behaviors that are specific to the target. These can be problem areas for interoperability. The specific definitions and behaviors may be described in a "Profile" document. In the absence of a profile, one must contact the service provider and ask. The behavior may be UNIQUE to that target. The expectation is that, over time, more and more will be documented explicitly in the standard and in profiles. 2. ATTRIBUTES The attributes of Attribute Set bib-1 are used to indicate the characteristics of a search term in a Type-1 query when the query is of the form AttributeList+term (as described in section 3.7.1 of Z39.50-1995). The descriptions in this document apply when all attributes within 'AttributeList' are from the bib-1 attribute set. It does not define semantics when bib-1 is mixed with other attribute sets. There are six types of attributes: Use, Relation, Position, Structure, Truncation, and Completeness. The Use attribute, if provided, identifies a set of access points against which the term is to be matched. The Relation, Completeness, Truncation and Position attributes, if provided, specify additional match criteria. The Structure attribute, if provided, identifies the form in which the term has been supplied. Within an attribute list, each attribute type is optional. However, if a particular attribute type is not supplied, this document does not address target behavior -- a given target might supply a default attribute, dynamically select an appropriate attribute based on the other attributes supplied, or fail the search because it requires that the attribute type be supplied. While Attribute Set bib-1 was originally established for use in the retrieval of records that are representable using the MARC formats for information interchange, it can also be used for the retrieval of records or documents representable in other formats. Within an attribute list, multiple instances of a given type of attribute element are undefined and discouraged. Use of version 3 semantic actions is encouraged. The remainder of this section describes each of the six attribute types, in order by the type number: Use attributes (type = 1) Relation attributes (type = 2) Position attributes (type = 3) Structure attributes (type = 4) Truncation attributes (type = 5) Completeness attributes (type = 6) 2.1 USE ATTRIBUTES (TYPE = 1) A Use attribute specifies an access point (e.g., corporate name, personal name, title, subject). The Use attributes are given below in two separate tables. Table 1 is similar to the listing in Z39.50-1995, Appendix 3, ATR: Attribute Sets, in that the attributes are in order by their values and the same names that appear in the Appendix appear in the left column of Table 1. The right column of Table 1 contains a reference to the name of the attribute that is used in Table 2. Table 2 rearranges the Use attributes alphabetically by group name in an attempt to bring similar Use attributes together. The groups are somewhat arbitrary; no rigorous classification of the attributes has been attempted. In Table 2, all attribute names are followed by their values, a brief definition or description, and tag values of representative USMARC bibliographic format fields that would contain data that could be described in the search by using the attribute. Whenever possible, definitions are taken from the Anglo-American Cataloguing Rules or the USMARC Format for Bibliographic Data as these are the guidelines that are used by a significant number of libraries for formulating data. In Table 2, the notation '$' following a USMARC tag refers to a subfield of the named field. The notation 'i' following a USMARC tag refers to values of the second indicator in the named field; when the second indicator of the field has the value , the data in the field is associated with that Use attribute. TABLE 1: USE ATTRIBUTES FROM Z39.50-1995 APPENDIX 3, ATR: ATTRIBUTE SETS Use Value Reference to Group Name Used in Table 2 --------------------------- ----- --------------------------------------- Personal name 1 Name-personal Corporate name 2 Name-corporate Conference name 3 Name-conference Title 4 Title Title series 5 Title-series Title uniform 6 Title-uniform ISBN 7 Identifier-ISBN ISSN 8 Identifier-ISSN LC card number 9 Control number-LC BNB card number 10 Control number-BNB BGF(sic) number 11 Control number-BNF Local number 12 Control number-local Dewey classification 13 Classification-Dewey UDC classification 14 Classification-UDC Bliss classification 15 Classification-Bliss LC call number 16 Classification-LC NLM call number 17 Classification-NLM NAL call number 18 Classification-NAL MOS call number 19 Classification-MOS Local classification 20 Classification-local Subject heading 21 Subject Subject Rameau 22 Subject-RAMEAU BDI index subject 23 Subject-BDI INSPEC subject 24 Subject-INSPEC MESH subject 25 Subject-MESH PA subject 26 Subject-PA LC subject heading 27 Subject-LC RVM subject heading 28 Subject-RVM Local subject index 29 Subject-local Date 30 Date Date of publication 31 Date-publication Date of acquisition 32 Date-acquisition Title-key 33 Title-key Title collective 34 Title-collective Title parallel 35 Title-parallel Title cover 36 Title-cover Title added-title-page 37 Title-added-title-page Title caption 38 Title-caption Title running 39 Title-running Title spine 40 Title-spine Title other variant 41 Title-other-variant Title former 42 Title-former Title abbreviated 43 Title-abbreviated Title expanded 44 Title-expanded Subject PRECIS 45 Subject-PRECIS Subject RSWK 46 Subject-RSWK Subject subdivision 47 Subject-subdivision Number natl bibliography 48 Identifier-national-bibliography Number legal deposit 49 Identifier-legal-deposit Number govt publication 50 Classification-government-publication Number publisher for music 51 Identifier-publisher-for-music Number DB 52 Control-number-DB Number local call 53 Identifier-local-call Code--language 54 Code-language Code--geographic area 55 Code-geographic-area Code--institution 56 Code-institution Name and title 57 Name and title Name geographic 58 Name-geographic Place publication 59 Name-geographic-place-publication CODEN 60 Identifier-CODEN Microform generation 61 Code-microform-generation Abstract 62 Abstract Note 63 Note Author-title 1000 Author-name-and-title Record type 1001 Code-record-type Name 1002 Name Author 1003 Author-name Author-name personal 1004 Author-name-personal Author-name corporate 1005 Author-name-corporate Author-name conference 1006 Author-name-conference Identifier--standard 1007 Identifier-standard Subject--LC children's 1008 Subject-LC-children's Subject name--personal 1009 Subject-name-personal Body of text 1010 Body of text Date/time added to database 1011 Date/time added to database Date/time last modified 1012 Date/time last modified Authority/format identifier 1013 Identifier-authority/format Concept-text 1014 Concept-text Concept-reference 1015 Concept-reference Any 1016 Any Server choice 1017 Server-choice Publisher 1018 Name-publisher Record source 1019 Record-source Editor 1020 Name-editor Bib-level 1021 Code-bib-level Geographic class 1022 Code-geographic-class Indexed by 1023 Indexed-by Map scale 1024 Code-map-scale Music key 1025 Music-key Related periodical 1026 Title-related-periodical Report number 1027 Identifier-report Stock number 1028 Identifier-stock Thematic number 1030 Identifier-thematic Material type 1031 Material-type Doc ID 1032 Identifier-document Host item 1033 Title-host-item Content type 1034 Content-type Anywhere 1035 Anywhere Author-Title-Subject 1036 Author-Title-Subject TABLE 2: USE ATTRIBUTES (CLASSIFIED AND DEFINED) Use Value Definition USMARC tag(s) -------------------- ----- ------------------------------ ------------------ Abstract 62 An abbreviated, accurate 520 representation of a work, usually without added interpretation or criticism. Any 1016 The record is selected if there exists a Use attribute that the target supports (and considers appropriate - see note 1) such that the record would be selected if the target were to substitute that attribute. Notes: (1) When the origin uses 'any' the intent is that the target locate records via commonly used access points. The target may define 'any' to refer to a selected set of Use attributes corresponding to its commonly used access points. (2) In set terminology: when Any is the Use attribute, the set of records selected is the union of the sets of records selected by each of the (appropriate) Use attributes that the target supports. Anywhere 1035 The record is selected if the term value (as qualified by the other attributes) occurs anywhere in the record. Note: A target might choose to support 'Anywhere' only in combination with specific (non-Use) attributes. For example, a target might support 'Anywhere' only in combination with the Relation attribute 'AlwaysMatches' (see below), to locate all records in a database. Notes on relationship of Any and Anywhere: (1) A target may support Any but not Anywhere, or vice versa, or both. However, if a target supports both, then it should exclude 'Anywhere' from the list of Use attributes corresponding to 'Any' (if it does not do so, then the set of records located by 'Any' will be a superset of those located by 'Anywhere'). (2) A distinction between the two attributes may be informally expressed as follows: 'anywhere' might result in more expensive searching than 'any'; if the target (and origin) support both 'any' and 'anywhere', if the origin uses 'Any' (rather than 'Anywhere') it is asking the target to locate the term only if it can do so relatively inexpensively. Author-name 1003 A personal or corporate author, 100, 110, 111, 400 or a conference or meeting 410, 411, 700, 710, name. (No subject name 711, 800, 810, 811 headings are included.) Author-name-and- 1000 A personal or corporate author, 100/2XX, 110/2XX, title or a conference or meeting 111/2XX, subfields name, and the title of the $a & $t in item. (No subject name following: 400,410, headings are included.) The 411, 700, 710, 711, syntax of the name-title 800, 810, 811 combination is up to the target, unless used with the Structure attribute Key (see below). Author-name- 1005 An organization or a group 110, 410, 710, 810 corporate of persons that is identified by a particular name. (Differs from attribute "name-corporate (2)" in that corporate name subject headings are not included.) Author-name- 1006 A meeting of individuals or 111, 411, 711, 811 conference representatives of various bodies for the purpose of discussing topics of common interest. (Differs from attribute "name-conference (3)" in that conference name subject headings are not included.) Author-name-personal 1004 A person's real name, 100, 400, 700, 800 pseudonym, title of nobility nickname, or initials. (Differs from attribute "name-personal (1)" in that personal name subject headings are not included.) Author-Title-Subject 1036 An author or a title or a 1XX, 2XX, 4XX, subject. 6XX, 7XX, 8XX Note: When the Use attribute is Author-name-and-title (1000) the term contains both an author name and a title. When the Use attribute is Author-Title-Subject (1036), the term contains an author name or a title or a subject. Body of text 1010 Used in full-text searching to indicate that the term is to be searched only in that portion of the record that the target considers the body of the text, as opposed to some other discriminated part such as a headline, title, or abstract. Classification-Bliss 15 A classification number from the Bliss Classification, developed by Henry Evelyn Bliss. Classification-Dewey 13 A classification number from 082 the Dewey Decimal Classification, developed by Melvyl Dewey. Classification- 50 A classification number 086 government-publication assigned to a government document by a government agency at any level (e.g., state, national, international). Classification-LC 16 A classification number from 050 the US Library of Congress Classification. Classification-local 20 A local classification number from a system not specified elsewhere in this list of attributes. Classification-NAL 18 A classification number from 070 the US National Agriculture Library Classification. Classification-NLM 17 A classification number from 060 the US National Library of Medicine Classification. Classification-MOS 19 A classification number from Mathematics Subject Classification, compiled in the Editorial Offices of Mathematical Reviews and Zentralblatt fur Mathematik. Classification-UDC 14 A classification number from 080 Universal Decimal Classification, a system based on the Dewey Decimal Classification. Code-bib-level 1021 A one-character alphabetic Leader/07 code indicating the bibliographic level such as monograph, serial or collection of the record. Code-geographic-area 55 A code that indicates the 043 geographic area(s) that appear or are implied in the headings assigned to the item during cataloging. Code-geographic- 1022 A code that represents the 052 class geographic area and if applicable the geographic subarea covered by an item. The codes are derived from the LC Classification-Class G and the expanded Cutter number list. Code-institution 56 An authoritative-agency 040, 852$a symbol for an institution that is the source of the record or the holding location. The code space is defined by the target. Code-language 54 A code that indicates the 008/35-37, 041 language of the item. The codes are defined by the target. Code-map-scale 1024 Coded form of cartographic 034 mathematical data, including scale, projection and/or coordinates related to the item. Code-microform- 61 The code specifying the 007/11 generation generation of a microform. Code-record-type 1001 A code that specifies the Leader/06 characteristics and defines the components of the record. The codes are target-specific. Concept-reference 1015 Used within Z39.50-1988; included here for historical reasons but its use is deprecated. Concept-text 1014 Used within Z39.50-1988; included here for historical reasons but its use is deprecated. Content-type 1034 The type of materials derived value contained in the item or from 008/24-27 publication. For example: review, catalog, encyclopedia, directory. Control number-BNB 10 Character string that uniquely 015 identifies a record in the British National Bibliography. Control number-BNF 11 Character string that uniquely 015 identifies a record in the Bibliotheque Nationale Francais. Control number-DB 52 Character string that uniquely 015 identifies a record in the Deutsche Bibliothek. Control number-LC 9 Character string that uniquely 010, 011 identifies a record in the Library of Congress database. Control number-local 12 Character string that uniquely 001, 035 identifies a record in a local system (i.e., any system that is not one of the four listed above). Date 30 The point of time at which 005, 008/00-05, a transaction or event 008/07-10, 260$c, takes place. 008/11-14, 033,etc. Date-publication 31 The date (usually year) in 008/07-10, 260$c which a document is published. 046, 533$d Date-acquisition 32 The date when a document was 541$d acquired. Date/time added to 1011 The date and time that a 008/00-05 database record was added to the database. Date/time last 1012 The date and time a record 005 modified was last updated. Identifier-- 1013 Used in full-text searching authority/format to indicate to the target system the format of the document that should be returned to the originating system. The attribute carries not only the format code, but also the authority (e.g., system) that assigned that code. Identifier-CODEN 60 A six-character, unique, 030 alphanumeric code assigned to serial and monographic publications by the CODEN section of the Chemical Abstracts Service. Identifier-document 1032 An identifier or Doc-ID, assigned by a server, that uniquely identifies a document on that server. May or may not be persistent. May be, for example, a URL. Note: this definition was modified October 1997. Identifier-ISBN 7 International Standard Book 020 Number -- internationally agreed upon number that identifies a book uniquely. Cf. ANSI/NISO Z39.21 and ISO 2108. Identifier-ISSN 8 International Standard Serial 022, 4XX$x, Number -- internationally 7XX$x agreed upon number that identifies a serial uniquely. Cf. ANSI/NISO z39.9 and ISO 3297. Identifier-legal- 49 The copyright registration 017 deposit number that is assigned to an item when the item is deposited for copyright. Identifier-local-call 53 Call number (e.g., shelf location) assigned by a local system (not a classification number). Identifier-national- 48 Character string that uniquely 015 bibliography identifies a record in a national bibliography. Identifier-publisher- 51 A formatted number assigned 028 for-music by a publisher to a sound recording or to printed music. Identifier-report 1027 A report number assigned to 027, 088 the item. This number could be the STRN (Standard Technical Report Number) or another report number. Cf. ANSI/NISO Z39.23 and ISO 10444. Identifier-standard 1007 Standard numbers such as ISBN, 010, 011, 015, 017, ISSN, music publishers 018, 020, 022, 023, numbers, CODEN, etc., that 024, 025, 027, 028, are indexed together in many 030, 035, 037 online public-access catalogs. Identifier-stock 1028 A stock number that could be 037 used for ordering the item. Identifier-thematic 1030 The numeric designation for a $n in the following: part/section of a work such as 130, 240, 243, 630, the serial, opus or thematic 700, 730 index number. Indexed-by 1023 For serials, a publication 510 in which the serial has been indexed and/or abstracted. Material-type 1031 A free-form string, more derived value from specific than the one-letter Leader/06-07, 007, code in Leader/06, that 008, and 502 describes the material type of the item, e.g., cassette, kit, computer database, computer file. Music-key 1025 A statement of the key in $r in the following: which the music is written. 130, 240, 243, 630, 700, 730 Name 1002 The name of a person, corporate 100, 110, 111, 400, body, conference, or meeting. 410, 411, 600, 610, (Subject name headings are 611, 700, 710, 711, included.) 800, 810, 811 Name-and-title 57 The name of a person, corporate 100/2XX, 110/2XX, body, conference, or meeting, 111/2XX, subfields and the title of an item. $a & $t in (Subject name headings are following: 400,410, included.) The syntax of the 411, 600, 610, 611, name-title combination is up 700, 710, 711, 800, to the target, unless used 810, 811 with the Structure attribute Key (see below). Name-corporate 2 An organization or a group 110, 410, 610, 710, of persons that is identified 810 by a particular name. (Subject name headings are included.) Name-conference 3 A meeting of individuals or 111, 411, 611, 711 representatives of various 811 bodies for the purpose of discussing topics of common interest. (Subject name headings are included.) Name-editor 1020 A person who prepared for 100 $a or 700 $a when publication an item that is the corresponding $e not his or her own. contains value 'ed.' Name-geographic 58 Name of a country, 651 jurisdiction, region, or geographic feature. Name-geographic-place- 59 City or town where an item 008/15-17, 260$a publication was published. Name-personal 1 A person's real name, 100, 400, 600, 700, pseudonym, title of nobility 800 nickname, or initials. Name-publisher 1018 The organization responsible 260$b for the publication of the item. Note 63 A concise statement in which 5XX such information as extended physical description, relationship to other works, or contents may be recorded. Record-source 1019 The USMARC code or name of the 008/39, 040 organization(s) that created the original record, assigned the USMARC content designation and transcribed the record into machine-readable form, or modified the existing USMARC record; the cataloging source. Server-choice 1017 The target substitutes one or more access points. The origin leaves the choice to the target. Notes on relationship of Any and Server-choice: (1) When the origin uses 'Server-choice' it is asking the target to select one or more access points, and to use its best judgment in making that selection. When 'Any' is used, there is no selection process involved; the target is to apply all of the (appropriate) supported Use attributes. The origin is asking the target to make a choice of access points. (2) The target might support 'Any' and not 'Server-choice', or vice versa, or both. If the target supports both, when the origin uses 'Server-choice', the target might choose 'Any'; however, it might choose any other Use attribute. Subject 21 The primary topic on which a 600, 610, 611, 630, work is focused. 650, 651, 653, 654, 655, 656, 657, 69X Subject-BDI 23 Subject headings from Bibliotek Dokumentasjon Informasjon -- a controlled subject vocabulary used and maintained by the five Nordic countries (Denmark, Finland, Iceland, Norway, and Sweden). Subject-INSPEC 24 Subject headings from 600i2, 610i2, Information Services for the 611i2, 630i2, Physics and Engineering 650i2, 651i2 Communities -- the Information Services Division of the Institution of Electrical Engineers. Subject-LC 27 Subject headings from 600i0, 610i0, US Library of Congress 611i0, 630i0, Subject Headings. 650i0, 651i0 Subject-LC- 1008 Subject headings, for use 600i1, 610i1, children's with children's literature, 611i1, 630i1, that conform to the 650i1, 651i1 formulation guidelines in the "AC Subject Headings" section of the Library of Congress Subject Headings. Subject-local 29 Subjects headings defined locally. Subject-MESH 25 Subject headings from 600i2, 610i2, Medical Subject Headings -- 611i2, 630i2, maintained by the US National 650i2, 651i2 Library of Medicine. Subject-name- 1009 A person's real name, 600 personal pseudonym, title of nobility nickname, or initials that appears in a subject heading. Subject-PA 26 Subject headings from 600i2, 610i2, Thesaurus of Psychological 611i2, 630i2, Index Terms -- maintained 650i2, 651i2 by the Retrieval Services Unit of the American Psychological Association. Subject-PRECIS 45 Subject headings from PREserved Context Index System -- a string of indexing terms set down in a prescribed order, each term being preceded by a manipulation code which governs the production of pre-coordinated subject index entries under selected terms -- maintained by the British Library. Subject-RAMEAU 22 Subject headings from Repertoire d'authorite de matieres encyclopedique unifie -- maintained by the Bibliotheque Nationale (France). Subject-RSWK 46 Subject headings from Regeln fur den Schlagwortkatalog -- maintained by the Deutsches Bibliotheksinstitut. Subject-RVM 28 Subject headings from 600i6, 610i6, Repertoire des vedettes- 611i6, 630i6, matiere -- maintained by the 650i6, 651i6 Bibliotheque de l'Universite de Laval. Subject-subdivision 47 An extension to a subject 6XX$x, 6XX$y, heading indicating the form, 6XX$z place, period of time treated, or aspect of the subject treated. Title 4 A word, phrase, character, 130, 21X-24X, 440, or group of characters, 490, 730, 740, 830, normally appearing in an item, 840, subfield $t that names the item or the in the following: work contained in it. 400, 410, 410, 600, 610, 611, 700, 710, 711, 800, 810, 811 Title-abbreviated 43 Shortened form of the title; 210, 211 (obs.), either assigned by national 246 centers under the auspices of the International Serials Data System, or a title (such as an acronym) that is popularly associated with the item. Title-added-title-page 37 A title on a title page 246i5 preceding or following the title page chosen as the basis for the description of the item. It may be more general (e.g., a series title page), or equally general (e.g., a title page in another language). Title-caption 38 A title given at the beginning 246i6 of the first page of the text. Title-collective 34 A title proper that is an 243 inclusive title for an item containing several works. Title-cover 36 The title printed on the 246i4 cover of an item as issued. Title-expanded 44 An expanded (or augmented) 214 (obs.), 246 title has been enlarged with descriptive words by the cataloger to provide additional indexing and searching capabilities. Title-former 42 A former title or title 247, 780 variation when one bibliographic record represents all issues of a serial that has changed title. Title-host-item 1033 The title of the item 773$t containing the part described in the record, for example, a journal title when the record describes an article in the journal. Title-key 33 The unique name assigned to 222 a serial by the International Serials Data System (ISDS). Title-other-variant 41 A variation from the title 212 (obs.), 246i3, page title appearing elsewhere 247, 740 in the item (e.g., a variant cover title, caption title, running title, or title from another volume) or in another issue. Title-parallel 35 The title proper in another 246i1 language and/or script. Title-related- 1026 Serial titles related to this 247, 780, 785 periodical item, either the immediate predecessor or the immediate successor. Title-running 39 A title, or abbreviated title, 246i7 that is repeated at the head or foot of each page or leaf. Title-series 5 Collective title applying to 440, 490, 830, 840, a group of separate, but subfield $t in the related, items. following: 400,410, 411, 800, 810, 811 Title-spine 40 A title appearing on the 246i8 spine of an item. Title-uniform 6 The particular title by which 130, 240, 730, a work is to be identified subfield $t in the for cataloging purposes. following: 700,710, 711 2.2 RELATION ATTRIBUTES (TYPE = 2) Relation attributes describe the relationship of the access point (left side of the relation) to the search term as qualified by the attributes (right side of the relation), e.g., Date-publication <= 1975. The Relation attributes are the following: Relation Value ------------------- ------ Less than 1 Less than or equal 2 Equal 3 Greater or equal 4 Greater than 5 Not equal 6 Phonetic 100 Stem 101 Relevance 102 AlwaysMatches 103 Relation attribute Equal -- specifies an exact match (subject to possible qualification by the truncation or structure attributes). Relation attributes Less than, Less than or equal, Greater than or equal, and Greater than -- meaningful only when both the term value as qualified by the attributes and the access point can be realized as elements of a set that has an inherent implied order. Relation attributes Stem and Phonetic -- Stem refers to a lexical or linguistic match; the term is to be compared with words in a record to find those with the same stem. Phonetic refers to a match based on aural similarity such as Soundex. In both cases, the match algorithms are defined by the target. Relation attribute Relevance -- used to select records that are relevant to the term. When used, the Use attribute determines what portion of a record is to be evaluated for relevance. The relevance algorithm is defined by the target. Relation attribute AlwaysMatches -- when the Relation attribute AlwaysMatches occurs: - The target ignores the supplied term. - If the Use attribute is Any or Anywhere, then all records are to be selected. - If a Use attribute other than Any or Anywhere is supplied, all records are selected for which the access point corresponding to the supplied Use attribute is meaningful. For example: if the Use attribute is Title, all records that have a title field are selected. 2.3 POSITION ATTRIBUTES (TYPE = 3) The Position attribute specifies the location of the search term within the field or subfield in which it appears. For the purpose of describing the Position attributes, when the expressions "field" or "subfield" do not have another understood meaning (as prescribed, for example, by the schema in use), these two expressions are used as follows: - "subfield" has no meaning, and the Position attribute "first in any subfield" is not to be used. - "field" refers to the portion of the record to which the access point refers. The Position attributes are the following: Position Value Definition ---------------------- ----- -------------------------------------- First in field 1 Search term must be the first data in the field. First in subfield 2 Search term may appear in any subfield but must be the first data in the subfield in which it appears. Any position in field 3 Search term may appear any place in the field. 2.4 STRUCTURE ATTRIBUTES (TYPE = 4) The Structure attribute specifies the type of search term (e.g., a single word, a phrase, several words to be treated as multiple single terms, etc.). The Structure attributes are the following: Structure Value Definition --------------- ----- ------------------------------------------- Phrase 1 A phrase consists of one or more groups of characters separated by blanks (for example, ASCII hex "20"). The value to be searched is exactly as it appears in the search term with respect to order and adjacency. Word(s) in the phrase may be explicitly truncated. (See "Truncation" -- section 2.5 below.) To indicate that additional words may appear in the access point, use the completeness attribute. Word 2 A word consists of a group of non-blank characters. It specifies the exact text of the value to be searched, unless the word is explicitly truncated. (See "Truncation" -- section 2.5 below.) A word search term contains no blanks. Key 3 A key specifies a sequence of characters extracted from those characters contained in an indexed word but not necessarily representing complete words. In the term, key segments should be separated by a blank (ASCII hex "20"). Each key segment should be the length of a key segment in the origin system or the length of the word, to a maximum of 6 characters. (For example, an name/title derived key search term for "Copland, Aaron, 1900- Rodeo" could be "coplan rodeo".) A segment may be adjusted by the target to the length required for the target's indexes. For example, the following derived key searches are in use at LC and at OCLC (in Online System): Site Index Letters taken Source Data ----- ---------- ------------- -------------- OCLC TITLE 3,2,2,1 title keywords NAME/TITLE 4,4 name, title NAME 4,3,1 personal name CNAME 4,3,1 corporate name LC PTK 3,1,1,1 title keywords PATK 3,3 name, title PPNK 5,1 or 6 personal name Year 4 A year search term is numeric and contains four digits. Date 5 The day, month, year and time when a (normalized) transaction or event takes place. The date search term structure is as defined for Generalized Time in ASN.1 (ISO 8824) except that the only mandatory portion of the string is the four-digit representation of the year. Word list 6 A word list consists of one or more words separated by blanks (for example, ASCII hex "20"). No order of the words is implied. The attributes (other than structure) that are associated with the search term apply to each word in the word list. Any words in a word list may be explicitly truncated. (See "Truncation" -- section 2.5 below.) The relationship between the words in a word list is target-specific. Date 100 The day, month, and year when a transaction or (un-normalized) event takes place. The un-normalized search term is unstructured. Name 101 A name search term that is structured in a (normalized) particular order (e.g., last_name, first_name). The resulting term is subject to special matching rules on the target system that differ from those applied to names structured as phrases or unstructured names. Name 102 A name search term that is unstructured (e.g., (un-normalized) first_name last_name), however, the resulting term is subject to matching rules on the target system that differ from those applied to phrases or structured names (e.g., the term "john smith" might be searched by the target as "smith, j#"). Structure 103 The term has a structure that is either implied by the Use attribute or defined by the target. Urx 104 The term is a document identifier, for example, an identifier extracted from a Z39.50 URL. Free-form-text 105 The term is text, input by the end user. May be used, for example, for relevance feedback. Document-text 106 The term is text, extracted from a document. May be used, for example, for relevance feedback. Local-number 107 A number significant to the target. String 108 The entire term is to be treated as a string, rather than a sequence or set of individual words. Numeric string 109 The term is a character string that represents a number. 2.5 TRUNCATION ATTRIBUTES (TYPE = 5) The Truncation attribute specifies whether one or more characters may be omitted in matching the search term in the target system at the position specified by the Truncation attribute. For example, a word in a search term may be 1) right truncated, in which case the word is treated both as a complete word and as the beginning of a longer word; 2) left truncated in which case the word is treated both as a complete word and the ending of a longer word; 3) left and right truncated, in which case the word is treated as a complete word and the beginning or ending of a longer word; 4) and embedded truncation, in which case the word is treated as a complete word and as a longer word with additional characters at the point where the truncation symbol, "#", appears in the search term. For Right truncation, left truncation, and Left and right truncation, the characters affected by the truncation are determined by the value of the structure attribute. The Truncation attributes are the following: Structure Truncation Value Attribute Definition ---------------------- ----- --------- ---------------------- Right truncation 1 Word or Last word of term is Phrase right truncated. String Entire term is right truncated. Word list Each word is right truncated. Left truncation 2 Word or First word of term is Phrase left truncated. String Entire term is left truncated. Word list Each word is left truncated. Left and right truncation 3 Word or First word of term is Phrase left truncated and last word of term is right truncated. String Entire term is left and right truncated. Word list Each word is left and right truncated. Do not truncate 100 No truncation is to be applied. Process # in search term 101 The search term contains the symbol "#" (ASCII hex "23") to show where truncation will take place (e.g., "National H# Institute", or "d#on"). RegExpr-1 102 The term is in the form of a regular expression as prescribed by IEEE 1003.2 Volume 1, Section 2.8 "Regular Expression Notation". RegExpr-2 103 The term is in the form of a regular expression whose format is target-defined. 2.6 COMPLETENESS ATTRIBUTES (TYPE = 6) The Completeness attribute specifies that the contents of the search term represent a complete or incomplete subfield or a complete field. Completeness indicates whether additional words should appear in a field or subfield with the search term. Note the difference from Truncation (Section 2.5 above), which handles characters added to words, phrases, or strings. For the purpose of describing the Completeness attributes, when the expressions "field" or "subfield" do not have another understood meaning (as prescribed, for example, by the schema in use) these two expressions are used as follows: - "subfield" has no meaning, and the Completeness attribute incomplete subfield is used to mean "incomplete field". - "field" refers to the portion of the record to which the access point refers. The Completeness attributes are the following: Completeness Value Definition ---------------------- ----- --------------------------------- Incomplete subfield 1 Words other than those in the search term may appear in the subfield or field in which the term appears. Complete subfield 2 No words other than those in the search term should appear in the entire subfield in which the term appears, but additional words may appear in other subfields in the field. Complete field 3 No words other than those in the search term should appear in the entire field in which the term appears. 3. RULES AND GUIDELINES 3.1 DERIVED KEY SEARCHES (STRUCTURE ATTRIBUTE = KEY) If supplied, the following attribute values would be used for a derived key search. - Position should always be 'first in field', even for author/title or name/title use attributes. - Completeness should always be 'incomplete'. - Truncation should always be 'right truncation'. - Relation is always 'equal'. 3.2 NUMBER SEARCHES (e.g., LCCN, ISBN, ISSN, Control Numbers) - Structure is 'word' or 'phrase' depending on whether the number contains internal blanks. - Position and Completeness attributes are determined for number arguments as they are for textual arguments. - all naturally occurring blanks, hyphens, slashes, etc., should be in the number search term if possible because different systems handle numbers in different ways in their indexes. The target system should apply normalization to the number according to its requirements, or return appropriate messages to allow the user to reformat the number. 3.3 MISCELLANEOUS - search arguments generally should not be normalized by the origin system. They should be normalized by the target system. - Position attribute 'any position in field' is compatible only with the 'incomplete subfield' Completeness attribute. 4. NEW ATTRIBUTES The Z39.50 Maintenance Agency manages the addition of attributes to the bib-1 attribute set. Generally, suggestions for new attributes are posted to the Z39.50 Implementors Group list and discussed at a subsequent ZIG meeting before being included in the attribute set.