- 1 - MINUTES OF THE 6TH MEETING OF THE Z39.50 IMPLEMENTORS' GROUP Held at OCLC, Inc., Dublin Ohio, 8-10 September 1991. Attended by: Acadia University Glenn Davidson Ameritech Information Systems Martha Fivecoat Apple Computer Janet Vratny-Watts Steve Wolfe Carnegie-Mellon University Joseph Rafail Chemical Abstracts Service Bill Farel Les Wibberley James Lahm Dartmouth College Eric Bivona Dialog Information Systems David Loy Dow Jones Greg Baber DRA Jim Michaels Sean Donelan Duke University Juraj Horacek ESL Inc Denis Lynch FCLA Mark Hinnebusch (chair) Terry Sullivan Gaylord Information Systems Mark E. Duck LC Larry Dixson Ray Denenberg Mead Data Central Peter Ryall MIT Bill Cattey Tom Owens NeXT Computer, Inc. Jack Greenfield NOTIS Systems Sara Randall OCLC Ralph LeVan Penn State University Tun Chin Ellen Newman Princeton University Tom True RLG Rich Fuchs Jay Field Lennie Stovel Wayne Davidson Software Kinetics Joe Zeeman (minutes) for National Library of Canada Thinking Machines Ottavia Bassetti Brewster Kahle Harry Morris UC Berkeley John Kunze Cecilia Preston UC-DLA Margery Tibbetts Mark Needleman Clifford Lynch Michael Thwaites VTLS V. Chachra Cathy Winfrey - 2 - NOTE: Because a minute taker had not been appointed at that stage, introductions and status reports are not given here. 1. Draft 4 of Z39.50 version 2 Ray Denenberg distributed draft 4 of the standard (ZIG 91-17). He explained that this should be seen as "the published version minus 3": comments on this version would be incorporated into the version to be distributed for NISO ballot; after ballot comments had been resolved, a prepublication version would be issued prior to the published ballot. 2. Z39.50 Enhancements Tracking Mechanism Ray Denenberg distributed a draft list of known proposed enhancements to the standard (ZIG 91-24). Each had been assigned an editor, and where possible an existing ZIG enunciating the most recent proposal had been identified. Finally, an attempt had been made to assign the enhancement to a future version of the standard. It was hoped that version 3 would be a fully interoperable version of Z39.50 and SR. This included the enhancements that TC46 were working on. Other enhancements had been assigned to a putative version 4. It was hoped that TC46 would begin work on version 3 early in 1992. There was some discussion of the relationship of versions of the standard to implementations. Ray acknowledged that expediting successive versions of Z39.50 through NISO might prove awkward, and that implementors might want to agree on temporarily out- of-standard enhancements. 3. Maintenance Agency Registration of Implementors Ray Denenberg distributed the current version of the register of implementors (ZIG 91-21) and a set of instructions to implementors (ZIG 91-22), together with a registration form (ZIG 91-23). The formal register would be implemented in January 1992, and all implementors were required to formally register by then. The requirements were not onerous: although submission of a PICS is required, this need be little more than a blank PICS with a signature. - 3 - 4. Relationship of ZIG to NIST OSI Workshop Ray Denenberg introduced this item by explaining that he had been attending NIST OSIW meeting for a number of years and had recently been encourged by Pat Harris of NISO and Brian ??? of OSIW to aim for closer alignment of this group with the OSIW. He had given a talk on Z39.50 and ILL to the last meeting of OSIW which had been received with interest. It would be possible for the ZIG to be adopted as an OSIW special interest group, although some MIST rules would have to be waived. He invited the group to consider the potential benefits, among which was increased exposure of Z39.50. Clifford Lynch was concerned that the group should not rush into this. Ray suggested that if the next meeting were to be held in Washington, Brian ??? could be invited to address the group. The NIST OSIW Procedures Manuals was distributed for information the next day (ZIG 91-30). 5. Profile Documents Ray Denenberg distributed 2 related profile documents: a Z39.50 PICS Proforma (ZIG 91-18) and a ZIG PRL (ZIG 91-19). The PICS proforma was closely related to a draft prepared for ISO TC46, the SR PICS proforma. The PRL was a revisit to an earlier attempt to establish a ZIG profile. Ray led a walkthrough of the document and pointed out a number of areas which were still undecided. 6. Identifying Implementation Types John Kunze introduced this item. He wished for the group to informally identify a number of variations of support for Z39.50 over TCP/IP. He distributed a discussion document (ZIG 91-25). Mark Hinnebusch was concerned that a formal set of transmission versions might be misconstrued by non- technical observers as an acknowledgement that the standard lacked stability. After some discussion it was generally agreed that this was in fact a profiling issue. Ray Denenberg recommended that the model of ISO TR10000 be used for describing profiles of protocol stacks for supporting Z39.50. - 4 - 7. Z39.50 over TCP/IP Lennie Stovel introduced this item by inviting Clifford Lynch to discuss the message posted to the ZIG list. Clifford described how the ASN.1 encoding rules might be used to encode APDUs for transmission directly over TCP, without use of other presentation layer services or the session layer. It involved encoding the Z39.50 EXTERNAL types using the ASN.1 direct encoding method, whereby an object identifier was associated with each value of an EXTERNAL type. This would appear to remove the necessity for a prior agreement by implementations running over TCP. There was considerable discussion of this item, which also involved discussion of profiling issues. There was general agreement that TCP port 210 would be used for the exchange of ASN.1 encoded APDUs directly over TCP. Implementations using the ISO upper layers would use the RFC 1006 method (using TCP port 105). A profile involving use of a variant of RFC 1085 (the lightweight presentation service) might also be required. This item was revisited the next day. Wayne Davidson described how Z39.50 had been conceived, "from the ground up", as an OSI protocol, requiring Association Control and Presentation Layer services. He suggested that if the protocol were to be run over a different protocol stack, then this should be handled as a profiling issue. Any such profile would describe the services required from the supporting stack. Ray Denenberg was concerned that use of a non-OSI stack should not lead to pressures to change the standard to accommodate the different requirements imposed by that environment. He described his recollection of the January 1991 meeting as being that all implementors had agreed to implement at least a minimal stack of ACSE, Presentation and Session Layers. Clifford Lynch had a different recollection; that the agreement was to send the Z39.50 APDUs directly over TCP in the short term. Ralph LeVan and Eric Bivona added their recollection that there had been an agreement to move to full OSI stacks in an evolutionary manner. Wayne Davidson said that the lack of presentation services such as transfer syntax negotiation could be handled by implicit a priori agreements, as long as the range of databases being accessed was limited. - 5 - When there was a lot of diversity in the range of databases available, then the need for a negotiated environment would become important. Sara Randall was concerned that many implementors would find it difficult in the short term to implement OSI lower layers because they lacked the in-house expertise required. Mark Hinnebusch agreed that application developers should not be required to implement lower layer protocols; this should be a vendor responsibility. Brewster Kahle argued for the implementation of a stripped-down version of the supporting layers, which had also been discussed at the Berkeley meeting. All the commercial implementors wanted to get an implementation going as soon as possible. CAI, as well as Mead, would operate over a variety of protocols: SNA, APPC, X.25, TCP/IP, etc. Clifford Lynch suggested that one approach might be to issue an RFC specifying how Z39.50 over TCP/IP should be implemented. Ray Denenberg and Wayne Davidson were actioned to draw up a list of what features would be lost if the OSI upper layers were not used by implementations of Z39.50. 8. Profile Restrictions of Data Element Lengths Terry Sullivan introduced this item. The IBM OSI/CS ASN.1 compiler required prior assignment of maximum lengths to all data types. Of particular concern was the requirement to assign a maximum length (and therefore maximum size) to integer and to assign a maximum number of nodes to object identifiers. It was agreed that a maximum integer value of (2**32)-1 would be specified in the profile. This is the maximum unsigned integer value that can be accommodated in a length of 4 octets. Visible Strings would be limited to a length of 1024 characters. There was considerable discussion of the need to specify maximum lengths for Term and/or Query. Bill Farel pointed out that terms describing chemical structures could be very long (over 8 KB). It was - 6 - agreed that no maximum size would be established for term or query. Neither would maxima be established for sequences of database name, RPNquery or RecordOrDiagnosticSurrogate. 9. Test Bed Jack Greenfield introduced this item. He reiterated his earlier E-mail message: ---------- ----------------------------------------------------------------- Date: Tue, 3 Sep 1991 10:53:53 PDT From: Jack_Greenfield%NEXT.COM@VM1.MCGILL.CA To: Multiple recipients of list Z3950IW Status: OR Greetings, all. Here is another item for the agenda... Sometime after this upcoming ZIG meeting, but before the next one, we will be in a position to start interoperability testing in earnest. At present, however, the only vehicle for interoperability testing afforded by the ZIG is the opportunity to make ad hoc arrangements with other implementors. While this is certainly valuable, I would like to propose a second vehicle: a formal interoperability test bed sponsored by the ZIG. Several years ago, I had opportunity to observe the test bed that supported the PDES protocols developed under the auspices of the CALS initiative. This project is a compelling precedent for interoperability test beds. In practice, it proved to be more than a vehicle for testing, and effectively promoted the rapid and wide- spread acceptance and deployment of the PDES protocols by the target community. I think this project could serve as a model for a Z39.50 test bed. To make this happen, we would need to: (1) recruit a management agency; (For now, since we are relatively few in number, this would not be an overwhelmingly large job, so hopefully someone will be both willing and able to volunteer.) (2) compile a directory describing the participants; (For each entry, the directory would describe, at a minimum, protocol stack(s), addressing, connectivity, - 7 - and databases available.) and (3) define a test suite to be used in demonstrating interoperability. (This could be a massive undertaking, if carried to an extreme, but I believe that something far less ambitious than an exhaustive test suite could prove quite effective.) The management agency would be responsible for collecting and disseminating the results of tests, and for maintaining the directory of participants. Unfortunately, the management agency cannot be NeXT. Lest this be construed as laziness on the part of the author, let me point out that I am currently the only person at NeXT working on Z39.50 and WAIS, and that I am concurrently responsible for several other projects, including the native retrieval machinery (all the way from the query language and document model down to the transaction manager and access methods), a distributed platform retrieval architecture, and the integration of that architecture with the rest of the system software. J. _________________________________________________________________________ There was discussion of the problem of implementing a test bed origin which could interrogate a range of targets. It was suggested that targets might implement a standard test database to enable a client to exercise the protocol in a standard way. In discussion it emerged that most implementors felt it would be too difficult to agree on a standard database. The consensus was that the early target implementations would in practice serve as reference implementations for clients under development. 10. TC46 Work on Explain and Browse Ray Denenberg distributed the draft addenda to ISO 10162 and 10163 to support the browse service (ZIG 91-27, 91-28). Discussion was postponed to allow attendees to read the documents. The discussion of Explain was merged with the next item. - 8 - 11. Explain Clifford Lynch explained that he was supposed to be drafting a description of the explain service for both the ZIG and TC46. The text which had been distributed via E-mail was an initial attempt to do this. The text was distributed again at the meeting (ZIG 91-26) There were first of all a number of omissions in the draft which would need to be corrected: o<16>+ add a field to the database description for "sorting order" o<16>+ list primary access points for a database o<16>+ provide an estimate of the number of distinct values of a given use attribute o<16>+ provide a list of unsearchable values for a use attribute (i.e. the "stop list") o<16>+ provide a list of "most used" values for a use attribute, with approximate numbers of records retrieved for each Joseph Rafail said that CMU has an explain service for its CWIS, and that it had been found necessary to add most of the features described in Cliff's paper and in the above list. There was a discussion of the need to support eye- readable formats for much of the explain information. It was agreed that, initially at least, only ASCII formatted text would be supported, with the possibility of supporting PostScript formatted text as well for users like NeXt which support display PostScript. It was agreed that the model for the explain function would be a separate database, searchable by means of the standard search APDU. Record syntaxes and ElementSetNames would need to be defined and agreed. A number of further issues were raised: o<16>+ A transfer syntax was required for icons. It was agreed that this would be a particular TIFF class. - 9 - o<16>+ Character sets would be transferred as an OctetString with a parameter identifying the character set. o<16>+ Cost information units would be: connect time [N.B. needs to be more concrete, i.e. connect- second, connect-minute, etc.], per search, per record or free. o<16>+ Currency would be handled along the model of the ILL protocol: an amount encoded as a string plus a parameter indicating a currency. o<16>+ Copyright notices would be provided by means of a flag indicating the nature of the notice plus a text field containing the notice itself. o<16>+ All information would need an expiry date (a "best-before" date). o<16>+ A machine-processable flag to indicate "access restrictions apply" would be needed. CMU's explain facility already provided this. o<16>+ A new "explain" attribute set would need to be defined and registered. It was suggested that if an address to which to send new documents were provided, then Z39.50-based servers could be used to replace bulletin boards. During discussion of how implementors could indicate that they support the explain database, it was suggested that a new "options" bit could be assigned to indicate support for explain. Some implementors, NOTIS among them, would have preferred to implement the facility as a new application service, with separate APDUs. There was some discussion as to the best way to proceed, since the database approach involved providing access to a "database" that was actually a piece of software. This raised issues involving the mapping of attributes and records to the explain software. Most implementors, however, agreed to support the "well- known database" approach. One alternative suggested was that of using a different query type to access an explain database. There was some discussion of using the explain facility to implement a "directory of servers", along - 10 - the lines of that provided for wais. Sara Randall indicated additional attributes and parameters which would need to be part of explain: maximum number of result sets, a timeout value, a free text form of hours of availability. 12. Review of Resource Report Format bib-1 replacement Clifford Lynch introduced this item by referring to his earlier E-mail message with the proposed format (attached as Appendix A). The primary requirement still outstanding was a mechanism for indicating units for the various EstimateType values. In particular, a means for indicating currency would be required if this was to be made acceptable for ISO use. Ray Denenberg suggested that the most appropriate mechanism might be to add a parameter to the APDU [i.e. to the Estimate data type]. Adding parameters to this APDU was still possible because it was local to Z39.50 and had no impact for SR. The ISO standard for encoding currencies was suggested as the most appropriate mechanism. There was some discussion as to the need to support decimal values in currencies. Joe Zeeman was actioned to investigate the currency units defined in the ISO standard. 13. Resource Report Request Discussion went on to agenda item 26 as it was related to the previous. Clifford Lynch explained that the requirement was to provide a mechanism whereby the origin could instruct the target to send a resource control report. The current protocol did not allow an origin to request a resource control report. Ralph LeVan proposed that there be two additional APDUs, a ResourceReportRequest issued by the origin, and a ResourceReportResponse issued by the target in response to a request. Other possibilities were to add a resourceReportRequest parameter to the Init service, and/or to the Search and Present services. The advantage of Ralph's proposal was that it could be incorporated in version 2 of Z39.50 since it didn't impact SR. - 11 - There was a lengthy discussion of how to handle the additional complications introduced to the state tables by an origin-requested resource report. There were particular concerns about handling problems caused by cross-over of APDUS and distinguishing solicited ResourceReports from unsolicited ResourceControlReports. There was a proposal to add a flag to the ResourceReportRequest specifying that suspension of the current operation is required. It was agreed to revisit this item the next day. When discussion resumed, Rich Fuchs suggested that it might be possible to obviate the use of the ResourceReportRequest by allowing the origin to specify at initialization that the target should maintain 1 or more timers. Whenever one of the timers expired, the target would issue a ResourceControlRequest. Ray Denenberg pointed out, however, that there were other circumstances in which an origin might require resource usage information or estimates. For instance, a target might want a cost or time estimate before a search is performed. This would require the addition of a parameter to the Search [and Present] APDU. 14. Abort Current Operation It was agreed that this should be called "cancel". It had been suggested that it might be possible to use the ResourceReportRequest with a suspend flag to achieve the functionality of a cancel service, but it was pointed out that the two did not necessarily achieve the same result. Some systems treated a cancel as a "roll-back" command, removing all knowledge that the request had ever been sent. A suspend followed by a "do not continue" might not achieve the same effect (e.g. cancelling resources consumed, deleting the partial result set, etc.) Wayne Davidson was worried that the addition of these services at this stage might lead to substantial delays in approval of the standard by extending the balloting period and possibly requiring additional ballots. These services were not uncontroversial. Les Wibbereley, however, pointed out that the availability of these services would significantly improve the commercial appeal of the standard. - 12 - It was agreed that Ray Denenberg would draft the required changes to the standard to support both ResourceReportRequest and CancelRequest and send them for review by a limited number of reviewers. 15. Review of Diagnostic Message Set Clifford Lynch introduced this item by describing two problems with the current set of Diagnostic Messages: one involved the completeness of the set; the other concerned the structure of the set. There were a substantial number of additional error conditions required to be supported. Among these were: "Attribute set not supported", "Transfer syntax not supported", "Malformed query". The structural problem lay in the requirement to return structured data as part of the diagnostic record. This was not currently possible because the record contained only an Integer and a VisibleString. It was pointed out that it would not be possible to change the structure of the DiagnosticRecord type because this would create an incompatibility with SR. This led to a vigorous discussion of the necessity or otherwise of maintaining alignment between SR and Z39.50. Among the concerns raised were the extent to which a desire for compatibility should constrain the implementation of enhancements to Z39.50; whether SR or Z39.50 should be seen as the "leading" standard; the extent to which a quest for stability of a clearly imperfect standard would impede commercial use of the standard; to what extent the implementors could influence the standards development process, both nationally and internationally. Several pointed out that the group had already had a significant impact on both the national and international standards-making bodies. Jim Michael pointed out that members of the group could ensure their influence by joining and participating in NISO. The conclusion for the DiagnosticRecord was that structured data would have to be sent in the addinfo parameter, using out-of-protocol agreements to provide structure. This had already been proposed for the ElementSetNames parameter. Clifford Lynch would make suggestions for conventions to be used in the parameter, and would also provide additional values for the diagnostic set. - 13 - 16. Proximity Operators Mark Hinnebusch distributed a updated version of the proximity operator text (ZIG 91-29). He proposed this as a version 2 modification of the standard. Ray Denenberg was certain that this had not been considered for version 2. Others disagreed; this would have to be in version 2 since a number or implementors intended to support it. There was a discussion of the details of the text, and a number of revisions were proposed. Mark would issue a revision of the text (received the next day - ZIG 91-34). Clifford Lynch raised the question of error messages relating to proximity operations. It was clear that many more diagnostic values would be required. Ralph LeVan offered to collate all target implementors' existing error messages into a single set of error conditions. These would be put into a reserved block of values beginning at integer 1000. All implementors were asked to send him the full text of all their error messages. 17. Proximity vis-a-vis Result-Set Models This discussion was a continuation of that from the previous meeting. It was generally acknowledged that support for proximity operations could not be fully reconciled with the database and result-set model described in the standard. Further work was needed to reconcile the two. 18. Code-language vs content-language USE attribute Clifford Lynch introduced this item. A number of use attributes consist of a set of values, e.g. language, which imply a code. Joe Zeeman had proposed that it might be appropriate to separate the encoding of the value from its use, so that an origin could utilize a language use attribute without needing to have prior knowledge of the encoding scheme. There would thus be a USE attribute for, say, language of text, which might be associated in a query with a structure attribute of "encoded value". Ray Denenberg pointed out that this would introduce an incompatibility with SR. One additional problem was that it might be necessary to negotiate, or else bilaterally agree on what encoding scheme was being - 14 - used. After some discussion it was agreed that this would be treated as a profiling issue, and that the additional attributes would be added to the local, Z39.50 part of the attribute set. Joe Zeeman was actioned to list the coded use attributes and recommend agreed encoding schemes for each. A related issue was how to transfer information about the language of the search term itself. A few retrieval systems currently supported multi-lingual retrieval. Associated with this was the problem of handling character sets in a query. Wayne Davidson described the problem of ASN.1's poor handling of character sets. The interim solution would have to be a profile agreement, possibly using escape sequences to change character sets within the octet string. 19. Persistent Result Sets Peter Ryall distributed a paper describing the requirements for persistent result sets and proposing a solution (ZIG 91-31). In discussion it emerged that there were a number of apparent inconsistencies between this proposal and the underlying model of Z39.50. One was the proposal to add and replace records in a result set; this required a database update service that was currently explicitly excluded from the standard. One suggestion was to devise a mechanism whereby an origin could specify that a result set should be turned into a database. It was pointed out that this proposal raised major security issues and major modelling issues. The fact was, however, that the commercial information vendors required such a service. Ralph LeVan indicated that OCLC had a working system and he would be glad to give details of his implementation. 20. Periodic Query Service Reter Ryall introduced his paper describing protocol extensions required to support a periodic query service (ZIG 91-32). A continued discussion of the modelling problem ensued. There was some disagreement as to whether this was in fact fundamentally different from persistent result sets or not. It certainly required some of the same enhancements to the protocol, such as asynchronous operations, and persistent, persistent associations and persistent result sets. - 15 - 21. Financial/news attribute set Peter Ryall introduced this item and the associated paper (ZIG 91-33). The proposed set of attributed led to considerable discussion of the relative merits of supporting a single or multiple attribute sets. It was clear that the bib-1 attribute set was rapidly becoming unwieldy. There were, however, a number of difficulties associated with supporting multiple attribute sets. Among these were the fact that some databases would require support of two or more sets simultaneously; queries of multiple databases might require multiple attribute sets; the problem of spreading "knowledge" of the attributes supported through the user population, etc. Ralph LeVan suggested that it might be feasible to define a set of "core" attributes, which were common to most applications, and would be required to be present in all attribute sets. Jack Greenfield suggested, as an alternative, that it might be desirable to support the use of multiple result sets in a query, e.g. a core bibliographic set and a sector-specific set. There was some discussion on the possible mechanisms for accomplishing this.