Notes of the fifth Z39.50 Implementor's Group Meeting, held at the Library of
Congress, 20-21 May 1991. 
 
Prepared by Joe Zeeman, Software Kinetics Ltd. 
 
Attendees: 
 
Apple Computer                          Eric Roth 
                                        Janet Vratny-Watts 
Central Intelligence Agency             Mark Zimmerman 
Chemical Abstracts International        Les Wibberley 
CNRI                                    David Ely 
Dartmouth College                       Eric Bivona 
Data Research Associates                James Michael 
                                        Sean Donelan 
Duke University                         Boris Vychodil 
                                        Juraj Horacek 
Dynix, Inc.                             Steve Jaynes 
Florida Center for Library Automation   Mark Hinnebusch 
Library of Congress                     Ray Denenberg 
                                        Larry Dixson 
                                        Ralph Orlik 
                                        Kaushi Belani 
Maxwell Online                          Oren Sreebny 
Mead Data Central                       Peter Ryall 
National Library of Medicine            Ed Sequira 
NeXT Computers                          Jack Greenfield 
NOTIS Systems Inc.                      Sara Randall 
OCLC                                    Ralph LeVan 
PLS Inc.                                Larry Fitzpatrick 
PSI                                     Wengyik Yeong 
Research Libraries Group                Richard Fuchs 
                                        Lennie Stovel 
                                        Jay Field 
Software Kinetics Ltd.                  Joe Zeeman 
Sun Microsystems                        Andy Bensky 
Thinking Machines, Inc.                 Brewster Kahle 
University of California, Berkeley      John Kunze 
                                        Cecilia Preston 
University of California, Division      Clifford Lynch 
   of Library Automation                Mark Needleman 
                                        Michael Thwaites 
                                        Margery Tibbetts 
Virginia Tech (VPI&SU)                  Carol Terry 
VTLS, Inc.                              Cathy Winfrey 


1.  Status Reports 
 
      Participants introduced themselves and gave status reports of work
going on at their organization. 
 
        1.  FCLA have received a Title II grant to implement Z39.50 over OSI. 
                 The IBM OSI/CS product was being used, and the application is
                 being implemented using the C language. 
 
        2.  UC-DLA is implementing Z39.50 over TCP/IP.  An early application
                 will provide Penn State with an origin to access the Melvyl
                 target. 
 
        3.  DRA are currently debugging their OSI protocol stack.  They have
                 implemented ISODE 6.8, and have found it to be a distinct
                 improvement over version 6.0. 
 
        4.  Software Kinetics reported that the National Library of Canada
                 intends to begin work soon on a project to develop an SR
                 kernel, which will be available for use in the public domain,
                 much as the ISODE is an OSI kernel in the public domain.  This
                 project is intended to encourage users in Canada to use the
                 protocol as soon as possible.  The National Library intends to
                 initiate a Canadian implementors' group similar to the ZIG. 
 
        5.  PSI has a prototype of version 2 working.  No further development
                 is taking place at present. 
 
        6.  Mead Data Central are not yet in the implementation phase.  They
                 have commercial needs to support multiple protocol stacks,
                 such as TCP/IP, OSI and SNA, running in multiple environments.
  
        7.  Thinking Machines have reached a "good crystallization point",
                 with public domain versions of their WAIS server available on
                 27 internet hosts, and origin implementations available for
                 Next, Macintosh, X-Windows and GNU Emacs.  There is
                 considerable public interest, and articles on WAIS have
                 recently been published in Byte, MacUser and Release 1.0. 
 
        8.  NOTIS Systems currently have interface coding in hand and are
                 about to start coding the Z39.50 protocol machine.  They hope
                 to speed up the timeline of their full Z39.50 implememtation. 
 
        9.  Dartmouth College hope to link their Campus-Wide Information
                 System to resources outside the campus using Z39.50. 
 
       10.  OCLC have a working implementation of the 1988 version of Z39.50. 
                 Current development is awaiting an actively interested
                 partner. 
 
       11.  UC Berkeley currently have most of the search and init facilities
                 completed. 
 
       12.  VTLS Inc was present as an observer only. 
 
       13.  The CIA was present as an observer only. 
 
       14.  Maxwell Online was present as an observer only. 
 
       15.  NeXT Inc. are working on a transducer based information
                 architecture and the Z39.50 protocol seems closely related to
                 this.  They are using Thinking Machines WAIS implementation. 
 
       16.  Chemical Abstracts International are planning an international STM
                 ("Scientific, Technical Medical") document retrieval project,
                 for which SR is intended to be used. 
 
       17.  Dynix was present as an observer only. 
 
       18.  The Research Libraries Group has designed a technical architecture
                 running under their Orville system, and are upgrading the
                 testbed originally used for testing the LSP protocols to
                 support Z39.50.  They are at present negotiating an
                 agreement with OCLC to offer a bi-directional Z39.50 link. 
 
       19.  The Library of Congress are replacing their LSP software with a
                 fully OSI conformant Z39.50 implementation, using the IBM
                 OSI/CS product to implement the lower layer services.  IBM
                 have recently identified their prioritization of the 
                 identified bugs in their OSI/CS, as follows:  support for
                 recursive definitions in ASN.1 has a high priority; support
                 for the "non-encoded form" of external data types has a
                 high priority; presentation layer context switching has a low
                 priority; and OSI/CS over TCP/IP will not be supported. 
 
       20.  Duke University is developing an interface to OCLC's Newton
                 database engine using Z39.50 over a TCP/IP connection.  The
                 user interface will be based on Microsoft Windows. 
 
       21.  Virginia Tech was present at this meeting as an observer only. 
 
       22.  Sun Microsystems has started implementing Z39.50.  They are
                 currently in the process of formalizing a specification based
                 on the protocol. 
 

2.  Z39.50 Version 2, Draft 3 
 
      Ray Denenberg distributed the third draft of Version 2 of Z39.50 (ZIG
91/08) and walked through the significant changes. 
 
      One change which led to discussion was in the version numbering.  In
order to support interworking of Z39.50 and ISO SR implementations, this
version was designated as version 2 of Z39.50 (199x).  There is additionally
a notional version 1 (actually SR) which all implementations of version 2 are
expected to support.  It was suggested by some that greater consistency would
be achieved by calling this version of Z39.50 version 1.  Ray undertook to
examine the implications of this and to clarify the text of the first
paragraph of section 4. 
 
  3.  Attribute Set bib-1 
 
      Larry Dixson distributed a draft of an "informational appendix"
describing the nature and use of the bib-1 attribute set (ZIG 91/09).  There
was some discussion about the desirability of adding more attributes to this
set, as well as adding more attribute sets.  In the end it was decided to
stabilize the attribute set for this version of the protocol. 
 
      Detailed discussion of the document was postponed until Tuesday. 

      Brewster Kahle suggested the addition of a "best section" use attribute
for use in retrieval from full-text databases.  On discussion it was
determined that this was more properly an element set name. 

 
4.  Future Work items for ISO TC 46 
 
      Sally McCallum reported on the recent TC46 editing meeting.  The final
texts for ISO 10162 and 10163 (SR) had been completed and all comments had
been satisfied.  The texts would go to France by the end of June for
translation and should complete the standards 
process by late autumn. 
 
      The meeting had also considered future work items, including the
development of a test suite, and expansions to the protocol.  Proposed new
work items included resource and access control, and request batching. 
 
      Liv Holm of Norway had been working on browse, and had identified 4
distinct kinds of browse: simple browsing of indexes, simple browsing of
result sets, navigational browsing of indexes and hierarchical browsing of
databases.  It was decided that only the first two would be included in the
current work item. 
 
 
5.  Maintenance agency report 
 
      Ray Denenberg reported on the activities of the Z39.50 maintenance
agency.  Most work continued to center on preparing version 2 of the standard. 
In addition, a technical report and implementors' guide had been proposed. 
Ray would undertake to do the technical report, which would include a
discussion of object identifiers, the application layer structure, etc.  He 
wanted help from actual implementors in preparation of the implementors'
guide. 
 
      During discussion of the various sources of information about Z39.50,
Brewster Kahle undertook to put the ZIG archives on Thinking Machines' WAIS
server. 
 
      Ray distributed a new version of the implementors list (ZIG 91/10), and
indicated that it would be made publicly available toward the end of the year. 
In order to be kept on the list, it is essential that implementors send a
brief description of their implementation to the maintenance agency. 
 
      The maintenance agency is also charged with coordinating work on the
development of testing procedures and test suites for Z39.50 [there was more
discussion of this at the end of the meeting]. 
 
      The maintenance agency was responsible for reflecting the various US
positions on SR to ISO TC46 and had co-ordinated the work on preparing the US
position paper on Canada's batch proposal. 
 
      The current version of Z39.50 would not go to ballot until the late
autumn, so there would be at least one more ZIG meeting before the text and
attribute sets, etc. would have to be finalized. 
 
 
6.  Shortening top-level ASN.1 identifiers 
 
      John Kunze introduced this item, and distributed a background paper for
this and the next 6 agenda items (ZIG 91/11). 
 
      It was agreed that the ASN.1 module name would be shortened to "IR" from
"ANSIZ39-50-2".  A number of people felt that this should be a local matter,
however, and it was pointed out that this change would still result in
compiler generated variable names 
longer than 30 characters. 

 
7.  Distribution of machine-readable ASN.1 specification 
 
      Although, again, this was felt to be properly a local matter, it was
recognized that errors would be minimized if implementors with ASN.1 compilers
could use the standard module as maintained by the maintenance agency. 
 
      The difficulty of editing a single ASN.1 text to allow both reasonable
printed output and machine processing was pointed out.  However, Ray Denenberg
undertook to maintain the ASN.1 specification with a line length of no more
than 71 characters (required by those with IBM compilers). 
 
      It was also pointed out that a machine-readable text of the standard
would only be available until it became a full NISO-approved standard.  NISO
is dependent on standard sales for its operating revenue. 

 
8.  Clarification of Association creation and termination 
 
      The conclusion of the discussion was that there was agreement on the
need for a new "CLOSE" service for Z39.50, which would inform a target that
the origin wished to cease any further exchange of Z39.50 APDUs, but that the
association should remain in place to 
allow other ASEs to be invoked. 
 
      Ray added that this had been raised by the Germans at the TC46 meeting
and that it would be the subject of a new work item.  All parties therefore
recognized the 
requirement for such a new service.  It would be in addition to the existing
Abort and Release services. 
 
      The wording of section 4.2.1.2 of the draft standard would be
re-examined to see if the relationship between the service user, the ACSE and
Z39.50SE could be clarified. 
 
 
9.  Role of Reverse Polish Notation in the RPNQuery 
 
      The new text in section 3.2.2.1.1 of the standard clarified much of
this.  It was pointed out that the protocol does not require a stack, nor
specifies an implementation of a stack.  It describes an abstract stack to
indicate the way a query is assumed to be processed at the target system. 
 
 
10.  Order of evaluation of non-commutative operators 
 
      This problem had been solved by the new text in the standard and the
informative appendix on bib-1. 


11.  Wild card language specified via attribute type/value pair 
 
      John Kunze proposed the definition of new relation type attributes to
support wild card matching of various kinds, including UNIX-style regular
expressions.  In discussion it was decided that simply defining additional
attributes was not sufficient; additional error states would be required as
well, together with an explicit statement as to how this 
wildcarding would work.   John agreed to submit a proposal to the list. 
 
 
12.  Effect of ASN.1 style on implementations 
 
      This led to a discussion of ASN.1 style.  There was some agreement that
the use of indirectly defined types had some advantage in terms of ease of
maintenance of the protocol: when the definition of ReferenceId is changed
only a single definition need be changed, rather than every occurrence of the
type.  Some implementors felt, however, that many of these named types were
used sufficiently infrequently that indirection could safely be removed.  It
was recommended that implementors who felt strongly about this should use
hand-optimization. 
 
      A related problem is that there are a few instances in which named
values have not been given to types (see, for instance, p. 27 of draft 3:) 
 
                      SEQUENCE { 
              [0] IMPLICIT DatabaseName OPTIONAL, 
 
      It was agreed that this was probably an error that should be looked at. 
 

13.  CNRI Knowbots project with NLM 
 
      [This item was brought forward for discussion in David Ely's presence]. 
 
      David Ely described the relationship of this project with Z39.50.  The
Knowbots project is a joint project of the National Science Foundation and the
National Library of Medicine.  It is part of the NSF's Digital Library
project.  The ultimate goal is the creation of "knowbots", which are seen as
information retrieval agents that actively, and autonomously, seek information
required by individuals or groups. 
 
      The current intention of the project is to make use of Z39.50 as far as
possible, but it is not yet sure how well the protocol fits into the knowbot
model and canonical query language. 
 
      Appendix A presents an extract from the Merit LinkLetter, vol. 4 no. 1
(March-April 1991), describing the Knowbots project. 
 
 
14.  New Structure Attribute for Personal Name 
 
      Michael Thwaites introduced this item by pointing out the difference
between searching for a name as a phrase and as a structured term.  Melvyl,
for instance, allows a searcher to indicate the boundary of a surname by using
a comma (Thomas, John).  This form of name is subject to special matching
rules, and is treated very differently from a name entered as a 
phrase (John Thomas or Thomas John).  There therefore appears to be a
requirement to be able to indicate that a search term is structured as a name.

 
      As an alternative it was suggested that it might be preferable to define
"first-name" and "last-name" search attributes, but this was objected to on
the ground that it would have to be duplicated for each existing name use
attribute. 
 
      In the end it was decided that there should be a new attribute of type
"structure", called "formatted name", and that the intersystem format should
be that specified for a name in AACR2.  This has the advantage of being a more
general specification than simply personal name, and will allow dates to be
included, as well as allowing corporate names to be transferred. 

 
15.  New value for all attribute types called Not-applicable 
 
      Michael Thwaites introduced this discussion as well.  UC-DLA had
identified a requirement to be able to inform a target system that a
particular attribute type was not applicable in this query. 
 
      In discussion it became clear that the requirement was to be able to
indicate that the target system should not use its default in the query.  This
applied particularly to the truncation attribute type.  There was, however,
a new "do not truncate" value.  It seemed meaningless to specify that a system
should not use its default, without specifying what it should use instead. 
The conclusion was that this attribute value was not needed. 
 
 
16.  Tighten up position and completeness attribute descriptions 
 
      This was deferred until discussion of the attribute set. 
 
 
17.  List of extensions 
 
      Les Webberly asked if there was a list anywhere of proposed extensions
to the protocol.  Ray Denenberg said that there was no such list, but that
there should be and that it was a maintenance agency function.  It was
requested that individuals who had or wished to request extensions should let
Ray know. 
 
      Clifford Lynch mentioned a number of new attributes which were required
to support full-text retrieval.  These included use attributes of "byte",
"line" and "document-id".  This led to a lengthy discussion of document ids
and types and their relation to the Thinking Machines' WAIS implementation. 
One particular concern centred around whether presentation of a sub-document
should be requested by means of a special search or by means of a special
element set name.  There was similar discussion as to whether a document type
was a necessary search attribute or a record syntax name.  In the end it was
agreed that Brewster Kahle would prepare a brief presentation on the
requirements to support WAIS. 

      One conclusion which emerged from the discussion was that the elements
of the WAIS document id were already present in the Z39.50 protocol, with the
exception of a means of identifying the system which 'owned' the number.  It
was agreed that this could most easily be 
accommodated through the addition of an attribute to indicate the number
owner.  This was subsequently generalized to be an "authority format
indicator". 
 
 
18.  Connectionless Z39.50 service 
 
      Peter Ryall distributed a document outlining Mead Data Central's
requirements for the next two items (ZIG 91/12).  Ray Denenberg pointed out
that this, or some other form of asynchronous operation was seen by TC46 as
a general need.  This item was closely connected with the following one. 
 
 
19.  Target driven periodic query invocation 
 
      This was required to support the many "clippings", Current Awareness and
SDI (Selective Dissemination of Information) services operated by commercial
vendors.  There are a number of problems with such a service.  Among them are,
how to handle presenting the result, how to notify the user that new results
are available, how to model the service, i.e., whether to model it as target
driven or as origin driven, whether and how to use external delivery
mechanisms. 
 
      It was clear that there is a requirement to support such a service, but
much further work will be required.  Ray Denenberg proposed that the problem
of persistent result sets be treated as a separate work item, since, it was
required by a number of new or extended services. 
 
      There was much discussion as to whether the service should be modelled
as a target initiated service, or whether the origin should be expected to
poll the target on a periodic basis.  No clear conclusion was reached, and it
was evident that much more work is required on this proposal. 
 
      It was, however, pointed out that an SDI-like mechanism might also meet
the Canadian requirement for batch searching. 
 
 
20.  Canadian batch proposal to TC46 and US. position. 
 
      Ray Denenberg distributed the two papers which had been issued at TC46
(ZIG 91/13 and ZIG 91/14).  Discussion of this item had been subsumed under
the previous item. 
 
 
21.  Redirection of associations for bridging 
 
      Mark Hinnebusch introduced this item by saying that FCLA intended to
support both OSI and TCP/IP protocol stacks, and suggesting that institutions
building applications over different stacks may want to act as a bridge to
enable otherwise incompatible systems to interoperate.  The question was, what
would be the most appropriate mechanism to provide such a bridging service? 
 
      Several possible mechanisms were described in discussion.  One would be
for an application to act as an APDU relay between incompatible systems.  A
difficulty with this approach is that it does not allow an end-to-end
presentation context to be negotiated, nor does it support an end-to-end
application context. 
 
      Another approach would be to use the transport layer bridge described
in RFC 1006.  This approach, which provides the OSI Transport class 0 over
TCP, is expected to be widely implemented, and allows the upper layers to act
as if they are in a pure OSI environment. 
 
      However, many of the early Z39.50 implementations will send raw Z39.50
APDUs directly over TCP, without using the OSI upper layer services.  Such
implementations will not be able to use the RFC 1006 approach.  On the other
hand, a simple application relay may be relatively easy to kludge together. 
 
 
22.  Proximity review 
 
      ZIG91-006, defining operators for proximity searching was revisited and
confusions and misunderstandings ironed out. 
 
      The definition of "distance" was revised to read "Distance is the
difference between the ordinal position values of the two operands." 

      The xor operator was deleted, since its effect could be achieved by a
combination of other operators. [(A and-not B) or (B and-not A)]. 
 
      There was a lengthy discussion of whether the Z39.50 model allows
information relating to the position of a search term to be part of the result
set. 
 
      It was suggested that there should be no difference between using a
complex query of the form (dogs and house) or (cats and house), and using
multiple queries and operations on result sets: 
              1. dogs 
              2. cats 
              3. 1 or 2 
              4. house 
              5. 3 and 4 
 
      The intermediate results created during the processing of the single
complex query are identical to the multiple result sets. 
 
      OCLC's result sets allow them to be used in proximity operations by
including a number indicating the ordinal position of the term in the record
with the record id.  This allows a result set to be used an argument with a
proximity operator.  It is not, however, the same as keeping the original
query with the result set. 
 
      Ray Denenberg explained the position of the original drafters of the
standard that a result set could not include such information. 
 
      The discussion went on to consider whether the current definition of a
result set allowed multiple postings of a single record to be present.  There
was nothing in the wording of the standard which disallowed this, and indeed,
it had been recognized by the original drafters that some systems may only be
able to operate in such a way.  It was pointed out that duplicate postings
would be required in retrieval from full text databases. 
 
      One problem identified was that there was no model given of the use of
a result set in a query, only of its use in presentation. 
 
      Clifford Lynch pointed out that there were potentially serious problems
in allowing duplicates to be either implicitly or explicitly present in a
result set.  For instance, it could lead to the situation where result sets
that present themselves identically would behave differently when used in a
query. 
 
      It was agreed that this was a problem which could not be resolved during
the course of the meeting, and that revisiting the result set model would be
an agenda item for the next meeting. 
 
      It was pointed out that the set of operators could be simplified by
moving the and-not-prox operator into the sequence defining proximityOperator. 
That is, the sequence could contain a boolean value indicating a "not"
operation. 
 
      Mark undertook to revise the document and reissue it as ZIG91-15. 
 
 
23.  WAIS requirements 
 
      Brewster Kahle made a presentation on the attribute and query
requirements to allow WAIS to conform to the standard.  He distributed a paper
describing the WAIS document identifier (ZIG 91/16) 
 
      His proposal involved the use of query attributes to specify the portion
of a document which was to be returned in response to a query [note: the WAIS
implementation normally involves two independent queries, one of which returns
a list of document ids in response to a "normal" search, and the second using
one of those document ids to retrieve all or part of a 
single document.  WAIS searches do not expect or use a result set maintained
by the server and do not use a separate presentation service].  In the
proposed scheme, byte or line position would be searchable, and only those
portions of documents meeting the criteria would be returned.  A query to
return a portion of a document might specify, for example, document id and a
byte range: 
 
      "AFI=WAIS" AND 
      "LocalSystemNo=DowJones-server.A-Database.DocumentNo.12345" AND 
      "Bytes<2000" AND 
      "Bytes>0". 
 
      Many in the group were uncomfortable with the idea of using a special
kind of search to return a subset of a document.  It was suggested instead
that the ElementSetNames parameter should be used to specify the subdocument
to be returned.  Since this parameter was currently defined to be a Visible
String, it would be impossible to define a structure for its use in 
version 2 of Z39.50, and WAIS would need to define string values externally
to the standard (e.g "BYTES 0-2000"). 
 
      The definition of a structured means for specifying segmentation of
large records for presentation purposes could in the meantime form a work item
for version 3. 

      Another issue raised was that of document type.  Brewster's original
proposal included an additional search attribute "document-type".  In the
course of a very lengthy discussion, it was determined that, although document
type could legitimately be used as a search attribute (e.g. "restrict the
search to only those documents of type 'WordPerfect'"), its use by WAIS was
actually to specify a syntax for presentation of the document.  It was
therefore felt that the 
PreferredRecordSyntax parameter was more appropriate for this information. 
 
      A WAIS search also required additional use attributes, notably an
attribute for "items about" and an attribute for relevance feedback.  The
first attribute was determined to be different from "subject" in that the
target is free to apply various tests of relevance 
and to substitute synomyms, etc.  It was decided to call it "conceptual
content".  The second attribute identifies a document relevant to the search
and asks for other items like it.  It is therefore a document id in form, and
a subject or contextual attribute in meaning. 
 
      Another issue raised by WAIS was the need for extensible APDUs, to allow
experimental data to be transferred between implementors.  It was suggested
that there already was an agreement to include the UserInformation field in
every APDU.  Upon examining minutes it was decided that the issue had been
raised, but never resolved.  It was now too late to include 
this in version 2 of the standard, but would be proposed for inclusion in
version 3. 

      One problem with this kind of extensibility was the danger that it would
lead to multiple, mutually unintelligible dialects of the protocol. 
 

24.  EXPLAIN service 
 
      Clifford Lynch presented the current status of work on the explain
service.  Three models have been proposed.  Liv Holm has prepared a first
draft of Explain using a separate protocol service; Cliff earlier proposed the
use of a special database - WAIS currently uses such a 
database; Sun Microsystems has recently proposed handling explain by means of
special element set names, which would return information relating to the
database being searched, attribute sets, etc. 
 
      Each of these models has advantages and disadvantages.  The separate
service as currently defined does not scale very well, since each kind of
explanation is predefined.  The special database requires the specification
of additional record syntaxes as well as a new search attribute set.  It also
presents the problem of requiring a target to keep multiple result sets, one
for the actual search and one for the explain search.  The use of element set
names does not handle more global requests for information well, such as
provision of a list of all databases. 
 
      The consensus of the discussion was that the special database approach
offered the most flexible way of implementing an explain service, though
neither of the other models were precluded by using a database.  There were
also a number of problems in implementing a database.  For instance, it would
not be sufficient for a target simply to give a list of attributes 
supported for each database; what the origin requires to know is what
combinations of attributes are supported for a database. 
 
      Similarly, an explain service based on a database could not easily give
an end user guidance or assistance in the form of context-sensitive help. 
This was, however, seen to be a different problem from Explain, and subject
to separate investigation. 

      This item led to a renewed discussion as to where the responsibility for
"fuzzying" a search lay.  The consensus was that this was a responsibility of
the origin system. 
 
 
25.  Z39.50 implementation recognition of SR Object Identifiers 
 
      Ray Denenberg described and explained the Object Identifier tree
structure defined by the ASN.1 standard, and how both SR and Z39.50 would fit
in.  Most implementors indicated that they would ensure their implementations
recognized both SR and Z39.50 object identifiers. 
 
26.  Local record syntaxes 
 
      The related question of how and where to register local record syntaxes
was also discussed, at some length.  While the OId tree descending from Z39.50
(ISO Member-body USA Z39.50 abstract-syntaxes) could easily be extended to
allow locally-specified record syntaxes (... local <organization>
<local-syntax-id>) there was some question as to how many of these local
syntaxes would and should be defined, whether the Z39.50 tree was the most
appropriate place to register them, and whether the existence of such local
syntaxes would compromise interoperability. 
 
      An additional problem was that of transfer syntaxes associated with
these abstract syntaxes, where these should be registered, and how their use
should be negotiated for any given document retrieval.  After much discussion,
it was decided to postpone any decision until more thought could be given to
the matter, with discussion over the list, etc. 

 
27.  Comments on Draft 3 of Version 2 
 
      There was an inconsistency in the standard in the use of "null result
set" and "empty result set".  Ray would correct this. 
 
      The wording of part of section 3.2.2.1.2 still suggested that a query
using a result set as an argument would always be unsuccessful: "...  prior
to processing the query, the existing result set whose name is specified by
the parameter Result-set-name will be deleted ..."  Ray undertook to find a
better wording. 
 
      The meaning of Maximum-record-size was clarified: it specifies the
maximum size of a data record.  The message containing this maximum size
record will be larger, and systems must make allowance for this. 
 
      The definitions in Appendix F still required units.  Clifford Lynch
undertook to send a proposal to the list. 
 
 
28.  Comments on bib-1 attribute set 
 
      Clifford Lynch raised a number of concerns.  Among them were the
structure types "year" and "date".  The problem with year was that it did not
allow for the transmission of BC dates, or dates using any other calendar. 
The specification of date used an ISO standard format.  This , however, placed
all the processing burden on the origin.  It was desirable to allow the
transmission of an unnormalized date as well, which the target could process. 

      In the course of discussion it became apparent that there was a
requirement to specify the encoding standard(s) used in search terms.  The
only standard which was mentioned was the prohibition on the use of ASCII
space in the year structure.  Failure to specify the encoding standard would
lead to interoperability problems if one system used ASCII encodings while 
another used EBCDIC. 

      An additional use attribute for "body-of-text" was agreed upon. 
 
 
29.  APDU test suite 
 
      In discussion of this item a distinction was made between a test suite,
an "exerciser" and a reference implementation.  A full test suite would be
expensive and complex to develop, and it was generally agreed that what was
immediately required was a set of APDUs which could be used to exercise
implementations as they were developed and debugged. 

      Jim Michael reported that the question of test suites was currently
under study by the Standards Development Committee of NISO.  The committee
would not be meeting again until after the next ZIG meeting, so comments and
requirements could be directed through this group. 

      It was agreed that Michael Thwaites of UC-DLA would act as the focal
point for exercise APDUs. 
 
 
30.  MARBI proposal for MARC record for systems 
 
      Mark Hinnebusch raised this item by mentioning that MARBI has recently
issued a discussion paper on a MARC format for describing information systems. 
The paper had been posted to the list. 
 
 
31.  Any other business 
 
      Andy Bensky from Sun gave a very brief (the time being 3:00 PM)
description of the implementation being planned by Sun. 


32.  Next meeting 
 
      The next meeting was fixed for 9-10 September at OCLC in Dublin, Ohio. 
The possibility of beginning the meeting on Sunday the 8th, and running for
three days was suggested. 
 

                                APPENDIX:  KNOWBOTS 
 
      The following is an excerpt from the Merit LinkLetter, vol. 4,  no. 1
(March-April 1991): 
 
      KNOWBOTS(TM) DELIVER THE GOODS 
 
      Have you ever wished for a little robot to sit at your 
      workstation and perform the tedious chore of finding 
      and retrieving information from databases distributed 
      around the world? 
 
      Well, hold onto your keyboards, because the 
      Corporation for Network Research Initiatives (CNRI) is 
      working on a project which is the debut of exactly 
      that kind of tool-one which will automate the 
      searching of multiple disparate databases. CNRI has 
      been working with the National Library of Medicine 
      (NLM) and the National Science Foundation (NSF) on a 
      utility for database searches in the Medline databases 
      of the NLM. 
 
      All the databases are now accessed via public networks 
      but two of them, ELHILL and TOXNET, will soon be 
      accessible over the Internet, perhaps as soon as June 
      of this year. The work currently being done at CNRI 
      targets the electronic databases at the National 
      Library of Medicine's Lister Hill Center, known as the 
      MEDLARS system. 
 
      The prototype NLM Multiple Database Access Project was 
      demonstrated recently at the American College of 
      Radiology conference at the Lister Hill Center. The 
      Multiple Database Access Project is part of a larger 
      CNRI project called Digital Library Systems and 
      applies the Knowbot technology of the DLS to the 
      Multiple Database Access effort. 
 
      Initial project goals nearly complete 
 
      At the outset of the project, a number of goals were 
      defined which included: 1) providing parallel access 
      to NLM's multiple databases, 2) extending the NLM's 
      form-based user interface for Mac's and PC's 
      (Grateful-Med) to UNIX workstations, 3) supporting 
      non-text information retrieval, and 4) supporting 
      Internet access to the Medline databases. Three of the 
      four are nearly complete. The fourth, supporting non- 
      text information retrieval, is currently under 
      investigation. 

      The heart of the project 
 
      At the heart of the project is the "Knowbot(TM)," 
      (KNOWledge roBOT) an active, intelligent program which 
      acts on behalf of the user to carry out a search and 
      retrieval task. A Knowbot exchanges messages with 
      other Knowbots and moves from one system to another to 
      carry out the user's wishes. When the Knowbot sets out 
      on an assignment several processes occur: 
 
      - The user interface, which is called the "user agent" 
      contains functions such as query forms and login menus 
      for the various databases available. The user 
      formulates a query on the user agent and presses the 
      "send" button. 
 
      - The user agent places the query inside a Knowbot and 
      encapsulates it with the appropriate "travel 
      instructions" for traversing the Internet. 
 
      - The Knowbot is then transmitted across the Internet 
      to the "database server" where it is received and 
      verified. The database server contains software which 
      expedites access to the databases. 
 
      - Next the database server runs a series of small 
      programs to process the Knowbot: 
 
      a) the Knowbot's generic syntax is translated into the 
      appropriate syntax for the database being queried; 
 
      b) the query is sent, which is equivalent to dialing- 
      in to the appropriate database; 
 
      c) when a response is received it is translated back 
      into the generic syntax of the Knowbot, and the 
      beginning and end of each record is marked. 
 
      - A Knowbot transports the retrieved records back to 
      the user agent where yet another small Knowbot 
      reformats the response into a friendly syntax which is 
      then displayed to the user. Figure 1 provides a 
      graphic representation of this process. 
 
      Additional Features 
 
      An additional option to forms-based access is to open 
      up a "transparent" window to ELHILL, TOXNET, and to a 
      Johns Hopkins Welch Library database: On-line 
      Mendelian Inheritance in Man (OMIM) and the Genome 
      DataBase (GDB). This is, in effect, a telnet screen. 
 
      The window offers direct access to the interactive 
      interfaces of the standard ELHILL, TOXNET, OMIM and 
      GDB systems. (See Figure 1.) 
 
      It is possible to have multiple Knowbot queries 
      running while simultaneously doing manual interactive 
      searches in this transparent window. In addition, 
      since the user agent stores an encrypted form of the 
      logins for each database, the user only needs to 
      provide login information once for each database 
      accessed. 
 
      Flexible design 
 
      The general design of the CNRI system is very flexible 
      with the user agent and database server separable 
      across the Internet. In the present experimental 
      implementation, the user agent typically runs on a SUN 
      4/110 workstation at CNRI, the database server on a 
      SUN 3/160 at the National Library of Medicine, the two 
      NLM database systems on Telenet (but ELHILL is soon to 
      be accessible on the Internet) and the OMIM and GDB 
      systems via the Internet. 
 
      Demonstrations using Network Computing Devices X- 
      display stations as well as the SUN 4/110 workstations 
      have been conducted for NLM and for NSF. 
 
      Looking to the future - short term 
 
      In the next few months Knowbots will be written to 
      perform multiple searches from a single request. For 
      example, the user will complete one Knowbot search 
      form and the single Knowbot will locate and access 
      multiple Medline databases until it finds the 
      information requested. 
 
      The current Knowbot-based system will be extended to 
      support queries to databases other than ELHILL and 
      TOXNET. The database server will be enhanced to 
      support queries which are not database specific by 
      making use of information about the contents of the 
      various MEDLARS databases. 
 
      For the long term 
 
      Knowbots are general tools for implementing complex, 
      distributed computations, processes and services. 
      Researchers at CNRI and elsewhere are exploring 
      applications of Knowbots as part of a more general 
      examination of a national information infrastructure. 
 
      Looking further into the future, two of many 
      possibilities are: 
 
      - Resident Knowbot. A Knowbot is instructed to remain 
      resident at a gateway and to query a given database at 
      the time when new citations are posted. The Knowbot is 
      programmed to search for topics of interest to the 
      user; when appropriate citations are located a message 
      is sent to the user listing the citations and their 
      locations. 
 
      - Image Processing. If a user's personal workstation 
      does not have enough power to quickly process needed 
      calculations, a Knowbot is written to launch the data 
      to a supercomputer where the calculations are 
      completed and returns the results to the user via the 
      Knowbot. 
 
      Additional CNRI Projects 
 
      The Corporation for National Research Initiatives is 
      involved in a number of networking research projects 
      in areas of High Speed Digital Networking 
      ("Gigabits"), Digital Library Systems, Inter- 
      Organizational Messaging, and Internet Research. 
 
      For more information about Knowbots or other CNRI 
      projects, contact: 
 
       Corp. for National Research Initiatives 
       1895 Preston White Drive 
       Suite  100 
       Reston, VA 22091 
       703/620-8990 
 
      -Susan Calcari, Merit/NSFNET