Table of Contents
In support of the service API - primarily the ASN module, which provides the pro-grammatic interface to the Z39.50 APDUs, YAZ contains a collection of tools that support the development of applications.
Since the type-1 (RPN) query structure has no direct, useful string
representation, every origin application needs to provide some form of
mapping from a local query notation or representation to a
Z_RPNQuery structure. Some programmers will prefer to
construct the query manually, perhaps using
odr_malloc()
to simplify memory management.
The YAZ distribution includes three separate, query-generating tools
that may be of use to you.
Since RPN or reverse polish notation is really just a fancy way of describing a suffix notation format (operator follows operands), it would seem that the confusion is total when we now introduce a prefix notation for RPN. The reason is one of simple laziness - it's somewhat simpler to interpret a prefix format, and this utility was designed for maximum simplicity, to provide a baseline representation for use in simple test applications and scripting environments (like Tcl). The demonstration client included with YAZ uses the PQF.
The PQF have been adopted by other parties developing Z39.50 software. It is often referred to as Prefix Query Notation - PQN.
The PQF is defined by the pquery module in the YAZ library. There are two sets of function that have similar behavior. First set operates on a PQF parser handle, second set doesn't. First set set of functions are more flexible than the second set. Second set is obsolete and is only provided to ensure backwards compatibility.
First set of functions all operate on a PQF parser handle:
#include <yaz/pquery.h> YAZ_PQF_Parser yaz_pqf_create(void); void yaz_pqf_destroy(YAZ_PQF_Parser p); Z_RPNQuery *yaz_pqf_parse(YAZ_PQF_Parser p, ODR o, const char *qbuf); Z_AttributesPlusTerm *yaz_pqf_scan(YAZ_PQF_Parser p, ODR o, Odr_oid **attributeSetId, const char *qbuf); int yaz_pqf_error(YAZ_PQF_Parser p, const char **msg, size_t *off);
A PQF parser is created and destructed by functions
yaz_pqf_create
and
yaz_pqf_destroy
respectively.
Function yaz_pqf_parse
parses query given
by string qbuf
. If parsing was successful,
a Z39.50 RPN Query is returned which is created using ODR stream
o
. If parsing failed, a NULL pointer is
returned.
Function yaz_pqf_scan
takes a scan query in
qbuf
. If parsing was successful, the function
returns attributes plus term pointer and modifies
attributeSetId
to hold attribute set for the
scan request - both allocated using ODR stream o
.
If parsing failed, yaz_pqf_scan returns a NULL pointer.
Error information for bad queries can be obtained by a call to
yaz_pqf_error
which returns an error code and
modifies *msg
to point to an error description,
and modifies *off
to the offset within last
query were parsing failed.
The second set of functions are declared as follows:
#include <yaz/pquery.h> Z_RPNQuery *p_query_rpn(ODR o, oid_proto proto, const char *qbuf); Z_AttributesPlusTerm *p_query_scan(ODR o, oid_proto proto, Odr_oid **attributeSetP, const char *qbuf); int p_query_attset(const char *arg);
The function p_query_rpn()
takes as arguments an
ODR stream (see section The ODR Module)
to provide a memory source (the structure created is released on
the next call to odr_reset()
on the stream), a
protocol identifier (one of the constants PROTO_Z3950 and
PROTO_SR), an attribute set reference, and
finally a null-terminated string holding the query string.
If the parse went well, p_query_rpn()
returns a
pointer to a Z_RPNQuery
structure which can be
placed directly into a Z_SearchRequest
.
If parsing failed, due to syntax error, a NULL pointer is returned.
The p_query_attset
specifies which attribute set
to use if the query doesn't specify one by the
@attrset
operator.
The p_query_attset
returns 0 if the argument is a
valid attribute set specifier; otherwise the function returns -1.
The grammar of the PQF is as follows:
query ::= top-set query-struct.
top-set ::= [ '@attrset' string ]
query-struct ::= attr-spec | simple | complex | '@term' term-type query
attr-spec ::= '@attr' [ string ] string query-struct
complex ::= operator query-struct query-struct.
operator ::= '@and' | '@or' | '@not' | '@prox' proximity.
simple ::= result-set | term.
result-set ::= '@set' string.
term ::= string.
proximity ::= exclusion distance ordered relation which-code unit-code.
exclusion ::= '1' | '0' | 'void'.
distance ::= integer.
ordered ::= '1' | '0'.
relation ::= integer.
which-code ::= 'known' | 'private' | integer.
unit-code ::= integer.
term-type ::= 'general' | 'numeric' | 'string' | 'oid' | 'datetime' | 'null'.
You will note that the syntax above is a fairly faithful representation of RPN, except for the Attribute, which has been moved a step away from the term, allowing you to associate one or more attributes with an entire query structure. The parser will automatically apply the given attributes to each term as required.
The @attr operator is followed by an attribute specification
(attr-spec
above). The specification consists
of an optional attribute set, an attribute type-value pair and
a sub-query. The attribute type-value pair is packed in one string:
an attribute type, an equals sign, and an attribute value, like this:
@attr 1=1003
.
The type is always an integer but the value may be either an
integer or a string (if it doesn't start with a digit character).
A string attribute-value is encoded as a Type-1 ``complex''
attribute with the list of values containing the single string
specified, and including no semantic indicators.
Version 3 of the Z39.50 specification defines various encoding of terms.
Use @term
type
string
,
where type is one of: general
,
numeric
or string
(for InternationalString).
If no term type has been given, the general
form
is used. This is the only encoding allowed in both versions 2 and 3
of the Z39.50 standard.
This is an advanced topic, describing how to construct queries that make very specific requirements on the relative location of their operands. You may wish to skip this section and go straight to the example PQF queries.
Most Z39.50 servers do not support proximity searching, or support only a small subset of the full functionality that can be expressed using the PQF proximity operator. Be aware that the ability to express a query in PQF is no guarantee that any given server will be able to execute it.
The proximity operator @prox
is a special
and more restrictive version of the conjunction operator
@and
. Its semantics are described in
section 3.7.2 (Proximity) of Z39.50 the standard itself, which
can be read on-line at
http://www.loc.gov/z3950/agency/markup/09.html#3.7.2
In PQF, the proximity operation is represented by a sequence of the form
@proxexclusion
distance
ordered
relation
which-code
unit-code
in which the meanings of the parameters are as described in in the standard, and they can take the following values:
exclusion. 0 = false (i.e. the proximity condition specified by the remaining parameters must be satisfied) or 1 = true (the proximity condition specified by the remaining parameters must not be satisifed).
distance. An integer specifying the difference between the locations of the operands: e.g. two adjacent words would have distance=1 since their locations differ by one unit.
ordered. 1 = ordered (the operands must occur in the order the query specifies them) or 0 = unordered (they may appear in either order).
relation. Recognised values are 1 (lessThan), 2 (lessThanOrEqual), 3 (equal), 4 (greaterThanOrEqual), 5 (greaterThan) and 6 (notEqual).
which-code.
known
or
k
(the unit-code parameter is taken from the well-known list
of alternatives described in below) or
private
or
p
(the unit-code paramater has semantics specific to an
out-of-band agreement such as a profile).
unit-code.
If the which-code parameter is known
then the recognised values are
1 (character),
2 (word),
3 (sentence),
4 (paragraph),
5 (section),
6 (chapter),
7 (document),
8 (element),
9 (subelement),
10 (elementType) and
11 (byte).
If which-code is private
then the
acceptable values are determined by the profile.
(The numeric values of the relation and well-known unit-code parameters are taken straight from the ASN.1 of the proximity structure in the standard.)
Example 7.2. PQF boolean operators
@or "dylan" "zimmerman" @and @or dylan zimmerman when @and when @or dylan zimmerman
Example 7.4. Attributes for terms
@attr 1=4 computer @attr 1=4 @attr 4=1 "self portrait" @attrset exp1 @attr 1=1 CategoryList @attr gils 1=2008 Copenhagen @attr 1=/book/title computer
Example 7.5. PQF Proximity queries
@prox 0 3 1 2 k 2 dylan zimmerman
Here the parameters 0, 3, 1, 2, k and 2 represent exclusion, distance, ordered, relation, which-code and unit-code, in that order. So:
exclusion = 0: the proximity condition must hold
distance = 3: the terms must be three units apart
ordered = 1: they must occur in the order they are specified
relation = 2: lessThanOrEqual (to the distance of 3 units)
which-code is ``known'', so the standard unit-codes are used
unit-code = 2: word.
So the whole proximity query means that the words
dylan
and zimmerman
must
both occur in the record, in that order, differing in position
by three or fewer words (i.e. with two or fewer words between
them.) The query would find ``Bob Dylan, aka. Robert
Zimmerman'', but not ``Bob Dylan, born as Robert Zimmerman''
since the distance in this case is four.
Example 7.7. PQF mixed queries
@or @and bob dylan @set Result-1 @attr 4=1 @and @attr 1=1 "bob dylan" @attr 1=4 "slow train coming" @and @attr 2=4 @attr gils 1=2038 -114 @attr 2=2 @attr gils 1=2039 -109
The last of these examples is a spatial search: in the GILS attribute set, access point 2038 indicates West Bounding Coordinate and 2030 indicates East Bounding Coordinate, so the query is for areas extending from -114 degrees to no more than -109 degrees.
Not all users enjoy typing in prefix query structures and numerical attribute values, even in a minimalistic test client. In the library world, the more intuitive Common Command Language - CCL (ISO 8777) has enjoyed some popularity - especially before the widespread availability of graphical interfaces. It is still useful in applications where you for some reason or other need to provide a symbolic language for expressing boolean query structures.
The CCL parser obeys the following grammar for the FIND argument.
The syntax is annotated by in the lines prefixed by
--
.
CCL-Find ::= CCL-Find Op Elements | Elements. Op ::= "and" | "or" | "not" -- The above means that Elements are separated by boolean operators. Elements ::= '(' CCL-Find ')' | Set | Terms | Qualifiers Relation Terms | Qualifiers Relation '(' CCL-Find ')' | Qualifiers '=' string '-' string -- Elements is either a recursive definition, a result set reference, a -- list of terms, qualifiers followed by terms, qualifiers followed -- by a recursive definition or qualifiers in a range (lower - upper). Set ::= 'set' = string -- Reference to a result set Terms ::= Terms Prox Term | Term -- Proximity of terms. Term ::= Term string | string -- This basically means that a term may include a blank Qualifiers ::= Qualifiers ',' string | string -- Qualifiers is a list of strings separated by comma Relation ::= '=' | '>=' | '<=' | '<>' | '>' | '<' -- Relational operators. This really doesn't follow the ISO8777 -- standard. Prox ::= '%' | '!' -- Proximity operator
Example 7.8. CCL queries
The following queries are all valid:
dylan "bob dylan" dylan or zimmerman set=1 (dylan and bob) or set=1 righttrunc? "notrunc?" singlechar#mask
Assuming that the qualifiers ti
,
au
and date
are defined we may use:
ti=self portrait au=(bob dylan and slow train coming) date>1980 and (ti=((self portrait)))
Qualifiers are used to direct the search to a particular searchable index, such as title (ti) and author indexes (au). The CCL standard itself doesn't specify a particular set of qualifiers, but it does suggest a few short-hand notations. You can customize the CCL parser to support a particular set of qualifiers to reflect the current target profile. Traditionally, a qualifier would map to a particular use-attribute within the BIB-1 attribute set. It is also possible to set other attributes, such as the structure attribute.
A CCL profile is a set of predefined CCL qualifiers that may be
read from a file or set in the CCL API.
The YAZ client reads its CCL qualifiers from a file named
default.bib
. There are four types of
lines in a CCL profile: qualifier specification,
qualifier alias, comments and directives.
A qualifier specification is of the form:
qualifier-name
[attributeset
,
]type
=
val
[attributeset
,
]type
=
val
...
where qualifier-name
is the name of the
qualifier to be used (eg. ti
),
type
is attribute type in the attribute
set (Bib-1 is used if no attribute set is given) and
val
is attribute value.
The type
can be specified as an
integer or as it be specified either as a single-letter:
u
for use,
r
for relation,p
for position,
s
for structure,t
for truncation
or c
for completeness.
The attributes for the special qualifier name term
are used when no CCL qualifier is given in a query.
Table 7.1. Common Bib-1 attributes
Type | Description |
---|---|
u= value | Use attribute (1). Common use attributes are 1 Personal-name, 4 Title, 7 ISBN, 8 ISSN, 30 Date, 62 Subject, 1003 Author), 1016 Any. Specify value as an integer. |
r= value | Relation attribute (2). Common values are 1 <, 2 <=, 3 =, 4 >=, 5 >, 6 <>, 100 phonetic, 101 stem, 102 relevance, 103 always matches. |
p= value | Position attribute (3). Values: 1 first in field, 2 first in any subfield, 3 any position in field. |
s= value | Structure attribute (4). Values: 1 phrase, 2 word, 3 key, 4 year, 5 date, 6 word list, 100 date (un), 101 name (norm), 102 name (un), 103 structure, 104 urx, 105 free-form-text, 106 document-text, 107 local-number, 108 string, 109 numeric string. |
t= value | Truncation attribute (5). Values: 1 right, 2 left, 3 left& right, 100 none, 101 process #, 102 regular-1, 103 regular-2, 104 CCL. |
c= value | Completeness attribute (6). Values: 1 incomplete subfield, 2 complete subfield, 3 complete field. |
Refer to Bib-1 Attribute Set(7) or the complete list of Bib-1 attributes
It is also possible to specify non-numeric attribute values, which are used in combination with certain types. The special combinations are:
Table 7.2. Special attribute combos
Name | Description |
---|---|
s=pw | The structure is set to either word or phrase depending on the number of tokens in a term (phrase-word). |
s=al | Each token in the term is ANDed. (and-list). This does not set the structure at all. |
s=ol | Each token in the term is ORed. (or-list). This does not set the structure at all. |
s=ag | Tokens that appears as phrases (with blank in them) gets structure phrase attached (4=1). Tokens that appear to be words gets structure word attached (4=2). Phrases and words are ANDed. This is a variant of s=al and s=pw, with the main difference that words are not split (with operator AND) but instead kept in one RPN token. This facility appeared in YAZ 4.2.38. |
s=sl | Tokens are split into sub-phrases of all combinations - in order. This facility appeared in YAZ 5.14.0. |
r=o |
Allows ranges and the operators greather-than, less-than, ...
equals.
This sets Bib-1 relation attribute accordingly (relation
ordered). A query construct is only treated as a range if
dash is used and that is surrounded by white-space. So
-1980 is treated as term
"-1980" not <= 1980 .
If - 1980 is used, however, that is
treated as a range.
|
r=r |
Similar to r=o but assumes that terms
are non-negative (not prefixed with - ).
Thus, a dash will always be treated as a range.
The construct 1980-1990 is
treated as a range with r=r but as a
single term "1980-1990" with
r=o . The special attribute
r=r is available in YAZ 2.0.24 or later.
|
r=omiteq | This will omit relation=equals (@attr 2=3) when r=o / r=r is used. This is useful for servers that somehow breaks when an explicit relation=equals is used. Omitting the relation is usually safe because "equals" is the default behavior. This tweak was added in YAZ version 5.1.2. |
t=l |
Allows term to be left-truncated.
If term is of the form ?x , the resulting
Type-1 term is x and truncation is left.
|
t=r |
Allows term to be right-truncated.
If term is of the form x? , the resulting
Type-1 term is x and truncation is right.
|
t=n |
If term is does not include ? , the
truncation attribute is set to none (100).
|
t=b |
Allows term to be both left&right truncated.
If term is of the form ?x? , the
resulting term is x and trunctation is
set to both left&right.
|
t=x | Allows masking anywhere in a term, thus fully supporting # (mask one character) and ? (zero or more of any). If masking is used, trunction is set to 102 (regexp-1 in term) and the term is converted accordingly to a regular expression. |
t=z | Allows masking anywhere in a term, thus fully supporting # (mask one character) and ? (zero or more of any). If masking is used, trunction is set to 104 (Z39.58 in term) and the term is converted accordingly to Z39.58 masking term - actually the same truncation as CCL itself. |
Example 7.9. CCL profile
Consider the following definition:
ti u=4 s=1 au u=1 s=1 term s=105 ranked r=102 date u=30 r=o
ti
and au
both set
structure attribute to phrase (s=1).
ti
sets the use-attribute to 4. au
sets the
use-attribute to 1.
When no qualifiers are used in the query the structure-attribute is
set to free-form-text (105) (rule for term
).
The date
sets the relation attribute to
the relation used in the CCL query and sets the use attribute
to 30 (Bib-1 Date).
You can combine attributes. To Search for "ranked title" you can do
ti,ranked=knuth computer
which will set relation=ranked, use=title, structure=phrase.
Query
date > 1980
is a valid query. But
ti > 1980
is invalid.
A qualifier alias is of the form:
q
q1
q2
..
which declares q
to
be an alias for q1
,
q2
... such that the CCL
query q=x
is equivalent to
q1=x or q2=x or ...
.
Lines with white space or lines that begin with
character #
are treated as comments.
Directive specifications takes the form
@
directive
value
Table 7.3. CCL directives
Name | Description | Default |
---|---|---|
truncation | Truncation character | ? |
mask | Masking character. Requires YAZ 4.2.58 or later | # |
field | Specifies how multiple fields are to be
combined. There are two modes: or :
multiple qualifier fields are ORed,
merge : attributes for the qualifier
fields are merged and assigned to one term.
| merge |
case | Specifies if CCL operators and qualifiers should be compared with case sensitivity or not. Specify 1 for case sensitive; 0 for case insensitive. | 1 |
and | Specifies token for CCL operator AND. | and |
or | Specifies token for CCL operator OR. | or |
not | Specifies token for CCL operator NOT. | not |
set | Specifies token for CCL operator SET. | set |
All public definitions can be found in the header file
ccl.h
. A profile identifier is of type
CCL_bibset
. A profile must be created with the call
to the function ccl_qual_mk
which returns a profile
handle of type CCL_bibset
.
To read a file containing qualifier definitions the function
ccl_qual_file
may be convenient. This function
takes an already opened FILE
handle pointer as
argument along with a CCL_bibset
handle.
To parse a simple string with a FIND query use the function
struct ccl_rpn_node *ccl_find_str(CCL_bibset bibset, const char *str, int *error, int *pos);
which takes the CCL profile (bibset
) and query
(str
) as input. Upon successful completion the RPN
tree is returned. If an error occur, such as a syntax error, the integer
pointed to by error
holds the error code and
pos
holds the offset inside query string in which
the parsing failed.
An English representation of the error may be obtained by calling
the ccl_err_msg
function. The error codes are
listed in ccl.h
.
To convert the CCL RPN tree (type
struct ccl_rpn_node *
)
to the Z_RPNQuery of YAZ the function ccl_rpn_query
must be used. This function which is part of YAZ is implemented in
yaz-ccl.c
.
After calling this function the CCL RPN tree is probably no longer
needed. The ccl_rpn_delete
destroys the CCL RPN tree.
A CCL profile may be destroyed by calling the
ccl_qual_rm
function.
The token names for the CCL operators may be changed by setting the
globals (all type char *
)
ccl_token_and
, ccl_token_or
,
ccl_token_not
and ccl_token_set
.
An operator may have aliases, i.e. there may be more than one name for
the operator. To do this, separate each alias with a space character.
CQL - Common Query Language - was defined for the SRU protocol. In many ways CQL has a similar syntax to CCL. The objective of CQL is different. Where CCL aims to be an end-user language, CQL is the protocol query language for SRU.
If you are new to CQL, read the Gentle Introduction.
The CQL parser in YAZ provides the following:
It parses and validates a CQL query.
It generates a C structure that allows you to convert a CQL query to some other query language, such as SQL.
The parser converts a valid CQL query to PQF, thus providing a way to use CQL for both SRU servers and Z39.50 targets at the same time.
The parser converts CQL to XCQL. XCQL is an XML representation of CQL. XCQL is part of the SRU specification. However, since SRU supports CQL only, we don't expect XCQL to be widely used. Furthermore, CQL has the advantage over XCQL that it is easy to read.
A CQL parser is represented by the CQL_parser
handle. Its contents should be considered YAZ internal (private).
#include <yaz/cql.h> typedef struct cql_parser *CQL_parser; CQL_parser cql_parser_create(void); void cql_parser_destroy(CQL_parser cp);
A parser is created by cql_parser_create
and
is destroyed by cql_parser_destroy
.
To parse a CQL query string, the following function is provided:
int cql_parser_string(CQL_parser cp, const char *str);
A CQL query is parsed by the cql_parser_string
which takes a query str
.
If the query was valid (no syntax errors), then zero is returned;
otherwise -1 is returned to indicate a syntax error.
int cql_parser_stream(CQL_parser cp, int (*getbyte)(void *client_data), void (*ungetbyte)(int b, void *client_data), void *client_data); int cql_parser_stdio(CQL_parser cp, FILE *f);
The functions cql_parser_stream
and
cql_parser_stdio
parses a CQL query
- just like cql_parser_string
.
The only difference is that the CQL query can be
fed to the parser in different ways.
The cql_parser_stream
uses a generic
byte stream as input. The cql_parser_stdio
uses a FILE
handle which is opened for reading.
The the query string is valid, the CQL parser generates a tree representing the structure of the CQL query.
struct cql_node *cql_parser_result(CQL_parser cp);
cql_parser_result
returns the
a pointer to the root node of the resulting tree.
Each node in a CQL tree is represented by a
struct cql_node
.
It is defined as follows:
#define CQL_NODE_ST 1 #define CQL_NODE_BOOL 2 #define CQL_NODE_SORT 3 struct cql_node { int which; union { struct { char *index; char *index_uri; char *term; char *relation; char *relation_uri; struct cql_node *modifiers; } st; struct { char *value; struct cql_node *left; struct cql_node *right; struct cql_node *modifiers; } boolean; struct { char *index; struct cql_node *next; struct cql_node *modifiers; struct cql_node *search; } sort; } u; };
There are three node types: search term (ST), boolean (BOOL) and sortby (SORT). A modifier is treated as a search term too.
The search term node has five members:
index
: index for search term.
If an index is unspecified for a search term,
index
will be NULL.
index_uri
: index URi for search term
or NULL if none could be resolved for the index.
term
: the search term itself.
relation
: relation for search term.
relation_uri
: relation URI for search term.
modifiers
: relation modifiers for search
term. The modifiers
list itself of cql_nodes
each of type ST
.
The boolean node represents and
,
or
, not
+
proximity.
left
and right
: left
- and right operand respectively.
modifiers
: proximity arguments.
The sort node represents both the SORTBY clause.
Conversion to PQF (and Z39.50 RPN) is tricky by the fact that the resulting RPN depends on the Z39.50 target capabilities (combinations of supported attributes). In addition, the CQL and SRU operates on index prefixes (URI or strings), whereas the RPN uses Object Identifiers for attribute sets.
The CQL library of YAZ defines a cql_transform_t
type. It represents a particular mapping between CQL and RPN.
This handle is created and destroyed by the functions:
cql_transform_t cql_transform_open_FILE (FILE *f); cql_transform_t cql_transform_open_fname(const char *fname); void cql_transform_close(cql_transform_t ct);
The first two functions create a tranformation handle from either an already open FILE or from a filename respectively.
The handle is destroyed by cql_transform_close
in which case no further reference of the handle is allowed.
When a cql_transform_t
handle has been created
you can convert to RPN.
int cql_transform_buf(cql_transform_t ct, struct cql_node *cn, char *out, int max);
This function converts the CQL tree cn
using handle ct
.
For the resulting PQF, you supply a buffer out
which must be able to hold at at least max
characters.
If conversion failed, cql_transform_buf
returns a non-zero SRU error code; otherwise zero is returned
(conversion successful). The meanings of the numeric error
codes are listed in the SRU specification somewhere (no
direct link anymore).
If conversion fails, more information can be obtained by calling
int cql_transform_error(cql_transform_t ct, char **addinfop);
This function returns the most recently returned numeric
error-code and sets the string-pointer at
*addinfop
to point to a string containing
additional information about the error that occurred: for
example, if the error code is 15 (``Illegal or unsupported context
set''), the additional information is the name of the requested
context set that was not recognised.
The SRU error-codes may be translated into brief human-readable error messages using
const char *cql_strerror(int code);
If you wish to be able to produce a PQF result in a different way, there are two alternatives.
void cql_transform_pr(cql_transform_t ct, struct cql_node *cn, void (*pr)(const char *buf, void *client_data), void *client_data); int cql_transform_FILE(cql_transform_t ct, struct cql_node *cn, FILE *f);
The former function produces output to a user-defined
output stream. The latter writes the result to an already
open FILE
.
The file supplied to functions
cql_transform_open_FILE
,
cql_transform_open_fname
follows
a structure found in many Unix utilities.
It consists of mapping specifications - one per line.
Lines starting with #
are ignored (comments).
Each line is of the form
CQL pattern
=
RPN equivalent
An RPN pattern is a simple attribute list. Each attribute pair takes the form:
[set
] type
=
value
The attribute set
is optional.
The type
is the attribute type,
value
the attribute value.
The character *
(asterisk) has special meaning
when used in the RPN pattern.
Each occurrence of *
is substituted with the
CQL matching name (index, relation, qualifier etc).
This facility can be used to copy a CQL name verbatim to the RPN result.
The following CQL patterns are recognized:
index.
set
.
name
This pattern is invoked when a CQL index, such as
dc.title is converted. set
and name
are the context set and index
name respectively.
Typically, the RPN specifies an equivalent use attribute.
For terms not bound by an index the pattern
index.cql.serverChoice
is used.
Here, the prefix cql
is defined as
http://www.loc.gov/zing/cql/cql-indexes/v1.0/
.
If this pattern is not defined, the mapping will fail.
The pattern,
index.
set
.*
is used when no other index pattern is matched.
qualifier.
set
.
name
(DEPRECATED)
For backwards compatibility, this is recognised as a synonym of
index.
set
.
name
relation.
relation
This pattern specifies how a CQL relation is mapped to RPN.
pattern
is name of relation
operator. Since =
is used as
separator between CQL pattern and RPN, CQL relations
including =
cannot be
used directly. To avoid a conflict, the names
ge
,
eq
,
le
,
must be used for CQL operators, greater-than-or-equal,
equal, less-than-or-equal respectively.
The RPN pattern is supposed to include a relation attribute.
For terms not bound by a relation, the pattern
relation.scr
is used. If the pattern
is not defined, the mapping will fail.
The special pattern, relation.*
is used
when no other relation pattern is matched.
relationModifier.
mod
This pattern specifies how a CQL relation modifier is mapped to RPN. The RPN pattern is usually a relation attribute.
structure.
type
This pattern specifies how a CQL structure is mapped to RPN.
Note that this CQL pattern is somewhat to similar to
CQL pattern relation
.
The type
is a CQL relation.
The pattern, structure.*
is used
when no other structure pattern is matched.
Usually, the RPN equivalent specifies a structure attribute.
position.
type
This pattern specifies how the anchor (position) of
CQL is mapped to RPN.
The type
is one
of first
, any
,
last
, firstAndLast
.
The pattern, position.*
is used
when no other position pattern is matched.
set.
prefix
This specification defines a CQL context set for a given prefix. The value on the right hand side is the URI for the set - not RPN. All prefixes used in index patterns must be defined this way.
set
This specification defines a default CQL context set for index names. The value on the right hand side is the URI for the set.
Example 7.10. CQL to RPN mapping file
This simple file defines two context sets, three indexes and three relations, a position pattern and a default structure.
set.cql = http://www.loc.gov/zing/cql/context-sets/cql/v1.1/ set.dc = http://www.loc.gov/zing/cql/dc-indexes/v1.0/ index.cql.serverChoice = 1=1016 index.dc.title = 1=4 index.dc.subject = 1=21 relation.< = 2=1 relation.eq = 2=3 relation.scr = 2=3 position.any = 3=3 6=1 structure.* = 4=1
With the mappings above, the CQL query
computer
is converted to the PQF:
@attr 1=1016 @attr 2=3 @attr 4=1 @attr 3=3 @attr 6=1 "computer"
by rules index.cql.serverChoice
,
relation.scr
, structure.*
,
position.any
.
CQL query
computer^
is rejected, since position.right
is
undefined.
CQL query
>my = "http://www.loc.gov/zing/cql/dc-indexes/v1.0/" my.title = x
is converted to
@attr 1=4 @attr 2=3 @attr 4=1 @attr 3=3 @attr 6=1 "x"
Example 7.11. CQL to RPN string attributes
In this example we allow any index to be passed to RPN as a use attribute.
# Identifiers for prefixes used in this file. (index.*) set.cql = info:srw/cql-context-set/1/cql-v1.1 set.rpn = http://bogus/rpn set = http://bogus/rpn # The default index when none is specified by the query index.cql.serverChoice = 1=any index.rpn.* = 1=* relation.eq = 2=3 structure.* = 4=1 position.any = 3=3
The http://bogus/rpn
context set is also the default
so we can make queries such as
title = a
which is converted to
@attr 2=3 @attr 4=1 @attr 3=3 @attr 1=title "a"
Example 7.12. CQL to RPN using Bath Profile
The file etc/pqf.properties
has mappings from
the Bath Profile and Dublin Core to RPN.
If YAZ is installed as a package it's usually located
in /usr/share/yaz/etc
and part of the
development package, such as libyaz-dev
.
Conversion from CQL to XCQL is trivial and does not require a mapping to be defined. There three functions to choose from depending on the way you wish to store the resulting output (XML buffer containing XCQL).
int cql_to_xml_buf(struct cql_node *cn, char *out, int max); void cql_to_xml(struct cql_node *cn, void (*pr)(const char *buf, void *client_data), void *client_data); void cql_to_xml_stdio(struct cql_node *cn, FILE *f);
Function cql_to_xml_buf
converts
to XCQL and stores result in a user supplied buffer of a given
max size.
cql_to_xml
writes the result in
a user defined output stream.
cql_to_xml_stdio
writes to a
a file.
Conversion from PQF to CQL is offered by the two functions shown below. The former uses a generic stream for result. The latter puts result in a WRBUF (string container).
#include <yaz/rpn2cql.h> int cql_transform_rpn2cql_stream(cql_transform_t ct, void (*pr)(const char *buf, void *client_data), void *client_data, Z_RPNQuery *q); int cql_transform_rpn2cql_wrbuf(cql_transform_t ct, WRBUF w, Z_RPNQuery *q);
The configuration is the same as used in CQL to PQF conversions.