Search/Retrieve web services

From refbase

(Difference between revisions)
Revision as of 23:41, 30 August 2006
Matthias (Talk | contribs)
removed note about masking characters ('*' and '?') which are now supported
← Previous diff
Current revision
Matthias (Talk | contribs)
updated URL(s) of the refbase database at ipoe.uni-kiel.de
Line 3: Line 3:
refbase >v0.8.0 supports the [http://www.loc.gov/standards/sru/ SRU] (Search/Retrieve via URL) standard search protocol for Internet search queries. SRU utilizes [http://www.loc.gov/standards/sru/cql/index.html CQL] (Common Query Language), a standard query syntax for representing queries. Both standards are developed by the Library of Congress and provide a generic API for searching a data repository and a mechanism for returning metadata records. refbase >v0.8.0 supports the [http://www.loc.gov/standards/sru/ SRU] (Search/Retrieve via URL) standard search protocol for Internet search queries. SRU utilizes [http://www.loc.gov/standards/sru/cql/index.html CQL] (Common Query Language), a standard query syntax for representing queries. Both standards are developed by the Library of Congress and provide a generic API for searching a data repository and a mechanism for returning metadata records.
-==== About the SRU/W web services-based protocols ====+=== About the SRU/W web services-based protocols ===
SRU defines a [http://en.wikipedia.org/wiki/Web_services web service] for searching databases. SRU is a companion protocol of [http://www.loc.gov/standards/sru/srw/index.html SRW] which is a [http://en.wikipedia.org/wiki/SOAP SOAP] version of the protocol. SRU can be regarded as a [http://en.wikipedia.org/wiki/Representational_State_Transfer RESTful] version of SRW, since all the requests are simple URLs instead of XML documents being sent via some sort of transport layer. SRU defines a [http://en.wikipedia.org/wiki/Web_services web service] for searching databases. SRU is a companion protocol of [http://www.loc.gov/standards/sru/srw/index.html SRW] which is a [http://en.wikipedia.org/wiki/SOAP SOAP] version of the protocol. SRU can be regarded as a [http://en.wikipedia.org/wiki/Representational_State_Transfer RESTful] version of SRW, since all the requests are simple URLs instead of XML documents being sent via some sort of transport layer.
Line 9: Line 9:
SRU defines three basic operations: [http://www.loc.gov/standards/sru/explain/index.html explain], [http://www.loc.gov/standards/sru/scan/index.html scan] and [http://www.loc.gov/standards/sru/sru-spec.html searchRetrieve] which define the requests and responses in an SRU interaction. Some simple examples are presented [http://www.loc.gov/standards/sru/simple.html here]. SRU defines three basic operations: [http://www.loc.gov/standards/sru/explain/index.html explain], [http://www.loc.gov/standards/sru/scan/index.html scan] and [http://www.loc.gov/standards/sru/sru-spec.html searchRetrieve] which define the requests and responses in an SRU interaction. Some simple examples are presented [http://www.loc.gov/standards/sru/simple.html here].
-==== refbase SRU server ====+=== refbase SRU server ===
Currently, the refbase SRU server (''sru.php'') supports explain and searchRetrieve operations (but not scan) and returns records as [http://www.loc.gov/standards/mods/ MODS] XML wrapped into SRW XML. See the SRU web site for a description of the elements of a [http://www.loc.gov/standards/sru/sru-spec.html#response searchRetrieve response]. Currently, the refbase SRU server (''sru.php'') supports explain and searchRetrieve operations (but not scan) and returns records as [http://www.loc.gov/standards/mods/ MODS] XML wrapped into SRW XML. See the SRU web site for a description of the elements of a [http://www.loc.gov/standards/sru/sru-spec.html#response searchRetrieve response].
-The refbase SRU server can be used as a remote back-end database for the [http://xbiblio.sourceforge.net/citeproc.html CiteProc] processor of the [http://xbiblio.sourceforge.net/index.html XBib] project. CiteProc is a comprehensive solution for bibliographic and citation formatting. See [http://netapps.muohio.edu/blogs/darcusb/darcusb/archives/2005/07/30/web-services-and-distributed-citation-processing here] and [http://bibliographic.openoffice.org/citeproc/ here] for a description and usage example that shows how to integrate refbase with CiteProc.+The refbase SRU server can be used as a remote back-end database for the [http://xbiblio.sourceforge.net/citeproc/ CiteProc] processor of the [http://xbiblio.sourceforge.net/index.html XBib] project. CiteProc is a comprehensive solution for bibliographic and citation formatting. See [http://netapps.muohio.edu/blogs/darcusb/darcusb/archives/2005/07/30/web-services-and-distributed-citation-processing here] and [http://bibliographic.openoffice.org/citeproc/ here] for a description and usage example that shows how to integrate refbase with CiteProc.
-===== Implementation notes =====+==== Implementation notes ====
-The refbase SRU server allows to query all global refbase fields (from the [[Table refs|''refs'']] MySQL table) - the given index name must either match one of the 'set.index' names listed in the explain response (''[http://polaris.ipoe.uni-kiel.de/refs/sru.php?operation=explain sru.php?operation=explain]'') or match a refbase field name directly. If no index name is given the ''serial'' field will be searched by default.+The refbase SRU server allows to query all global refbase fields (from the [[Table refs|''refs'']] MySQL table) - the given index name must either match one of the 'set.index' names listed in the explain response (''[https://refbase.ipoe.uni-kiel.de/refs/sru.php?operation=explain sru.php?operation=explain]'') or match a refbase field name directly. If no index name is given the ''serial'' field will be searched by default.
-Note that for valid queries (i.e., if the ''version'' & ''query'' parameters are present in the query), ''operation=searchRetrieve'' is assumed if ommitted. Additionally, only ''recordPacking=xml'' and ''recordSchema=mods'' are supported and ''sru.php'' will use these settings by default if not given in the query. Data will be returned together with a default stylesheet if the ''stylesheet'' parameter wasn't given in the query. XPath, sort and result sets are currently not supported and only SRW version 1.1 is recognized. Also note that, currently, ''sru.php'' allows only for a limited set of CQL queries and future versions may offer support for the boolean CQL operators 'and/or/not' and parentheses.+Note that for valid queries (i.e., if the ''version'' & ''query'' parameters are present in the query), ''operation=searchRetrieve'' is assumed if omitted. Additionally, only ''recordPacking=xml'' and ''recordSchema=mods'' are supported and ''sru.php'' will use these settings by default if not given in the query. Data will be returned together with a default stylesheet if the ''stylesheet'' parameter wasn't given in the query. XPath, sort and result sets are currently not supported and only SRW version 1.1 is recognized. Also note that, currently, ''sru.php'' allows only for a limited set of CQL queries and future versions may offer support for the boolean CQL operators 'and/or/not' and parentheses.
-===== Usage examples =====+==== Usage examples ====
Below are some working examples of an online refbase SRU Server: Below are some working examples of an online refbase SRU Server:
Line 27: Line 27:
You can call ''sru.php'' without any parameters or with the ''operation=explain'' parameter to retrieve a standard explain response which tells the requesting client the settings and features provided by the refbase web service: You can call ''sru.php'' without any parameters or with the ''operation=explain'' parameter to retrieve a standard explain response which tells the requesting client the settings and features provided by the refbase web service:
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php sru.php]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php sru.php]
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php? sru.php?]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php? sru.php?]
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php?operation=explain sru.php?operation=explain]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php?operation=explain sru.php?operation=explain]
Here's a simply query that asks for all database entries where the creator index (i.e., the refbase ''author'' field) contains "Schmid". Found data are returned as SRW+MODS XML: Here's a simply query that asks for all database entries where the creator index (i.e., the refbase ''author'' field) contains "Schmid". Found data are returned as SRW+MODS XML:
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=dc.creator=schmid sru.php?version=1.1&query=dc.creator=schmid]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=dc.creator=schmid sru.php?version=1.1&query=dc.creator=schmid]
Note that the ''version=1.1'' and ''query'' parameters are mandatory and that the query string of the ''query'' parameter is specified as CQL. Mike Taylor has written a nice [http://zing.z3950.org/cql/intro.html introduction to CQL]. The standard explain response (see above example) lists all available indexes and their corresponding refbase fields. Note that the ''version=1.1'' and ''query'' parameters are mandatory and that the query string of the ''query'' parameter is specified as CQL. Mike Taylor has written a nice [http://zing.z3950.org/cql/intro.html introduction to CQL]. The standard explain response (see above example) lists all available indexes and their corresponding refbase fields.
-As mentioned above, a simple default stylesheet will be returned with the response if the ''stylesheet'' parameter is ommitted in the SRU request. However, you can supply your own stylesheet as follows (note that the given stylesheet just serves as an example and is of no real-world use since it simply strips all XML tags):+As mentioned above, a simple default stylesheet will be returned with the response if the ''stylesheet'' parameter is omitted in the SRU request. However, you can supply your own stylesheet as follows (note that the given stylesheet just serves as an example and is of no real-world use since it simply strips all XML tags):
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=dc.creator=schmid&stylesheet=xml2html.xsl sru.php?version=1.1&query=dc.creator=schmid&stylesheet=xml2html.xsl]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=dc.creator=schmid&stylesheet=xml2html.xsl sru.php?version=1.1&query=dc.creator=schmid&stylesheet=xml2html.xsl]
You can suppress any stylesheets by including the ''stylesheet'' parameter without a value: You can suppress any stylesheets by including the ''stylesheet'' parameter without a value:
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=dc.creator=schmid&stylesheet= sru.php?version=1.1&query=dc.creator=schmid&stylesheet=]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=dc.creator=schmid&stylesheet= sru.php?version=1.1&query=dc.creator=schmid&stylesheet=]
By default, refbase will return as many records as specified by the admin in variable <code>$defaultNumberOfRecords</code> in ''initialize/ini.inc.php''. You can use the ''startRecord'' and ''maximumRecords'' parameters to explicitly define the first record and the maximum number of records that shall be returned: By default, refbase will return as many records as specified by the admin in variable <code>$defaultNumberOfRecords</code> in ''initialize/ini.inc.php''. You can use the ''startRecord'' and ''maximumRecords'' parameters to explicitly define the first record and the maximum number of records that shall be returned:
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=dc.creator=schmid&startRecord=10&maximumRecords=10 sru.php?version=1.1&query=dc.creator=schmid&startRecord=10&maximumRecords=10]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=dc.creator=schmid&startRecord=10&maximumRecords=10 sru.php?version=1.1&query=dc.creator=schmid&startRecord=10&maximumRecords=10]
Here are some other SRU queries that should give you some ideas about what's possible (all given queries are valid but the last one in each set is the preferred one since it uses standard 'set.index' names): Here are some other SRU queries that should give you some ideas about what's possible (all given queries are valid but the last one in each set is the preferred one since it uses standard 'set.index' names):
Line 53: Line 53:
* return record with serial number 1: * return record with serial number 1:
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=1 sru.php?version=1.1&query=1]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=1 sru.php?version=1.1&query=1]
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=serial=1 sru.php?version=1.1&query=serial=1]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=serial=1 sru.php?version=1.1&query=serial=1]
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=rec.identifier=1 sru.php?version=1.1&query=rec.identifier=1]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=rec.identifier=1 sru.php?version=1.1&query=rec.identifier=1]
* find all records where the ''title'' field contains either "ecology" OR "diversity": * find all records where the ''title'' field contains either "ecology" OR "diversity":
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=title%20any%20ecology%20diversity sru.php?version=1.1&query=title any ecology diversity]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=title%20any%20ecology%20diversity sru.php?version=1.1&query=title any ecology diversity]
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=dc.title%20any%20ecology%20diversity sru.php?version=1.1&query=dc.title any ecology diversity]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=dc.title%20any%20ecology%20diversity sru.php?version=1.1&query=dc.title any ecology diversity]
* find all records where the ''author'' field contains both "dieckmann" AND "thomas": * find all records where the ''author'' field contains both "dieckmann" AND "thomas":
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=author%20all%20dieckmann%20thomas sru.php?version=1.1&query=author all dieckmann thomas]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=author%20all%20dieckmann%20thomas sru.php?version=1.1&query=author all dieckmann thomas]
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=dc.creator%20all%20dieckmann%20thomas sru.php?version=1.1&query=dc.creator all dieckmann thomas]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=dc.creator%20all%20dieckmann%20thomas sru.php?version=1.1&query=dc.creator all dieckmann thomas]
* find all records where the ''publication'' field equals EXACTLY "Marine Ecology Progress Series": * find all records where the ''publication'' field equals EXACTLY "Marine Ecology Progress Series":
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=publication%20exact%20Marine%20Ecology%20Progress%20Series sru.php?version=1.1&query=publication exact Marine Ecology Progress Series]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=publication%20exact%20Marine%20Ecology%20Progress%20Series sru.php?version=1.1&query=publication exact Marine Ecology Progress Series]
* find all records where the ''year'' field is greater than or equals "2005": * find all records where the ''year'' field is greater than or equals "2005":
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=year%3E=2005 sru.php?version=1.1&query=year>=2005]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=year%3E=2005 sru.php?version=1.1&query=year>=2005]
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=dc.date%3E=2005 sru.php?version=1.1&query=dc.date>=2005]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=dc.date%3E=2005 sru.php?version=1.1&query=dc.date>=2005]
* find records with serial numbers 1, 123, 499, 612, 21654 & 23013 but return only the three last records: * find records with serial numbers 1, 123, 499, 612, 21654 & 23013 but return only the three last records:
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=1%20123%20499%20612%2021654%2023013&startRecord=4&maximumRecords=3 sru.php?version=1.1&query=1 123 499 612 21654 23013&startRecord=4&maximumRecords=3]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=1%20123%20499%20612%2021654%2023013&startRecord=4&maximumRecords=3 sru.php?version=1.1&query=1 123 499 612 21654 23013&startRecord=4&maximumRecords=3]
* same as above, but return just the number of found records (and not the full record data): * same as above, but return just the number of found records (and not the full record data):
- [http://polaris.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=1%20123%20499%20612%2021654%2023013&maximumRecords=0 sru.php?version=1.1&query=1 123 499 612 21654 23013&maximumRecords=0]+ [https://refbase.ipoe.uni-kiel.de/refs/sru.php?version=1.1&query=1%20123%20499%20612%2021654%2023013&maximumRecords=0 sru.php?version=1.1&query=1 123 499 612 21654 23013&maximumRecords=0]
[[Category:Usage]] [[Category:Usage]]

Current revision

Contents

refbase web service

refbase >v0.8.0 supports the SRU (Search/Retrieve via URL) standard search protocol for Internet search queries. SRU utilizes CQL (Common Query Language), a standard query syntax for representing queries. Both standards are developed by the Library of Congress and provide a generic API for searching a data repository and a mechanism for returning metadata records.

About the SRU/W web services-based protocols

SRU defines a web service for searching databases. SRU is a companion protocol of SRW which is a SOAP version of the protocol. SRU can be regarded as a RESTful version of SRW, since all the requests are simple URLs instead of XML documents being sent via some sort of transport layer.

SRU defines three basic operations: explain, scan and searchRetrieve which define the requests and responses in an SRU interaction. Some simple examples are presented here.

refbase SRU server

Currently, the refbase SRU server (sru.php) supports explain and searchRetrieve operations (but not scan) and returns records as MODS XML wrapped into SRW XML. See the SRU web site for a description of the elements of a searchRetrieve response.

The refbase SRU server can be used as a remote back-end database for the CiteProc processor of the XBib project. CiteProc is a comprehensive solution for bibliographic and citation formatting. See here and here for a description and usage example that shows how to integrate refbase with CiteProc.

Implementation notes

The refbase SRU server allows to query all global refbase fields (from the refs MySQL table) - the given index name must either match one of the 'set.index' names listed in the explain response (sru.php?operation=explain) or match a refbase field name directly. If no index name is given the serial field will be searched by default.

Note that for valid queries (i.e., if the version & query parameters are present in the query), operation=searchRetrieve is assumed if omitted. Additionally, only recordPacking=xml and recordSchema=mods are supported and sru.php will use these settings by default if not given in the query. Data will be returned together with a default stylesheet if the stylesheet parameter wasn't given in the query. XPath, sort and result sets are currently not supported and only SRW version 1.1 is recognized. Also note that, currently, sru.php allows only for a limited set of CQL queries and future versions may offer support for the boolean CQL operators 'and/or/not' and parentheses.

Usage examples

Below are some working examples of an online refbase SRU Server:

You can call sru.php without any parameters or with the operation=explain parameter to retrieve a standard explain response which tells the requesting client the settings and features provided by the refbase web service:

sru.php
sru.php?
sru.php?operation=explain

Here's a simply query that asks for all database entries where the creator index (i.e., the refbase author field) contains "Schmid". Found data are returned as SRW+MODS XML:

sru.php?version=1.1&query=dc.creator=schmid

Note that the version=1.1 and query parameters are mandatory and that the query string of the query parameter is specified as CQL. Mike Taylor has written a nice introduction to CQL. The standard explain response (see above example) lists all available indexes and their corresponding refbase fields.

As mentioned above, a simple default stylesheet will be returned with the response if the stylesheet parameter is omitted in the SRU request. However, you can supply your own stylesheet as follows (note that the given stylesheet just serves as an example and is of no real-world use since it simply strips all XML tags):

sru.php?version=1.1&query=dc.creator=schmid&stylesheet=xml2html.xsl

You can suppress any stylesheets by including the stylesheet parameter without a value:

sru.php?version=1.1&query=dc.creator=schmid&stylesheet=

By default, refbase will return as many records as specified by the admin in variable $defaultNumberOfRecords in initialize/ini.inc.php. You can use the startRecord and maximumRecords parameters to explicitly define the first record and the maximum number of records that shall be returned:

sru.php?version=1.1&query=dc.creator=schmid&startRecord=10&maximumRecords=10

Here are some other SRU queries that should give you some ideas about what's possible (all given queries are valid but the last one in each set is the preferred one since it uses standard 'set.index' names):

  • return record with serial number 1:
sru.php?version=1.1&query=1
sru.php?version=1.1&query=serial=1
sru.php?version=1.1&query=rec.identifier=1
  • find all records where the title field contains either "ecology" OR "diversity":
sru.php?version=1.1&query=title any ecology diversity
sru.php?version=1.1&query=dc.title any ecology diversity
  • find all records where the author field contains both "dieckmann" AND "thomas":
sru.php?version=1.1&query=author all dieckmann thomas
sru.php?version=1.1&query=dc.creator all dieckmann thomas
  • find all records where the publication field equals EXACTLY "Marine Ecology Progress Series":
sru.php?version=1.1&query=publication exact Marine Ecology Progress Series
  • find all records where the year field is greater than or equals "2005":
sru.php?version=1.1&query=year>=2005
sru.php?version=1.1&query=dc.date>=2005
  • find records with serial numbers 1, 123, 499, 612, 21654 & 23013 but return only the three last records:
sru.php?version=1.1&query=1 123 499 612 21654 23013&startRecord=4&maximumRecords=3
  • same as above, but return just the number of found records (and not the full record data):
sru.php?version=1.1&query=1 123 499 612 21654 23013&maximumRecords=0