#DOAP Raptor RDF Parser Library Dave Beckett Overview Raptor is a free software / Open Source C library that provides a set of parsers and serializers that generate Resource Description Framework (RDF) triples by parsing syntaxes or serialize the triples into a syntax. The supported parsing syntaxes are RDF/XML, N-Triples, TRiG, Turtle, RSS tag soup including all versions of RSS, Atom 1.0 and 0.3, GRDDL and microformats for HTML, XHTML and XML and RDFa. The serializing syntaxes are RDF/XML (regular, and abbreviated), Atom 1.0, GraphViz, JSON, N-Triples, RSS 1.0 and XMP. Raptor was designed to work closely with the Redland RDF library (RDF Parser Toolkit for Redland) but is entirely separate. It is a portable library that works across many POSIX systems (Unix, GNU/Linux, BSDs, OSX, cygwin, win32). Raptor has no memory leaks and is fast. This is a mature and stable library. A summary of the changes can be found in the NEWS file, detailed API changes in the release notes and file-by-file changes in the ChangeLog. * Designed to integrate well with Redland * Parses content on the web if libcurl, libxml2 or BSD libfetch is available. * Supports all RDF terms including datatyped and XML literals * Optional features including parsers and serialisers can be selected at configure time. * Language bindings to Perl, PHP, Python and Ruby when used via Redland * No memory leaks * Fast * Standalone rapper RDF parser utility program Known bugs and issues are recorded in the Redland issue tracker. Parsers RDF/XML Parser A Parser for the standard RDF/XML syntax. * Fully handles the RDF/XML syntax updates for XML Base, xml:lang, RDF datatyping and Collections. * Handles all RDF vocabularies such as FOAF, RSS 1.0, Dublin Core, OWL, DOAP * Handles rdf:resource / resource attributes * Uses expat and/or (GNOME) libxml XML parsers as available or required N-Triples Parser A parser for the N-Triples syntax as defined by the W3C RDF Core working group for the RDF Test Cases. Turtle Parser A parser for the Turtle Terse RDF Triple Language syntax, designed as a useful subset of Notation 3. TRiG Parser A parser for the TriG - Turtle with Named Graphs syntax. The parser is alpha quality and may not support the entire TRiG specification. RSS "tag soup" parser A parser for the multiple XML RSS formats that use the elements such as channel, item, title, description in different ways. Attempts to turn the input into RSS 1.0 RDF triples. True RSS 1.0, as a full RDF vocabulary, is best parsed by the RDF/XML parser. It also generates triples for RSS enclosures. This parser also provides support for the Atom 1.0 syndication format defined in IETF RFC 4287 GRDDL and microformats parser A parser/processor for Gleaning Resource Descriptions from Dialects of Languages (GRDDL) syntax, W3C Recommendation of 2007-09-11 which allows reading XHTML and XML as RDF triples by using profiles in the document that declare XSLT transforms from the XHTML or XML content into RDF/XML or other RDF syntax which can then be parsed. It uses either an XML or a lax HTML parser to allow HTML tag soup to be read. The parser passes the all the GRDDL tests as of Raptor 1.4.16. The parser also handles hCard and hReview using public XSL sheets. RDFa parser A parser for RDFa (W3C Candidate Recommendation 20 June 2008) implemented via librdfa linked inside Raptor, written by Manu Sporny of Digital Bazaar, licensed with the same license as Raptor. As of Raptor 1.4.18 the RDFa parser passes all of the RDFa test suite except for 4 tests. Serializers RDF/XML Serializer A serializer to the standard RDF/XML syntax as revised by the W3C RDF Core working group in 2004. This writes a plain triple-based RDF/XML serialization with no optimisation or pretty-printing. A second serializer is provided using several of the RDF/XML abbreviations to provide a more compact readable format, at the cost of some pre-processing. This is suitable for small documents. N-Triples Serializer A serializer to the N-Triples syntax as used by the W3C RDF Core working group for the RDF Test Cases. Atom 1.0 Serializer A serializer to the Atom 1.0 syndication format defined in IETF RFC 4287. Beta quality. JSON Serializers Two serializers for to write triples encoded in JSON, one (json) in a resource-centric abbreviated form RDF/JSON like Turtle or RDF/XML-Abbreviated; the other a triple-centric format (json-triples) based on the SPARQL results in JSON format. Beta quality. GraphViz DOT Serializer An serializer to the GraphViz DOT format which aids visualising RDF graphs. RSS 1.0 Serializer A serializer to the RDF Site Summary (RSS) 1.0 format. Turtle Serializer A serializer for the Turtle Terse RDF Triple Language syntax. XMP Serializer An alpha quality serializer to the Adobe XMP profile of RDF/XML suitable for embedding inside an external document. Documentation The public API is described in the libraptor.3 UNIX manual page. It is demonstrated in the rapper utility program which shows how to call the parser and write the triples in a serialization. When Raptor is used inside Redland, the Redland documentation explains how to call the parser and contains several example programs. There are also further examples in the example directory of the distribution. To install Raptor see the Installation document. Sources The packaged sources are available from http://download.librdf.org/source/ (master site) and also from the SourceForge site. The development Subversion sources can also be browsed with ViewCV. License This library is free software / open source software released under the LGPL (GPL) or Apache 2.0 licenses. See LICENSE.html for full details. Mailing Lists The Redland mailing lists discusses the development and use of Raptor and Redland as well as future plans and announcement of releases. __________________________________________________________________ Copyright (C) 2000-2010 Dave Beckett Copyright (C) 2000-2005 University of Bristol