NAME XRD::Parser - parse XRD and host-meta files into RDF::Trine models SYNOPSIS use RDF::Query; use XRD::Parser; my $parser = XRD::Parser->new(undef, "http://example.com/foo.xrd"); my $results = RDF::Query->new( "SELECT * WHERE {?who ?auth.}") ->execute($parser->graph); while (my $result = $results->next) { print $result->{'auth'}->uri . "\n"; } or maybe: my $data = XRD::Parser->hostmeta('gmail.com') ->graph ->as_hashref; VERSION 0.100 DESCRIPTION While XRD has a rather different history, it turns out it can mostly be thought of as a serialisation format for a limited subset of RDF. This package ignores the order of elements, as RDF is a graph format with no concept of statements coming in an "order". The XRD spec says that grokking the order of elements is only a SHOULD. That said, if you're concerned about the order of elements, the callback routines allowed by this package may be of use. This package aims to be roughly compatible with RDF::RDFa::Parser's interface. Constructors "$p = XRD::Parser->new($content, $uri, \%options, $store)" This method creates a new XRD::Parser object and returns it. The $content variable may contain an XML string, or a XML::LibXML::Document. If a string, the document is parsed using XML::LibXML::Parser, which may throw an exception. XRD::Parser does not catch the exception. $uri the supposed URI of the content; it is used to resolve any relative URIs found in the XRD document. Also, if $content is undef, then XRD::Parser will attempt to retrieve $uri using LWP::UserAgent. Options [default in brackets]: * default_subject - If no element. [undef] * link_prop - How to handle in ? [0] 0=skip, 1=reify, 2=subproperty, 3=both. * loose_mime - Accept text/plain & app/octet-stream. [0] * tdb_service - thing-described-by.org when possible. [0] $storage is an RDF::Trine::Storage object. If undef, then a new temporary store is created. "$p = XRD::Parser->hostmeta($uri)" This method creates a new XRD::Parser object and returns it. The parameter may be a URI (from which the hostname will be extracted) or just a bare host name (e.g. "example.com"). The resource "/.well-known/host-meta" will then be fetched from that host using an appropriate HTTP Accept header, and the parser object returned. Public Methods "$p->uri($uri)" Returns the base URI of the document being parsed. This will usually be the same as the base URI provided to the constructor. Optionally it may be passed a parameter - an absolute or relative URI - in which case it returns the same URI which it was passed as a parameter, but as an absolute URI, resolved relative to the document's base URI. This seems like two unrelated functions, but if you consider the consequence of passing a relative URI consisting of a zero-length string, it in fact makes sense. "$p->dom" Returns the parsed XML::LibXML::Document. "$p->graph" This method will return an RDF::Trine::Model object with all statements of the full graph. This method will automatically call "consume" first, if it has not already been called. $p->set_callbacks(\%callbacks) Set callback functions for the parser to call on certain events. These are only necessary if you want to do something especially unusual. $p->set_callbacks({ 'pretriple_resource' => sub { ... } , 'pretriple_literal' => sub { ... } , 'ontriple' => undef , }); Either of the two pretriple callbacks can be set to the string 'print' instead of a coderef. This enables built-in callbacks for printing Turtle to STDOUT. For details of the callback functions, see the section CALLBACKS. "set_callbacks" must be used *before* "consume". "set_callbacks" itself returns a reference to the parser object itself. *NOTE:* the behaviour of this function was changed in version 0.05. "$p->consume" This method processes the input DOM and sends the resulting triples to the callback functions (if any). It called again, does nothing. Returns the parser object itself. Utility Functions "$uri = XRD::Parser::host_uri($uri)" Returns a URI representing the host. These crop up often in graphs gleaned from host-meta files. $uri can be an absolute URI like 'http://example.net/foo#bar' or a host name like 'example.com'. "$uri = XRD::Parser::template_uri($relationship_uri)" Returns a URI representing not a normal relationship, but the relationship between a host and a template URI literal. CALLBACKS Several callback functions are provided. These may be set using the "set_callbacks" function, which taskes a hashref of keys pointing to coderefs. The keys are named for the event to fire the callback on. pretriple_resource This is called when a triple has been found, but before preparing the triple for adding to the model. It is only called for triples with a non-literal object value. The parameters passed to the callback function are: * A reference to the "XRD::Parser" object * A reference to the "XML::LibXML::Element" being parsed * Subject URI or bnode (string) * Predicate URI (string) * Object URI or bnode (string) The callback should return 1 to tell the parser to skip this triple (not add it to the graph); return 0 otherwise. pretriple_literal This is the equivalent of pretriple_resource, but is only called for triples with a literal object value. The parameters passed to the callback function are: * A reference to the "XRD::Parser" object * A reference to the "XML::LibXML::Element" being parsed * Subject URI or bnode (string) * Predicate URI (string) * Object literal (string) * Datatype URI (string or undef) * Language (string or undef) The callback should return 1 to tell the parser to skip this triple (not add it to the graph); return 0 otherwise. ontriple This is called once a triple is ready to be added to the graph. (After the pretriple callbacks.) The parameters passed to the callback function are: * A reference to the "XRD::Parser" object * A reference to the "XML::LibXML::Element" being parsed * An RDF::Trine::Statement object. The callback should return 1 to tell the parser to skip this triple (not add it to the graph); return 0 otherwise. The callback may modify the RDF::Trine::Statement object. SEE ALSO RDF::Trine, RDF::Query, RDF::RDFa::Parser. . AUTHOR Toby Inkster, COPYRIGHT AND LICENSE Copyright (C) 2009-2010 by Toby Inkster This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8 or, at your option, any later version of Perl 5 you may have available.