Sprinkling POWDER on metaTXT

I've recently been in touch with Mícheál Ó Foghlú of the Waterford Institute of Technology about metaTXT — an initiative being lead by Bena Roberts et al at Visibility Mobile.

The motivation for metaTXT comes from the mobile search companies who want to be better able to identify content that is mobile-friendly as distinct from desktop-friendly. Sounds like a job for POWDER? (mobileOK was a primary use case for POWDER's development).

In brief, metaTXT is similar to robits.txt in design and planned implementation. It offers fields and values in a text file that you stick in the root directory of a Web site and search engines use this in their algorithm to better index your site. POWDER can be used to do the same thing (and a lot more besides) and is an XML dialect that can transport property-value pairs applicable to any defined group of resources. The whole thing can be transformed (programmatically) into RDF/OWL for processing as part of the Semantic Web.

As I see it, the differences between the two are:

POWDERmetaTXT
Requires that all descriptions are attributed to the entity that create them, thus allowing consumers of the data to authenticate it with the author (via any of a number of means). Attributed to the Web site publisher by implication. You'll trust the data if and only if you trust the Web site.
Supports description of any resource by any one. Only supports first-party description
A Description Resource can apply to a whole Web site, part of a Web site, a specific resource, or millions of Web sites. Applies to 'a Web site' without defining the term.
Can transport any property/value pairs Optimised for a specific purpose with an in-built list of terms.
Uses XML, has a validator and schema, is open to transformation into other formats etc. Uses plain text and is optimised to perform a specific function
Uses HTML Link, HTTP Link and RDFa to point from a resource to its description, which can be anywhere. Prescribes a well-known location.
Allows for repositories to be built and maintaned that provide metadata about many different Web sites. Such repositories, if of good quality, will be of direct relevance to search engines without them crawling each individua; site to find the data. Doesn't support repositories
Well-formed and highly expressive format, encompassing RDF, but takes a while to compose. Simple free-form text, uncomplicated syntax, takes just a minute to create.
Descriptions can be located anywhere, so there needs to be a discovery process, which could be complex. Data is always located at the root of the site that it describes.
Generic description capability, not intended for just one or a few use cases. Focussed use case, and includes specific support for mobile entry points into sites.

Iitems thus highlighted contributed by Rotan Hanrahan.

metaTXT in POWDER - an Example

The example below captures many of the features of the metaTXT proposal. To do so it uses terms from Dublin Core where appropriate, terms from the GeoURL namespace (although I can't find an example of anyone using ICBM terms in RDF, the namespace quoted is made up) and for the more specific items, an 'mt' namespace tied to the visibilitymobile.com domain (which, again, I've made up).

This example is also available as an XML file

   <?xml version="1.0"?>
1  <powder xmlns="http://www.w3.org/2007/05/powder#"
2          xmlns:dcterms="http://purl.org/dc/terms/"
3          xmlns:icbm="http://geourl.org/icmb#"
4          xmlns:mt="http://www.visibilitymobile.com/metaTXT#">
  
5    <attribution>
6      <issuedby src="http://authority.example.org/company.rdf#me" />
7      <issued>2008-18-17T14:35:00</issued>
8    </attribution>

9    <dr>
10     <iriset>
11       <includehosts>example.com</includehosts>
12     </iriset>

13     <descriptorset>
14       <mt:name>example.com - The Exemplary Web site</mt:name>
15       <dcterms:description>example.com is a widely used example website</dcterms:description>
16       <dcterms:subject>example, demo, demonstration</dcterms:subject>
17       <mt:pcentrypage>http://ww.example.com</mt:pcentrypage>
18       <mt:mobile>http://m.example.com</mt:mobile>
19       <mt:pc-sitemap>http://example.com/sitemap.xml</mt:pc-sitemap>
20       <mt:mobile-sitemap>http://example.com/mobilesitemap.xml</mt:mobile-sitemap>
21       <mt:rss>http://rss.example.com/rss/topstoriesoftheday.xml</mt:rss>
22       <mt:rss>http://rss.example.com/rss/toppoliticalstory.xml</mt:rss>
23       <mt:rss>http://rss.example.com/rss/topsportstory.xml</mt:rss>
24       <mt:podcast>http://rss.example.com/podcasting/news.xml</mt:podcast>
25       <mt:video>http://rss.example.com/rss/tutorial.xml</mt:video>
26       <icbm:longitude>12.3456789</icbm:longitude>
27       <icbm:latitude>98.7654321</icbm:latitude>
28       <icbm:region>MM</icbm:region>
29     </descriptorset>
30   </dr>

31   <dr>
32     <iriset>
33       <includehosts>m.example.com</includehosts>
34     </iriset>

35     <descriptorset>
36       <typeof src="http://www.w3.org/2008/06/mobileOK#Conformant" />
37       <displaytext>The example.com Web site conforms to mobileOK</displaytext>
38       <displayicon src="http://www.w3.org/2005/11/MWI-Icons/mobileOK.png" />
39     </descriptorset>
40   </dr>    


41 </powder>
Line 6
Points to a file that describes the author. This is usually done using FOAF or DC Terms (see the POWDER Primer for an example).
Lines 9 - 30
The first of two Description Resources in this document
Lines 10 - 12
Define the scope of this Description Resource (everything on example.com)
Lines 13 - 30
The actual descriptors. As noted, I've sued a variety of namespaces to do this but the terms are all as listed in the metaTXT documentation.
Lines 31 - 40
A second DR, this time one that only applies to the m.example.com domain - which is tagged as being mobileOK conformant.

Processing the first DR reveals that for any resource anywhere on example.com or its subdomains, that it has the given name, keywords, is listed in the sitemap, links to RSS feeds etc.

Conversely, a search engine can see exactly where each of those features can be found.

Processing the second DR reveals that resources on the m.example.com domain (and any subdomains it may have) conform to W3C mobileOK.

Conversely, a search engine wishing to index pages that were mobileOK knows where to look.


Please note that this not in any way an 'official document,' neither is it a stement on behalf of anyone. It's just a personal reflection on metaTXT and POWDER.

Phil Archer
Version 1.0
18 November 2008
I have validated the example but it has proved that there's a way to go yet on the Processor - ouch.