[Home]ModWikiDevelopmentHistory

MeatballWiki | RecentChanges | Random Page | Indices | Categories

This page archives the discussion that went into the development of the ModWiki RichSiteSummary standard. Ongoing discussion goes onto ModWikiDiscussion.


General discussion


Compatibility with older parsers

More testing is needed.

ModWiki currently recommends (in a couple of places) syntax similar to the following:

    <dc:contributor>
      <rdf:Description rss:link="http://openwiki.com/?MaryMcConnell"
                       wiki:host="192.168.1.10">
        <rdf:value>Mary McConnell?</rdf:value>
      </rdf:Description>
    </dc:contributor>

The hope behind this convoluted syntax is that older non-RDF aware RSS parsers will still manage, at least, to extract Mary McConnell? as the value of <dc:contributor>.

Unfortunately, the RSS parser I've been using, Perl's [XML::RSS], completely chokes on this syntax. Since there are cleaner, RDF-equivalent alternatives, there is no point in recommending this syntax if it doesn't help for at least some legacy parsers. Does it work in any of them?


Rename <changes/>

The term changes seems fairly ambiguous to me. Does it refer to the extent of the current change (e.g. number of modified lines) or is it a revision serial number? Given the context, I suspect the latter, in which case I would suggest naming it serial. I wonder whether it's needed at all (see the proposed use of dc:identifier below), and whether there's really a need to restrict the serial number to integral values. -- JeffDairiki

Can't find the proposed use of dc:identifier below. -- LaurensPit

From below:

dc:identifier
A URI which is unique to a particular RecentChanges entry (i.e. unique to a particular page version.)

The motivation for including this is to provide a "fingerprint" of the item for aggregators. If <item/>s from two separate RSS sources have the same dc:identifier, they are the same item. (If the <item/>s rdf:about attribute were required to be unique to a particular page version, then that would serve an identical purpose, and dc:indentifier would not be needed. Further discussion on this below.) --JeffDairiki

While I agree that <changes/> is not a good name, <serial/> is worse and <dc:identifier/> is not appropriate according to DublinCore. I think <version/> would be better, as that does not require a positive integer, or a serial ordering. Some wikis might use a time_t timestamp (seconds since January 1, 1970), for instance. -- SunirShah


<item/>

I think the proposal should include some discussion regarding the generation of item URIs (the values of the rdf:about attributes of the items.) Do/should these URIs refer to a specific revision of a page or to the page as a whole? (Or can it be either?) My feeling is that the URIs should be unique to a specific page revision. This implies that an <item> refers to a specific page revision, and it allows information about different revisions of the same page to be included in a channel.

Nit: the example RSS in the proposal is not valid. The <channel> needs an <item/> property, and the <item> needs an rdf:about attribute. -- JeffDairiki

Nitback: I think you meant <items/> instead of <item/>. hehe ;-) Anyways, I added the rdf:about attribute and the <items/> stuff to make it more complete. -- LaurensPit

We cannot force wiki engines to provide a way to link to specific versions. Many do not have this feature. Many only have the EditCopy type of versioning. Thus, rdf:about may only be able to point to the page in general. We could make a recommendation, however. Since the <link/> tag will also be required, this is a good idea.

It's true that the version 3 is wrong in that <item/> is a child of <channel/>. -- SunirShah

Does an rss:item refer to a wiki page, or does it refer to a particular version of a wiki page?

Can be both. -- LaurensPit

I think the rdf:about may point to a version (or the page in general), but the <link/> must point to the page in general. -- SunirShah

Some wiki engines, however, do keep (and make accessible) a number of older page revisions. For those wikis it would be a shame if one couldn't use RSS to express things like: page histories (a list of page revisions) and RecentChanges listings which include all (not just the most recent) changes to each page. I say the item URIs (the rdf:about attribute of the <item/> elements) must be unique to a particular page revision, while the <link/> property may point to either the page in general, or a specific version of the page. :-) This has the advantage that the URIs are unique identifiers for RecentChanges entries, thus obviating the requirement for <changes/>, <version/>, or <dc:identifier/> (some of which could still be optional properties.) -- JeffDairiki

If rdf:about must point to a specific version you will require wiki engines to have versioning available, since the rdf:about attribute is #REQUIRED in RSS1.0, which would exclude WikiWiki. This is absolutely impossible. At its strongest, we may choose the word should. But that would prevent particular wiki owners from choosing whether they want this behaviour. As it is unwiki to even have versions, I don't think mod_wiki should make this assertion.

It's not illegal to have <link/> and the <item rdf:about/> be equivalent. I don't think we should enforce their difference.

Further, I don't think the <link/> attribute should ever point to a specific version. That would not express the very wiki RecentChanges, the goal of this project. Thus, the <link/> attribute must point to the page in general, not a specific version. -- SunirShah

The only requirement for the value of rdf:about attributes is that each one is described in the form of a URI and is unique with respect to other rdf:about attributes in the same RSS document. It's used for identification purposes only, it does not necessarily have to be a URL to an existing webpage, though it's common to do so as a URL is a URI (but not the other way around as URLs are a subset of URIs).

Whether the URI must be a URL pointing to the general wikipage or to a specific version of the wikipage or something that is not even a URL (e.g. a pointer to a database record) should be left to the RSS author.

When an RSS author decides to put two or more items that link to the same wikipage in one RSS document, then the author must make sure each item's rdf:about value is a unique URI within the document. Usually the author will accomplish this by using URLs pointing to the version specific wikipages.

RSS aggregators should not assume the value of rdf:about is pointing to a wikipage. For that one should look at the value of the <link> element.

Whether the value of the <link> element must be a URL pointing to the general wikipage or to a specific version of the wikipage should also be left to the RSS author. Though I would make it a strong recommendation (i.e. a should) that the <link> element should be pointing to the general wikipage. -- LaurensPit


The URL to the author's page

I think the RDF way to do this would be something like

  ...
  <dc:contributor>
    <wiki:author about="http://www.usemod.com/cgi-bin/mb.pl?JeffDairiki">
      <rdf:value>Jeff Dairiki</rdf:value>
      <rss:link>http://www.usemod.com/cgi-bin/mb.pl?JeffDairiki</rss:link>
    </wiki:author>
  </dc:contributor>
  ...

After reading the RDF specs, I think it's:

  ...
  <dc:contributor>
    <rdf:Description about="http://www.usemod.com/cgi-bin/mb.pl?JeffDairiki">
      <rdf:value>Jeff Dairiki</rdf:value>
      <rss:link>http://www.usemod.com/cgi-bin/mb.pl?JeffDairiki</rss:link>
    </rdf:Description>
  </dc:contributor>
  ...

This form, at least in theory, allows non-wiki aware RSS parsers to extract Jeff Dairiki as the value for dc:contributor. In practice, I doubt most current (non-RDF based) RSS parsers would handle this correctly. In RSS 0.91 or so, elements are limited to being either containers or value, but not both [can anyone confirm?].

I think in practice parsers will extract Jeff Dairikihttp://www.usemod.com/cgi-bin/mb.pl?JeffDairiki as the value for dc:contributor. -- LaurensPit

Well, a conforming RSS parser would probably just give up. A naive RSS parser might take the entire text between <dc:contributor>...</dc:contributor> and fail at that point.

The RDF way is to use an xlink:href on <dc:contributor/>. The naive parser would probably still die on this, but it's naive, so it gets what it deserves. HaHaOnlySerious?. -- SunirShah

Some other options:

  ...
  <dc:contributor>Jeff Dairiki</dc:contributor>
  <wiki:contributorURL>
    http://www.usemod.com/cgi-bin/mb.pl?JeffDairiki
  </wiki:contributorURL>
  ...
This is simple, and doesn't break old RSS parsers, but is not the RDF way. This form does not express any inherent connection between the URL and the name (except that they are both attached to the same rss:item.)

As always, I like Wiki:DoTheSimplestThingThatCouldPossiblyWork. Therefore the last option appeals most to me. -- LaurensPit

We should verify that there are old RSS parsers that don't parse XML correctly first. Or, we could use [vCard] by extending the <dc:contributor/> with the vCard XML. I'm not really in favour of that, though. -- SunirShah

From the RDF specification, a related example (in theory, xmlns:v == vCard):

<rdf:RDF  
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:s="http://description.org/schema/">
  <rdf:Description about="http://www.w3.org/Home/Lassila">    
      <s:Creator>
         <rdf:Description about="http://www.w3.org/staffId/85740">
           <rdf:type resource="http://description.org/schema/Person"/>
           <v:Name>Ora Lassila</v:Name> 
           <v:Email>lassila@w3.org</v:Email>
         </rdf:Description>    
      </s:Creator>  
   </rdf:Description>
</rdf:RDF>


URL to page diffs

Here's a complicated fancy way:

  ...
  <wiki:button>
    <wiki:name>diff</wiki:name>
    <wiki:link>
      http://www.usemod.com/cgi-bin/mb.pl?diff=1&id=RssExtensionModuleForWikis
    </wiki:link>
  </wiki:button>
  ...

A simple, less flexible way would be:
  ...
  <wiki:diffURL>
   http://www.usemod.com/cgi-bin/mb.pl?diff=1&id=RssExtensionModuleForWikis
  </wiki:diffURL>
  ...

I think it's incorrect to use a moniker like button as that indicates how the link information is displayed. What if the data was presented by a text-to-speech browser? I think Laurens' v0.3 <wiki:diff xlink:href="url>major</wiki:diff> solution is very elegant. -- SunirShah


Was this a minor edit?

(And other flag-like meta-data.)

The two choices I can think of are

  ...
  <wiki:isMinorEdit>yes</wiki:isMinorEdit>
  ...
or
  ...
  <wiki:flags>minor_edit</wiki:flags>
  ...

It's bad XML practice to make "big bucket" fields that will accept some random text. That defeats the purpose of using XML in the first place. XML is meant to structure the data by using a uniform grammar. To introduce such a flags field, you require a separate grammar to parse this. Once again, I think the <wiki:diff/> tag in v0.3 is very elegant. -- SunirShah


I think it would be useful to list all wiki words the page contains within the item. how would one go about that? I am planning to add RSS to SushiWiki -- DougRansom

Syntax arguments

RichSiteSummary is a subset of the ResourceDescriptionFramework, which in turn is an ExtensibleMarkupLanguage (XML) specification. RDF uses the DublinCore. Further, DaveWiner has muddied the waters with the Aggegrator namespace. The many sets of requirements as described by these interacting standards that must be met. Below lists a few concerns that have been raised regarding standards compliance.

First, though, please read http://groups.yahoo.com/group/rss-dev/files/Modules/modules.html.

The use of RDF implies much more than just the DublinCore (see http://www.w3.org/RDF/). Most of the arguments made in this section (below) are flawed in that they make insufficient distinction between RDF and XML. That a construction is valid XML does not imply that it is valid RDF. --JeffDairiki

RDF 19990222 is not XML because at the time XML was not solidified. The intention always to ensure RDF conforms to XML. "Consequently, the grammar in the specification is flawed. For instance, "ID" is not an XML "id" (case sensitive), and the all the examples that use the Dublin Core are illegal (also case sensitivity issues).

As there is no official standard then for RDF, the question becomes a pragmatic one of de facto standards. It remains to be demonstrated what is fact and what is not. -- SunirShah


mod_aggregation

This is pretty useless. <ag:source/> == <dc:publisher/>, <ag:timestamp/> ~= <dc:date/>, and <ag:sourceURL/> should be an xlink:href on <dc:publisher/>.

The mod_aggregator properties are meta-meta-data, meant to be set by RSS aggregators. They are data about RSS entries, while most RSS properties give information about some other resource (e.g. a wiki page.) (RSS aggregators merge RSS feeds from several sources.) Ag:source describes who produced the original RSS, dc:publisher is about who produced the thing which the RSS item describes. Ag:timestamp has to do with when the aggregator groked the original RSS, dc:date (once again) describes the thing with the RSS item describes. Finally, I do not believe that the xlink:href you propose is valid RDF. (See below for more on this.) -- JeffDairiki

I think in general we should avoid using proposed extensions to RSS 1.0, like mod_taxonomy, until RSS 1.0 comes out. Perhaps until then we should hold off on adding dependent attributes like the CategoriesAndTopics. (Boo! I wanted that.) -- SunirShah

XML conformance of legacy parsers

It's not clear how close to the XML specification RSS systems are. Some may really choke and die on real XML. So, the above may have to change. -- SunirShah

As the RSS v1.0 spec is "real XML" I don't think it's a problem. If an RSS system doesn't choke on e.g. <channel rdf:about="http://example.com"> then it most probably won't choke on the above. -- LaurensPit

The issue (and title of this sub-section) should be "RDF conformance" not "XML conformance". --JeffDairiki

Agreed -- LaurensPit

No, we assume they are RDF conforming. This does not imply XML conforming as the XML standard has changed, invalidating the RDF standard. If the parsers are not RDF conforming, we just ignore them. Such is the way of standards. -- SunirShah

What???

xlink:href

The XML standards now require external resource references to conform to the UniformResourceIdentifier standard. They usually prefer they be referenced through [xlinks]. This means using xlink:href instead of a special <!ENTITY/> that has either PCDATA or CDATA as its child. For example, <contributorURL/> is bad, whilst an xlink:href on the <dc:contributor/> is better. Legacy RSS parsers may not appreciate this however. But unless this is shown, we should conform to the XML preferred way as much as reasonable, in my opinion. On the other hand, notice that RDF has the <link/> element, and the resource attribute. While xlink isn't religion, it's the usual method. -- SunirShah

Showstopper: I still do not believe that the proposed usage of xlink:href is valid RDF. You can not attach a property (xlink:href) to another property (<dc:contributor/>). Properties must be attached to objects. Referring to the formal grammar for RDF [1], the <dc:contributor/> element is a propertyElt, while the xlink:href attribute would be a propAttr. PropAttrs are not allowed in propertyElts. (There is an exception to this rule when the propertElt is empty, but that does not apply in this case.)

If I am right (that xlink:hrefs are not allowed on RDF property elements) then many of the suggestions made on this page (and the extension proposal draft) specify illegal syntax. --JeffDairiki

The RDF grammar is invalid. It doesn't apply any more. This is described above. Remember the history of the standards. RDF was developed before XML was completed.

From section 2.2, "This specification of RDF uses the Extensible Markup Language [XML] encoding as its interchange syntax," and "All syntactic flexibilities of XML are also implicitly included."

Yes, okay, but that is a red herring. The RDF grammar has not changed in any major way. See the working draft "Refactoring the RDF/XML syntax" (which describes not particularly major changes), dated 6 September, 2001 [2]. The syntax regarding propertyElts and propAttrs, in particular, remains unchanged.

RDF is not dead. It is not a precursor to XML. It is a data model, which happens to have an XML syntax for it's representation defined as part of its specification. That fact that the authors of the specification anticipated that there would be changes in the XML standard in no way justifies the view that "valid XML implies valid RDF" or that "valid XML supercedes valid RDF". --JeffDairiki

Further, extending the existing tags with new attributes is permitted. If you read the English description of the grammar, this is described somewhat obtusely, but it's there. Also examples in the RDF specification demonstrate this ability:

 <RDF
   xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:dc="http://purl.org/metadata/dublin_core#"
   xmlns:l="http://mycorp.com/schemas/my-schema#">
   <Description about="http://www.webnuts.net/Jan97.html"> 
     <dc:Subject
       rdf:value="020 - Library Science"
       l:Classification="Dewey Decimal Code"/>
   </Description>
 </RDF>

Yes, extending RDF with new properties is permitted. This does not mean that you can add new attributes to arbitrary elements in the XML representation of an RDF model --- you can only add the new poperties as attributes where propAttrs are permitted. (If you try to parse the XML in terms of the RDF data model (e.g. as a set of (predicate,subject,objsct) triples) the reasons for the restrictions will become clear.) The above example is legal --- it makes use of the empty propertyElt case which I referred to above. It is a contraction or abbreviated form which is (believe it or not) exactly equivalent to:

 <RDF
   ...
   <Description about="http://www.webnuts.net/Jan97.html"> 
     <dc:Subject>
       <Description>
         <rdf:value>020 - Library Science</rdf:value>
         <l:Classification>Dewey Decimal Code</l:Classification>
       </Description>
     </dc:Subject>
   </Description>
 </RDF>

The following are also legal contractions of the above:

     ... 
     <dc:Subject>
       <Description l:Classification="Dewey Deciaml Code>
         <rdf:value>020 - Library Science</rdf:value>
       </Description>
     </dc:Subject>
     ...
and
     ... 
     <dc:Subject>
       <Description l:Classification="Dewey Deciaml Code
                    rdf:value="020 - Library Science"/>
     </dc:Subject>
     ...

This, however, is not legal RDF syntax (though it is legal XML):

   ...
   <Description about="http://www.webnuts.net/Jan97.html"> 
     <dc:Subject l:Classification="Dewey Decimal Code">
       020 - Library Science
     </dc:Subject>
   </Description>
   ...

To reiterate once again, XML-validity of an construct is a necessary but certainly not a sufficient condition for it being a valid expression of an RDF data model. Please recognize that the top-level element in an RSS file is <rdf:RDF> for a reason. If you are unfamiliar with the set-of-triples representation or DirectedGraph? representations of an RDF model, you really need to read up on RDF further before proceeding. --JeffDairiki

I think I finally understand what you are saying. Sorry about that. ;) Let me summarize. While RDF is theoretically XML (although it's less strict than XML), it has its own data model that constrains the flexibility of the XML. This data model uses the triples, and it expects to see them in a particular way. Consequently, we can't just extend the RDF as we would any other XML as the RDF is not using the normal DocumentObjectModel specification.

After looking at v0.3, I think what should do is replace xlink:href with rdf:about on <wiki:diff/> and <wiki:changes/>; drop the xlink:href from dc:contributor; and instead use the vCard strategy as the RDF examples show. -- SunirShah

That, unfortunately, still doesn't make it comply to RDF specs ;-) dc:contributor can not get an rdf:about attribute. -- LaurensPit

I'm not particularly worried about our ability to randomly add new attribute to things according to the standard. We should do it and then verify that it works with significant aggregators. -- SunirShah

Thanks Jeff for clearing this up. I learned a lot :) I modified the proposal, but created 4 different versions this time ;). They are v0.41, v0.42, v0.43 and v.044. Plz find all four at http://openwiki.com/mod_wiki.html.

[v0.41] is the logical modification of v0.3 to make the proposal RDF compliant.

Because that looks complex, I went all the way back to basics, hence [v0.42] (which is almost the same as v0.1).

Because the contributor stuff looks odd (and though RDF compliant, not really the RDF way), I tried to mold it into the RDF way: result is [v0.43].

Because most probably RSS readers/aggregators grab the text value of dc:contributor, using v0.43 would result in Mary McConnell?192.168.1.10http://openwiki.com/?MaryMcConnell? as the value. To workaround this I propose to use an abbreviated form: result is [v0.44].

My preference currently is v0.44. -- LaurensPit

Very good! Yes, I like v0.44. I particularly like the convoluted abbreviated form for <dc:contributor/> which probably does solve most of the legacy-parser problems. ("We" should probably do some testing to verify this.) I still a few comments and (somewhat vague) concerns about v0.44. I'll list them on my name page (JeffDairiki) for now to avoid further clutterage here. --JeffDairiki


Rhetorics and Style

Eventually, I think the proposal needs a little more verbiage. It should include enough background and motivational material so that people who have not heard of a WikiWikiWeb are not completely snowed by the proposal. Also more specific guidance for implementors, including:

Beginnings of "expanded verbiage" are included below. -- JeffDairiki

Motivation

UnifiedRecentChanges: something about the desire to use RSS to produce UnifiedRecentChanges listings (which look very much like traditional wiki RecentChanges, e.g. Wiki:RecentChanges, Meatball:RecentChanges, except that they include changes on multiple wikis.)

Also, of course, more traditional uses of RSS.

wiki:interwiki

Optional property of rss:channel. It's value is the InterWiki moniker (an abbreviated name) used to refer to the source wiki.

   <rss:channel rdf:about="http://openwiki.com">
     ...
     <wiki:interwiki>OpenWiki</wiki:interwiki>
   </rss:channel>

The rss:link property may used to indicate the InterWiki prefix. (The InterWiki prefix, when prepended to a wiki page name, gives the URL to the wiki page.) For best compatibility with non-RDF based parsers, the following syntax is recommended:

   <rss:channel rdf:about="http://openwiki.com">
     ...
     <wiki:interwiki>
       <rdf:Description rss:link="http://openwiki.com/?">
         <rdf:value>OpenWiki</rdf:value>
       </rdf:Description>
     </wiki:interwiki>
   </rss:channel>

wiki:version

wiki:status

wiki:importance

wiki:diff

wiki:history

wiki:host

Optional property of dc:contributor values. The host (HTTP client) from which the wiki page was edited. See notes on dc:contributor for example usage. (Expand on acceptable formats. Are mangled IP's, e.g. "192.168.1.xxx" okay?)

Item URIs

There must (should?) be a one-to-one mapping between the URIs (used in the rdf:about attributes) of the rss:items and the triples, ( wiki, page, revision ). In particular, this means that the URIs must (should?) be unique to a particular page revision.

Use of dc:date within rss:item

The use of the dc:date property for rss:items is optional but highly encouraged (particularly when mod_wiki is used to list RecentChanges.) If used, its value must be the modification time of the corresponding wiki page. It is recommended that dc:date be specified to minute precision or better. See [W3CDTF] for acceptable formats.

Use of dc:contributor within rss:item

The dc:contributor property for rss:items is used to indicate the author, or editor of a wiki page revision. It's value can be the authors name, or for anonymous authors, the HTTP client hostname, IP number, or some mangling of any of those.

   <rss:item rdf:about="http://openwiki.com/ow.asp?p=SandBox&version=23">
     ...
     <dc:contributor>Joe User</dc:contributor>
   </rss:item>

The rss:link property can be used to indicate a link to a document containing information about the author, and the wiki:host property can be used to indicate the HTTP client from which the author committed the edit. If either of these properties are used, the following syntax is recommended, for best compatibility with older, non-RDF aware parsers:

   <rss:item rdf:about="http://openwiki.com/ow.asp?p=SandBox&version=23">
     ...
     <dc:contributor>
       <rdf:Description rss:link="http://openwiki.com/?MaryMcConnell"
                        wiki:host="192.168.1.10">
         <rdf:value>Mary McConnell</rdf:value>
       </rdf:Description>
     </dc:contributor>
   </rss:item>



To change

Discussion goes above. Actual things to modify are listed here.

I've modified v0.44 into [v0.5] which has a slightly different syntax for the wiki:interwiki element. The link attribute is the InterWiki URL prefix, the rdf:value element is the InterWiki moniker. Both elements are usually found in the InterMaps. For a discussion about these plz see JeffDairiki's thoughts.

If no one has any further comments/thoughts on what needs to be in mod_wiki, I think we're ready to have your votes. ;) If enough votes for v0.5, this will get raised to v1.0, after which we'll submit it to RSS-DEV (fwiw).

Votes please

[mod_wiki v0.5]

Yea:

  1. LaurensPit laurens@openwiki.com
  2. JeffDairiki dairiki@dairiki.org, with the following comments:

  1. SunirShah -- (Good work, guys!)

Nay:

  1. Not in the current form, what is absolutely missing is at least a short sentence to each introduced tag, and more important, whether it is optional or not. Extracting that info from this page we're on is not an easy task. In a nutshell: the tags are ok, their description sucks. We need "more verbiage" now, not eventually. ;) -- JürgenHermann
Well put (particularly your "an example does not a standard make" summary.) Meatball::ModWiki is a superb place for tutorial, FAQ, and addendum-like information (and the standard should link to it); but a (particular version of a) standard should be immutable (precluding wiki-ness), and, on its own, should provide enough detail so that one can, for example, tell whether a particular usage is conforming or not. By eventually, I meant that: I think now is a fine time to get the RSS-DEV group "into the loop", even if the draft standard is not completely fleshed out. --JeffDairiki

I Abstain:


Discussion

MeatballWiki | RecentChanges | Random Page | Indices | Categories
Edit text of this page | View other revisions
Search: