RSS1 come RDF: vantaggi rispetto a RSS2... parte prima

In questi giorni ho deciso di capire in modo approfondito questa questione: intuitivamente so che RSS1 e RSS1.1, entrambi RDF based offrono una migliore gestione dei feeds…
E tra l’altro completamente compatibili al Semantic Web…

Il problema e’ che tutti non lo capiscono: sia per una questione, che avevo gia’ accennato, di marketing e di naming delle varie versioni, sia per incompresioni sull’usabilita’ reale di RDF…

Anche io non ho chiaro il punto, non del tutto almeno: ho iniziato quindi a spulciare nell’archivio del blogger, che su queste questioni ha scritto molto e ne proporro’ una sintesi…
-> Danny Ayers, a Semantic Web enthusiastic

Partiamo da lontano: 2003

Eh gia’, ho spulciato l’archivio un bel po’ direi e ho scovato un post del 2003, che indicava un paio di interessanti articoli, pubblicati su XML.com:
-> Why Choose RSS 1.0?
-> The Social Meaning of RDF

Il primo link e’ stato una rivelazione: sembra pazzesco, ma per capire cosa accade oggi, qualcuno nel 2003 aveva scritto con una chiarezza agghiacciante quanto RSS1 e il mondo relativamente nuovo di RDF fosse gia’ un traguardo NON COMPRESO per fornire un framework estendibile naturalmente…

Il fatto che oggi Dave Winer e RSS2.0 siano famosissimi e che RSS2 sia ritenuto superiore rispetto agli altri formati,e’ il frutto di una strategia di marketing vincente sulle tecnologie migliori pre-esistenti, dove gli sforzi non erano concentrati sul creare consenso, ma sullo sviluppo della tecnologia.

Purtroppo nella storia dell’informatica e nel mercato in generale, e’ una dura legge: chi prende e domina il mercato quasi mai rappresenta la tecnologia migliore esistente, anzi direi il contrario…

The principal suspect is surely RDF which is perceived to be somehow “difficult”.
Although built on a simple triples data model there is no fixed XML serialization since abbreviate XML syntaxes are supported, and it is thus not easy to capture this neatly in an XML schema language. And further, RDF makes liberal use of XML Namespaces.
But if RDF is to manage multiple schemas then it manifestly needs to be able to label elements according to their respective schemas.
And there is the widely held belief that native XML markup is somehow intuitive – and by implication good enough – and that the additional baggage of any common relational data model is just so much further complexity.
The problem with this view is that it doesn’t scale.
At the time of writing it should be noted that there is a further effort to define yet another new RDF-free specification of RSS with the principal aim of syndicating blogs. Formerly known as “Project Echo”, this has been now been tentatively renamed “The Atom Project” [3]. [The curse of RSS nomenclature seemingly continues – I just checked out the wiki before sending this text off and see that it’s now being labeled “The (Not)Atom Project”.] The Atom Project is essentially a reworking of RSS 2.0, and as such it remains a much more focused technology than RSS 1.0.

Per capire meglio il ruolo degli XML namespaces, e’ interessante citare questa parte:

The use of XML Namespaces can help to resolve element naming conflicts in XML documents but cannot of itself resolve any semantic interpretation that may be placed upon the use of a particular schema.
Inserting arbitrary namespaced elements into an RSS document does not necessarily help either a human or a machine understand the purpose of the element or the meaning of its value. Further, there may or may not be a schema specification located under the XML Namespace URI, but even if there were, it might not help the human or machine to interpret the context within which an RSS element is found.
By adopting a public data model such as RDF all these ambiguities vanish.
In the RDF data model the context supplies the meaning.

Sul secondo link invece per ora non mi soffermo, introduce questioni strettamente legate alla semantica e al contesto presenti in RDF, che vale la pena indicare, ma che e’ fuori tema rispetto a quello che vorrei capire in questo percorso…

Mentre per quanto riguarda la faccenda dei namespaces, e’ interessante vedere:
-> One Namespace to Unite Them

The key technical aspect was on whether it was better to use (namespace-based) modularisation, or put everything in a single (no-namespace) XML vocabulary (There’s a nice snapshot of discussions here).
The pro-namespace approach had the added advantage of being able to use RDF, the no-namespace approach made for a simpler syntax, without “esoteric labels”.
Having everything available in one place was also cited as another advantage for the kitchen sink approach.

Per capire invece quanto vale essere RDF-compatibili fin da subito si puo’ citare questo post invece:
-> RSS-Data alternative demo

“What we need is a simple data model that can expand the use of RSS into application arenas, enabling applications to output RSS with object data, and clients and other applications to easily and predictably include that data. In other words, RSS needs a schema, but it’s not XML Schema.
…
RSS-Data would require no changes or revisions to RSS 2.0, though developers wishing to support RSS-Data would obvioulsy need to write RSS parsers that recognized and deserialized RSS-data in the namespace. But, rather than writing custom parsers for every new namespace extension to RSS, developers could confidently work with just one RSS/Data parser that handled 99% of their application meta-data needs.”
The simple data model is RDF, an appropriate schema is RDF Schema, no custom parsers are needed because generic RDF/XML parsers can handle this.
Finally, let me just emphasis that this approach doesn’t need anything new like RSS-Data. It’s all there in RDF and RSS 1.0.

Per quanto riguarda cosa si intendeva per RSS-Data, ecco il post in questione…
-> RSS-Data

Una prima chiusura…

Per chiudere questo primo spunto di riflessione, e’ ottimo riferirsi a questo post:
-> “We want BOTH RSS 1.0 & RSS 2.0!”

Qui la questione che si sta discutendo e’ questa:

talking about using data structures via RSS 2.0 + RSS-Data AND
RSS 1.0 + RDF

there needs to be some sort of common understanding to do useful things with the data - a schema is one approach, some receptive code is another.
RDF has a model that can make some sense of arbitrary data in an RSS 1.0 feed (though having an RDF schema magnifies this n-fold), Roger’s parser demonstrates that some sense can be made of XML-RPC data in an RSS 2.0 feed, but it should be noted that there is no defined way of using this information alongside other available information.
Only humans can provide context for the names of the fields, the strings in the arrays or whatever.
In the RDF model the descriptions can be merged and processed alongside any other information that refers to the same resources.
I am beginning to drift into the ‘why RDF is cool’ discussion here, but there is a point.

Al di la’ di questa questione, per quello che riguarda il fattore RSS2.0 e’ notevole questo passaggio:

RSS 2.0 itself is defined essentially as a application for pumping newslike information.
Nothing else.
You can extend it using namespaces, but there is no standard way of doing so, so every new extension module exists as a parallel pipe.
The RSS-Data approach looks to me like an alternate way of creating a single-application-specific pipe.
But I honestly can’t think why you would want to reduce your application model down to structs and so on, when you could define a data model that maps better to the application domain and put it in a namespace.

L’ultima parte in grassetto identifica in pratica il corretto uso dei namespaces da un punto di vista RDF centrico…

work in progress

RSS1 come RDF: vantaggi rispetto a RSS2... parte prima

Matteo Brunati