Ancora alcune riflessioni sui tags

La questione si fa complessa: e un motivo ci sara’,no?

In pratica a mio avviso adesso i tags sono il sintomo e la risposta della comunita’ degli utenti a bisogni ormai considerati necessari per filtrare la folle mole di informazioni che ci viene costantemente addosso…
Da un lato ci sono gli esperti del Semantic Web e delle branchie dei Knowledge Systems e delle ontologie che continuano a perseverare nelle formalita’ e dall’altro la comunita’ che con le conoscenze che ha crea qualcosa di intrinsecamente imperfetto, ma che e’ usabile… a differenza dei tools creati dai primi…

Il compromesso dell’imperfezione e’ comunque accettato per i benefici ricevuti con i nuovi tools, ma non si guarda oltre il proprio naso.. questo e’ il vero limite…

La vera sfida oggi e’ quindi cercare di trovare un compromesso a questi due approcci, e allo stesso tempo mantenere in ogni caso una compatibilita’ molto forte con il fronte accademico, se vogliamo, per non precludere le potenzialita’ future.

La cosa piu’ vicina a tutto questo credo siano i famosi Microformats in XML per inciso: l’importante pero’ e’ tenere presente questo assunto, che non sembra molto chiaro ai piu’, putroppo.

Differenti punti di vista

Ho raccolto differenti punti di vista con differenti tematiche anche che non sono semplici da disquisire, ma provero’ a fare delle puntualizzazioni…

Essendo il tema molto aperto e’ chiaro che sara’ impossibile vedere tutto il materiale interessante, ma consiglio di leggerli per farsi un’idea e poi se volete aggiungete quello che pensate.. di cose da discutere ce ne sono molte, ve l’assicuro…

Technorati tags e la semantica…

L’intervento piu’ critico a mio avviso e quello che piu’ mi ha colpito e’ questo:

-> Swoogle and Technorati Tags

Il post esce nei primi giorni di uscita dei nuovi tags di Technorati e ne vengono evidenziati i difetti, con alcune note particolarmente illuminanti:

Instead of being true meta-data, they are specific to technorati. If another service wanted to provide the same level of functionality, they’d either have to use the technorati tag format, or get the existing users to add/replace their current system.
Rather than just parsing existing data, the t.t. system requires users to alter the data itself. This means that in the future, all changes would need to be made to the data to ensure proper tagging. It also means all the data in the past would need to be altered. Imagine if google required every page that wished to be indexed to add a meta-tag saying “index me google” to them. Obviously, few would comply.
Along with #2, this system seems like a simple rehash of the meta-tag system, which was abused to death by many. The tag system could just as easily be a meta-tag system.
Anchor tags? Anchor tags? The use of anchor tags to introduce meta-data seems to be a bad reading of what meta-data is actually for. The rel attribute, for instance, is supposed to describe what an anchor link is linking to. This doesn’t seem to be doing that.
The tags themselves seem to be yet another attempt to alter existing meta-data systems without any hard work.

La critica che mi ha colpito di piu’ e’ quella che discute della forma del tag di technorati… Per quelli che non lo sanno e’ questa:

<a rel="tag" href="">
In effetti qui e’ stato manomesso il significato semantico del tag a html… o meglio ci sono alcune cose date per implicite…
L’attributo rel dell’ancora ( il tag html a sta per anchor ) infatti sta ad indicare qualcosa sull’elemento linkato, una sua descrizione quindi…

Di fatto pero’ il link che associo e’ nella maggior parte dei casi il tag di techonorati, che quindi mi fa andare ad una AGGREGAZIONE di risorse legate a quel tag medesimo… oppure ad una risorsa legata strettamente con il tag stesso..seguendo pero’ la stessa sintassi…

You do not have to link to Technorati. You can link to any URL that ends in something conforming to the tag standard.
Questo e’ quello infatti che indica Technorati…

Tra parentesi questo e’ anche quello che traspare da una critica che ho gia’ postato qualche giorno fa… il fatto che in realta’ io includo altri significati in un solo costrutto…

In pratica ne viene fuori una confusione generale sull’uso che ne dobbiamo fare..indicare un aggregatore o una risorsa che spiega il tag stesso?

L’uso dell’attributo rel quindi non e’ del tutto sbagliato, ma in ogni caso non risulta chiaro… io sarei meno critico su questo punto rispetto all’autore.

La seconda parte del post spiega invece una direzione piu’ giusta di sviluppo…

The semantic web is not here today.
But systems like technorati tags don’t help matters, because they don’t create machine readable data that is open and not tied to one source or enterprise.
Instead, the tag system just helps people share data with technorati, not with each other. Grr.
On the other hand, a technology that will help build the semantic web has just been enhanced, namely, SPARQL.
SPARQL isn’t tied to one group or body. It lets machines search rdf data. What could be better for the growth of the semantic web than that? For semantic web searching, everyone should go check out swoogle.

Poi c’e’ il punto molto opinabile e con il quale io non mi trovo del tutto d’accordo…

Obviously, not everyone agrees. Some people seem to think that ontologies that are definited by smaller bodies will “evolve” into a larger more useful ontology.
These user-created “folksonomies” are therefore a stepping stone to a more useful ontology. I just think this is the wrong approach: we should be working on building information that can be evaluated independently of any particular author. If I write a web piece that I claim is about network technology but seems to be a bildungsroman in actuality, I would hope that
1. The human reading it would “get” it.
2. The machine reading it, if programmed properly, would also “get it”.
We’re not even at the first state, where humans can agree on what certain works “mean”. So why don’t we just skip to step #2 instead of doing more arguing about #1? Or, to put it another way, if the current search engine algorithms ignore the meta-tags I insert into my description field because they tend to be easily skewed (by nefarious porn providers, no doubt!), why should I keep inserting that information?
Instead of requiring humans to constantly classify our words into different taxonomies, can’t we just get a machine to automate it? The less humans in the equation of the semantic web, the better, in my mind.

Ora questo e’ il suo punto di vista: il problema e’ che ancora non l’ho provato in pratica, ma in linea teorica non sono del tutto d’accordo…

Non si puo’ pretendere di creare delle ontologie valide per tutti, perche’ lambiguita’ di moltissimi termini e’ palese e dipende da troppi fattori squisitamente umani che non possiamo delegare alle macchine: e questo IMHO e’ un dato di fatto.

Quello che possiamo fare pero’ e’ definire una nostra tassonomia secondo le regole del Semantic Web, visto che i contenuti del nostro sito o di quello che stiamo scrivendo e i loro contesti ci sono chiari ( o dovrebbero esserlo :) )

In questo modo una qualsiasi agente esterno quando prende i nostri dati riesce a capirne il contesto non in base ad una sua ontologia, ma in base all’ontologia legata al contenuto stesso…

Questo credo e’ anche il motivo dell’esistenza del gruppo SKOS del w3c… dare un modo standard e semantico di definire una tassonomia a tutti…

Questo e’ quello che ne capisco io: il problema e’ che devo ancora metterlo in pratica… vedremo…

RSS e un meta-feed

Questo post invece ha alcuni spunti interessanti su come unire questi mondi, l’uso dei tags e RSS, una cosa che sto cercando di fare anch’io…

Ecco i punti interessanti

Tag, You’re Categorized
Realizing the shortcomings of keyword-based searching, many people are embracing the concept of “tagging”.
Tagging is simply adding some basic metadata to an RSS post, usually just a simple keyword “tag”.
For example, the RSS feed on this site effectively “tags” my posts by using the dc:subject property from the RSS standard.
Using such keywords, feed aggregators (such as Technorati, PubSub and Del.icio.us ) can sort posts into different categories and subscribers can then subscribe to these categorized RSS feeds, instead of the “raw” feeds from the sites themselves.
Alternatively, RSS readers can sort the posts into user-created folders based on tags (although mine doesn’t offer this feature yet).

Quello di rilevante e’ che le categorie e i tags attualmente sono gestiti di default via RSS usando la proprieta’ dc:subject… e che basandosi su questo assunto i feed aggregators possono offrire delle visioni personalizzate per l’utente…

Nel caso di usare una propria tassonomia pero’ io sarei piu’ propenso nell’usare il modulo di RSS1 dedicato proprio alle tassonomie, in modo da gestire al meglio il tag stesso…
Ma in effetti qui non e’ chiaro se usare un dc:subjects come URI per un tag oppure occorra il modulo taxo…

Poi l’autore spiega il solito problema dei tags: la mancanza di una tassonomia comune… ma la parte interessante e’ la sua visione di come risolvere questa cosa…

Meta-Feeds
While tagging may be doomed to confusion, there are some other potential approaches that promise to bring order to RSS’s increasingly chaotic situation.
The most promising approach involves something called a Meta-feed.
Meta-feeds are RSS feeds comprised solely of metadata about other feeds.
Combining meta-feeds with the original source feeds enables RSS readers to display consistently categorized posts within rich and logically consistent taxonomies.

Notevole come idea,no? Ma quindi come fare?

The process of creating a meta-data feed looks a lot like that needed** to create a search index**.
First, crawlers must scour RSS feeds for new posts. Once they have located new posts, the posts are categorized and placed into a taxonomy using advanced statistical processes such as Bayesian analysis and natural language processing.
This metadata is then appended to the URL of the original post and put into its own RSS meta-feed.
In addition to the categorization data, the meta-feed can also contain taxonomy information, as well as information about such things as exact/near duplicates and related posts.

Quindi in questo caso si usano delle tecniche avanzate basate sul linguaggio e su regole bayesiane per catalogare in questa tassonomia comune tutti i posts…

Sono scettico riguardo alla fattibilita’ di questa cosa per gli stessi motivi che ho scritto riguardo all’altro post… ma l’idea di unire un meta-feed sembra interessante…

Users will also be able to create their own custom taxonomies and category names (as long they relate them back to the meta-feed).
Users can even combine meta-feeds from two different feeds so long as one of the meta-feed publishers** creates an RDF file that relates the two categories and taxonomies (to the extent practical)**. Of course the biggest benefit to users will be that information is consistently sorted and grouped into meaningful categories allowing them greatly reduce the amount of “noise” created by duplicate and non-relevant posts.
At a higher level, the existence of multiple meta-feeds, each with its own distinct taxonomy and categories, will in essence create multiple “views” of the web that are not predicated on any single person’s semantic orientation (as is the case with tagging).

A quanto pare devo aver frainteso la parte precedente: anche questo autore per fortuna intende la creazione da parte degli utenti della propria tassonomia basata su RDF… bene :)

Infatti l’ultima parte dell’intervento attacca la visione chiusa di Technorati rispetto all’apertura delle tecnologie semantiche e ne mette in dubbio la longevita’… e soprattutto la mancanza di uno standard comune.. che aggiungo io in effetti e’ il Semantic Web…

While it may be possible for services like Pubsub and Technorati to put together their own proprietary end-to-end implementations of meta-feeds, in order for such feeds to become truly accepted, standards will have to be developed that incorporate meta-feeds into readers and allow for interoperability between meta-feeds.
If RSS fails to address “Feed Overload Syndrome”, it will admittedly not be the end of the world.
RSS will still provide a useful, albeit highly limited, “alert” service for new content at a limited number of sites. However for RSS to reach its potential of dramatically expanding the scope, scale, and richness of individuals’ (and computers’) interaction with the web, innovations such as meta-feeds are desperately needed in order to create a truly scaleable foundation.

Ok questo post e’ lunghissimo, ma spero di aver sollevato dubbi e punti di vista notevoli per capire meglio cosa si puo’ fare e quali sono le direzioni da prendere… mai come ora il Web e’ in fermento e a un bivio: capire quali sono i compromessi da prendere che siano allo stesso tempo scalabili nel tempo e’ la chiave del Web seconda generazione…

Riferimenti:

-> Ancora sui tags e su SKOS…
-> RSS1,taxonomies e tags
-> Tags,Microformats e RDF
-> The future will be tagified
-> Saving RSS: Why Meta-feeds will triumph over Tags
-> Swoogle and Technorati Tags
-> Folksonomies succeed where the Semantic Web fails
-> Technorati tags

Ancora alcune riflessioni sui tags

Instead of being true meta-data, they are specific to technorati. If another service wanted to provide the same level of functionality, they’d either have to use the technorati tag format, or get the existing users to add/replace their current system.

Along with #2, this system seems like a simple rehash of the meta-tag system, which was abused to death by many. The tag system could just as easily be a meta-tag system.

Anchor tags? Anchor tags? The use of anchor tags to introduce meta-data seems to be a bad reading of what meta-data is actually for. The rel attribute, for instance, is supposed to describe what an anchor link is linking to. This doesn’t seem to be doing that.

The tags themselves seem to be yet another attempt to alter existing meta-data systems without any hard work.

Matteo Brunati