[Go-essp-tech] replicas and search

Bryan Lawrence bryan.lawrence at stfc.ac.uk
Wed Dec 2 10:29:21 MST 2009


On Wednesday 02 December 2009 17:05:37 Luca Cinquini wrote:
> Hi Don,
> 	from Bryan's use case, it seemed to me that he already knew which  
> gateway to consult...

No ... it didn't occur to me that the tracking-id wouldn't be distributed with the rdf metadata ...

> But I do agree that we will not be able to perform file-searches  
> across the federation based on the RDF metadata, it has been long  
> known that we don't want to store files inside the triple stores  
> because of performance issues.

what sort of performance issues?

> Obviously, if this use case is really   
> important, each gateway could expose a web service to perform this  
> kind of queries in a distributed fashion.

indeed, and possibly an web service (atom feed would be easy and appropriate) exposing a (tracking-id,metadata tuple) ... allowing somone (us if necessary) to harvest these results ...  (I think distributed search is not the way to go) .... we could build such a feed if we had access to your db schema ...

I feel pretty strongly that the tracking-id to metadata search is going to be an incredibly important component of our catalog ecosystem ... folks quickly lose track of what data was what, and need to be able to find out a postiori.

I should say there are some consequential issues for what we do with provenance (attribute preservation) when using subsetting tools (and indeed any postprocessing).

Cheers
Bryan






> thanks, Luca
> 
> 
> On Dec 2, 2009, at 9:47 AM, Don Middleton wrote:
> 
> > A follow-on to the call, as I had to leave a bit early. We were  
> > discussing being able to use the tracking id for various files. If  
> > the tracking id is not in the federated metadata, how does one know  
> > which gateway to consult for information?
> >
> > don
> >
> >
> > On Dec 1, 2009, at 2:30 PM, Luca Cinquini wrote:
> >
> >> Hi,
> >> 	as I mentioned today at the telecon, I think the RDF query  
> >> services should be able to handle the concept of dataset replicas  
> >> quite nicely. The main concepts are that the RDF records generated  
> >> from replicas must have different RDF identifiers, so they can be  
> >> exchanged independently among gateways, and must reference their  
> >> original RDF record.
> >>
> >> If you are interested, I've documented some of the details here:
> >>
> >> https://wiki.ucar.edu/display/esgcet/Metadata+Search+and+Replicas
> >>
> >> I'm also including a snapshot of an ESG Gateway that shows multiple  
> >> replicas returned as part of a DRS search (note that how the  
> >> replicas are presented in the result page can be easily changed -  
> >> the picture is just to show that the query can handle replicas).
> >>
> >> thanks, Luca
> >>
> >> <DRS with Replicas.tiff>
> >> _______________________________________________
> >> GO-ESSP-TECH mailing list
> >> GO-ESSP-TECH at ucar.edu
> >> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
> >
> 
> _______________________________________________
> GO-ESSP-TECH mailing list
> GO-ESSP-TECH at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
> 



-- 
Bryan Lawrence
Director of Environmental Archival and Associated Research
(NCAS/British Atmospheric Data Centre and NCEO/NERC NEODC)
STFC, Rutherford Appleton Laboratory
Phone +44 1235 445012; Fax ... 5848; 
Web: home.badc.rl.ac.uk/lawrence


More information about the GO-ESSP-TECH mailing list