[Go-essp-tech] [esg-gateway-dev] Metadata API

Nathan Wilhelmi wilhelmi at ucar.edu
Thu May 5 09:53:24 MDT 2011


Hi Estani,

I think you should be able to get this from the hessian API now. Let me 
get back to you after I do a little verification. Personally I would 
like to see the hessian service deprecated, so I wouldn't recommend 
using it.

What we are working on now I think would solve your problem. We are 
changing the dataset view mechanism in the gateway to be REST services. 
We looking at implementing the following URL patterns:

.../dataset/[thredds-identifier] .[content-type]
.../dataset/[thredds-identifier]/version/[version-identifier] 
.[content-type]

.../dataset/[thredds-identifier]/files .[content-type]
.../dataset/[thredds-identifier]/version/[version-identifier]/files 
.[content-type]

To start we are going to use file extensions rather than content type. 
However if there is a desire to use content types we can go that route 
as well.

We have been looking at the following views:

.html -> the normal html pages the gateway currently exposes.
.xml -> the XML documents that are currently returned by the Hessian 
metadata service. Likely a few extensions to support finding the top 
level datasets etc.
.thredds -> If applicable we will return the thredds catalog 
representation of the dataset.
.rss/atom -> rss/atom feeds views where applicable.
(we are anticipated exposing representations in other metadata standards 
as well)

We are open to any suggestions or requests for this approach/structure. 
We will keep the old dataset url's active for a while and likely 
implement a redirect mechanism so as to not break existing links during 
a deprecation period.

This work is currently underway, go-essp is going to slow development 
work but we should have test builds available for evaluation shortly 
after go-essp.

Thanks!
-Nate




On 05/05/2011 09:10 AM, Estanislao Gonzalez wrote:
> Hi Luca, Nate,
>
> The thing is that for the DOI we have to know where the catalog is being
> held. The DOI is really given to an experiment, so the mapping works
> like this: experiment ->  [ datasets ] ->  [ urls_of_catalogs ]
>
> The idea is that after following the DOI link, you'll get to a page
> where, among other things, you'll have the possibility to get to the
> catalogs directly. This URL cannot be inferred (because of the
> "page_number" element) and can't even be granted to remain constant
> (e.g. if some catalogs get re-published or deleted everything held
> afterwards will change too).
> After looking into this more thoroughly, I think that only the Gateway
> were the dataset got published has this information... which could be a
> problem.
>
> I need something like:
> String getCatalogUrl(String dataset_id)
>
> like:
> getCatalogURL("tamip.output1.MOHC.HadGEM2-A.tamip200904.3hr.atmos.3hrCurt.r10i1p1.v20110325")
> ==
> "http://cmip-dn.badc.rl.ac.uk/thredds/esgcet/1/tamip.output1.MOHC.HadGEM2-A.tamip200904.3hr.atmos.3hrCurt.r10i1p1.v20110325.html"
>
> (or better yet return a list of catalogs from master and replicas, if known)
>
> If this information is not in the rdfs (haven't checked yet), that means
> it's only in the postgres DB of the master/replica GW. In this case I
> could extend the hessian API if required, but this should get into the
> trunk, it's no use if only we have this version in place. For DOI
> publication we need at least PCMDI and BADC to share this configuration
> (not to mention that I'm 100% against such "selectivity" :-)
>
> If the information is passed by the OAI harvester, then it will be
> simpler, as we will have locally all the information required.
>
> If instead of that number we would had some subset of the DRS, we
> wouldn't need this service at all, as the catalog URL can be completely
> inferred from the dataset_id. Still it will be a more robust solution if
> the gateway could provide such an API.
>
> So I guess what I'm asking is: what do we have now? what will we have?
> would the Gateway add such a functionality (I'll help with this if
> required of course)?
> Or do you see any other way around this?
>
> Thanks,
> Estani
> Am 05.05.2011 16:09, schrieb Cinquini, Luca (3880):
>> Hi Estani,
>> 	are you looking for this information specifically  :
>>
>>     <dataset name="AIRS L3 Monthly Data" ID="obs4cmip5.NASA-JPL.AQUA.AIRS.mon.v1" restrictAccess="esg-user">
>>       <property name="dataset_id" value="obs4cmip5.NASA-JPL.AQUA.AIRS.mon" />
>>       <property name="dataset_version" value="1" />
>>
>> If so, it could be obtained directly from the thredds catalog. Additionally, the data node will soon expose the possibility of querying this directly (yes, through REST....).
>>
>> thanks, Luca
>>
>> On May 5, 2011, at 7:30 AM, Nathan Wilhelmi wrote:
>>
>>> Hi Estani,
>>>
>>> Currently the metadata query is rather limited and hessian only. I am
>>> not exactly sure what information you are after, could you explain that
>>> in more detail?
>>>
>>> Providing a more comprehensive REST API for dataset metadata is underway
>>> right now. We are working through a few issues, but the intent is to
>>> provide a more comprehensive REST based API to get dataset metadata. I
>>> don't have an exact ETA but as more details become available we can pass
>>> them along. There are number for groups and projects that are looking
>>> for this functionality.
>>>
>>> Thanks!
>>> -Nate
>>>
>>>
>>>
>>> On 05/05/2011 03:46 AM, Estanislao Gonzalez wrote:
>>>> Hi,
>>>>
>>>> There are some problems regarding the dataset->catalog mapping for the
>>>> DOI submission. This information cannot be inferred as there's a
>>>> pseudo-random element in the url of the catalog.
>>>>
>>>> It was brought to my attention that there's an API in the gateway that
>>>> provides this info. The information is in the table
>>>> metadata.dataset_version, but I don't see that the remote metadata query
>>>> API provides this. Is there any other API for this? Are these APIs only
>>>> hessian or is there an effort going on to provide some REST endpoints as
>>>> well?
>>>>
>>>> Thanks,
>>>> Estani
>>>>
>>> _______________________________________________
>>> esg-gateway-dev mailing list
>>> esg-gateway-dev at mailman.earthsystemgrid.org
>>> http://mailman.earthsystemgrid.org/mailman/listinfo/esg-gateway-dev
>> _______________________________________________
>> esg-gateway-dev mailing list
>> esg-gateway-dev at mailman.earthsystemgrid.org
>> http://mailman.earthsystemgrid.org/mailman/listinfo/esg-gateway-dev
>



More information about the GO-ESSP-TECH mailing list