[Go-essp-tech] publishing by realm

Bob Drach drach at llnl.gov
Thu Feb 25 12:27:15 MST 2010


Hi Stephen,

On Feb 25, 2010, at 1:03 AM, <stephen.pascoe at stfc.ac.uk> wrote:

>
> Hi all,
>
> I completely agree that publishing by realm fits much better with  
> the way the Gateway's UI is designed.  We've been discussing this  
> issue over the last few days between IS-ENES and UKMO and I think  
> the consensus is that it would be pragmatic change but it would  
> have implications.
>
> (I'm going to use the term realm-dataset to mean the unit of  
> publication "at the realm level")
>
> I had thought that publishing by realm would mean that we can only  
> manage versions of realm-datasets rather than atomic-datasets but  
> Bob's comment made me think.  We would have a rather muddled  
> version system:
>
>  1. esgpublish would track versions of realm-datasets and files
>  2. The DRS records versions of atomic datasets
>
> So what are we versioning?  What happens when 1 atomic-dataset from  
> a realm is found to have errors?
>
>  1. The entire realm-dataset would be unpublished
>  2. The realm-dataset could be republished with the faulty atomic- 
> dataset removed (realm-dataset v2)
>  3. A new version of the realm-dataset could be published with the  
> corrected variable (realm-dataset v3, atomic-dataset v2).

Where would 'atomic dataset version' be stored? In ESG there would  
only be realm-dataset versions and individual file versions.

Bob
>
> At this point we potentially have confusing version information.   
> Alternatively we could not do #2 but then the whole realm-dataset  
> is unpublished whilst problems are fixed.  So we need to decide  
> whether we will continue to represent versions of atomic-datasets  
> or change the definition of the DRS version component.
>
> This gets more complex when we consider replication.  If this realm- 
> dataset has been replicated do we propagate all these versions to  
> the replicas?
>
> Just throwing some issues into the air.
>
> Cheers,
> Stehphen.
>
>
> -----Original Message-----
> From: go-essp-tech-bounces at ucar.edu on behalf of Bob Drach
> Sent: Wed 2/24/2010 7:20 PM
> To: Luca Cinquini
> Cc: go-essp-tech at ucar.edu
> Subject: Re: [Go-essp-tech] publishing by realm
>
> Hi Luca,
>
> I'm happy with the solution as well. The main downside, compared to
> publishing at the variable level, is that if new versions of specific
> files are published, it will be necessary to republishe more than
> just the modified files. However the publisher can be instructed to
> only rescan the files modified, and in the case where multiple
> variables are updated this scheme would actually require less
> republishing. In short, I think it's a workable solution.
>
> Bob
>
> On Feb 24, 2010, at 7:56 AM, Luca Cinquini wrote:
>
>> Hi Bob,
>> 	I looked at the PCMDI site after you published by realm, and it
>> seems to me that this is a MUCH better presentation of the data to
>> the user. The search results are the right granularity (1825 total
>> CMIP3 datasets, 194 for a single model like CCSM) and the number of
>> files for each dataset, once you click on it, is still very
>> manageable - around 10 for a single CCSM/atmosphere dataset
>> (although this is likely to increase for CMIP5 runs I believe).
>>
>> Eric and I are working on harvesting the additional experiment and
>> realm information for the thredds catalogs, and expose them as
>> search facets. When this is done (by next week, hopefully), it
>> would be good to re-publish all these data, because I think it
>> would provide a good example on how users can easily find CMIP3
>> data by selecting one or more of model/realm/experiment/variable.
>>
>> In summary, I like it much better know than before...
>>
>> thanks, Luca
>
> _______________________________________________
> GO-ESSP-TECH mailing list
> GO-ESSP-TECH at ucar.edu
> http://*mailman.ucar.edu/mailman/listinfo/go-essp-tech
>
>
>
> -- 
> Scanned by iCritical.
>



More information about the GO-ESSP-TECH mailing list