[Go-essp-tech] ESG Gateway Search Preview
Eric Nienhouse
ejn at ucar.edu
Thu Jul 28 15:12:15 MDT 2011
Hi Karl, All,
Thank you again for raising these search and discovery problems and
keeping our user's concerns in the forefront of our attention.
As Don noted we're working on this area as our top priority. Good
things are underway and we have an early access community preview ready
for review. We trust this is on track to provide a robust search
experience based on the feedback from initial use of the ESG system.
Please see further below for caveats and other details related to this
early preview site. The URL follows:
http://search-esg.prototype.ucar.edu/home.htm
Please feel free to try out all aspects of search. We've worked to
address the many issues relating to free text strings, variable short
and standard names, multiple variables, missing ensembles, model names
and more. Specific details on improvements can be found further below.
Thank you for all feedback, we truly appreciate it!
Please provide feedback to: esg-gateway-dev at earthsystemgrid.org
And if you like, add a Jira ticket (or we'll add one for you):
https://vets.development.ucar.edu/jira
This preview represents a significant re-work of the ESG gateway search
system. Note this work is in-development and there are a number of
issues to address prior to a release. In particular, the full work flow
to data download and Wget script generation has limitations and you may
run into errors. Note, too, that the data holdings represented include
NCAR and PCMDI CMIP5 and other projects. We also may redeploy the site
at any time to keep it up to date with progress.
It is our intent to deliver a production ready release candidate as soon
as possible. Your feedback will be useful in achieving this goal. Note
that upgrading existing ESG Gateways to the current 1.3.0 version is an
key step towards doing so.
For further information on the remaining search related tasks please see:
https://vets.development.ucar.edu/jira/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+GTWY+AND+fixVersion+%3D+%22SOLR+Production+Ready%22+AND+resolution+%3D+Unresolved+ORDER+BY+due+ASC%2C+priority+DESC%2C+created+ASC&mode=hide
Following is a list of problems raised and how they are being addressed:
* Consistency of search results/dataset counts:
We now have consistency across search category values (eg: institutes,
models, ensembles) based on the authoritative source (Gateway and
thredds catalog.) Gateway administrators will no longer need to add
these values by hand to their gateway database. The thredds catalogs
ultimately drive this. We have some work to do relating to creation of
new standard variable names.
* Free text searching:
Free text search is now capable of handling the common syntax and
characters used for CMIP5 and other activity names and labels. Further,
words like "the" are generally excluded from the search query yielding
better and more expected results. Certain fields are no longer used for
free text searching such as a variable standard name "description".
* Specific variable search issues:
Both standard names and "short" names are provided as facets. This
allows either form to be used, enabling faceted search of the variable
space in cases where a standard name may not be available. Standard
name "long" descriptions are no longer indexed as this led to confusing
results based on feedback. Multiple variable facets can now be
selected, which is a feature many users have requested.
* Latency in search indexing:
Publication of new datasets results in nearly immediate search
indexing. This means published datasets will be available via search as
soon as they are published to the gateway. Latency still exists when
harvesting "remote" datasets from another gateway and this is currently
as designed.
Again, thank you all for your feedback regarding the search experience.
We look forward to providing a release candidate of this work for review
as well.
Kind Regards,
The ESG Gateway Team at NCAR.
Don Middleton wrote:
> Hi Karl - Thanks for staying on top of all of these issues. They're
> all logged into Jira and we're working on the search-related ones as a
> top priority right now. We'll follow up shortly with a more detailed
> status update, and a community preview of some fixes.
>
> cheers - don, on behalf of the NCAR team
>
>
> On Jul 27, 2011, at 4:53 PM, Karl Taylor wrote:
>
>> Dear all,
>>
>> Currently (as of a few minutes ago) the ESG sites are showing the
>> following number of CMIP5 datasets:
>>
>> PCMDI 8040
>> BADC 8040
>> NCAR 8040
>> NCI 8041
>> DKRZ 8032
>> ORNL 8031
>> NERSC 7525
>> JPL 7524
>>
>>
>> I think we know about the NCI single extra dataset, which they're
>> presumably working on removing. Also we know NERSC and JPL are not
>> seeing data at the NCI gateway (CSIRO model). Can anyone explain
>>
>> 1) the missing 8 datasets at DKRZ and the 9 missing datasets at ORNL?
>> 2) the difference of 1 between NERSC and JPL?
>>
>> Best regards,
>> Karl
>>
>>
>>
>> _______________________________________________
>> GO-ESSP-TECH mailing list
>> GO-ESSP-TECH at ucar.edu <mailto:GO-ESSP-TECH at ucar.edu>
>> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> GO-ESSP-TECH mailing list
> GO-ESSP-TECH at ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/go-essp-tech
>
More information about the GO-ESSP-TECH
mailing list