[Go-essp-tech] Comments on the demo today.

Gavin M. Bell gavin at llnl.gov
Tue Jan 24 13:16:09 MST 2012


 Hi Stephen,

I am glad you liked the demo.  We did indeed make a point to **listen**
to our scientists and colleagues, like you, to address their issues.

(rest interleaved below)

On 1/24/12 9:54 AM, stephen.pascoe at stfc.ac.uk wrote:
>
> Hi All,
>
>  
>
> Firstly, a great demo of the P2P system today and I'm glad to see
> virtually all of the issues I've been nagging you about are being
> addressed ;-).  Whilst they are on my mind I have some, hopefully
> constructive, comments.
>
>  
>
of course ;-).
>
> Metadata: we really need the CIM metadata in there.  A lot of effort
> has gone into filling out the CIM metadata and the modelling centres
> won't be happy if it isn't visible along with the data.
>
>  
>
This will be taken care of in a couple weeks or so.  We would probably
schedule a short demo of in on one of the next, if not the next go-essp
call.
>
> Versions & Replicas: As you know these features are critical for the
> European CMIP5 centres who are committed  to maintain a snapshot for
> IPCC.   What Luca showed on the JPL node is very exciting and I agree
> there are lots of UI work that need to happen to make it more intuitive.
>
>  
>
Yes, the interface will be evolving.  Also the esgf-sh shell will be
another interface into the system that command line folks may like
better.  Certainly the primary point was to show that the capability
exists and viable.  Point taken.
>
> Data integrity (checksums and tracking_ids): I think search by
> checksum or tracking_id is supposed to work but I can't make it happen
> on the JPL node.  Selecting a BADC dataset that JPL knows about and
> entering a checksum or tracking_id doesn't produce any hits.  I also
> hope we can display the checksums in the interface so that users can
> check them by hand.
>
This was working rather well in the rehearsal before the demo.  It is a
small matter that we will fix.   As for the display, again the front end
is evolving and will provide a more intuitive interface.  We will give
the users what they want.  Also there will be the esgf-sh that will
provide a shell interface with options to view the results in many many
different ways.
>
>  
>
> Having played with pcmdi9 myself, I actually find the sticky session
> feature rather confusing.  If I return to it after a few days I don't
> expect to see my previous search criteria still active.  If I enter
> text in the "quick search" box I expect it to start a new search, not
> append it to my current search.  Luca showed how flexible the
> Results/DataCart system is but it too can be confusing (maybe I just
> have to get used to it).  I've managed to show odd things like an
> empty Results tab when there shouldn't be.  Some debugging is still
> needed.
>
This can be easily fixed... but some folks I have talked to liked the
stickiness as it provides continuity to the experience and they don't
have to rehash everything up again when they go away.  We will find the
right tone for this.  Also there are fixes that have been done that are
not yet deployed on pcmdi9.
>
>  
>
> I noticed that the responsiveness of the pcmdi9 interface was a lot
> better on the screencast than what I see from my desk in the UK.  It
> might be that the AJAX techniques used are more susceptible to latency
> problems than the traditional load-all-at-once model.  I think it will
> be important to test the software in a federation that includes index
> nodes on other continents to make sure we don't recreate slow search
> responses.  I've noticed that viewing the data cart can take a while
> with a few datasets selected.
>
As for speed... there are a few techniques we can do to address that. 
Also we will engage in this trans-Atlantic / trans-Pacific testing and
see what's what.  Much of this will be made moot as we move data around
under a more resource aware replication strategy.
>
>  
>
> I stand by my concern about distributed search.  It would be best to
> give direct feedback to the user when nodes are offline making clear
> that this will affect search results.  Users really notice when the
> number of hits from a search result change and they want to know why.
>
>  
>
So, in my go-essp presentation about elastic networks... this p2p
network will prune nodes that are not on line to provide consistent
search results at any given point in time.  When index nodes come up
they are automatically detected and sewn into the esgf p2p dataspace. 
Okay, more directly addressing your concern... users can use the
dashboard to view the nodes that are in the federation when they issue a
query.  As you can see from the demo, you will be able to *see* the
nodes and their state.  There is also a historical component to the
dashboard monitoring that will also be present to allow you to step
through time to see the state of the federation through time.  So for
the curious, the *dashboard* will be how to exactly answer your question. 

There will be work moving forward to couple the RSS and the wget
procured files such that updating can be done easily.  So if you pulled
down files today and an index node was down ... and there were no
replicas... (both of which make it more improbable that you get an
incomplete record), then you can run your script at some other time when
that index node is back up and/or replicas are on line and update your
downloaded files.

Distributed search is good... it offers inherent consistency that
harvesting can't - by definition.

All of this is just the tip of the iceberg.  There is work to be done
making sure what we have it hardened.  There are lots of exciting
features and capabilities on the horizon that we will be getting done,
that you are welcome to help with, that we will be bringing to the
community! :-)
>
> Cheers,
>
> Stephen.
>
>  
>
> ---
>
> Stephen Pascoe  +44 (0)1235 445980
>
> Centre of Environmental Data Archival
>
> STFC Rutherford Appleton Laboratory, Harwell Oxford, Didcot OX11 0QX, UK
>
>  
>
>
> -- 
> Scanned by iCritical.
>
>

-- 
Gavin M. Bell
Lawrence Livermore National Labs
--

 "Never mistake a clear view for a short distance."
       	       -Paul Saffo

(GPG Key - http://rainbow.llnl.gov/dist/keys/gavin.asc)

 A796 CE39 9C31 68A4 52A7  1F6B 66B7 B250 21D5 6D3E

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucar.edu/pipermail/go-essp-tech/attachments/20120124/b5b520be/attachment.html 


More information about the GO-ESSP-TECH mailing list