<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Hi Jennifer,<br>
<br>
I'll check this more carefully and see what can be done with what we
have (or minimal changes), thought the multiple versions is
something CMIP3 hasn't worried about, files just got changed or
deleted, cmip5 add a two figure factor to that since there are many
more institutions and data... but it might be possible.<br>
<br>
In any case I wanted just to thank you very much for the detailed
description, it is very useful.<br>
<br>
Regards,<br>
Estani<br>
<br>
Am 15.12.2011 14:52, schrieb Jennifer Adams:
<blockquote
cite="mid:A6AA79FD-B141-4611-B06D-2FA892512CFC@cola.iges.org"
type="cite">Hi, Estanislao --
<div>Please see my comments inline.</div>
<div><br>
<div>
<div>On Dec 15, 2011, at 5:47 AM, Estanislao Gonzalez wrote:</div>
<br class="Apple-interchange-newline">
<blockquote type="cite">
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
<div bgcolor="#FFFFFF" text="#000000"> Hi Jennifer,<br>
<br>
I'm still not sure how is Lucas change in the API going to
help you Jennifer. But perhaps it would help me to fully
understand your requirement as well as your use of wget
when using the FTP protocol.<br>
<br>
I presume what you want is to crawl the archive and get
file from a specific directory structure?<br>
Maybe it would be better if you just describe briefly the
procedure you've been using for getting the CMIP3 data so
we can see what could be done for CMIP5.<br>
<br>
How did you find out which data was interesting?<br>
</div>
</blockquote>
COLA scientists ask for a specific
scenario/realm/frequency/variable they need for their
research. Our CMIP3 collection is a shared resource of about
4Tb of data. For CMIP5, we are working with an estimate of 4-5
times that data volume to meet our needs. It's hard to say at
this point whether that will be enough. </div>
<div><br>
<blockquote type="cite">
<div bgcolor="#FFFFFF" text="#000000"> How did you find out
which files were required to be downloaded?<br>
</div>
</blockquote>
For CMIP3, we often referred to <a moz-do-not-send="true"
href="http://www-pcmdi.llnl.gov/ipcc/data_status_tables.htm">http://www-pcmdi.llnl.gov/ipcc/data_status_tables.htm</a>
to see what was available. </div>
<div><br>
</div>
<div>The new version of this chart for CMIP5, <a
moz-do-not-send="true"
href="http://cmip-pcmdi.llnl.gov/cmip5/esg_tables/transpose_esg_static_table.html">http://cmip-pcmdi.llnl.gov/cmip5/esg_tables/transpose_esg_static_table.html</a>,
is also useful. An improvement I'd like to see on this page:
the numbers inside the blue boxes that show how many runs
there are for a particular experiment/model should be a link
to a list of those runs that has all the necessary components
from the Data Reference Synatax so that I can go directly to
the URL for that data set. For example, </div>
<div>the BCC-CSM1.1 model shows 45 runs for the decadal1960
experiment. I would like to click on that 45 and get a list of
the 45 URLs for those runs, like this:</div>
<div><a moz-do-not-send="true"
href="http://pcmdi3.llnl.gov/esgcet/dataset/cmip5.output1.BCC.bcc-csm1-1.decadal1960.day.land.day.r1i1p1.html">http://pcmdi3.llnl.gov/esgcet/dataset/cmip5.output1.BCC.bcc-csm1-1.decadal1960.day.land.day.r1i1p1.html</a></div>
<div><a moz-do-not-send="true"
href="http://pcmdi3.llnl.gov/esgcet/dataset/cmip5.output1.BCC.bcc-csm1-1.decadal1960.day.land.day.r2i1p1.html">http://pcmdi3.llnl.gov/esgcet/dataset/cmip5.output1.BCC.bcc-csm1-1.decadal1960.day.land.day.r2i1p1.html</a></div>
<div>...</div>
<div><br>
</div>
<div><br>
<blockquote type="cite">
<div bgcolor="#FFFFFF" text="#000000"> How did you tell wget
to download those files?<br>
</div>
</blockquote>
<div>For example: wget -nH --retr-symlinks -r -A nc <a
moz-do-not-send="true"
href="ftp://username@ftp-esg.ucllnl.org/picntrl/atm/mo/tas">ftp://username@ftp-esg.ucllnl.org/picntrl/atm/mo/tas</a>
-o log.tas</div>
<div>This would populate a local directory
./picntrl/atm/mo/tas with all the models and ensemble
members in the proper subdirectory. If I wanted to update
with newer versions or models that had been added, I simply
ran the same 1-line wget command again. This is what I refer
to as 'elegant.'</div>
<div> </div>
<div><br>
</div>
<blockquote type="cite">
<div bgcolor="#FFFFFF" text="#000000">We might have already
some way of achieving what you want, if we knew exactly
what that is.<br>
</div>
</blockquote>
Wouldn't that be wonderful? I am hopeful that the P2P will
simplify the elaborate and flawed workflow I have cobbled
together to navigate the current system.</div>
<div>I have a list of desired
experiment/realm/frequency/MIP_table/variables for which I
need to grab all available models/ensembles. Is that not
enough to describe my needs? </div>
<div><br>
<blockquote type="cite">
<div bgcolor="#FFFFFF" text="#000000"> <br>
I guess my proposal of issuing:<br>
bash <(wget <a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="http://p2pnode/wgetscript?experiment=decadal1960&realm=atmos&time_frequency=month&variable=clt">http://p2pnode/wget?experiment=decadal1960&realm=atmos&time_frequency=month&variable=clt</a>
-qO - | grep -v HadCM3)<br>
</div>
</blockquote>
Yes, this would likely achieve the same result as the
'&model=!name' that Luca implemented. However, I believe
the documentation says that there is a limit of 1000 to the
number of wgets that p2pnode will put into a single search
request, so I don't want to populate my precious 1000 results
with wgets that I'm going to grep out afterwards. </div>
<div><br>
</div>
<div>--Jennifer</div>
<div><br>
</div>
<div><br>
<blockquote type="cite">
<div bgcolor="#FFFFFF" text="#000000"> <br>
was not acceptable to you. But I still don't know exactly
why. <br>
It would really help to know what you meant by "elegant
use of wget".<br>
<br>
Thanks,<br>
Estani<br>
<br>
<br>
Am 14.12.2011 18:44, schrieb Cinquini, Luca (3880):
<blockquote
cite="mid:003B320D-938C-4269-91D4-BF083FFAA5CF@jpl.nasa.gov"
type="cite">So Jennifer, would having the capability of
doing negative searches (model=!CCSM), and generate the
corresponding wget scripts, help you ?
<div>thanks, Luca</div>
<div><br>
<div>
<div>On Dec 14, 2011, at 10:38 AM, Jennifer Adams
wrote:</div>
<br class="Apple-interchange-newline">
<blockquote type="cite">
<div style="word-wrap: break-word;
-webkit-nbsp-mode: space; -webkit-line-break:
after-white-space; ">Well, after working from
the client side to get CMIP3 and CMIP5 data, I
can say that wget is a fine tool to rely on at
the core of the workflow. Unfortunately, the
step up in complexity from CMIP3 to CMIP5 and
the switch from FTP to HTTP trashed the elegant
use of wget. No amount of customized wrapper
software, browser interfaces, or pre-packaged
tools like DML fixes that problem.
<div><br>
</div>
<div>At the moment, the burden on the user is
embarrassingly high. It's so easy to suggest
that the user should "filter to remove what is
not required" from a downloaded script, but
the actual pratice of doing that in a timely
and automated and distributed way is NOT
simple! And if the solution to my problem of
filling in the gaps in my incomplete
collection is to go back to clicking in my
browser and do the whole thing over again but
make my filters smarter by looking for what's
already been acquired or what has a new
version number … this is unacceptable. The
filtering must be a server-side responsibility
and the interface must be accessible by
automated scripts. Make it so! </div>
<div><br>
</div>
<div>By the way, the version number is a piece
of metadata that is not in the downloaded
files or the gateway's search criteria. It
appears in the wget script as part of the path
in the file's http location, but the path is
not preserved after the wget is complete, so
it is effectively lost after the download is
done. I guess the file's date stamp would be
the only way to know if the version number of
the data file in question has been changed,
but I'm not going to write that check into my
filtering scripts. </div>
<div><br>
</div>
<div>--Jennifer</div>
<div><br>
</div>
<div><br>
</div>
<div>
<div>
<div>
<div>
<div apple-content-edited="true"> <span
class="Apple-style-span"
style="font-size: 12px; "> </span>
<div style="font-size: 12px; ">--</div>
<span class="Apple-style-span"
style="font-size: 12px; "> </span>
<div style="font-size: 12px; ">Jennifer
M. Adams</div>
<span class="Apple-style-span"
style="font-size: 12px; "> </span>
<div style="font-size: 12px; ">IGES/COLA</div>
<span class="Apple-style-span"
style="font-size: 12px; "> </span>
<div style="font-size: 12px; ">4041
Powder Mill Road, Suite 302</div>
<span class="Apple-style-span"
style="font-size: 12px; "> </span>
<div style="font-size: 12px; ">Calverton,
MD 20705</div>
<span class="Apple-style-span"
style="font-size: 12px; "> </span>
<div style="font-size: 12px; "><a
moz-do-not-send="true"
href="mailto:jma@cola.iges.org">jma@cola.iges.org</a></div>
<span class="Apple-style-span"
style="font-size: 12px; "> </span>
<div style="font-size: 12px; "><br
class="khtml-block-placeholder">
</div>
<span class="Apple-style-span"
style="font-size: 12px; "> </span><span
class="Apple-style-span"
style="font-size: 12px; "><br
class="Apple-interchange-newline">
</span><span class="Apple-style-span"
style="font-size: 12px; "> </span>
</div>
<br>
</div>
</div>
</div>
</div>
</div>
_______________________________________________<br>
GO-ESSP-TECH mailing list<br>
<a moz-do-not-send="true"
href="mailto:GO-ESSP-TECH@ucar.edu">GO-ESSP-TECH@ucar.edu</a><br>
<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="http://mailman.ucar.edu/mailman/listinfo/go-essp-tech">http://mailman.ucar.edu/mailman/listinfo/go-essp-tech</a><br>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
GO-ESSP-TECH mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:GO-ESSP-TECH@ucar.edu">GO-ESSP-TECH@ucar.edu</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://mailman.ucar.edu/mailman/listinfo/go-essp-tech">http://mailman.ucar.edu/mailman/listinfo/go-essp-tech</a>
</pre>
</blockquote>
<br>
<br>
<pre class="moz-signature" cols="72">--
Estanislao Gonzalez
Max-Planck-Institut für Meteorologie (MPI-M)
Deutsches Klimarechenzentrum (DKRZ) - German Climate Computing Centre
Room 108 - Bundesstrasse 45a, D-20146 Hamburg, Germany
Phone: +49 (40) 46 00 94-126
E-Mail: <a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:gonzalez@dkrz.de">gonzalez@dkrz.de</a> </pre>
</div>
_______________________________________________<br>
GO-ESSP-TECH mailing list<br>
<a moz-do-not-send="true"
href="mailto:GO-ESSP-TECH@ucar.edu">GO-ESSP-TECH@ucar.edu</a><br>
<a class="moz-txt-link-freetext" href="http://mailman.ucar.edu/mailman/listinfo/go-essp-tech">http://mailman.ucar.edu/mailman/listinfo/go-essp-tech</a><br>
</blockquote>
</div>
<br>
<div apple-content-edited="true">
<span class="Apple-style-span" style="border-collapse:
separate; border-spacing: 0px 0px; color: rgb(0, 0, 0);
font-family: Helvetica; font-size: 12px; font-style: normal;
font-variant: normal; font-weight: normal; letter-spacing:
normal; line-height: normal; text-align: auto;
-khtml-text-decorations-in-effect: none; text-indent: 0px;
-apple-text-size-adjust: auto; text-transform: none;
orphans: 2; white-space: normal; widows: 2; word-spacing:
0px; "><span class="Apple-style-span"
style="border-collapse: separate; border-spacing: 0px 0px;
color: rgb(0, 0, 0); font-family: Helvetica; font-size:
12px; font-style: normal; font-variant: normal;
font-weight: normal; letter-spacing: normal; line-height:
normal; text-align: auto;
-khtml-text-decorations-in-effect: none; text-indent: 0px;
-apple-text-size-adjust: auto; text-transform: none;
orphans: 2; white-space: normal; widows: 2; word-spacing:
0px; "><span class="Apple-style-span"
style="border-collapse: separate; border-spacing: 0px
0px; color: rgb(0, 0, 0); font-family: Helvetica;
font-size: 12px; font-style: normal; font-variant:
normal; font-weight: normal; letter-spacing: normal;
line-height: normal; text-align: auto;
-khtml-text-decorations-in-effect: none; text-indent:
0px; -apple-text-size-adjust: auto; text-transform:
none; orphans: 2; white-space: normal; widows: 2;
word-spacing: 0px; ">
<div>--</div>
<div>Jennifer M. Adams</div>
<div>IGES/COLA</div>
<div>4041 Powder Mill Road, Suite 302</div>
<div>Calverton, MD 20705</div>
<div><a moz-do-not-send="true"
href="mailto:jma@cola.iges.org">jma@cola.iges.org</a></div>
<div><br class="khtml-block-placeholder">
</div>
<br class="Apple-interchange-newline">
</span></span></span>
</div>
<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
GO-ESSP-TECH mailing list
<a class="moz-txt-link-abbreviated" href="mailto:GO-ESSP-TECH@ucar.edu">GO-ESSP-TECH@ucar.edu</a>
<a class="moz-txt-link-freetext" href="http://mailman.ucar.edu/mailman/listinfo/go-essp-tech">http://mailman.ucar.edu/mailman/listinfo/go-essp-tech</a>
</pre>
</blockquote>
<br>
<br>
<pre class="moz-signature" cols="72">--
Estanislao Gonzalez
Max-Planck-Institut für Meteorologie (MPI-M)
Deutsches Klimarechenzentrum (DKRZ) - German Climate Computing Centre
Room 108 - Bundesstrasse 45a, D-20146 Hamburg, Germany
Phone: +49 (40) 46 00 94-126
E-Mail: <a class="moz-txt-link-abbreviated" href="mailto:gonzalez@dkrz.de">gonzalez@dkrz.de</a> </pre>
</body>
</html>