[Go-essp-tech] noaa node is not working...

Serguei Nikonov serguei.nikonov at noaa.gov
Wed Feb 29 11:51:16 MST 2012


Hi Luca,

this is a whole bouquet of problems, but as I understand the most important is 
"HTTP Status 403 - Access Denied" when file tried to download through THREDDS 
catalog (e.g. 
http://esgdata.gfdl.noaa.gov/thredds/esgcet/2/cmip5.output1.NOAA-GFDL.GFDL-ESM2G.esmControl.mon.aerosol.aero.r1i1p1.v20110601.html?dataset=cmip5.output1.NOAA-GFDL.GFDL-ESM2G.esmControl.mon.aerosol.aero.r1i1p1.v20110601.abs550aer_aero_GFDL-ESM2G_esmControl_r1i1p1_002101-002512.nc). 


This one also manifests when users download files using wget script built in 
gateway GUI (log message: WARN  - esg.node.filters.AccessLoggingFilter - 
UnAuthorized Request for: 
http://esgdata.gfdl.noaa.gov/thredds/fileServer/gfdl_dataroot/NOAA-GFDL/...). 
As a result we have most requests unsuccessful judging THREDDS log file 
threddsServlet.log with message
"INFO  - thredds.servlet.FileServerServlet - Request Completed - -1 - 0 - 0".
Current rate of successful requests is 1%.

At the same time downloading individual files directly from gateway are always 
successful.

Also we have accumulated "CLOSE_WAIT" connections (currently is ~1000) which 
leads to full system down due to "too many open files" error.

Another issue, may be pertaining to the athorisation problem is java errors 
reported in THREDDS logfile (see at the bottom).

Regarding installation, we can start it only about 5pm (Eastern) cause I don't 
have root priveleges and all such kind of work can be done only by Hans, GFDL 
sysadmin which usually appears in office about that time. Is it convinient for you?

Thanks,
Sergey


2012-02-29T13:15:25.055 -0500 [   4500276][    4622] WARN  - 
org.apache.commons.httpclient.SimpleHttpConnectionManager - 
SimpleHttpConnectionManager being used incorrectly.  Be sure that 
HttpMethod.releaseConnection() is always called and that only one th
ead and/or method is using this connection manager at a time.
2012-02-29T13:18:14.026 -0500 [   4669247][    1853] WARN  - 
esg.orp.app.SAMLAuthorizationServiceFilterCollaborator - Connection is not open
java.lang.IllegalStateException: Connection is not open
         at 
org.apache.commons.httpclient.HttpConnection.assertOpen(HttpConnection.java:1277)
         at 
org.apache.commons.httpclient.HttpConnection.write(HttpConnection.java:974)
         at 
org.apache.commons.httpclient.HttpConnection.write(HttpConnection.java:943)
         at 
org.apache.commons.httpclient.HttpConnection.print(HttpConnection.java:1033)
         at 
org.apache.commons.httpclient.HttpMethodBase.writeRequestHeaders(HttpMethodBase.java:2187)
         at 
org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2060)
         at 
org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1096)
         at 
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
         at 
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
         at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
         at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
         at esg.security.common.SOAPServiceClient.doSoap(SOAPServiceClient.java:56)
         at 
esg.orp.app.SAMLAuthorizationServiceFilterCollaborator.authorize(SAMLAuthorizationServiceFilterCollaborator.java:78)
         at 
esg.orp.app.AuthorizationFilter.attemptValidation(AuthorizationFilter.java:60)
         at 
esg.orp.app.AccessControlFilterTemplate.doFilter(AccessControlFilterTemplate.java:62)
         at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
         at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
         at 
esg.orp.app.AccessControlFilterTemplate.doFilter(AccessControlFilterTemplate.java:66)
         at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
         at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
         at 
eske.web.filters.security.AuthorizationTokenValidationFilter.doFilter(AuthorizationTokenValidationFilter.java:84)
         at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
         at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
         at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
         at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
         at 
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:470)
         at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
         at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
         at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
         at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
         at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
         at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
         at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
         at java.lang.Thread.run(Thread.java:662)
2012-02-29T13:18:14.026 -0500 [   4669247][       9] WARN  - 
esg.orp.app.SAMLAuthorizationServiceFilterCollaborator - Connection is not open
java.lang.IllegalStateException: Connection is not open
         at 
org.apache.commons.httpclient.HttpConnection.assertOpen(HttpConnection.java:1277)
         at 
org.apache.commons.httpclient.HttpConnection.write(HttpConnection.java:974)
         at 
org.apache.commons.httpclient.HttpConnection.write(HttpConnection.java:943)
         at 
org.apache.commons.httpclient.HttpConnection.print(HttpConnection.java:1033)
         at 
org.apache.commons.httpclient.HttpMethodBase.writeRequestHeaders(HttpMethodBase.java:2187)
         at 
org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2060)
         at 
org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1096)
         at 
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
         at 
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
         at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
         at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
         at esg.security.common.SOAPServiceClient.doSoap(SOAPServiceClient.java:56)
         at 
esg.orp.app.SAMLAuthorizationServiceFilterCollaborator.authorize(SAMLAuthorizationServiceFilterCollaborator.java:78)
         at 
esg.orp.app.AuthorizationFilter.attemptValidation(AuthorizationFilter.java:60)
         at 
esg.orp.app.AccessControlFilterTemplate.doFilter(AccessControlFilterTemplate.java:62)
         at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
         at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
         at 
esg.orp.app.AccessControlFilterTemplate.doFilter(AccessControlFilterTemplate.java:66)
         at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
         at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
         at 
eske.web.filters.security.AuthorizationTokenValidationFilter.doFilter(AuthorizationTokenValidationFilter.java:84)
         at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
         at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
         at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
         at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
         at 
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:470)
         at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
         at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
         at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
         at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
         at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
         at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
         at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
         at java.lang.Thread.run(Thread.java:662)




On 02/29/2012 01:26 PM, Cinquini, Luca (3880) wrote:
> Hi Serguei,
>          I just sent an email to the esgf-devel list asking for a volunteer to install the datanode-only setup. I have re-installed it today and it works fine for me. I'll be happy to work with you on the GFDL installation of a data node, but before we start - can you remind me again what are the problems with your current datanode ? Is it an authorization problem ? Or is it something different ?
> thanks, Luca
>
> On Feb 29, 2012, at 11:22 AM, Serguei Nikonov wrote:
>
>> Hi Luca,
>>
>> is any good news about data node release? We are looking forward it cause GFDL
>> data node practically nonfunctional having rate of successful requests ~1% (I
>> wrote it recently on go-essp-tech list). Keeping in mind that right now is very
>> hot time for users - all wants to get data before Hawaii meeting we are very
>> anxious about this problem.
>>
>> Thanks,
>> Sergey
>>
>>
>> On 02/23/2012 05:13 PM, Luca Cinquini wrote:
>>> Hi Serguei,
>>> sorry I haven't replied to this so far... I think we should wait to tackle this
>>> till next week, when you can install the new release of the data node. At that
>>> point, we'll know for sure what software you are running, and there should be
>>> enough debug statements in the logs to figure out what's wrong.
>>> So please be patient, and bug as again by mid week if you haven't heard from us.
>>> thanks, Luca
>>>
>>> On Thu, Feb 23, 2012 at 7:45 AM, Serguei Nikonov<serguei.nikonov at noaa.gov
>>> <mailto:serguei.nikonov at noaa.gov>>  wrote:
>>>
>>>     Hi Estani,
>>>
>>>     we increased memory allocation 2 months ago. Unfortunately the main issue we
>>>     had, 403 error, is still here.
>>>
>>>     Sergey
>>>
>>>
>>>     On 02/23/2012 02:46 AM, Estanislao Gonzalez wrote:
>>>
>>>         Hi Hans,
>>>
>>>         I was about to suggest using /usr/local/java :-)
>>>
>>>         Don't worry about the wms config error... I have that too... As we don't
>>>         use it,
>>>         there's no real harm, though indeed it would great to set it up properly.
>>>
>>>           From your mail I'd think you are ready to go. I haven't completely
>>>         understand
>>>         if you have increased the memory allocation or was already at those
>>>         values you
>>>         show.
>>>         If the former s true, then the memory and open file increase should
>>>         solve all
>>>         the problems you were having.
>>>
>>>         Cheers,
>>>         Estani
>>>
>>>         Am 23.02.2012 03:31, schrieb Hans Vahlenkamp:
>>>
>>>             Some more information... I found why running jmap was failing. We have
>>>             another java version installed; the one provided by Red Hat with
>>>             RHEL 5 which
>>>             perhaps should be removed. Using jmap from the Java version provided
>>>             with
>>>             the data node software works.
>>>
>>>             [root at data2 ~]# sudo /usr/local/java/bin/jmap -heap $(sudo jps |
>>>             grep -iv jps)
>>>             Attaching to process ID 28509, please wait...
>>>             Debugger attached successfully.
>>>             Server compiler detected.
>>>             JVM version is 19.0-b09
>>>
>>>             using thread-local object allocation.
>>>             Parallel GC with 10 thread(s)
>>>
>>>             Heap Configuration:
>>>             MinHeapFreeRatio = 40
>>>             MaxHeapFreeRatio = 70
>>>             MaxHeapSize = 17179869184<tel:17179869184>  (16384.0MB)
>>>             NewSize = 1310720 (1.25MB)
>>>             MaxNewSize = 17592186044415 MB
>>>             OldSize = 5439488 (5.1875MB)
>>>             NewRatio = 2
>>>             SurvivorRatio = 8
>>>             PermSize = 21757952 (20.75MB)
>>>             MaxPermSize = 536870912 (512.0MB)
>>>
>>>             Heap Usage:
>>>             PS Young Generation
>>>             Eden Space:
>>>             capacity = 4295032832 (4096.0625MB)
>>>             used = 3749511000 (3575.812339782715MB)
>>>             free = 545521832 (520.2501602172852MB)
>>>             87.29877387815041% used
>>>              From Space:
>>>             capacity = 715784192 (682.625MB)
>>>             used = 715764432 (682.6061553955078MB)
>>>             free = 19760 (0.0188446044921875MB)
>>>             99.99723939139466% used
>>>             To Space:
>>>             capacity = 715784192 (682.625MB)
>>>             used = 0 (0.0MB)
>>>             free = 715784192 (682.625MB)
>>>             0.0% used
>>>             PS Old Generation
>>>             capacity = 11453267968 (10922.6875MB)
>>>             used = 4231302192 (4035.284225463867MB)
>>>             free = 7221965776 (6887.403274536133MB)
>>>             36.94406001694974% used
>>>             PS Perm Generation
>>>             capacity = 85262336 (81.3125MB)
>>>             used = 85221144 (81.2732162475586MB)
>>>             free = 41192 (0.03928375244140625MB)
>>>             99.95168792935722% used
>>>
>>>             However, we are still getting frequent "HTTP Status 403 - Access
>>>             Denied."
>>>             failures when trying to download files directly from our local TDS.
>>>
>>>             Hans
>>>
>>>
>>>             On 02/22/2012 08:21 PM, Hans Vahlenkamp wrote:
>>>
>>>                 Hello Estani,
>>>
>>>                 After restarting our data node, the ORP address
>>>                 "https://esgdata.gfdl.noaa.__gov/OpenidRelyingParty/home.__htm
>>>                 <https://esgdata.gfdl.noaa.gov/OpenidRelyingParty/home.htm>"
>>>                 is functioning again. Trying to see the memory map of the Java
>>>                 process is
>>>                 currently failing:
>>>
>>>                 [root at data2 bin]# sudo jmap -heap $(sudo jps | grep -iv jps)
>>>                 Attaching to process ID 28509, please wait...
>>>                 Exception in thread "main"
>>>                 java.lang.reflect.__InvocationTargetException
>>>                 at sun.reflect.__NativeMethodAccessorImpl.__invoke0(Native Method)
>>>                 at
>>>                 sun.reflect.__NativeMethodAccessorImpl.__invoke(__NativeMethodAccessorImpl.java:__39)
>>>                 at
>>>                 sun.reflect.__DelegatingMethodAccessorImpl.__invoke(__DelegatingMethodAccessorImpl.__java:25)
>>>
>>>                 at java.lang.reflect.Method.__invoke(Method.java:597)
>>>                 at sun.tools.jmap.JMap.runTool(__JMap.java:179)
>>>                 at sun.tools.jmap.JMap.main(JMap.__java:110)
>>>                 Caused by: sun.jvm.hotspot.runtime.__VMVersionMismatchException:
>>>                 Supported
>>>                 versions are 19.1-b02. Target VM is 19.0-b09
>>>                 at sun.jvm.hotspot.runtime.VM.__checkVMVersion(VM.java:224)
>>>                 at sun.jvm.hotspot.runtime.VM.<__init>(VM.java:287)
>>>                 at sun.jvm.hotspot.runtime.VM.__initialize(VM.java:357)
>>>                 at
>>>                 sun.jvm.hotspot.bugspot.__BugSpotAgent.setupVM(__BugSpotAgent.java:594)
>>>                 at
>>>                 sun.jvm.hotspot.bugspot.__BugSpotAgent.go(BugSpotAgent.__java:494)
>>>                 at
>>>                 sun.jvm.hotspot.bugspot.__BugSpotAgent.attach(__BugSpotAgent.java:332)
>>>                 at sun.jvm.hotspot.tools.Tool.__start(Tool.java:163)
>>>                 at sun.jvm.hotspot.tools.__HeapSummary.main(HeapSummary.__java:39)
>>>                 ... 6 more
>>>
>>>                 although I recall that it worked previously.
>>>
>>>                 I'm not sure if this is a related problem, but after a restart
>>>                 we noticed in
>>>                 the "catalina.err" file these entries:
>>>
>>>                 SEVERE: StandardWrapper.Throwable
>>>                 org.springframework.beans.__factory.BeanCreationException: Error
>>>                 creating bean
>>>                 with name 'wmsController' defined in ServletContext resource
>>>                 [/WEB-INF/wms-servlet.xml]: Invocation of init method failed; nested
>>>                 exception is thredds.server.wms.config.__WmsConfigException:
>>>                 Could not find
>>>                 wmsConfig.xml
>>>                 at
>>>                 org.springframework.beans.__factory.support.__AbstractAutowireCapableBeanFac__tory.initializeBean(__AbstractAutowireCapableBeanFac__tory.java:1336)
>>>
>>>                 at
>>>                 org.springframework.beans.__factory.support.__AbstractAutowireCapableBeanFac__tory.doCreateBean(__AbstractAutowireCapableBeanFac__tory.java:471)
>>>
>>>                 at
>>>                 org.springframework.beans.__factory.support.__AbstractAutowireCapableBeanFac__tory$1.run(__AbstractAutowireCapableBeanFac__tory.java:409)
>>>
>>>                 at java.security.__AccessController.doPrivileged(__Native Method)
>>>                 at
>>>                 org.springframework.beans.__factory.support.__AbstractAutowireCapableBeanFac__tory.createBean(__AbstractAutowireCapableBeanFac__tory.java:380)
>>>
>>>                 at
>>>                 org.springframework.beans.__factory.support.__AbstractBeanFactory$1.__getObject(AbstractBeanFactory.__java:264)
>>>
>>>                 at
>>>                 org.springframework.beans.__factory.support.__DefaultSingletonBeanRegistry.__getSingleton(__DefaultSingletonBeanRegistry.__java:220)
>>>
>>>                 at
>>>                 org.springframework.beans.__factory.support.__AbstractBeanFactory.doGetBean(__AbstractBeanFactory.java:261)
>>>
>>>                 at
>>>                 org.springframework.beans.__factory.support.__AbstractBeanFactory.getBean(__AbstractBeanFactory.java:185)
>>>
>>>                 at
>>>                 org.springframework.beans.__factory.support.__AbstractBeanFactory.getBean(__AbstractBeanFactory.java:164)
>>>
>>>                 at
>>>                 org.springframework.beans.__factory.support.__DefaultListableBeanFactory.__preInstantiateSingletons(__DefaultListableBeanFactory.__java:429)
>>>
>>>                 at
>>>                 org.springframework.context.__support.__AbstractApplicationContext.__finishBeanFactoryInitializatio__n(AbstractApplicationContext.__java:729)
>>>
>>>                 at
>>>                 org.springframework.context.__support.__AbstractApplicationContext.__refresh(__AbstractApplicationContext.__java:381)
>>>
>>>                 at
>>>                 org.springframework.web.__servlet.FrameworkServlet.__createWebApplicationContext(__FrameworkServlet.java:402)
>>>
>>>                 at
>>>                 org.springframework.web.__servlet.FrameworkServlet.__initWebApplicationContext(__FrameworkServlet.java:316)
>>>
>>>                 at
>>>                 org.springframework.web.__servlet.FrameworkServlet.__initServletBean(__FrameworkServlet.java:282)
>>>
>>>                 at
>>>                 org.springframework.web.__servlet.HttpServletBean.init(__HttpServletBean.java:126)
>>>                 at javax.servlet.GenericServlet.__init(GenericServlet.java:212)
>>>                 at
>>>                 org.apache.catalina.core.__StandardWrapper.loadServlet(__StandardWrapper.java:1173)
>>>                 at
>>>                 org.apache.catalina.core.__StandardWrapper.load(__StandardWrapper.java:993)
>>>                 at
>>>                 org.apache.catalina.core.__StandardContext.loadOnStartup(__StandardContext.java:4420)
>>>
>>>                 at
>>>                 org.apache.catalina.core.__StandardContext.start(__StandardContext.java:4733)
>>>                 at
>>>                 org.apache.catalina.core.__ContainerBase.__addChildInternal(__ContainerBase.java:799)
>>>                 at
>>>                 org.apache.catalina.core.__ContainerBase.addChild(__ContainerBase.java:779)
>>>                 at
>>>                 org.apache.catalina.core.__StandardHost.addChild(__StandardHost.java:601)
>>>                 at
>>>                 org.apache.catalina.startup.__HostConfig.deployDescriptor(__HostConfig.java:675)
>>>                 at
>>>                 org.apache.catalina.startup.__HostConfig.deployDescriptors(__HostConfig.java:601)
>>>                 at
>>>                 org.apache.catalina.startup.__HostConfig.deployApps(__HostConfig.java:502)
>>>                 at
>>>                 org.apache.catalina.startup.__HostConfig.start(HostConfig.__java:1315)
>>>                 at
>>>                 org.apache.catalina.startup.__HostConfig.lifecycleEvent(__HostConfig.java:324)
>>>                 at
>>>                 org.apache.catalina.util.__LifecycleSupport.__fireLifecycleEvent(__LifecycleSupport.java:142)
>>>
>>>                 at
>>>                 org.apache.catalina.core.__ContainerBase.start(__ContainerBase.java:1061)
>>>                 at
>>>                 org.apache.catalina.core.__StandardHost.start(__StandardHost.java:840)
>>>                 at
>>>                 org.apache.catalina.core.__ContainerBase.start(__ContainerBase.java:1053)
>>>                 at
>>>                 org.apache.catalina.core.__StandardEngine.start(__StandardEngine.java:463)
>>>                 at
>>>                 org.apache.catalina.core.__StandardService.start(__StandardService.java:525)
>>>                 at
>>>                 org.apache.catalina.core.__StandardServer.start(__StandardServer.java:754)
>>>                 at org.apache.catalina.startup.__Catalina.start(Catalina.java:__595)
>>>                 at sun.reflect.__NativeMethodAccessorImpl.__invoke0(Native Method)
>>>                 at
>>>                 sun.reflect.__NativeMethodAccessorImpl.__invoke(__NativeMethodAccessorImpl.java:__39)
>>>                 at
>>>                 sun.reflect.__DelegatingMethodAccessorImpl.__invoke(__DelegatingMethodAccessorImpl.__java:25)
>>>
>>>                 at java.lang.reflect.Method.__invoke(Method.java:597)
>>>                 at
>>>                 org.apache.catalina.startup.__Bootstrap.start(Bootstrap.__java:289)
>>>                 at sun.reflect.__NativeMethodAccessorImpl.__invoke0(Native Method)
>>>                 at
>>>                 sun.reflect.__NativeMethodAccessorImpl.__invoke(__NativeMethodAccessorImpl.java:__39)
>>>                 at
>>>                 sun.reflect.__DelegatingMethodAccessorImpl.__invoke(__DelegatingMethodAccessorImpl.__java:25)
>>>
>>>                 at java.lang.reflect.Method.__invoke(Method.java:597)
>>>                 at
>>>                 org.apache.commons.daemon.__support.DaemonLoader.start(__DaemonLoader.java:219)
>>>                 Caused by: thredds.server.wms.config.__WmsConfigException: Could
>>>                 not find
>>>                 wmsConfig.xml
>>>                 at
>>>                 thredds.server.wms.__ThreddsWmsController.init(__ThreddsWmsController.java:99)
>>>                 at sun.reflect.__NativeMethodAccessorImpl.__invoke0(Native Method)
>>>                 at
>>>                 sun.reflect.__NativeMethodAccessorImpl.__invoke(__NativeMethodAccessorImpl.java:__39)
>>>                 at
>>>                 sun.reflect.__DelegatingMethodAccessorImpl.__invoke(__DelegatingMethodAccessorImpl.__java:25)
>>>
>>>                 at java.lang.reflect.Method.__invoke(Method.java:597)
>>>                 at
>>>                 org.springframework.beans.__factory.support.__AbstractAutowireCapableBeanFac__tory.invokeCustomInitMethod(__AbstractAutowireCapableBeanFac__tory.java:1412)
>>>
>>>                 at
>>>                 org.springframework.beans.__factory.support.__AbstractAutowireCapableBeanFac__tory.invokeInitMethods(__AbstractAutowireCapableBeanFac__tory.java:1373)
>>>
>>>                 at
>>>                 org.springframework.beans.__factory.support.__AbstractAutowireCapableBeanFac__tory.initializeBean(__AbstractAutowireCapableBeanFac__tory.java:1333)
>>>
>>>                 ... 47 more
>>>
>>>                 I'm not sure why this is occurring since the
>>>                 "/usr/local/apache-tomcat-6.0.__32/content/thredds/wmsConfig.__xml"
>>>                 file exists.
>>>
>>>                 We increased the maximum number of open files to 4096 for the
>>>                 tomcat user and
>>>                 have the Java memory options set with "-Xmx16384m -Xms16384m
>>>                 -XX:MaxPermSize=512m".
>>>                 Also, the last software update we did on the data node was in
>>>                 December.
>>>
>>>                 Thanks,
>>>
>>>                 Hans and Sergey
>>>
>>>
>>>                 On 02/22/2012 10:00 AM, Estanislao Gonzalez wrote:
>>>
>>>                     Hi Sergei,
>>>
>>>                     I have a terrible memory so don't ask for the impossible ;-)
>>>
>>>                     I guess there are multiple things happening at the same
>>>                     time, which is
>>>                     pretty standard in CMIP5 context...
>>>
>>>                     You do have a problem that has nothing to do with pcmdi3
>>>                     being overloaded,
>>>                     which it is and causes many other problems.
>>>
>>>                     I've followed the link you sent and got a 500 error, which
>>>                     is not near good...
>>>
>>>                     java.lang.__NoClassDefFoundError:
>>>                     org/springframework/web/util/__UriUtils
>>>                     org.springframework.web.util.__UrlPathHelper.__decodeRequestString(__UrlPathHelper.java:307)
>>>
>>>                     org.springframework.web.util.__UrlPathHelper.getContextPath(__UrlPathHelper.java:213)
>>>
>>>                     org.springframework.web.util.__UrlPathHelper.__getPathWithinApplication(__UrlPathHelper.java:163)
>>>
>>>
>>>                     This was thrown by the ORP... (see
>>>                     https://esgdata.gfdl.noaa.gov/__OpenidRelyingParty/home.htm
>>>                     <https://esgdata.gfdl.noaa.gov/OpenidRelyingParty/home.htm>)
>>>                     Maybe Luca can help with this... but tell me something, have
>>>                     you updated or
>>>                     changed the node in any particular way?
>>>                     There are multiple possible causes... one being you ran out
>>>                     of memory in the
>>>                     PermGen space, and the class could just not be loaded anymore...
>>>                     Could you check the catalina logs? If they are huge, stop
>>>                     the node move the
>>>                     catalogs somewhere /tag them with a timestamp in the name)
>>>                     and let it create
>>>                     new ones, that will make debugging easier.
>>>
>>>                     There should be a message telling why the ORP is not working
>>>                     anymore...
>>>
>>>                     Also could you send me this:
>>>                     #information on tomcat, provided it's the only java server
>>>                     in the node:
>>>                     sudo jmap -heap $(sudo jps | grep -iv jps)
>>>                     # current java parameters
>>>                     grep Xmx /etc/esg.env
>>>
>>>                     This is what I have to compare:
>>>                     $ grep Xmx /etc/esg.env
>>>                     export JAVA_OPTS="-Xmx15G -Xms10G -XX:MaxPermSize=512m
>>>                     -XX:NewRatio=9"
>>>                     $ sudo jmap -heap $(sudo jps | grep -iv jps)
>>>                     Attaching to process ID 15222, please wait...
>>>                     Debugger attached successfully.
>>>                     Server compiler detected.
>>>                     JVM version is 19.0-b09
>>>
>>>                     using thread-local object allocation.
>>>                     Parallel GC with 8 thread(s)
>>>
>>>                     Heap Configuration:
>>>                     MinHeapFreeRatio = 40
>>>                     MaxHeapFreeRatio = 70
>>>                     MaxHeapSize = 16106127360<tel:16106127360>  (15360.0MB)
>>>                     NewSize = 1310720 (1.25MB)
>>>                     MaxNewSize = 17592186044415 MB
>>>                     OldSize = 5439488 (5.1875MB)
>>>                     NewRatio = 9
>>>                     SurvivorRatio = 8
>>>                     PermSize = 21757952 (20.75MB)
>>>                     MaxPermSize = 536870912 (512.0MB)
>>>
>>>                     Heap Usage:
>>>                     PS Young Generation
>>>                     Eden Space:
>>>                     capacity = 757792768 (722.6875MB)
>>>                     used = 332487616 (317.08489990234375MB)
>>>                     free = 425305152 (405.60260009765625MB)
>>>                     43.875796924997836% used
>>>                      From Space:
>>>                     capacity = 152043520 (145.0MB)
>>>                     used = 82383968 (78.56747436523438MB)
>>>                     free = 69659552 (66.43252563476562MB)
>>>                     54.184465079471984% used
>>>                     To Space:
>>>                     capacity = 157483008 (150.1875MB)
>>>                     used = 0 (0.0MB)
>>>                     free = 157483008 (150.1875MB)
>>>                     0.0% used
>>>                     PS Old Generation
>>>                     capacity = 9663676416 (9216.0MB)
>>>                     used = 7712867544 (7355.563682556152MB)
>>>                     free = 1950808872 (1860.4363174438477MB)
>>>                     79.81297398606937% used
>>>                     PS Perm Generation
>>>                     capacity = 79757312 (76.0625MB)
>>>                     used = 79734640 (76.04087829589844MB)
>>>                     free = 22672 (0.0216217041015625MB)
>>>                     99.97157376617707% used
>>>
>>>                     Thanks,
>>>                     Estani
>>>
>>>                     Am 22.02.2012 15:35, schrieb Serguei Nikonov:
>>>
>>>                         Hi Estani,
>>>
>>>                         Indeed we experienced two last days "Too many open
>>>                         files" error in tomcat.
>>>                         That made us to clear up catalina log file which was so
>>>                         big that overload
>>>                         hard drive. Thanks for your advise how prevent this
>>>                         issue. Currently, TDS
>>>                         is up but has a limited functionality. Very often (may
>>>                         be most time) files
>>>                         do not have access - TDS gives "403 - Access Denied"
>>>                         error when trying to
>>>                         download them. We had this issue 3 months ago. You have
>>>                         to remember it
>>>                         because you were the first who pointed out to this
>>>                         problem. It's
>>>                         interesting that at the same time files are downloadable
>>>                         from pcmdi gateway
>>>                         but not
>>>                         directly from data node TDS. Bob fixed it that time on
>>>                         pcmdi side. Now we
>>>                         have very similar symptoms but he is sure that this is
>>>                         because gateway is
>>>                         too busy.
>>>
>>>                         Just for example:
>>>                         file in this dataset
>>>                         http://esgdata.gfdl.noaa.gov/__thredds/esgcet/1/cmip5.__output1.NOAA-GFDL.GFDL-CM3.__1pctCO2.mon.atmos.Amon.r1i1p1.__v20110601.html?dataset=cmip5.__output1.NOAA-GFDL.GFDL-CM3.__1pctCO2.mon.atmos.Amon.r1i1p1.__v20110601.ccb_Amon_GFDL-CM3___1pctCO2_r1i1p1_002601-003012.__nc
>>>                         <http://esgdata.gfdl.noaa.gov/thredds/esgcet/1/cmip5.output1.NOAA-GFDL.GFDL-CM3.1pctCO2.mon.atmos.Amon.r1i1p1.v20110601.html?dataset=cmip5.output1.NOAA-GFDL.GFDL-CM3.1pctCO2.mon.atmos.Amon.r1i1p1.v20110601.ccb_Amon_GFDL-CM3_1pctCO2_r1i1p1_002601-003012.nc>
>>>
>>>                         was accessible when I try before writing this email and
>>>                         now is not.
>>>
>>>                         At the same time it's downloadable from pcmdi gateway
>>>                         all time.
>>>
>>>                         regards,
>>>                         Sergey
>>>
>>>
>>>                         On 02/22/2012 05:41 AM, Estanislao Gonzalez wrote:
>>>
>>>                             Hi Sergei,
>>>
>>>                             Your node is down because of too many files:
>>>                             http://esgdata.gfdl.noaa.gov/__esgf-node-manager/
>>>                             <http://esgdata.gfdl.noaa.gov/esgf-node-manager/>
>>>
>>>                             java.io.FileNotFoundException:
>>>                             /usr/local/apache-tomcat-6.0.__32/webapps/esgf-node-manager/__index.html
>>>                             (Too
>>>                             many
>>>                             open files)
>>>                             java.io.FileInputStream.open(__Native Method)
>>>                             java.io.FileInputStream.<init>__(FileInputStream.java:106)
>>>                             org.apache.naming.resources.__FileDirContext$FileResource.__streamContent(FileDirContext.__java:927)
>>>
>>>
>>>                             org.apache.catalina.servlets.__DefaultServlet.copy(__DefaultServlet.java:1832)
>>>                             org.apache.catalina.servlets.__DefaultServlet.serveResource(__DefaultServlet.java:919)
>>>
>>>                             org.apache.catalina.servlets.__DefaultServlet.doGet(__DefaultServlet.java:398)
>>>                             javax.servlet.http.__HttpServlet.service(__HttpServlet.java:617)
>>>                             javax.servlet.http.__HttpServlet.service(__HttpServlet.java:717)
>>>
>>>                             The catalina logs should also have messages
>>>                             regarding this.
>>>                             This is how you can prevent it from happening again:
>>>                             http://esgf.org/wiki/ESGFNode/__FAQ#Tomcat_is_complaining___about_too_many_open_files
>>>                             <http://esgf.org/wiki/ESGFNode/FAQ#Tomcat_is_complaining_about_too_many_open_files>
>>>
>>>                             Now, you'll have to restart the node.
>>>
>>>                             Thanks,
>>>                             Estani
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>



More information about the GO-ESSP-TECH mailing list