[jira] [Commented] (APEXCORE-807) In secure mode containers are failing after one day and the application is failing after seven days

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (APEXCORE-807) In secure mode containers are failing after one day and the application is failing after seven days

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/APEXCORE-807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16442548#comment-16442548 ]

Pramod Immaneni commented on APEXCORE-807:
------------------------------------------

This is happening because the tokens are not being renewed by yarn on a daily basis. To get around this, the application needs to renew the tokens before the daily renewal expiry period just like it refreshes the tokens before the seven day lifetime expiry period.

> In secure mode containers are failing after one day and the application is failing after seven days
> ---------------------------------------------------------------------------------------------------
>
>                 Key: APEXCORE-807
>                 URL: https://issues.apache.org/jira/browse/APEXCORE-807
>             Project: Apache Apex Core
>          Issue Type: Task
>            Reporter: Pramod Immaneni
>            Assignee: Pramod Immaneni
>            Priority: Major
>
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token nnnnnn for xxxxxx) can't be found in cache
>  at com.google.common.base.Throwables.propagate(Throwables.java:156)
>  at com.datatorrent.stram.engine.Node.reportStats(Node.java:489)
>  at com.datatorrent.stram.engine.GenericNode.reportStats(GenericNode.java:825)
>  at com.datatorrent.stram.engine.GenericNode.processEndWindow(GenericNode.java:184)
>  at com.datatorrent.stram.engine.GenericNode.run(GenericNode.java:397)
>  at com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1465)
> Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token nnnnnn for xxxxxx) can't be found in cache
>  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>  at com.datatorrent.stram.engine.Node.reportStats(Node.java:482)
>  ... 4 more
> Caused by: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token nnnnnn for xxxxxx) can't be found in cache
>  at com.google.common.base.Throwables.propagate(Throwables.java:156)
>  at com.datatorrent.common.util.AsyncFSStorageAgent.copyToHDFS(AsyncFSStorageAgent.java:131)
>  at com.datatorrent.common.util.AsyncFSStorageAgent.flush(AsyncFSStorageAgent.java:156)
>  at com.datatorrent.stram.engine.Node$CheckpointHandler.call(Node.java:706)
>  at com.datatorrent.stram.engine.Node$CheckpointHandler.call(Node.java:696)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token nnnnnn for xxxxxx) can't be found in cache
>  at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554)
>  at org.apache.hadoop.ipc.Client.call(Client.java:1498)
>  at org.apache.hadoop.ipc.Client.call(Client.java:1398)
>  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
>  at com.sun.proxy.$Proxy10.create(Unknown Source)
>  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:313)
>  at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
>  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
>  at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
>  at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
>  at com.sun.proxy.$Proxy11.create(Unknown Source)
>  at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1822)
>  at org.apache.hadoop.hdfs.DFSClient.primitiveCreate(DFSClient.java:1762)
>  at org.apache.hadoop.fs.Hdfs.createInternal(Hdfs.java:104)
>  at org.apache.hadoop.fs.Hdfs.createInternal(Hdfs.java:60)
>  at org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:585)
>  at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:688)
>  at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:684)
>  at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
>  at org.apache.hadoop.fs.FileContext.create(FileContext.java:684)
>  at com.datatorrent.common.util.AsyncFSStorageAgent.copyToHDFS(AsyncFSStorageAgent.java:119)
>  ... 9 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)