Problems encountered when upgrading Atlassian Bamboo

This is for a self-hosted Atlassian Bamboo environment.

First rule of thumb for any Atlassian upgrades is backup setenv.sh and compare the old/new to ensure you bring any customisations across. The installer won't do this for you.

It is best to deploy into a new directory then symlink the active application path
e.g. run Bamboo from /data/bamboo/bamboo-install/...
which would symlink to the current version

[root@bamboo logs]# ls -ld /data/bamboo/bamboo-install
lrwxrwxrwx 1 _bamboo _bamboo 49 Oct 18 03:40 /data/bamboo/bamboo-install -> /data/bamboo/installations/atlassian-bamboo-6.5.1

Now for the errors encountered this time around:

Error - Java Out Of Memory errors (in Apache Tomcat logs – catalina.out)

If you see entries such as this:

18-Oct-2018 08:44:14.422 SEVERE [http-nio-127.0.0.1-8000-exec-103] org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun
java.lang.OutOfMemoryError: GC overhead limit exceeded

You may need to tune your memory (the below is in setenv.sh on the Bamboo server)

The Java OOM may impact any of the Java threads, so specific impact will be unpredictable (today we lost the ability for AWS build nodes to establish tunnels back into Bamboo, but when restarting the service the nodes worked and the GUI slowed down instead).

# The following 2 settings control the minimum and maximum given to the Bamboo Java virtual machine.  In larger Bamboo instances, the maximum amount will need to be increased.
#
JVM_MINIMUM_MEMORY=8092m
JVM_MAXIMUM_MEMORY=28672m

Error - Builds unable to store artifacts

I observed these errors on builds

18-Oct-2018 11:27:57  com.atlassian.bamboo.build.artifact.BambooRemoteArtifactHandler: java.net.SocketException: Broken pipe
18-Oct-2018 11:27:57  Unable to publish artifact [Rendered files]: Unable to publish artifact Non required shared artifact: [Rendered files], pattern: [generated_files.tar.gz] anchored at: [.] for BUILD-PLAN1 via com.atlassian.bamboo.build.artifact.BambooRemoteArtifactHandler@389dd430
18-Oct-2018 11:27:57  The artifact is required for down stream Stages and Jobs, build will now fail.

Digging further, local agents show errors such as:

2018-10-18 11:58:25,709 INFO [0-BAM::Elastic Agent on i-007b561bb2cb49cca::Agent:pool-3-thread-1] [RetryExec] Retrying request to {}->httpst://127.0.0.1:46593
2018-10-18 11:58:26,116 WARN [0-BAM::Elastic Agent on i-007b561bb2cb49cca::Agent:pool-3-thread-1] [ArtifactStreams] Error during artifact transfer, total bytes written: 114688, total requested: 246751, latest request: 246751
2018-10-18 11:58:26,116 INFO [0-BAM::Elastic Agent on i-007b561bb2cb49cca::Agent:pool-3-thread-1] [RetryExec] I/O exception (java.net.SocketException) caught when processing request to {}->httpst://127.0.0.1:46593: Connection reset
2018-10-18 11:58:26,116 INFO [0-BAM::Elastic Agent on i-007b561bb2cb49cca::Agent:pool-3-thread-1] [RetryExec] Retrying request to {}->httpst://127.0.0.1:46593
2018-10-18 11:58:26,461 WARN [0-BAM::Elastic Agent on i-007b561bb2cb49cca::Agent:pool-3-thread-1] [ArtifactStreams] Error during artifact transfer, total bytes written: 98304, total requested: 246751, latest request: 246751
2018-10-18 11:58:26,461 INFO [0-BAM::Elastic Agent on i-007b561bb2cb49cca::Agent:pool-3-thread-1] [RetryExec] I/O exception (java.net.SocketException) caught when processing request to {}->httpst://127.0.0.1:46593: Connection reset
2018-10-18 11:58:26,461 INFO [0-BAM::Elastic Agent on i-007b561bb2cb49cca::Agent:pool-3-thread-1] [RetryExec] Retrying request to {}->httpst://127.0.0.1:46593

This configuration is using nginx to present the web service over HTTPS (and forward requests to Java).

Because of this, nginx may need to cache large amounts of incoming data.

We have set a directive in nginx:

client_body_temp_path /data/nginx/client_temp 1 2;

Nginx logs on the Bamboo server show:

2018/10/18 12:14:02 [crit] 100146#0: *6458677 mkdir() "/data/nginx/client_temp/8" failed (2: No such file or directory), client: 192.168.10.20, server: _, request: "POST /agentServer/message?planResultKey=BUILD-PLAN1-4567 HTTP/1.1", host: "bamboo"
2018/10/18 12:14:02 [crit] 100146#0: *6458702 mkdir() "/data/nginx/client_temp/9" failed (2: No such file or directory), client: 192.168.10.20, server: _, request: "POST /agentServer/message?planResultKey=BUILD-PLAN1-4567 HTTP/1.1", host: "bamboo"
2018/10/18 12:14:03 [crit] 100146#0: *6458709 mkdir() "/data/nginx/client_temp/0" failed (2: No such file or directory), client: 192.168.10.20, server: _, request: "POST /agentServer/message?planResultKey=BUILD-PLAN1-4567 HTTP/1.1", host: "bamboo"

Turns out, the client_temp directory didn’t exist.

I created the client_temp directory (and set ownership/permissions for nginx) and it worked as expected.