02:37:00  * joaocgreisquit (*.net *.split)
02:46:44  * joaocgreisjoined
06:48:22  * rmgquit (Read error: Connection reset by peer)
06:48:58  * rmgjoined
08:17:30  * dawsonmquit (Remote host closed the connection)
09:19:59  <jbergstroem>rvagg: any updates re binaries?
10:50:14  <rvagg>jbergstroem: shaving yaks
10:50:31  <rvagg>joaocgreis: node-msft-win10-5 is full, can you log in and clean it up? it's been knocked out of service by jenkins
10:51:12  <joaocgreis>rvagg: knocked by me, I'm working on it
10:51:42  <rvagg>joaocgreis: ah, ok sorry, also I just cleaned up the two 2008 machines on rackspace, one was full(ish)
10:52:29  <rvagg>jbergstroem: I've removed all of the iojs-* labels
10:55:17  <rvagg>jbergstroem: which smartos machines should I use for builds? the ones labelled smartos13-32 and smartos13-64?
10:55:28  <rvagg>or are there specific release machines there?
10:57:08  <rvagg>oh, you have iojs-joyent-smartos13.3.1-release, did you want to use this single machine for both binary types?
11:13:32  <rvagg>had to add a `mkdir -p ${HOME}/node-icu` to the jenkins job, I don't know the ip addresses of the new centos or smartos machines to do it manually, but that's ok
11:13:38  <rvagg>https://ci.nodejs.org/job/iojs+release/263/
11:14:08  <rvagg>https://nodejs.org/download/nightly/v5.0.1-nightly2015111347f3735e88/
11:24:20  <rvagg>oh for goodness sake ... centos5-64
11:24:25  <rvagg>Slave went offline during the build
11:24:25  <rvagg>ERROR: Connection was broken: java.io.EOFException
11:24:26  <jbergstroem>rvagg: for test, yep -- use smartos-32/64. i will reprovision them soon. the smartos13.3.1 release is gcc47 to be used for 0.12.
11:24:42  <jbergstroem>sigh
11:24:52  <jbergstroem>thats a jenkins bug for sure
11:28:56  <rvagg>joaocgreis: can you have a look at https://ci.nodejs.org/job/node-test-commit/1107/ please? both fanned jobs are broken, or are you working on these?
11:29:35  <rvagg>hm, also, only windows and arm are being tested in node-test-commit, what's with that?
11:30:05  <rvagg>our jenkins config is getting way too complicated
11:33:54  <rvagg>jbergstroem: https://ci.nodejs.org/job/iojs+release/263/nodes=smartos13-release/console ok so I can't use this for > 0.12, should I still be using the previous smartos slaves for that?
11:34:12  <joaocgreis>rvagg: looking at it
11:34:14  <rvagg>can you update me on what the plan is for smartos slaves and how they are supposed to be used? I'm a little confused
11:36:14  <jbergstroem>rvagg: i created smartos13.3.1-release since we didn't have a reproducible way to create 0.12 releases; especially now that the one that old nodejs used to bake theirs with. after that i created a new release machine for smartos called nodejs-release-joyent-smartos153-64-1. this is intended to be used for newer (gcc4.8+ releases). i'm happy to try that for 0.12 too but not sure what will break. i will soon retire our oldest smartos test slaves and
11:36:14  <jbergstroem> replace them with newer ones based on gcc 4.8
11:36:57  <joaocgreis>Job status: [node-compile-windows] subjob has no changes since last build.
11:36:59  <rvagg>okok, so single release machine for each of <=0.12 and >0.12?
11:37:26  <jbergstroem> thats how we were doing it prior
11:37:49  <jbergstroem>at the moment im just trying to replicate what we have. i reckon we've got enough flak for this weeks wiggle :)
11:38:10  <rvagg>joaocgreis: I just pushed a totally new commit to https://github.com/nodejs/node/pull/3399, it hasn't been through jenkins yet
11:38:47  <rvagg>jbergstroem: prior we were using two machines for smartos releases, one for 64 and one for 32, we do this with other architectures, doubling up is something we haven't had to do yet
11:38:51  <jbergstroem>rvagg: regarding complexity; i agree it gets overwhelming when you start digging but after you have a few hours behind your vest it starts to make sense.
11:39:02  <jbergstroem>rvagg: thing is, both were 64-bit
11:39:04  <rvagg>I can do it, it'll slow down releases, but arm builds are the slowest part so it's not a big deal
11:39:16  <rvagg>jbergstroem: yeah, that's fine but having two means we could do them in parallel
11:39:42  <jbergstroem>rvagg: happy to add a real -32 forward
11:39:56  <rvagg>jbergstroem: do we have guidance from joyent re the amount of resources we can use?
11:40:18  <rvagg>lemme try doubling up, smartos builds have been pretty quick in the past
11:40:23  <jbergstroem>rvagg: i asked jgi about me spawning a few, no restrictions beside "plx don't delete old stuff".
11:40:32  <rvagg>cool
11:40:39  <jbergstroem>rvagg: skip the auth key stuff (we're using that shared key nowadays)
11:41:05  <jbergstroem>rvagg: also, i'm a bit more careful about how much resources we allocate per vm. most of the stuff I add is 2vcpu 2g ram
11:41:09  <jbergstroem>which is enough for our needs
11:41:34  <jbergstroem>rvagg: want me to creat the machine at joyent? im already logged in
11:42:01  <rvagg>jbergstroem: not yet, release machines are not used much so I'm fine with doubling up unless it ends up taking too long
11:42:20  <jbergstroem>rvagg: ok
11:42:40  <jbergstroem>rvagg: i don't think it will -- we don't run the test suite on release, right?
11:42:45  <rvagg>this complexity is making me grumpy, nothing I'm trying fully works, tests, release builds
11:42:50  <rvagg>no we don't
11:42:57  <jbergstroem>rvagg: then it'll be super quick, ccache and all
11:43:12  <rvagg>ccache isn't as impressive for release builds, full of misses
11:43:53  <jbergstroem>rvagg: sure, but if we cut weekly from 5.x etc it'll start counting. we shoudl bump the ccache size on release bots (default 2g)
11:44:18  <rvagg>there's something about the compile defs that cause more misses on release builds
11:46:41  <rvagg>jbergstroem: I've changed the new smartos 14 labels to: "smartos14-release post-1-release" and the smartos 13 to: "smartos13-release pre-1-release", the "-release" bit is important in these labels or else they will be used for test runs and can't be singled out for release builds; the "pre-1-release" and "post-1-release" labels are being used for > 0.12 and
11:46:41  <rvagg><= 0.12 differentiation, a bunch of them have both labels but some have only one
11:46:59  <jbergstroem>new ones are 15.3.0
11:47:07  <jbergstroem>ah you're talking about the old test ones
11:47:18  <jbergstroem>i'm not touching labels for now :)
11:48:03  <jbergstroem>i usually duplicate an old one to get the correct label when i create a new one
11:49:15  <rvagg>oh, 15.3.0 .. that's what "smartos153-64-1" means?
11:50:30  <jbergstroem>well no other os had a dot prefix
11:50:39  <rvagg>ok, changed it to smartos15-release
11:50:42  <jbergstroem>for version
11:51:04  <jbergstroem>careful if you change node names (feel free), since you have to update id in init scripts
11:51:13  <jbergstroem>(and ansible)
11:52:22  <rvagg>just labels, names are unimportant
11:57:16  * orangemochajoined
11:58:20  <rvagg>joaocgreis: does node-test-pull-request & node-test-commit look at commit shas? that run was on a PR with a commit I ammended, although I don't think I ever run CI on it previously anyway, the fact that only windows & arm ran (and both crapped themselves) means something's wrong
11:59:02  <jbergstroem>names are unimportant you say! bah. ask phil karlton
11:59:11  <jbergstroem>rvagg: could it be broken because the binary upload stuff isn't working? (guessing)
11:59:38  <rvagg>it didn't even run node-test-commit-linux et. al, just node-test-commit-arm and node-test-commit-windows, the rest were ignored
11:59:53  <rvagg>and yes, binary upload or cross-compile / fanned / whatever seems to be broken
12:00:28  <joaocgreis>rvagg: those checks should be all disabled, looking into it now
12:00:29  <rvagg>i.e. it's all broken, everything is terrible, all software is full of fail, I'm feeling stabby
12:00:40  <jbergstroem>i've seen node-test-pull-request spawn node-test-commit multiple times
12:00:48  <jbergstroem>(inc freebsd, linux, et al)
12:01:47  <rvagg>has fanned windows actually bought us anything in terms of speed? I'm concerned that the complexity of cross-compile + fanned is far outweighing any gains, even on arm
12:01:59  <jbergstroem>it has
12:02:11  <jbergstroem>but not even close to arm
12:02:33  <rvagg>and the $^*@^$*#ing Windows linker locking up machines so consecutive builds fail unless you wait long enough is driving me mad
12:02:51  <jbergstroem>i think - as with other parts of build - its a bit bottlenecked for houskeeping or putting out fires
12:03:31  <jbergstroem>bus factor is real
12:03:48  <rvagg>and wth is with the response times @ https://ci.nodejs.org/computer/? they're all around ~20000ms
12:03:57  <jbergstroem>that's always been bullshit
12:04:22  <jbergstroem>jenkins is really showing its ugly side
12:04:50  <joaocgreis>damned thing must be a bug in jenkins! the option is there "Build only if SCM changes" in test-commit, it is disabled for every sub-job but that run of test-commit behaved as if enabled!
12:05:31  <joaocgreis>we need Jenkins LTS..
12:05:32  <jbergstroem>i'm going to attempt a nights sleep for once. will be back in ~7h. but before i leave -- do you need me for anything?
12:05:48  <jbergstroem>joaocgreis: it doesn't exist
12:06:09  <jbergstroem>btw have a look at the irq load at our CI once jobs get going =D
12:06:29  <rvagg>Caused by: java.lang.NoClassDefFoundError: Could not initialize class com.sun.proxy.$Proxy10
12:06:41  <rvagg>it's as if Jenkins is attempting to throw every possible error at us in one day
12:06:45  <joaocgreis>rvagg: about the complexity of fanned: the w10 bots took 1h to build, that's why I made the windows-fanned
12:06:46  <rvagg>piece of garbage
12:07:09  <joaocgreis>jbergstroem: under advanced
12:07:14  <joaocgreis>for each job
12:08:06  <jbergstroem>i'll do the smartos and centos 6,7 test bots tomorrow then. let me know if you need anything else done.
12:08:50  <joaocgreis>jbergstroem: nothing from me, have a good night sleep
12:09:26  <jbergstroem>buildbot would probably slow us down for a bit
12:09:29  <jbergstroem>but i wouldn't rule it out.
12:16:00  <rvagg>now even more slaves are failing with the stupid NoClassDefFoundError exception, it's contagious
12:17:41  <jbergstroem>haven't seen that before
12:17:48  <jbergstroem>im off, sorry
12:18:41  <rvagg>thanks jbergstroem, sleep well
12:41:25  <rvagg>jbergstroem, joaocgreis: Jenkins LTS _does_ exist https://wiki.jenkins-ci.org/display/JENKINS/LTS+Release+Line
12:50:18  <rvagg>jbergstroem: jenkins slave on nodejs-release-digitalocean-centos5-64-1 keeps on dying during build and not restarting .. there's zilch in the log file to indicate a problem
12:54:05  <rvagg>and jbergstroem, shouldn't we be using the oldest possible smartos for building >0.12? or are we able to solve libc compatibility problems with the machine you've provisioned
12:54:29  <rvagg>perhaps you could test the binaries @ https://nodejs.org/download/nightly/v5.0.1-nightly2015111384bb74547d/ on an older smartos machine
12:57:14  <rvagg>yeah, something sus about that centos6-64 release machine, can't get through a build and can't even connect to it now https://ci.nodejs.org/job/iojs+release/272/nodes=centos5-release-64/console
13:17:23  <orangemocha>something is seriously messed up
13:17:32  <orangemocha>I wonder if this could be the cause: https://ci.nodejs.org/administrativeMonitor/OldData/manage
13:23:31  <orangemocha>note that it shouldn't be polling SCM in the first place
13:30:52  <rvagg>more progress on 0.12 builds https://ci.nodejs.org/job/iojs+release/274/ -> https://nodejs.org/download/nightly/v0.12.8-nightly201511139d12c3b135/
13:31:36  <rvagg>that centos5-64 machine tho
19:30:35  <jbergstroem>rvagg: java was ded
19:32:50  <jbergstroem>i'll try and limit the amount of ram java consumes, see if it changes anything
19:42:07  <joaocgreis>jbergstroem, rvagg: can you delete the node-test-binary-arm and node-test-binary-windows directories in the master workspace? The new problem seems to be related to history
19:42:56  <jbergstroem>the entire folder? wouldn't that kill the jobs as well
19:46:16  <joaocgreis>jbergstroem: in the workspace, the git checkout folder
19:46:26  <joaocgreis>that'll be recreated next run
19:46:53  <jbergstroem>there are 3 workspace folders in -windows
19:46:57  <jbergstroem>kill all?
19:47:37  <joaocgreis>no
19:47:48  <joaocgreis>just node-test-binary-arm and node-test-binary-windows
19:49:26  <jbergstroem>yes, in jobs/node-test-binary-windows there are three workspace folders, workspace workspace@2 workspace@3
19:51:30  <jbergstroem>just want to make sure
19:51:31  <joaocgreis>i don't know. If they have a .git inside, move them elsewere
19:52:01  <jbergstroem>they are all git checkouts
19:52:09  <jbergstroem>and the first one is gone now?!
19:52:17  <joaocgreis>then I say move them
19:52:39  <joaocgreis>we'll see if they get recreated
19:52:52  <joaocgreis>in the slaves I've done this often, no problems
19:52:52  <jbergstroem>done
19:53:10  <jbergstroem>arm has 6 subfolders
19:53:21  <jbergstroem>done&done
19:53:31  <joaocgreis>https://ci.nodejs.org/job/node-test-commit/1115/ building, let's see what happens when the test-binaries start
19:56:12  <joaocgreis>no luck. I'll clone them
19:56:56  <joaocgreis>no point in restoring the folders, they seem to work
19:57:05  <joaocgreis>I'd leave them for a day just in case
19:57:41  <jbergstroem>ok
20:07:08  <joaocgreis>jbergstroem: did you limit the memory usage on the master? jenkins is way slower than it has been in the last hours
20:07:18  <jbergstroem>joaocgreis: not at all
20:07:32  <jbergstroem>just tried two slaves to see if that would help any
20:07:46  <joaocgreis>ok, unrelated then
20:08:25  <joaocgreis>I really have to run, arm is building, going to assume it also fixed windows
20:10:20  <joaocgreis>http://stackoverflow.com/questions/23347077/connection-issue-with-jenkins-slave-on-windows-azure I used the second answer from here, but the slaves failed again. Can you follow the first one to change jenkins.xml on master? Add -Dhudson.slaves.ChannelPinger.pingInterval=2 to the arguments
20:15:53  <jbergstroem>sure
20:16:05  <jbergstroem>i'll wait for completion
20:29:02  <jbergstroem>here's something else I haven't seen before; "write error": https://ci.nodejs.org/job/node-test-commit-linux/1189/nodes=fedora23/console
20:29:12  <jbergstroem>not sure if related, but dmesg: [233862.809425] traps: node[8412] trap invalid opcode ip:e18769 sp:7ffe182a7a58 error:0 in node[400000+de5000]
21:04:38  <jbergstroem>joaocgreis: changed to 2 now.
21:06:54  <jbergstroem>joaocgreis: just restarted and now see these errors; checked that both plugins are loaded: https://ci.nodejs.org/administrativeMonitor/OldData/manage
21:25:20  <jbergstroem>rvagg: should we really use the smartos15 for 0.12 nightlies?
21:39:19  * orangemochaquit (Ping timeout: 240 seconds)
21:46:26  * orangemochajoined
23:04:40  <rvagg>jbergstroem: it's the same build job, if you look in the jenkins setup it does a check for labels and if it doesn't have "pre-1-release" then it doesn't do anything to contribute to the build, the reverse happens for >0.12 where it checks for "post-1-release"
23:05:30  <rvagg>jbergstroem: how are we giving access to the jenkins server? can we put joaocgreis on there too?