00:00:01  <evanlucas>true
00:00:03  <rvagg>we are effectively colo now
00:00:45  <evanlucas>rvagg so what would be the ideal outcome of this situation in your opinion? dedicated os x machines?
00:01:10  <rvagg>ideal is that, yea, but ideal is also that we have lots of combinations of osx versions
00:01:24  <jbergstroem>yeah its all about getting access to osx versions
00:01:27  <evanlucas>10.7 - 10.11?
00:02:15  <jbergstroem>prdy much. we've actually had 10.5 set as target for 4.x but i think we've gone with 10.{6,7} as semi-official. 10.6 was the libc landing no?
00:02:25  <evanlucas>would "we" host them or someone else?
00:02:36  <jbergstroem>evanlucas: prefer someone else
00:02:42  <evanlucas>ok
00:03:19  <jbergstroem>evanlucas: bit more on requirements here: https://github.com/nodejs/build/issues/367
00:03:43  <evanlucas>ahhh couldn't find the issue
00:03:46  <evanlucas>thanks
00:03:57  <jbergstroem>👌🏻
00:06:41  <thealphanerd>travis does osx 9 -> 11 https://docs.travis-ci.com/user/osx-ci-environment/#OS-X-Version
00:08:41  <jbergstroem>NIH sorry! :-D have you tested it? what kind of horsepower are we talking about
00:14:55  <thealphanerd>Well I bet we could get in touch with the people from travis
00:18:15  <jbergstroem>i like the idea of a homogenic build farm. less mixing and matching, etc -- but if we are out of options i'd consider it
00:18:32  <thealphanerd>who knows what they might let us do
00:18:43  <thealphanerd>we could check out sauce labs too
00:19:23  <thealphanerd>they do iOS testing so they probably have osx machines
00:33:33  <rvagg>we discussed both travis and sauce at the last TSC meeting, not sure if we made notes about it but someone was going to reach out to at least travis
00:33:44  <rvagg>not sure sauce have good infra for letting anyone else in to use their stuff
01:02:45  * Fishrock123joined
01:04:59  * Fishrock123quit (Client Quit)
01:14:25  * chorrelljoined
01:30:11  * chorrellquit (Quit: Textual IRC Client: www.textualapp.com)
05:54:41  <phillipj>we've just got the first inline jenkins build status --> https://github.com/nodejs/node/pull/6674 😎
05:57:54  * node-ghjoined
05:57:55  * node-ghpart
06:03:16  <jbergstroem>phillipj: what build was it
06:03:45  <jbergstroem>perhaps the bot should post a comment about a submitted build [if no one else has]
06:04:22  <jbergstroem>ah you didn't invoke test-pr, just test-commit?
06:04:33  <jbergstroem>or just linter?
06:05:08  <jbergstroem>found it.
06:12:06  <phillipj>yupp, triggered the linter directly
06:12:56  <jbergstroem>i'll add a few more. optimally we should test from test-pr and not a subjob
06:14:24  <phillipj>alright
06:17:05  <phillipj>if we want it to comment in the PR about builds, we would have to include PR # in the curl payload
06:19:01  <jbergstroem>we already collect that from test-pr
06:40:15  <jbergstroem>ok; added POST_STATUS_TO_PR check for all jobs
06:48:36  <jbergstroem>https://github.com/nodejs/node/pulls
06:48:43  <jbergstroem>seems to be a few pr's with pending checks?
07:10:28  <phillipj>yupp.. by looking at the logs, it seems like https://github.com/nodejs/node/pull/6674 is the only one who got "success" pushed from jenkins
07:12:09  <phillipj>could it be that linter only pushes "success" if it was triggered directly?
07:12:27  <jbergstroem>no it pushes on finish
07:12:28  <phillipj>not only succes though... the completion status would be more correct
07:12:31  <jbergstroem>either fail or success
07:12:35  <phillipj>hm
07:14:48  <phillipj>ahh, there's a couple of errors while pushing github statuses here
07:14:59  <jbergstroem>on unstable (flaky) or success i post scuess
07:15:10  <jbergstroem>on fail, not built or aborted i post failure
07:15:17  <phillipj>e.g: https://github.com/nodejs/node/pull/6683
07:15:23  <phillipj>* nodejs/node/8ce7305 Jenkins / Github PR status updated to 'pending'
07:15:32  <phillipj>! nodejs/node/4a92d22 Error while updating Jenkins / GitHub PR status { [Error: {"message":"No commit found for SHA: 4a92d2213739cf46df1801fc5f8a5a1ed3474de0" .....
07:15:43  <phillipj>those might be related?
07:16:17  <phillipj>any chance the sha is different from start to completion?
07:16:18  <jbergstroem>should we really let this loose on the main repo? I've defaulted the tests so you have to set the checkbox for it to run
07:16:41  <jbergstroem>it might be if using rebase_onto
07:16:45  <jbergstroem>i can move the step post that
07:16:53  <jbergstroem>just did.
07:17:20  <phillipj>goodie
07:19:19  <phillipj>afaik there's two PRs which has gotten stale at "pending", maybe we should trigger a new pr-test on both of them, to see if your fix is working?
07:19:38  * sgimenojoined
07:22:25  <jbergstroem>sure
07:22:44  <jbergstroem>i think this is a good example of why - in time - the gh bot kind of needs to keep state of pending runs
07:22:51  <jbergstroem>and why we'll ultimately have to poll jenkins as well
07:26:41  <phillipj>yeah I'm guessing jobs might suddenly get stuck as well? meaning we cannot rely on jobs *always* pushing completion status
07:29:23  <jbergstroem>most often they enter failed, be it cancelled etc
07:29:37  <jbergstroem>if i restart master during a job it can be lost, yes.
07:29:54  <jbergstroem>after we've sorted this out i'll add other runners
07:30:00  <jbergstroem>just wanna make sure we don't create pending stuff
07:31:50  <jbergstroem>re-running the other pending
07:32:01  <jbergstroem>https://github.com/nodejs/node/pull/6683
07:32:59  <jbergstroem>ok worked
07:36:47  <phillipj>nice
07:37:43  <phillipj>got this after your fix though: ! nodejs/node/79b0000 Error while updating Jenkins / GitHub PR status { [Error: {"message":"No commit found for SHA: 79b000029eb8a7f18fea12cd6edcad0f42749873
07:37:50  <phillipj>not sure what PR # that belongs to
07:43:57  <jbergstroem>fatal: bad object 79b000029eb8a7f18fea12cd6edcad0f42749873
07:44:00  <jbergstroem>can't find it in my repo
07:44:27  <jbergstroem>you should be getting a very small amount of incoming; if not the conditional isn't working.
07:46:41  <jbergstroem>all i do is git rev-parse HEAD
07:47:03  <jbergstroem>i don't see any changes between the curls
07:48:22  <jbergstroem>found another pending: https://github.com/nodejs/node/pull/6669
07:48:25  <jbergstroem>will run it
07:50:06  <phillipj>absolutely, it's not many
07:52:16  <phillipj>4 updates from 2 different jobs the last 20 minutes
08:16:51  <jbergstroem>phillipj: ok
08:16:54  <jbergstroem>so let me enable freebsd
08:17:25  <jbergstroem>(done)
08:20:26  <phillipj>nice
08:29:35  <jbergstroem>added fbsd, arm, arm-fanned so far
09:13:23  <jbergstroem>hm
09:13:28  <jbergstroem>ccache isn't working on ubuntu16
09:15:38  <jbergstroem>path seems changed for some reason
09:19:21  <jbergstroem>i think they've changed how environment= is parsed ..
09:24:22  <jbergstroem>fixed
09:29:14  * node-ghjoined
09:29:14  * node-ghpart
10:05:41  <jbergstroem>ok we're down to <9min for linux now
10:05:48  <jbergstroem>fixed one host missing JOBS
10:05:53  <jbergstroem>reckon we'll end up around 6mins
10:13:19  * node-ghjoined
10:13:19  * node-ghpart
10:24:40  * node-ghjoined
10:24:41  * node-ghpart
10:27:24  * thealphanerdquit (Quit: farewell for now)
10:27:45  * thealphanerdjoined
11:02:23  * node-ghjoined
11:02:23  * node-ghpart
11:58:13  * node-ghjoined
11:58:13  * node-ghpart
12:13:15  * node-ghjoined
12:13:16  * node-ghpart
12:21:18  * node-ghjoined
12:21:19  * node-ghpart
12:22:18  <jbergstroem>phillipj: https://github.com/nodejs/node/pull/6674
12:22:22  <jbergstroem>should have a few more testers now
12:22:31  <jbergstroem>i'm not quite done; still $real_lyf in the way
12:23:26  <jbergstroem>doesn't look like its working: + [ -n '' ]
12:24:14  * node-ghjoined
12:24:15  * node-ghpart
12:24:33  * node-ghjoined
12:24:33  * node-ghpart
12:25:26  <jbergstroem>i think i have to read up how we parameter pass again
12:25:31  <jbergstroem>*on how
12:26:48  * node-ghjoined
12:26:48  * node-ghpart
12:26:59  * node-ghjoined
12:27:00  * node-ghpart
12:27:43  * node-ghjoined
12:27:43  * node-ghpart
12:40:36  <jbergstroem>note to self: if you can't find how something works in jenkins, there's likely an 'advanced' button you've forgot to click
12:41:11  <joaocgreis>rvagg: git should make it optimal by downloading only the diffs, but I think the jenkins git plugin sometimes downloads all the remote refs even if I explicitly told it not to do so
12:41:46  <joaocgreis>rvagg: are all the arms in your office? can we use your shared disk and cross compile in the armv8s?
12:42:30  <joaocgreis>rvagg: if you can create a shared folder with the same path in all machines, I can play with it
12:42:40  <jbergstroem>joaocgreis: would we be able support JOBS in windows for the test suite?
12:45:31  <joaocgreis>jbergstroem: if I'm understanding it well, it should work, I've used -J locally
12:45:56  <jbergstroem>joaocgreis: perfect. so what we need to do is assign JOBS through environment to all windows hosts
12:46:02  <jbergstroem>then i might have to PR the vcbuild.bat stuff
12:47:39  <joaocgreis>jbergstroem: note that we hame some 8 core machines and also some single core ones. The single core snails are very useful to detect some flaky tests, we've seen a lot of things only failing there
12:48:02  <joaocgreis>so JOBS will have to be different for different machines
12:48:23  <joaocgreis>are you planning to set it in the jenkins slave config?
12:49:05  <jbergstroem>joaocgreis: it is set in each jenkins environment, yeah
12:49:08  <jbergstroem>joaocgreis: expanding like so:
12:49:34  <jbergstroem>https://github.com/jbergstroem/build/blob/f91fbd4d9daee74b48cc41701f9a3e21156b2bd1/setup/ubuntu16.04/resources/jenkins.service.j2#L14
12:49:49  <jbergstroem>joaocgreis: systemd for instance doens't support forking out process in its init scripts so we can't use getconf
12:52:29  <rvagg>joaocgreis: /home/iojs/.ccache is shared across all of the machines, perhaps try and make a new subdirectory under there to experiment, I'm a bit afraid of what would happen with concurrent updates though
12:53:17  <rvagg>joaocgreis: we do have git mirrors in /home/iojs/git/, I could add one for the binary repo and add a cron job to update it every day, seems excessive though when we're simply trying to transfer big binary blobs
12:53:46  <rvagg>joaocgreis: note that .ccache dir is shared across _all_ machines, so if you do anything arch specific then be careful
12:55:07  <joaocgreis>rvagg: I'll try .ccache/jenkins or similar. I'll use the job name, and make sure there's only one writer
13:16:08  * node-ghjoined
13:16:08  * node-ghpart
13:17:19  <jbergstroem>ok; hopefully we'll have a working jenkins <> github thing going tomorrow. everything is hidden behind that checkbox "post_to_pr" something; but i suspect i'm not getting the right commit sha's at all stages, so just leave it for now.
13:30:18  * node-ghjoined
13:30:19  * node-ghpart
13:36:31  * rmgquit (Remote host closed the connection)
13:37:07  * rmgjoined
13:38:16  * node-ghjoined
13:38:16  * node-ghpart
13:40:46  * node-ghjoined
13:40:46  * node-ghpart
13:41:40  * rmgquit (Ping timeout: 252 seconds)
13:41:51  <jbergstroem>joaocgreis: do you know a reliable way to get the git sha from a script execution as part of a jenkins job? I've tried using git rev-parse HEAD below the part where we check out the repo to no avail.
13:48:46  <joaocgreis>jbergstroem: won't rev-parse fail only if there is no commit, meaning that the clone failed?
14:00:49  * Fishrock123joined
14:14:33  <jbergstroem>joaocgreis: it will fail if there's no commit otherwise just show latest from $branch
14:15:33  <joaocgreis>jbergstroem: so why does it fail?
14:15:44  <jbergstroem>joaocgreis: i just get inconsistent commits (i think)
14:15:46  <jbergstroem>or null
14:15:50  <jbergstroem>anyway, too late over here
14:15:51  <jbergstroem>ttyl
14:16:12  <joaocgreis>there are a few in the rebase scrips, haven't seen them fail
14:16:18  <joaocgreis>ok, good night!
14:37:37  * rmgjoined
14:42:03  * rmgquit (Ping timeout: 240 seconds)
15:20:28  * node-ghjoined
15:20:29  * node-ghpart
15:29:57  * chorrelljoined
15:50:14  * node-ghjoined
15:50:14  * node-ghpart
16:17:35  * rmgjoined
16:18:30  * chorrellquit (Quit: My Mac has gone to sleep. ZZZzzz…)
16:56:56  <Fishrock123>is the windows fanned ci stuck?
16:57:17  <Fishrock123>2h 6m ago still running https://ci.nodejs.org/job/node-test-binary-windows/2077/
17:02:43  <joaocgreis>Fishrock123: it is, thanks for the warning. Investigating, I'll unblock it asap
17:21:46  * node-ghjoined
17:21:47  * node-ghpart
20:26:12  * chorrelljoined
20:33:00  * chorrellquit (Quit: Textual IRC Client: www.textualapp.com)
21:46:29  <jbergstroem>great!
21:46:39  <jbergstroem>kids: don't restart jenkins, ever. even if there's a security update.
22:01:02  <Fishrock123>OH
22:01:13  <Fishrock123>so that's what happened
22:15:22  * Fishrock123quit (Remote host closed the connection)
22:30:32  * evanlucasquit (Remote host closed the connection)
22:52:28  <Trott>Heh. No idea if this ever worked before because I never had a reason to use it, but I just tried to stress test on all platforms and most failed instantly. https://ci.nodejs.org/job/node-stress-single-test/714/
22:52:48  <Trott>Raspberry Pis look like they might run. /shrug
22:55:35  <Trott>Doing a more typical stress test on just one Windows platform fails instantly too. I'm going to go away for a while and let jbergstroem finish having Jenkins reboot/upgrade ruin his day before typing more in here and making it worse. :-/
22:57:14  * Fishrock123joined
22:57:29  <rvagg>DON'T ROCK THE BOAT!
22:58:17  <rvagg>Trott: looks like config problems, you need to pass GIT_ORIGIN_SCHEME, GITHUB_ORG and REPO_NAME
22:58:18  <jbergstroem>hm
22:58:20  <rvagg>at least
22:58:41  <jbergstroem>yeah its not inheriting/getting lookups
22:59:44  <Trott>Those three things are filled in on the form...
23:00:01  <jbergstroem>https://ci.nodejs.org/job/node-stress-single-test/714/parameters/
23:00:12  <jbergstroem>hangon
23:00:15  <jbergstroem>i wonder
23:00:23  <jbergstroem>ok i think i know whats going on
23:00:24  <Trott>Ah, so no parameters at all are being passed, nothing specific to those three.
23:00:26  <Trott>Cool.
23:00:44  <jbergstroem>https://wiki.jenkins-ci.org/display/SECURITY/Jenkins+Security+Advisory+2016-05-11
23:00:49  <jbergstroem>read first
23:01:09  <Trott>Oh, it's a feature, not a bug. :-P
23:01:10  <jbergstroem>that list is pretty floating for us
23:01:25  <jbergstroem>rvagg: you ok with enabling all passed env vars?
23:01:28  <jbergstroem>(i am)
23:01:41  <jbergstroem>hm, we should probably document that decision somewhere
23:01:50  <rvagg>eh? what is that?
23:02:00  <jbergstroem>so they deemed passing env vars unsafe
23:02:06  <jbergstroem>so we either define safe vars
23:02:10  <jbergstroem>or pass hudson.model.ParametersAction.keepUndefinedParameters=1
23:02:14  <jbergstroem>(which was previous behaviour)
23:04:01  <jbergstroem>I guess I can start collecting a list
23:04:16  <jbergstroem>there'll be a lot of "wtfs" from us jenkins admins down the road though :)
23:04:40  * Fishrock123quit (Remote host closed the connection)
23:05:55  <rvagg>hum, I really don't know the full implications of this, is this only for env vars that we define in job and slave configs?
23:07:17  <jbergstroem>well if plugins are adding stuff we rely on in a similar fashion that'd be included.
23:07:25  <jbergstroem>but i don't know what every job we have expects
23:07:30  <jbergstroem>here's the shortlist so far: CERTIFY_SAFE,TARGET_GITHUB_ORG,TARGET_REPO_NAME,PR_ID,POST_STATUS_TO_PR,REBASE_ONTO,GIT_REMOTE_REF,NODES_SUBSET,IGNORE_FLAKY_TESTS,TEMP_REPO
23:07:41  <jbergstroem>how about we restart with that list and see if we get a test-pr going?
23:07:55  <jbergstroem>likely breaking ci-release too..
23:09:40  <jbergstroem>testing.
23:10:01  * jenkins-monitor1quit (Remote host closed the connection)
23:10:01  * jenkins-monitorquit (Remote host closed the connection)
23:10:10  * jenkins-monitor1joined
23:10:11  * jenkins-monitorjoined
23:10:28  <jbergstroem>i reckon we should just update to 2.0 while at it. i mean, can't be anymore breaking changes right?
23:10:40  <jbergstroem>new ui, etc.
23:13:11  <jbergstroem>https://ci.nodejs.org/job/node-test-commit/3290/
23:13:56  <jbergstroem>:'(
23:14:21  <jbergstroem>this is new: https://gist.github.com/jbergstroem/369f02d10ba8643989f89c0c0ea90143
23:15:22  <jbergstroem>looks like we're passing variables at leaset
23:15:45  <rvagg>eeek
23:16:27  <jbergstroem>ok
23:16:48  <jbergstroem>so;
23:16:49  <jbergstroem>1. api was updated somewhere somehow; we update github* plugins and close our eyes
23:16:49  <jbergstroem>2. they're relying on ENV and we're in for a treat
23:16:51  <jbergstroem>suggesting 1
23:17:17  <jbergstroem>i guess 2. could be tested with just disabling the env stuff
23:19:38  <thealphanerd>jbergstroem citgm is using a bunch of env vars based on the pattern observed in the release job
23:19:43  <thealphanerd>fwiw
23:19:59  <jbergstroem>thealphanerd: i manged to grep out warnings in the jenkins logs about these vars so I reckon I'll be picking them out as we go
23:20:20  <thealphanerd>cool. Let me know what I can do or what I can get for you that can help
23:20:58  <jbergstroem>just think if me singing along to https://www.youtube.com/watch?v=rIE2GAqnFGw would likely help
23:21:17  <jbergstroem>yees it is!
23:22:41  <jbergstroem>so passing variables doesn't seem to work so well. great.
23:25:40  <jbergstroem>Do I just shut this guy down access-wise for a bit?
23:27:32  <rvagg>wasn't me
23:28:42  <jbergstroem>ok
23:28:45  <jbergstroem>looks like we have progress now
23:30:19  <jbergstroem>but we're still running into the git auth issue at the end of a run
23:31:57  <jbergstroem>attempting to update plugins related to this
23:36:25  <thealphanerd>gl
23:37:50  <jbergstroem>same
23:37:50  <jbergstroem>INFO: node-test-commit-windows-fanned #2496 main build action completed: FAILURE
23:42:21  <jbergstroem>last bet is disabling env behaviour to at least see if we're running into the same.
23:42:31  <jbergstroem>asked in #jenkins on irc if anyone else is seeing the same.
23:42:54  <jbergstroem>a few jobs seems to be passing: INFO: node-test-commit-linux/nodes=centos6-64 #3336 main build action completed: SUCCESS
23:43:45  <jbergstroem>ah it passes but fails with the same error, never mind.
23:53:39  <jbergstroem>ok. established that the github auth thing is not related to env juggling
23:59:10  <jbergstroem>https://github.com/jenkinsci/jenkins/compare/jenkins-1.651.1...jenkinsci:jenkins-1.651.2