00:03:33  * saxbyquit (Remote host closed the connection)
00:10:28  * bixujoined
00:12:21  * fredk1quit (Quit: Leaving.)
00:15:31  * bixuquit (Ping timeout: 272 seconds)
00:15:58  * saxbyjoined
00:28:40  * mindlacequit (Quit: Textual IRC Client: www.textualapp.com)
00:30:50  * mindlacejoined
00:55:48  * saxbyquit (Remote host closed the connection)
01:20:00  * ed209quit (Remote host closed the connection)
01:20:35  * ed209joined
01:41:35  * marsellquit (Quit: marsell)
02:21:20  * dap_quit (Quit: Leaving.)
03:00:35  * mindlacequit (Quit: My MacBook Pro has gone to sleep. ZZZzzz…)
03:27:11  * saxbyjoined
05:42:51  * saxbyquit (Remote host closed the connection)
06:35:44  * bsdgurujoined
06:43:25  * saxbyjoined
06:48:07  * saxbyquit (Ping timeout: 245 seconds)
06:49:10  * bsdguruquit (Quit: bsdguru)
06:51:37  * ringzerojoined
07:01:11  * bsdgurujoined
07:09:15  * bsdguruquit (Quit: bsdguru)
07:09:23  * marselljoined
07:45:26  * bixujoined
07:46:00  * bixuquit (Remote host closed the connection)
07:54:18  * ringzeroquit
08:17:18  * bixujoined
08:17:30  * bixuquit (Read error: Connection reset by peer)
08:20:00  * bixujoined
08:24:27  * bixuquit (Ping timeout: 255 seconds)
08:26:31  * bixujoined
08:27:54  * bsdgurujoined
08:35:22  * bixuquit (Ping timeout: 264 seconds)
08:37:03  * bsdguruquit (Ping timeout: 240 seconds)
08:37:52  * bixujoined
08:37:58  * bixuquit (Remote host closed the connection)
08:45:31  * bsdgurujoined
10:20:00  * ed209quit (Remote host closed the connection)
10:20:36  * ed209joined
10:22:47  * yruss972joined
10:45:04  * bixujoined
10:49:48  * bixuquit (Ping timeout: 255 seconds)
10:58:53  * saxbyjoined
11:03:18  * saxbyquit (Ping timeout: 255 seconds)
11:59:37  * saxbyjoined
12:04:01  * saxbyquit (Ping timeout: 256 seconds)
12:34:31  * bixujoined
12:39:03  * bixuquit (Ping timeout: 272 seconds)
12:44:02  * nicholaswyoungjoined
13:19:40  <nicholaswyoung>Is there a way to setup a job without the input, and add it later?
13:37:54  <nahamu>I think so
13:38:11  <nahamu>but you need to add the input within a minute or two (I think...)
14:58:03  * ryancnelsonjoined
15:06:25  * chorrelljoined
15:07:00  * nfitchjoined
15:07:16  * chorrellquit (Client Quit)
15:11:33  * nfitchquit (Client Quit)
15:12:00  * nfitchjoined
15:24:41  * jschmidtquit (Remote host closed the connection)
15:38:03  * fredk1joined
15:43:52  * jschmidtjoined
16:00:02  * chorrelljoined
16:28:26  * dap_joined
16:38:14  * dap_quit (Quit: Leaving.)
16:40:18  * chorrellquit (Quit: Textual IRC Client: www.textualapp.com)
17:03:47  * nicholaswyoungquit (Quit: Computer has gone to sleep.)
17:09:08  * yruss972quit (Ping timeout: 252 seconds)
17:10:08  * dap_joined
17:18:58  * dap_quit (Quit: Leaving.)
17:29:58  * mindlacejoined
17:32:50  * Marc__joined
17:33:17  <Marc__>can i also ask zfs questions here?
17:33:34  * nicholaswyoungjoined
17:35:33  <Marc__>test
17:35:51  <rmustacc>Marc__: You can, but probably something like #illumos or #smartos may be a better venue. But what'su p?
17:37:02  <Marc__>i had a long 1-on-1 discussion with Cantrill in SF on June 6th wrt "derived datasets" and ZFS
17:37:26  <Marc__>but in the context of SDC, I'm confused on something
17:38:15  <rmustacc>delegated datasets perhaps?
17:38:16  <Marc__>what i effectively want to do is to be able to share a zfs snapshot or clone across all of the compute nodes under control of the SDC headnode
17:38:34  <Marc__>maybe delegated is the term
17:38:45  <rmustacc>Share how? NFS, iSCSI, something else?
17:38:46  <Marc__>ie
17:38:57  <Marc__>i want to be able to clone a zfs snapshot
17:39:21  <Marc__>and then create zones (VMs) on other servers
17:39:27  <Marc__>in the compute resource pool
17:39:37  <Marc__>and then use the zfs clone
17:39:53  <Marc__>from whereever zpool that it was created on
17:39:59  <Marc__>whatever
17:40:14  <Marc__>and use that clone on a new zone/vm
17:40:34  <Marc__>i would prefer not to use nfs
17:40:42  <Marc__>as teh file sharing mechanism
17:40:44  <Marc__>although
17:41:03  <Marc__>that seems to be an obvious solution
17:41:12  <Marc__>i think that zfs clones
17:41:45  <rmustacc>Well, what are the semantics of the data, is it read-only? Does it need to be consistent in every zone?
17:41:50  <Marc__>should be able to be used seamlessly across the SDC compute resource pools or the aggregation of all zones across all servers
17:41:57  <Marc__>yes read only
17:42:16  <Marc__>for example
17:42:25  <Marc__>we create a chip build
17:42:33  <Marc__>then we want to kick off
17:42:42  <Marc__>a bunch of simulation runs
17:42:56  <rmustacc>And I guess you don't want to use something like manta to just schedule the compute for you?
17:43:00  <Marc__>against that chip build that was created on a zfs dataset
17:43:09  <Marc__>i would consider manta
17:43:27  <Marc__>but i'm not sure how it would work in this case
17:43:54  <rmustacc>Well, you have a large blob of data right? Maybe you frame an entire chip dataset as a tar file, for example.
17:43:56  <Marc__>the chip build is a directory tree of files associated with the chip build
17:44:24  <Marc__>the chip build may consist of 100 files
17:44:25  <rmustacc>Then your manta job just executes by first untarring that and running the compute job against the object and you create one job for each simulation or other checks you have.
17:45:04  <Marc__>this sounds like what i want to do
17:45:10  <Marc__>however
17:45:42  <Marc__>i really want these jobs to be running against an existing file system
17:45:43  <Marc__>ala
17:45:47  <Marc__>zfs dataset
17:45:50  <ryancnelson>a zfs clone is a writable snapshot of a zfs filesystem. writability is what makes it a "clone". read-only is just "a snapshot" … and both reference the zfs filesystem they were cloned/snapped from for data that's not changed since snap-time. The term for "serialized out full-copy of that thing, so you can use it elsewhere or share it" is (basically) "dataset"
17:46:21  <Marc__>elswhere is the operative word
17:46:23  <rmustacc>Sure, it's a question of how much you want to build your own orchestration, right?
17:47:04  <rmustacc>And by that I mean, if say, you create an image for a dataset that has your tools and the data, then you have to manually create that on all of the hosts and cause jobs to run on it.
17:47:05  <Marc__>share the zfs clone across zones between servers
17:47:39  <Marc__>so
17:47:44  <Marc__>with 100 servers
17:48:09  <ryancnelson>you are, for the most part, describing how we offer new vm images to our customers… "provision a new vm/zone of this version with this software installed in it" == "deliver a new instance of this frozen-and-prepared-at-this-date dataset to customer X"
17:48:27  <Marc__>yes
17:48:29  <Marc__>but
17:48:37  <ryancnelson>that's how it does it in a few seconds, vs "install the os every time"
17:48:37  <Marc__>it is not an OS image
17:48:46  <ryancnelson>right, but that's just a fine point.
17:48:52  <Marc__>it is a file system
17:49:04  <Marc__>that i want to share
17:49:10  <Marc__>among many zones
17:49:16  <rmustacc>Right, you'll get that.
17:49:21  <Marc__>some zones which will be on different servers
17:49:30  <rmustacc>Yup.
17:49:36  <rmustacc>So, let's talk about what the image actually has.
17:49:43  <rmustacc>So the core of the OS doesn't come from the image at all.
17:49:48  <Marc__>no
17:49:50  <rmustacc>It just comes from the hypervsior.
17:49:58  <Marc__>?
17:50:00  <rmustacc>No, you don't want me to explain it?
17:50:09  <Marc__>yes
17:50:17  <Marc__>the data
17:50:24  <Marc__>on the zfs dataset
17:50:42  <Marc__>is needed by the user to run their application
17:50:45  * saxbyjoined
17:50:53  <Marc__>like the chip build
17:50:57  <Marc__>that i described
17:51:23  <Marc__>let me ask it this way
17:51:48  <Marc__>how would you share a zfs clone to another zone on a different server?
17:52:23  <ryancnelson>you need to stop using the term "clone". that means a thing that you don't mean here.
17:52:43  * mindlacequit (Quit: My MacBook Pro has gone to sleep. ZZZzzz…)
17:52:45  <Marc__>just use the term zfs dataset?
17:52:49  <ryancnelson>yes.
17:52:54  <Marc__>ok
17:52:57  <ryancnelson>by way of example:
17:53:06  <ryancnelson>in the joyent public manta:
17:53:08  <rmustacc>Well, given the constraints of I don't care about the space it takes up and do care about the latency.
17:53:29  <rmustacc>I would do one of two things.
17:53:54  <rmustacc>I would either, first, do a zfs send of that zfs dataset to every host that would have zones I want to run this.
17:54:23  <rmustacc>Next, I would then use vmadm to do a lofs mount of that data read-only into the zone.
17:54:38  <rmustacc>Such that the zone then has read-only access at some defined point.
17:55:19  <Marc__>ok
17:55:23  <Marc__>2nd?
17:55:33  <Marc__>way?
17:55:47  <rmustacc>I would give every zone a delegated dataset.
17:55:57  <rmustacc>And then do the zfs send | zfs recv directly to that zone's delegated dataset.
17:55:59  <Marc__>ok...that was my next question
17:56:08  <rmustacc>The advantage is that the user has their own copy that's unique to that zone.
17:56:13  <Marc__>what is a delegated dataset?
17:56:15  <rmustacc>They can modify it to their heart's content.
17:56:35  <rmustacc>So, are you familiar with the idea of how zfs is broken into datasets that are logically grouped?
17:56:53  <Marc__>i think that i do
17:57:02  <Marc__>the datasets
17:57:07  <Marc__>are essentially file systems
17:57:15  <Marc__>individually
17:57:25  <rmustacc>Yes, but they are grouped in a heirarchy.
17:57:25  <Marc__>the grouping is the zpool
17:57:28  * dap_joined
17:57:28  <Marc__>right?
17:57:37  <rmustacc>A zpool will contain many datasets.
17:57:42  <Marc__>yes
17:57:51  <rmustacc>For example, each zone will have their own dataset zones/<uuid>
17:57:56  <ryancnelson>if i mlogin to 10 manta compute zones, i see a *ton* of stuff installed in /opt/local/bin … recently, we updated that list of stuff to add a lot more, and removed a couple buggy things.
17:57:56  <ryancnelson>all that software is just data that's delivered in the image. You could have a terabyte of chip-data in there if you wanted. that's the dataset we provision with.
17:58:13  <Marc__>right
17:58:25  <Marc__>what is the hierarchy?
17:58:26  <rmustacc>A delegated dataset allows a zone to have a namespace under zones/<uuid>/data that they can then do whatever they want with, subject to the constraints.
17:58:44  <rmustacc>So a zone is given permission to administer anything in zones/<uuid>/data, but nothing else.
17:58:52  <rmustacc>So they can create additional datasets, etc.
18:00:00  <Marc__>how does this relate to other zones?
18:00:09  <rmustacc>They're all independent.
18:00:23  <Marc__>that may wish to use another zone's delegated dataset?
18:00:23  <rmustacc>But it means you could take a zfs dataset from one zone and send it to another.
18:00:28  <rmustacc>They cannot.
18:01:12  <Marc__>send the delegated dataset to another zone
18:01:22  <nahamu>stupid question: is the chip build huge? why not just create a new zone image that includes the chip build in it?
18:01:41  <rmustacc>That's what I was originally trying to suggest.
18:01:47  <nahamu>oh, I missed that.
18:02:00  <Marc__>these chip builds and data can be 100's of GB
18:02:13  <Marc__>and take a long time to build
18:02:16  <nahamu>okay, so that will be annoying to hold in the image server
18:02:33  <nahamu>but otherwise you still need to drop a copy of that zfs filesystem onto every compute node.
18:02:42  <Marc__>we have one engineer perform these builds
18:02:48  <Marc__>then another engineer
18:02:50  <rmustacc>Well, given that you probably care about your storage capacity, I wouldn't go the delegated dataset route because you'll duplicate that for every zone.
18:02:51  <Marc__>who will
18:02:59  <Marc__>set up the batch regressions
18:03:06  <Marc__>against this build
18:03:12  <Marc__>we want to save builds
18:03:25  <Marc__>to be able to compare regressions and re-run
18:03:51  <Marc__>we want to make sure that the build is stable, untouched and know how it was created
18:04:07  <Marc__>i'm actually not that concerned with the storage
18:04:15  <nahamu>certainly the path of least resistance in SDC would be to do the build in a zone, create an image out of that zone, and then use that new image for all the testing
18:04:26  <Marc__>a typical chip project will allocate 25TB of storage
18:04:43  <Marc__>hmmm?
18:04:44  <nahamu>but the build is only 100s of GB?
18:04:50  <nahamu>now I'm confused.
18:05:07  <Marc__>the chip build
18:05:08  <rmustacc>Marc__: How many of these do you want to have active on a single compute node?
18:05:11  <Marc__>is just a seed
18:05:22  <rmustacc>eg. what's your capacity on an individual node and how much do you want to manage that?
18:05:29  <Marc__>input to the execution of many downstream application runds
18:05:49  <Marc__>for example
18:05:56  <Marc__>Qualcomm
18:06:06  <Marc__>the Snapdragon ARM processor
18:06:28  <Marc__>there are approximately 350,000 regression tests against that processor core
18:06:33  * chorrelljoined
18:06:45  <rmustacc>I understand, I did chip work in a past life.
18:06:46  <Marc__>that are almost under continuous regression
18:07:09  <Marc__>the similar activity will happen for synthesis, place & route, and timing
18:07:21  <Marc__>so you see where i am going with this
18:07:26  <Marc__>the key problem
18:07:28  <Marc__>here
18:07:29  <rmustacc>Right, so that's why I asked how much data you want to have active on a single CN at a time and how much do you manually want to manage it?
18:07:56  <Marc__>this depends
18:08:10  <Marc__>on exactly what i am doing in the full chip workflow
18:08:14  <Marc__>keep in mind that
18:08:17  <rmustacc>Well, based on the sizing requirements, I wouldn't use a delegated dataset.
18:08:34  <Marc__>we are dealing with a complex workflow
18:08:35  <rmustacc>I would lofs mount the data read-only from the GZ and send each dataset to the node that it needs to exist on.
18:08:44  <Marc__>that has many EDA software applications
18:08:52  <Marc__>that comprise the workflow
18:09:11  <Marc__>i call these types of workflows Loosely Coupled Tightly Constrained
18:09:14  <Marc__>(LCTC
18:09:16  <Marc__>)
18:09:30  <Marc__>the loosely coupled component is
18:09:41  <Marc__>the batch regressions that are run iteratively
18:09:45  <nahamu>when you say 25TB, are you referring to 250 copies of a 100GB build floating around, or are you saying that there are 25TB of non-replicated data?
18:09:49  <Marc__>the tightly constrained portion
18:10:05  <nahamu>(ignore me until you're done answering rmustacc...)
18:10:09  <rmustacc>Right, so I think what I'm suggesting will work for you.
18:10:23  <rmustacc>But it will be a bit more manual on your part.
18:10:45  <Marc__>25TB of non-replicated data
18:10:50  <Marc__>over the course of the project
18:10:52  <rmustacc>If I wanted to take myself out of managing the individual datasets and their presence or lack thereof on compute nodes, I'd probably try to find a way to make it work on manta.
18:10:59  <rmustacc>But either of those two routes should work for you, make sense?
18:11:23  <Marc__>i'm more interested in understanding if i can use Manta to do this
18:12:26  <Marc__>so you would suggest making this work on Manta, right?
18:12:26  <rmustacc>Well, if all the files must be present and can't be done through an object pipeline, then today, you'd have to have each of that blob of data be a single object.
18:12:36  <rmustacc>It all depends on the level of management you want to od.
18:12:37  <rmustacc>*do
18:12:59  <rmustacc>Manta will make managing the data to host mapping easier, but it may be trickier to transform a file system of objects into a series of jobs to best take advantage of manta.
18:13:33  <Marc__>so i do think i understand what manta does
18:13:34  <rmustacc>Having the ability to just use a delegated dataset and import that data in manta might be interesting, but isn't something that exists today or is probably going to magically appear any time soon.
18:13:40  <Marc__>after you said taht
18:13:42  <Marc__>that
18:13:57  <Marc__>yes
18:14:05  <Marc__>that is what we want in this scenario
18:14:18  <Marc__>a delegated dataset imported into Manta
18:14:47  <rmustacc>Unfortunately, that doesn't exist today, and would not necessairily be faster than the tar approach.
18:14:51  <Marc__>so where can i read about delgated datasets
18:16:18  <Marc__>last question
18:16:21  <Marc__>what about nfs
18:16:33  <rmustacc>Well, there are two things you could do here, actually.
18:16:40  <rmustacc>You could have a real nfs server and just have the data in one place.
18:16:43  <rmustacc>You could also use manta-nfs.
18:16:53  <rmustacc>So you could put all the files in manta, and expose it to a zone through nfs.
18:17:08  <Marc__>what about performance
18:17:19  <Marc__>against 10,000 zones
18:17:24  <rmustacc>As for reading about delegated datasets, I'd suggest the zfs manual page http://illumos.org/man/zfs
18:17:26  <Marc__>spinning up and down
18:17:37  <Marc__>with mounts and unmounts
18:17:47  <nahamu>The SmartOS specifics are described a bit in the man page: http://us-east.manta.joyent.com/smartosman/public/man1M/vmadm.1M.html
18:17:56  <rmustacc>I haven't personally done nfs scalability or manta-nfs scalability.
18:18:06  <rmustacc>But manta itself is designed to handle those kinds of requests.
18:18:38  <Marc__>manta deals with objects
18:18:45  <Marc__>not file systems
18:18:48  <nahamu>this may have been asked already, but how many chip builds are active in a single cluster at a time?
18:19:04  <Marc__>in our cloud
18:19:13  <Marc__>there would be dozens
18:19:28  <Marc__>being done by many different organizations
18:19:39  <nahamu>oh, so this is a service for chip companies...
18:20:00  <Marc__>yes
18:20:11  <Marc__>a design infrastructure built upon SDC
18:20:15  <Marc__>SmartO
18:20:23  <Marc__>SmartOS and ZFS
18:20:39  <Marc__>an inter-organizational WorkFlow-as-a-Service
18:20:54  <Marc__>and Digital-Rights-Management-as-a-Service (DRMaaS)
18:22:04  <Marc__>many thanks for this dialogue
18:22:21  <Marc__>this transcript will give me something to re-read and ponder
18:23:12  <nahamu>If it were me and I had some serious money backing the project, I'd probably get a couple of new features added to Manta to smooth this out.
18:23:36  <Marc__>that is what we are talking to Joyent about
18:23:50  * nicholaswyoungquit (Quit: Computer has gone to sleep.)
18:23:57  <Marc__>we're now standing up the first pilot cloud now
18:24:00  <nahamu>Because ultimately I do think that ZFS is the best way to reliably store and move around your bits.
18:24:19  <Marc__>with Joyent's assistance
18:24:23  <Marc__>with SDC 7
18:24:28  <nahamu>And Manta's computation model has a lot going for it.
18:24:39  <Marc__>i'm getting pushback to allow us to instal Manta
18:24:42  <Marc__>at this point
18:25:04  <nahamu>But the units of computation you're dealing with seem to be bigger than the normal style of objects that Manta generally deals in
18:25:15  <Marc__>precisely
18:25:29  <nahamu>I mean, here's a completely unfinished idea:
18:25:48  <nahamu>if you could have an object that was in fact a UFS filesystem that the compute zone could mount up...
18:26:03  <Marc__>plus i want combinations of ZFS datasets
18:26:03  <Marc__>to complete the execution picture for the user
18:26:05  <nahamu>then you don't need to waste time untarring a tarball.
18:26:20  <Marc__>right
18:27:16  <Marc__>we may seek to have zfs-1a + zfs-2b + zfs-3.3, etc to comprise a complete filesystem needed for a specific chip regression
18:27:46  <nahamu>each of which contains thousands of files
18:27:50  <nahamu>(if not millions)
18:28:05  <Marc__>maybe not millions, but certainly 1000's
18:28:37  <Marc__>remember i would also like for the EDA software tree to be shared and moved around
18:28:43  <ryancnelson>mounting a filesystem via lofs (filesystem in a file) is a thing that's not considered safe for multi-tenant infrastructures, so we don't allow it in the JPC, or the public manta. You could do that in your on-prem manta, though.
18:29:53  <nahamu>can hyperlofs mount in a whole subtree?
18:30:55  <nahamu>I don't actually know the internals of Manta, but I know enough about SmartOS to be dangerous.
18:30:56  * chorrellquit (Read error: Connection reset by peer)
18:31:20  <nahamu>I'm pretty sure that Manta is using hyperlofs to mount in the object to an already running zone.
18:31:30  <ryancnelson>yep
18:31:39  <nahamu>it would be cool if you could have an "object" that was actually a while filesystem tree
18:31:44  <ryancnelson>also pretty sure it's just objects
18:31:50  <ryancnelson>actually, it has to be
18:31:56  <ryancnelson>in this implementation, anyway
18:32:08  <nahamu>the normal tools for uploading such an object wouldn't really make sense.
18:32:08  <ryancnelson>because:
18:32:19  <ryancnelson>if you pass in 200 objects to a marlin job
18:32:27  <ryancnelson>… we find compute near those objects
18:32:34  <ryancnelson>sorry, more elipses
18:32:36  <nahamu>:)
18:33:07  <ryancnelson>and we will re-use your marlin zone to do the next ones, if it's in the same place as more of the objects you pass in
18:33:14  <ryancnelson>but they're spread all over
18:33:35  <ryancnelson>which is to say, the "directory tree" of manta storage is just an abstraction.
18:33:54  <ryancnelson>all your objects under /nahamu/stor/foo/bar/ aren't really all on the same spinning disk.
18:34:00  <nahamu>right, I get that.
18:34:04  <ryancnelson>disks
18:34:18  <ryancnelson>so, there's no "directory" to map in, via hyperlofs
18:34:29  <nahamu>they're each under /zones/<blah blah blah>/<unique identifier>
18:34:34  <ryancnelson>right
18:34:42  <nahamu>and manatee knows what "path" it's under
18:35:14  <nahamu>what I'm saying is I'd love to have /nahamu/stor/thing1 as an object, but that object is actually a whole tree of files itself.
18:35:23  <ryancnelson>so, pick a compute zone you're in. that guy might be nearby the disks that hold *some* of the files in that dir, but almost certainly not all of them
18:35:27  <ryancnelson>right
18:35:33  <ryancnelson>that's like a boot_archive :)
18:35:37  <nahamu>because then I could set the number of copies for that object equal to the number of machines in the cluster
18:35:40  <ryancnelson>… which has a ufs filesystem in it
18:35:52  <nahamu>and then I can run jobs on it that can run on any machine since it's on every machine.
18:35:53  <ryancnelson>in VERY EARLY manta,
18:36:15  <ryancnelson>i used to be able to mount a hsfs (cdrom iso) file in a manta zone
18:36:25  <ryancnelson>doing compute jobs on iso images was very handy
18:36:28  <ryancnelson>but it's not safe
18:36:49  <tjfontaine>fwiw there are plenty of ways you could do that in userspace
18:36:55  <ryancnelson>arbitrary filesystem data mounted by kernel filesystem drivers
18:37:19  <ryancnelson>yeah, userland is safe, and the recommended alternative
18:37:22  <nahamu>right, if illumos had fuse it might be safer, but you could abuse the node-nfs and do it locally in the zone too
18:37:29  <nahamu>oh!
18:37:33  <nahamu>that's it!
18:37:48  <ryancnelson>marlin zones can't nfs-mount stuff, either
18:37:52  <nahamu>rats
18:38:01  <tjfontaine>I'm not sure if I would define it as 'safe' to have fuse -- but you really don't need much kernel support to actually do file system parsers
18:38:04  <ryancnelson>they could, and (fairly) safely, too
18:38:25  <tjfontaine>the kernel support you need is basically "this is how you load a file" which of course you're always going to have :)
18:38:33  <ryancnelson>it's just a bit crazy to think that normally someone would nfs-mount their netapp out the JPC manta nat gateway
18:39:07  <ryancnelson>nfs-mount localhost as a filesystem interface isn't un-reasonable, though
18:39:08  <nahamu>but you could allow nfs in an on-prem manta
18:40:09  <nahamu>but yes, if there's money to be spent, kernel work could be on the table
18:40:37  <nahamu>but still, without messing with the kernel...
18:40:45  <nahamu>oh!
18:40:50  <nahamu>okay, next unfinished idea:
18:41:52  <nahamu>manta is normally expecting a "normal" object
18:42:09  <nahamu>stream of bits that get landed into a file on n nodes in the cluster
18:42:44  <nahamu>if you could set a header that said "this is a zfs send stream" instead of dropping it into a file you receive it into an appropriately named spot
18:42:52  <nahamu>(on n nodes in the cluster)
18:43:10  <nahamu>then you could hyperlofs that directory into a marlin zone
18:43:19  <nahamu>(this would never fly in the JPC Manta...)
18:43:50  <ryancnelson>then you still have a big file in your marlin zone.
18:44:13  <nahamu>huh? no, it's a send stream containing a snapshot of a filesystem
18:44:19  * chorrelljoined
18:44:34  <nahamu>which you receive into a filesystem rather than just dropping onto the disk.
18:44:46  <ryancnelson>oh. you zfs receive at provision time?
18:44:54  <nahamu>I meant at upload time
18:44:56  <ryancnelson>that's not fast, like zfs clone
18:45:02  * chorrellquit (Client Quit)
18:45:11  <ryancnelson>oh, i get it
18:45:41  <ryancnelson>at store-time, instead of writing files, you import datasets
18:45:47  <nahamu>right
18:46:06  <ryancnelson>you don't want to have millions of datasets, though. millions of files is trivial, though
18:46:34  <nahamu>right, but if each of these datasets is 100GB you'll run out of disk before you get to millions of them.
18:46:51  <nahamu>again, this wouldn't work for JPC
18:48:13  <nahamu>and in theory, if a compute job could be made to delegate a dataset, then you could dump your outputs into that delegated dataset filesystem and that could be similarly uploaded into manta for later reuse
18:48:26  <nahamu>with the number of copies set appropriately depending on what kind of output it is.
18:49:39  <Marc__>a delegated dataset filesystem uploaded into manta?
18:49:56  <Marc__>as individual file objects?
18:50:03  <nahamu>this is a hypothetical, not an existing feature, but yes.
18:50:15  <nahamu>as individual "Manta objects"
18:50:22  <nahamu>I'm redefining a Manta object.
18:50:41  <Marc__>as a zfs dataset
18:50:55  <nahamu>Currently a Manta object is an immutable file.
18:51:26  <nahamu>I'm saying "wouldn't it be cool if Manta objects could either be a) an immutable file or b) an immutable snapshot of a single zfs filesystem"
18:51:38  <Marc__>yes
18:51:43  <Marc__>i would agree
18:51:52  <Marc__>that would be awesome
18:52:34  <nahamu>I don't think it makes sense for the Joyent cloud
18:53:09  <Marc__>a zpool is confined to a single server, right?
18:54:02  <ryancnelson>absolutely it is
18:59:08  <ryancnelson>having your filesystem-in-a-file as a manta object would need to be basically on every node, to be useful for computing "against"
18:59:17  <ryancnelson>currently, we top out at (i think?) 8 copies
18:59:25  <nahamu>oh, interesting
18:59:41  <nahamu>I was kind of assuming I could set copies to "everywhere"
18:59:44  <ryancnelson>a different, special type of object could maybe be distrubuted widely
19:00:12  <nahamu>like an alternate compute image as rmustacc suggested earlier
19:00:13  <ryancnelson>if we put that kind of filesystem delegation into it
19:00:30  <ryancnelson>yeah, this is re-inventing the image system, really
19:00:33  <nahamu>but if you want to mix and match them it gets a little trickier.
19:01:01  * Marc__quit (Quit: Page closed)
19:02:01  * Marc__joined
19:09:34  <nahamu>Is it known if these phases are CPU or I/O or memory bound?
19:09:55  * mindlacejoined
19:26:27  * jschmidtquit (Ping timeout: 245 seconds)
19:31:27  * jschmidtjoined
19:44:59  * saxbyquit (Remote host closed the connection)
19:56:13  * nicholaswyoungjoined
20:00:57  <nahamu>At any rate, whether or not Manta fits the use case, I'd certainly pick SDC over anything else to build out the compute side of that system.
20:14:12  * nicholaswyoungquit (Quit: Computer has gone to sleep.)
20:20:00  * ed209quit (Remote host closed the connection)
20:20:36  * ed209joined
20:26:55  * ryancnelsonquit (Quit: Leaving.)
20:28:53  * saxbyjoined
21:10:15  * nicholaswyoungjoined
21:41:39  * ryancnelsonjoined
21:42:11  * nicholaswyoungquit (Quit: Computer has gone to sleep.)
21:51:00  * preillyquit (Quit: ZNC - http://znc.in)
22:32:56  * jperkinquit (*.net *.split)
22:32:59  * tjfontainequit (*.net *.split)
22:33:18  * jperkinjoined
22:36:02  * tjfontainejoined
22:36:19  * echelog-1quit (Read error: Connection reset by peer)
22:39:39  * echelog-1joined
22:44:05  * nicholaswyoungjoined
22:50:23  <nicholaswyoung>Can anyone report on this issue? https://github.com/joyent/node-manta/issues/172
22:50:38  <nicholaswyoung>Because that helper would be super-awesome to have.
22:54:01  * utlemmingquit (Remote host closed the connection)
22:57:50  * dap_quit (Quit: Leaving.)
22:58:22  * dap_joined
23:05:21  * nicholaswyoungquit (Quit: Computer has gone to sleep.)