00:00:01
| * ircretary | quit (Remote host closed the connection) |
00:00:09
| * ircretary | joined |
00:31:38
| * shama | quit (Remote host closed the connection) |
00:36:34
| * shama | joined |
00:42:36
| * pfraze | joined |
00:52:46
| <jjjohnny> | SIMD seems useful |
01:13:53
| <mikolalysenko> | BigNums would be good too, but they need to be carefully considered |
01:15:32
| * knownasilya | quit (Quit: Connection closed for inactivity) |
01:20:31
| * domanic | quit (Ping timeout: 240 seconds) |
01:24:15
| * phated | quit (Remote host closed the connection) |
01:30:31
| * ec\ | quit (Ping timeout: 240 seconds) |
01:31:00
| * ec\ | joined |
01:32:19
| <jjjohnny> | wish you could mine bitcoin in GLSL |
01:33:12
| <jjjohnny> | am i the only who wished your could opt for types and enjoy the bennies? |
01:33:30
| <jjjohnny> | why isnt that on the list |
01:33:32
| <jjjohnny> | signed typed |
01:33:34
| <jjjohnny> | just do it |
01:33:48
| <jjjohnny> | unsigned typos (not verified) |
01:34:30
| <jjjohnny> | no |
01:34:38
| <jjjohnny> | typed variables |
01:34:42
| <jjjohnny> | thats what i mean |
01:35:00
| <jjjohnny> | communicating is difficult |
01:35:42
| <jjjohnny> | typediness |
01:35:49
| <jjjohnny> | just doit |
01:36:24
| <jjjohnny> | i wrote a test to see if using typed arrays instead of var would be a win |
01:36:26
| <jjjohnny> | it was a loss |
01:37:16
| <jjjohnny> | on large data it was probably a different story |
01:38:34
| <jjjohnny> | const is stupid |
01:38:37
| <jjjohnny> | stop using const |
01:38:51
| <jjjohnny> | im talking to you yoshuawuyts |
01:40:01
| * mrgodfrey | quit (Ping timeout: 264 seconds) |
01:40:24
| * mrgodfre2 | quit (Ping timeout: 246 seconds) |
01:41:09
| <jjjohnny> | only use const if your immutable is being referenced at greater than 41 kHz |
01:55:24
| * pfraze | quit (Remote host closed the connection) |
02:16:03
| * mrgodfrey | joined |
02:16:49
| * mrgodfre1 | joined |
02:17:13
| * mrgodfre1 | part |
02:19:44
| * mrgodfre3 | joined |
02:20:06
| * mrgodfre3 | part |
02:20:28
| <mrgodfrey> | :quit |
02:20:31
| <mrgodfrey> | :quit |
02:20:38
| <mrgodfrey> | :quit |
02:20:44
| * mrgodfrey | quit (Client Quit) |
02:21:06
| * mrgodfrey | joined |
02:44:08
| * phated | joined |
03:26:37
| * owen1_ | changed nick to owen1 |
03:58:23
| * phated | quit (Remote host closed the connection) |
04:04:51
| * sethvincent | quit (Ping timeout: 240 seconds) |
04:26:54
| * fotoverite | quit (Quit: fotoverite) |
04:51:36
| * domanic | joined |
04:56:18
| * pfraze | joined |
05:00:40
| * pfraze | quit (Ping timeout: 256 seconds) |
05:06:53
| * shama | quit (Quit: (╯°□°)╯︵ɐɯɐɥs) |
05:15:56
| <ogd> | substack: where is the bittorrent dht put demo code you showed me? |
05:28:35
| * domanic | quit (Ping timeout: 260 seconds) |
05:41:08
| * domanic | joined |
05:42:41
| <substack> | ogd: it's in the test suite of bittorrent-dht |
05:43:09
| <substack> | and https://github.com/substack/bittorrent-dht-store-keypair |
05:55:11
| * domanic | quit (Ping timeout: 260 seconds) |
06:20:13
| * farnsworth | joined |
06:20:13
| * cubert | joined |
06:20:49
| * cubert | quit (Remote host closed the connection) |
06:20:49
| * farnsworth | quit (Remote host closed the connection) |
06:21:26
| * farnsworth | joined |
06:21:26
| * cubert | joined |
07:35:02
| * stagas | joined |
07:35:28
| * peutetre | joined |
07:37:31
| * mrgodfrey | quit (Ping timeout: 240 seconds) |
07:37:58
| * dlmanning | quit (Ping timeout: 244 seconds) |
07:39:30
| * domanic | joined |
07:40:55
| * dlmanning | joined |
07:47:38
| * peutetre | quit (Quit: ...) |
07:59:50
| <domanic> | ahdinosaur, hey |
08:02:25
| * peutetre | joined |
08:17:29
| * stagas | quit (Ping timeout: 246 seconds) |
08:21:54
| * stagas | joined |
08:26:47
| * contraha_ | quit (Quit: Sleeping) |
08:30:36
| * mrgodfrey | joined |
08:45:20
| * stagas | quit (Ping timeout: 240 seconds) |
09:04:34
| * stagas | joined |
09:23:38
| * mrgodfrey | quit (Ping timeout: 246 seconds) |
09:48:09
| * stagas | quit (Read error: No route to host) |
09:48:20
| * stagas | joined |
10:12:34
| * mrgodfrey | joined |
10:13:09
| * stagas | quit (Ping timeout: 265 seconds) |
10:13:09
| * domanic | quit (Ping timeout: 246 seconds) |
10:23:40
| * peutetre | quit (Quit: ...) |
10:37:56
| * stagas | joined |
10:45:40
| * peutetre | joined |
10:49:39
| * mrgodfrey | quit (Ping timeout: 260 seconds) |
11:04:30
| * stagas | quit (Ping timeout: 250 seconds) |
11:08:50
| * peutetre | quit (Quit: ...) |
11:22:36
| <tixz> | Does anyone know of an algorithm to divide a set of N geometric points (x, y), into groups of varying size? Preferably the groups should be geometrically as close together as possible? |
11:23:39
| * peutetre | joined |
11:50:36
| * peutetre | quit (Quit: ...) |
11:54:06
| * peutetre | joined |
12:36:42
| * contrahax | joined |
12:43:54
| * trodrigues | part |
12:58:54
| * peutetre | quit (Quit: ...) |
12:59:59
| * fotoverite | joined |
13:16:43
| * contrahax | quit (Quit: Sleeping) |
13:37:17
| <mafintosh> | tixz: i'm sure mikolalysenko know |
13:37:44
| <mikolalysenko> | tixz: lots of ways to do that, look up clustering |
13:37:51
| <mikolalysenko> | kmeans is the most common solution |
13:38:00
| <mikolalysenko> | https://en.wikipedia.org/wiki/K-means_clustering |
13:38:07
| <tixz> | I'm currently trying to do it with a voronoi tesselation |
13:38:33
| <mikolalysenko> | hold on, brb need to take dog out |
13:38:34
| <tixz> | My issue with k-means is that I have no control over how many points go in each cluster, unless I do a custom implementation |
13:39:40
| <tixz> | mikolalysenko: I was thinking it is pretty similar to knn, however I have to remove points that have already been assigned to a group, and I'm not sure of the consequences when I do that |
13:40:13
| <tixz> | Like, will I get some groups where points will be really far apart because they just didn't get chosen for one of the initial groups |
13:46:31
| * simcop2387 | quit (Ping timeout: 240 seconds) |
13:52:20
| * simcop2387 | joined |
13:55:24
| <mikolalysenko> | tixz: ok, back |
13:55:47
| <mikolalysenko> | ok, so you want k-means, but with the number of points per cluster bounded? |
13:56:34
| <tixz> | mikolalysenko: Yeah pretty much |
13:56:50
| <tixz> | I want the clusters to look nice |
13:56:56
| <tixz> | I can show you what it's for: |
13:57:12
| <tixz> | http://bl.ocks.org/emilbayes/82b321dd788bc09d254b |
13:58:24
| <tixz> | I generate the points using poisson disc sampling, but all the dots represent refugees. I want to colour the points in distinct colours, but in groups so it perceptually looks "nice" |
13:59:17
| <tixz> | Any pointers as to what to do? |
13:59:26
| <mikolalysenko> | what if you just cut it into a grid? |
13:59:34
| <mikolalysenko> | or do something like this: |
14:00:06
| <mikolalysenko> | first split into slabs along x, with at most say sqrt(n/k) points per slab |
14:00:23
| <mikolalysenko> | then within each slab, split along y doing the same thing so you have n/k points per bucket |
14:01:30
| <tixz> | Yeah okay, show throw it into something like a quad tree or range tree and progressively make the ranges smaller until I get groups of the sizes I want? |
14:01:44
| <mikolalysenko> | yeah, or you could just do it in one shot |
14:01:49
| <mikolalysenko> | but a quadtree might be ok too |
14:01:56
| <tixz> | Only "issue" is that I visually will get very "square" colourings |
14:02:07
| <tixz> | would be nice with something more bloby, if possible |
14:02:08
| <mikolalysenko> | well, instead of a quadtree, you could use a kdtree |
14:02:20
| <tixz> | Know I'm streching it a bit here :po |
14:02:24
| <mikolalysenko> | or maybe a fair-split tree |
14:02:28
| <mikolalysenko> | but here is the general idea |
14:02:32
| <tixz> | Yeah, was thinking of kdtree |
14:02:42
| <mikolalysenko> | stick all the points into some tree, quadtree or fair split tree are probably good candidates |
14:02:58
| <mikolalysenko> | then for each subtree with at most n/k points, make that a cluster |
14:03:25
| <mikolalysenko> | fair split tree might be a good option I think |
14:03:32
| <mikolalysenko> | or you could modify it a bit |
14:03:38
| <tixz> | Yeah, do you know of any implementations that will thll me the size of the subtree? Was looking at your static-kdtree :p |
14:03:51
| <tixz> | Otherwise I'll just have to go at it myself |
14:03:53
| <mikolalysenko> | size of the subtree is easy to compute in linear time |
14:04:02
| <mikolalysenko> | you just do that as a preprocessing step |
14:04:26
| <mikolalysenko> | working bottom up, you can store the size for each node |
14:04:40
| <tixz> | Preprocessing was considered, but the user will be able to decide the sizes of clusters dynamically |
14:04:42
| <tixz> | arr okay |
14:04:53
| <tixz> | yeah, was thinking that too, just being heaps lazy here |
14:05:01
| <mikolalysenko> | it is a linear time step |
14:05:11
| <mikolalysenko> | so preprocessing wise it is no worse than just drawing all the points anyway |
14:05:23
| <mikolalysenko> | and if you are visualizing this, that cost should be no big deal |
14:06:21
| <tixz> | Alright, I'll try building this. But as far as I know, won't something like a kdtree get very unbalanced when I insert points at random? |
14:06:33
| <mikolalysenko> | don't build it like that |
14:06:43
| <mikolalysenko> | build it the normal way by sorting |
14:07:20
| <tixz> | Okay, so generate all the points, insert them by order into an array and then add them to a tree of some sort? |
14:07:57
| <mikolalysenko> | skip the insert step and just build the tree from the points |
14:08:07
| <mikolalysenko> | think of the tree construction as a divide and conquer algorithm |
14:08:09
| <mikolalysenko> | like merge sort |
14:08:31
| <mikolalysenko> | 1. generate points, 2. build tree, 3. extract clusters from tree |
14:09:06
| <tixz> | Hmm okay, maybe i'm lacking imagination, but the points are generated by a disc sampler that will choose the "best" available spot, so they might be all over the place |
14:09:14
| <mikolalysenko> | sure |
14:09:34
| <tixz> | Maybe I should just try it out and see what happens |
14:09:39
| <tixz> | thanks for the advice! |
14:09:53
| <mikolalysenko> | so your tree construction looks something like this: 1. partition points into 2 groups which are separated by some halfspace, 2. for each group recursively build a tree |
14:10:26
| <mikolalysenko> | so maybe we can make this more interesting |
14:10:49
| <mikolalysenko> | what if instead of just splitting the points into k equally sized groups, we want those groups to form voronoi clusters |
14:11:21
| <mikolalysenko> | one way to do that is to ensure that each of these partitions is a circle |
14:12:40
| <mikolalysenko> | and then for each tree, you could take the center point of its bounding circle as the voronoi site generating its cluster |
14:13:34
| <tixz> | I think I'm a bit lost now, I'll just have to take a minute to read all this through again |
14:19:35
| <mikolalysenko> | I guess part of the problem here is that this is a bit underconstrained |
14:19:52
| <mikolalysenko> | generating k equal sized partitions is trivial by itself: just sort and then split by quantiles |
14:20:05
| <mikolalysenko> | but that gives you a bunch of slabs which is maybe not what you want |
14:20:24
| <mikolalysenko> | so the question then is what properties should a "nicer" partition have? |
14:21:40
| <tixz> | I guess a nicer partitioning is each group having a central point and members of the group being as close to this as possible, while the groups partition the available data completely |
14:23:28
| <tixz> | mikolalysenko: Does what I want even make sense? |
14:24:10
| <tixz> | That's why I started working in the knn direction, but then it hit me that I might end up with clusters that were just comprise of "leftover" points |
14:31:04
| <mikolalysenko> | tixz: ok, so here is maybe another way to do this |
14:31:24
| <mikolalysenko> | tixz: your input points are from a poisson distribution and so they are packed pretty densely and uniformly |
14:31:39
| <tixz> | mikolalysenko: Yep |
14:31:40
| <mikolalysenko> | so what you could do is just overlay a triangle grid and then within each cell of that grid select a random point |
14:31:52
| <mikolalysenko> | from those, you then build a voronoi diagram |
14:31:55
| <mikolalysenko> | and those are your clusters |
14:32:06
| <mikolalysenko> | it isn't exactly going to be equal, but it should be pretty close |
14:32:06
| <tixz> | Arr yeah |
14:32:20
| <tixz> | close is good enough |
14:32:27
| <mikolalysenko> | that is really cheap and easy |
14:33:01
| <tixz> | Thank you, I'll try it out :) |
14:33:29
| <mikolalysenko> | you can use barycentric coordinates to get those triangle indices too |
14:34:03
| <mikolalysenko> | basically you can just round each point to the closest point on the triangle lattice |
14:34:59
| <mikolalysenko> | or: (x,y) -> (Math.round(x), Math.round(x*1/2 + y*Math.sqrt(3)/2) |
14:35:08
| <mikolalysenko> | so use that second tuple as the key for each bucket |
14:35:13
| <mikolalysenko> | then in each bucket draw a random point |
14:35:22
| <mikolalysenko> | use those for generating voronoi sites |
14:35:30
| <tixz> | Arr yeah |
14:36:42
| <mikolalysenko> | rescale x and y before hashing to adjust the fineness of the triangulation |
15:20:09
| * pfraze | joined |
15:39:26
| * fotoverite | quit (Quit: fotoverite) |
15:39:38
| * peutetre | joined |
15:50:36
| * peutetre | quit (Quit: ...) |
15:52:26
| * peutetre | joined |
16:25:20
| * shama | joined |
16:34:57
| <ogd> | mikolalysenko: im at an event with basically all the ndarray implementers in python, julia and R and was curious if you have any specific links you want me to share on your behalf (i have a lightning talk tomorrow but can do a shout out at the beginning) |
16:50:13
| <mikolalysenko> | what sort of stuff do you want? |
16:50:26
| <mikolalysenko> | ogd: I could give you some links from some blog posts I wrote long ago |
16:51:00
| <mikolalysenko> | this might help: http://scijs.net/packages/ |
16:51:17
| <ogd> | mikolalysenko: i think most people here arent aware of the ndarray js ecosystem so i was just wondering the best links to share with a room full of implementers |
16:51:23
| <ogd> | mikolalysenko: ah yea its part of scijs now, cool |
16:51:26
| <mikolalysenko> | this might help: http://0fps.net/2013/05/28/cache-oblivious-array-operations/ |
16:51:31
| <mikolalysenko> | yeah, trying to roll it all into scijs |
16:51:47
| <mikolalysenko> | but have been stretched thin lately so haven't gotten as much done on that as I would like |
16:52:01
| <mikolalysenko> | I want to try getting some visualization tools done |
16:52:12
| <mikolalysenko> | since that is sort of blocking for getting demos, etc. working on scijs |
16:52:34
| <mikolalysenko> | here is a prototype for the 2d datavis stuff: http://gl-vis.github.io/gl-plot2d/scatter-demo.html |
16:53:05
| <ogd> | oh yea hehe i saw that one, pretty awesome |
16:53:10
| <mikolalysenko> | my current thinking is that in order for all this stuff to really make sense in js, you need to have some good visualization tools for it to output too |
16:53:38
| <ogd> | mikolalysenko: ah yea since thats what everyone uses js for from other languages |
16:53:41
| <mikolalysenko> | some general themes for scijs: 1. radical modularity, 2. embrace runtime code generation, 3. open contributorship |
16:53:51
| <ogd> | cool |
16:54:12
| <mikolalysenko> | modularity means you get reproducibility for free |
16:54:25
| <mikolalysenko> | since that is a consequence of having a good package management solution |
16:54:39
| <mikolalysenko> | there are also other approaches out there too |
16:54:49
| <mikolalysenko> | krgyte's stuff is worth taking a look at |
16:55:08
| <ogd> | mikolalysenko: got a link to o/ ? |
16:55:19
| <mikolalysenko> | https://github.com/compute-io/compute.io |
16:55:40
| <mikolalysenko> | I'm not as huge a fan of this super-pack type approach, but he is doing a lot of stuff |
16:57:26
| <mikolalysenko> | my sort of dream though is to not have a single library or even org that has all the scientific computing stuff |
16:57:28
| <ogd> | wow yea megarepo |
16:57:40
| <mikolalysenko> | but rather to just have all the modules live on npm as separate things |
16:57:52
| <mikolalysenko> | still there is some boot strapping though that has to happen |
16:58:09
| <mikolalysenko> | and some of this becomes organizational, there are lots of foundational things that take time to write |
16:58:21
| <ogd> | kgryte has a lot of .io github organizations lol |
16:58:39
| <ogd> | mikolalysenko: right... its an ecosystem, not a top-down organization |
16:59:45
| <mikolalysenko> | he also has is own ndarray-compatible thing, let me see if I can find a link... |
17:00:01
| <mikolalysenko> | https://github.com/dstructs/ndarray |
17:01:08
| <mikolalysenko> | at a high level though, I think that if you can do something without using ndarrays it is usually better |
17:01:19
| <mikolalysenko> | it is almost always a good idea to avoid using custom data types if possible |
17:01:30
| <mikolalysenko> | still there are places where ndarrays make sense and you should use them |
17:02:12
| <mikolalysenko> | in the pursuit of modularity, the worst thing to do is create complex frameworks |
17:02:27
| <mikolalysenko> | a slightly less bad option is to create objects with methods |
17:02:28
| <ogd> | mikolalysenko: is that from a perf standpoint? what about from a maintenance/compatibility standpoint e.g. you dont want to rewrite your analysis code when you hit perf issues |
17:02:40
| <mikolalysenko> | no |
17:02:50
| <mikolalysenko> | the issue is really interoperability |
17:02:59
| <mikolalysenko> | you would want everything to live as independent modules |
17:03:06
| <mikolalysenko> | so things like objects and interfaces are problematic |
17:03:21
| <ogd> | ah yea, so the event im at now is actually about ndarray interop lol |
17:03:23
| <mikolalysenko> | the best thing is to just have procedures |
17:03:28
| <mikolalysenko> | ok |
17:03:32
| <mikolalysenko> | so here is my thesis about ndarrays |
17:03:38
| <ogd> | someone is talking about trying to duck type ndarrays in python |
17:03:45
| <mikolalysenko> | yeah, it makes sense |
17:03:56
| <mikolalysenko> | you just need shape, stride, offset and a 1d data store |
17:04:02
| <mikolalysenko> | for classic ndarrays |
17:04:27
| <mikolalysenko> | a perspective that makes a lot of sense is that they are just affine maps from nd grids onto 1d arrays |
17:04:39
| <mikolalysenko> | that is sufficient to cover pretty much all uses you see in numerical computing |
17:04:44
| <mikolalysenko> | it also works for sparse data too: |
17:04:55
| <mikolalysenko> | a csr matrix is just a run length encoded array |
17:05:05
| <mikolalysenko> | same for csc and all other commonly used sparse formats |
17:05:27
| <mikolalysenko> | basically sparsity can be handled by replacing your 1d data store with something fancier |
17:05:52
| <mikolalysenko> | for example a hash map, or an rle array |
17:12:35
| <mikolalysenko> | though one thing I have been thinking about more and more lately is maybe moving away from ndarrays altogether and using lower level interfaces |
17:12:57
| <mikolalysenko> | since it is easier to implement, and generally not a big deal to wrap it up later on in something high level |
17:13:24
| <mikolalysenko> | for example, I think for some operations like matrix-vector multiplication it might be better to just take raw typedarrays as input |
17:14:06
| <mikolalysenko> | also, the coming of simd to js will change how things work substantially |
17:14:17
| <ogd> | mikolalysenko: did you see this http://lars-t-hansen.github.io/ecmascript_sharedmem/shmem.html |
17:14:21
| <mikolalysenko> | yeah |
17:14:33
| <mikolalysenko> | the new changes will require some old practices to be reevaluated |
17:14:40
| <mikolalysenko> | and a lot of new opportunities will open up |
17:15:32
| <mikolalysenko> | taking advantage of shared memory, simd and value types would require some really different architectural approaches |
17:16:24
| <mikolalysenko> | but it doesn't make sense to do it yet, at least until things get more solid |
17:16:33
| <mikolalysenko> | and the stuff that is already there right now is somewhat useful |
17:16:49
| <mikolalysenko> | also, the biggest problem right now with doing all this stuff in js is the heap size limit |
17:17:01
| <mikolalysenko> | you don't get enough address space in js to do some of the really big problems |
17:17:26
| <ogd> | ah yea |
17:19:22
| <ogd> | https://github.com/libdynd presentation is on now |
17:21:33
| <ogd> | mikolalysenko: also i was curious if there were perf benefits of value types as it relates to things like numeric computing or if they were just a convenience thing |
17:21:49
| <mikolalysenko> | allocating ndarrays is a big cost in js |
17:21:57
| <mikolalysenko> | it is sort of a dumb thing that this is even a problem |
17:22:10
| <mikolalysenko> | but you pay a higher overhead for accessing random parts of an ndarray in js than you would in c++ |
17:22:33
| <mikolalysenko> | because you have to dereference the object pointer -> dereference stride pointer -> do look up in typedarray etc. |
17:22:55
| <mikolalysenko> | so there are multiple levels of indirection to just get the index of an element in an array, and multiple type/bounds checks |
17:23:06
| <mikolalysenko> | this is all dumb, a smart enough programming language could optimize it away |
17:23:07
| <ogd> | mikolalysenko: ah so maybe if it was builtin to the language it would fix that? |
17:23:11
| <mikolalysenko> | sure |
17:23:19
| <mikolalysenko> | or if we had value types we could implement it in the language |
17:23:24
| <mikolalysenko> | in c++ for example, it would be no problem |
17:24:09
| <mikolalysenko> | in c++ you could flatten all the ndarray data into a struct which you can pass around in the register file |
17:24:14
| * Kraln | joined |
17:24:18
| <mikolalysenko> | so your array look up is fast and cheap |
17:24:32
| <mikolalysenko> | in js though you have all these other checks and layers of indirection going on |
17:24:44
| <mikolalysenko> | so doing things like ndarray.set/.get is much slower than it should be |
17:25:00
| <mikolalysenko> | you can avoid this overhead using cwise, but it only works for bulk operations |
17:25:23
| <mikolalysenko> | and for small things cwise is not great perfwise since there is overhead in the dispatch |
17:25:31
| <ogd> | ahh i see |
17:26:05
| <mikolalysenko> | with value types, you could get pretty much optimal performance |
17:26:33
| <mikolalysenko> | also it isn't even possible for the interpreter to optimize this stuff |
17:26:48
| <mikolalysenko> | since there is nothing in js that stops you from swapping out the stride array reference for example |
17:27:15
| <ogd> | hah |
17:27:27
| <mikolalysenko> | and if you do things like chaining array operations, for example, .hi.lo, etc., then you have to allocate ndarrays at each intermediate step |
17:27:46
| <mikolalysenko> | which creates gc pressure and kills performance |
17:28:07
| <kumavis> | go-go-gadget webasm? |
17:28:18
| <mikolalysenko> | sure, if you like writing in c++ |
17:28:27
| <mikolalysenko> | the real solution is value types |
17:29:07
| <ogd> | mikolalysenko: so the benefit of value types is they avoid allocating extra js objects? |
17:29:11
| <mikolalysenko> | yes |
17:29:16
| <mikolalysenko> | and layers of indirection/validation |
17:29:20
| <ogd> | gotcha |
17:29:26
| <mikolalysenko> | also they can be allocated contiguously |
17:29:35
| <mikolalysenko> | so for compound objects, arrays of value types would be much much faster |
17:29:40
| <ogd> | ah really |
17:29:50
| <mikolalysenko> | scanning array of values is way way waaaay faster than scanning an array of objects |
17:30:00
| <mikolalysenko> | since the values are packed contiguously |
17:30:16
| <mikolalysenko> | if you read out a cache line of values, you can basically prefetch ahead |
17:30:26
| <mikolalysenko> | so say your block size for a cache line is B |
17:30:42
| <mikolalysenko> | if you scan an array of contiguous values, you do O(n/B) fetches |
17:30:54
| <mikolalysenko> | (in practice, might even be better with prefetching) |
17:30:58
| <ogd> | yea |
17:31:12
| <mikolalysenko> | if you have an array of objects, for each array entry you have to do an indirect look up on the object's pointer to get its data |
17:31:20
| <mikolalysenko> | so you do at least O(n) operations to read it |
17:31:29
| <mikolalysenko> | and you block the cpu when you are waiting for that fetch to complete |
17:31:49
| <mikolalysenko> | moreover, that factor of B is pretty huge on modern processors especially once you go out L3 cache |
17:32:05
| <mikolalysenko> | in fact it is so big that it completely dwarfs speedups due to parallelism |
17:32:10
| <mikolalysenko> | say you have an 8 core machine |
17:32:16
| <mikolalysenko> | then the best speed up you get is a factor of 8 |
17:32:26
| <mikolalysenko> | but B is usually on the order of 1000 |
17:32:36
| <ogd> | ah wow |
17:32:43
| <mikolalysenko> | if you optimize cache performance, it is like the speedup you would get of running on a 1000 core machine |
17:33:04
| <mikolalysenko> | the economics of memory heirarchy are such that it is always the first and most important priority when optimizing |
17:33:21
| <mikolalysenko> | in fact, one perspective on threads is that they are really just a trick to optimize prefetching |
17:33:35
| <mikolalysenko> | even in numerical computing, very few tasks are compute limited |
17:33:38
| <mikolalysenko> | it is all memory bandwidth |
17:34:02
| <mikolalysenko> | so bringing it back to js: lack of value types is a real problem here |
17:34:18
| <mikolalysenko> | because we can't easily control the order of data in memory, we can't take advantage of these economies |
17:34:31
| <mikolalysenko> | now that said, it isn't totally true |
17:34:46
| <mikolalysenko> | you can kind of get these advantages by manually packing things into typedarrays and doing indexing yourself |
17:34:53
| <mikolalysenko> | but that is really tedious and error prone |
17:35:16
| <mikolalysenko> | if we had value types, you could do this stuff way more easily and create sane interfaces |
17:35:20
| * sethvincent | joined |
17:37:35
| <mikolalysenko> | basically value types would be great for avoiding lots of small temporary allocations |
17:37:55
| <mikolalysenko> | so places where people would want to return multiple values, group tiny amounts of data, etc. |
17:58:39
| <ogd> | mikolalysenko: they made a gitter chat for it https://gitter.im/BIDS/ds4ds |
18:00:02
| * contrahax | joined |
18:40:17
| * akenn | joined |
19:27:30
| * phated | joined |
19:34:22
| * peutetre | quit (Quit: ...) |
19:48:51
| * reqshark | joined |
19:55:01
| * contrahax | quit (Quit: Sleeping) |
20:06:36
| * domanic | joined |
20:31:49
| * peutetre | joined |
20:40:22
| * peutetre | quit (Quit: ...) |
21:19:02
| * mrgodfrey | joined |
21:46:43
| * contrahax | joined |
21:59:37
| <pfraze> | (https://github.com/ssbc/docs#setup-up-a-pub) |
21:59:40
| <pfraze> | ag |
21:59:42
| <pfraze> | https://medium.com/@iefserge/runtime-js-javascript-library-os-823ada1cc3c |
22:01:44
| <pfraze> | "We’ve got a very small and standalone web server OS image we can run locally or push somewhere into the cloud. It’s also immutable and does not require any installation nor configuration. The resulting server does not use disk and is completely stateless between reboots. And it boots in much less than a second under the KVM!" |
22:03:40
| * akenn | quit (Quit: Textual IRC Client: www.textualapp.com) |
22:03:51
| <pfraze> | domanic: ^ |
22:03:53
| * akenn | joined |
22:07:06
| * rhiaro | changed nick to rhiaro__ |
22:08:24
| <domanic> | pfraze, yes this is awesome |
22:31:03
| * contrahax | quit (Read error: Connection reset by peer) |
22:31:26
| * contrahax | joined |
23:34:33
| * contrahax | quit (Quit: Sleeping) |
23:44:42
| * fotoverite | joined |