00:16:45  * timoxleyjoined
00:18:01  * timoxleyquit (Read error: Connection reset by peer)
00:18:27  * timoxleyjoined
00:22:12  * nnnnathannquit (Ping timeout: 252 seconds)
00:23:16  * timoxleyquit (Ping timeout: 264 seconds)
00:33:06  * dguttmanquit (Quit: dguttman)
00:34:49  * dguttmanjoined
00:38:16  * eugenewarejoined
00:50:42  * nnnnathannjoined
01:12:02  * nnnnathannquit (Remote host closed the connection)
01:19:10  * timoxleyjoined
01:24:14  * timoxleyquit (Ping timeout: 264 seconds)
01:39:21  * thlorenzjoined
01:45:51  * rescrvquit (Read error: Connection reset by peer)
01:49:05  * jmartinsjoined
02:02:52  * timoxleyjoined
02:05:35  * dguttmanquit (Quit: dguttman)
02:15:31  * eugenewarequit (Remote host closed the connection)
02:19:08  * eugenewarejoined
02:19:42  * eugenewa_joined
02:23:26  * eugenewarequit (Ping timeout: 240 seconds)
02:30:58  * bradleymeckjoined
02:32:00  <bradleymeck>anyone know of an algorithm to create an index in leveldb that does not fit into memory? presumably split into multiple documents?
02:33:33  <bradleymeck>I am working on a n-ary chunking algorithms and need to make index insertion and removal not terribly slow
03:01:27  <mikeal>bradleymeck: the index key is too big or the value is?
03:01:40  <bradleymeck>the value
03:01:48  <bradleymeck>which is an odd problem I am facing
03:01:49  <mikeal>yeah
03:01:51  <mikeal>just do
03:02:28  <bradleymeck>basically I can't list all the rows of a spreadsheet that match a value because I have millions as a stress test
03:02:28  <mikeal>key:[_key, [startByte, endByte]], value:chunk
03:03:08  <bradleymeck>well the value is ids, not the chunk itself
03:04:03  <bradleymeck>ala indexName=>{rows:[1,2,3,4,…]} where rows is too big
03:04:04  <mikeal>right, whatever, just break up the value in to byte ranges
03:04:22  <bradleymeck>mmm
03:04:23  <mikeal>why don't you store a cell for each key/value
03:04:39  <mikeal>er, a key/value for each cell
03:04:40  <mikeal>:)
03:05:38  <bradleymeck>i do have a chunked 1024x1024 set of location-value pairs, need a moment to think about all this
03:06:13  <mikeal>what max is working on is storing each cell as a key rather than each row
03:06:30  <mikeal>and he's doing it to improve performance
03:07:18  <bradleymeck>each cell as a key, like bigtable?
03:09:52  <bradleymeck>im just trying to repro google spreadsheets api but for n-dimensional data sets, but am facing interesting problems all around perf due to too many ops
03:11:05  <bradleymeck>anyways ill try and think how to split the chunk up like you said then min/max iter on the key
03:11:22  <bradleymeck>love the min/max streaming base don name
03:29:27  * dguttmanjoined
03:30:27  * eugenewa_quit (Remote host closed the connection)
03:34:42  * julianduquejoined
03:35:56  <levelbot>[npm] level-geo@0.0.1 <http://npm.im/level-geo>: A spatial extension to LevelDB (@julianduque)
03:38:37  * dguttmanquit (Quit: dguttman)
03:39:07  * rescrvjoined
03:41:16  * dguttmanjoined
03:45:47  * jmartinsquit (Quit: Konversation terminated!)
03:51:03  * dguttmanquit (Quit: dguttman)
03:54:08  * thlorenzquit (Remote host closed the connection)
04:19:42  * mikealquit (Quit: Leaving.)
04:25:48  * mikealjoined
04:27:54  <levelbot>[npm] level-geo@0.0.2 <http://npm.im/level-geo>: A spatial extension to LevelDB (@julianduque)
04:32:49  * mikealquit (Quit: Leaving.)
04:35:02  * eugenewarejoined
04:45:01  * dominictarrjoined
04:54:27  * eugenewarequit (Remote host closed the connection)
04:58:38  * julianduquequit (Quit: leaving)
04:59:37  * eugenewarejoined
05:23:20  * bradleymeckquit (Quit: bradleymeck)
05:29:29  * wilmoore-dbquit (Remote host closed the connection)
05:37:04  * wilmoore-dbjoined
06:27:26  * enosjoined
06:30:10  * dominictarrquit (Quit: dominictarr)
06:33:42  * timoxleyquit (Ping timeout: 264 seconds)
06:37:08  * werlejoined
06:44:15  * werlequit (Ping timeout: 260 seconds)
07:02:52  * timoxleyjoined
07:04:45  * jcrugzzchanged nick to jcrugzz|zzz
07:53:51  * eugenewarequit (Remote host closed the connection)
07:54:35  * eugenewarejoined
08:11:26  * dominictarrjoined
08:20:51  * dominictarrquit (Quit: dominictarr)
08:43:17  * dominictarrjoined
08:52:53  * eugenewarequit (Remote host closed the connection)
08:57:48  * timoxleyquit (Remote host closed the connection)
08:58:24  * timoxleyjoined
09:02:52  * timoxleyquit (Ping timeout: 264 seconds)
09:29:14  * timoxleyjoined
09:35:31  * wilmoore-dbquit (Remote host closed the connection)
09:39:54  * timoxleyquit (Ping timeout: 252 seconds)
09:48:52  * eugenewarejoined
09:48:56  * eugenewarequit (Remote host closed the connection)
10:19:18  * eugenewarejoined
10:19:38  * timoxleyjoined
11:31:50  * timoxleyquit (Remote host closed the connection)
11:32:24  * timoxleyjoined
11:37:18  * timoxleyquit (Ping timeout: 264 seconds)
11:47:45  * rudquit (Quit: rud)
11:51:58  * julianduquejoined
12:01:18  * julianduquequit (Quit: leaving)
12:11:21  * timoxleyjoined
12:13:20  * werlejoined
12:16:14  * werlequit (Client Quit)
12:51:46  * thlorenz_joined
13:12:56  * tmcwjoined
13:13:48  * tmcwquit (Remote host closed the connection)
13:18:10  * thlorenz_quit (Remote host closed the connection)
13:19:59  * rudjoined
13:19:59  * rudquit (Changing host)
13:19:59  * rudjoined
13:22:30  * werlejoined
13:38:11  * werlequit (Quit: Leaving.)
13:54:23  * nnnnathannjoined
14:04:38  * thlorenzjoined
14:05:09  * thlorenzquit (Remote host closed the connection)
14:05:41  * thlorenzjoined
14:12:40  * jjmalinajoined
14:14:16  * tmcwjoined
14:30:32  * Acconutjoined
14:34:51  * ednapiranhajoined
14:36:20  * dominictarrquit (Quit: dominictarr)
14:37:28  * timoxleyquit (Remote host closed the connection)
14:38:05  * timoxleyjoined
14:39:50  * Acconutquit (Quit: Acconut)
14:42:27  * timoxleyquit (Ping timeout: 260 seconds)
14:50:00  * wilmoore-dbjoined
14:57:27  * dguttmanjoined
14:59:47  * julianduquejoined
15:02:18  * jcrugzz|zzzchanged nick to jcrugzz
15:06:34  <thlorenz>juliangruber: having trouble with the multilevel client example
15:07:05  <thlorenz>juliangruber: https://github.com/juliangruber/multilevel/pull/37#issuecomment-24926721
15:08:43  * timoxleyjoined
15:09:42  * eugenewarequit (Ping timeout: 264 seconds)
15:11:13  * Acconutjoined
15:13:09  * timoxleyquit (Ping timeout: 248 seconds)
15:19:25  * rudquit (Quit: rud)
15:19:53  <levelbot>[npm] level-geo@0.0.3 <http://npm.im/level-geo>: A spatial extension to LevelDB (@julianduque)
15:22:53  * Acconutquit (Quit: Acconut)
15:31:30  * dominictarrjoined
15:35:36  * eugenewarejoined
15:37:29  * julianduquequit (Quit: leaving)
15:44:53  * dguttmanquit (Quit: dguttman)
15:47:59  * mikealjoined
15:59:35  * nnnnathannquit (Ping timeout: 260 seconds)
16:00:51  * nnnnathannjoined
16:01:00  * jjmalinaquit (Quit: Leaving.)
16:01:16  * jjmalinajoined
16:01:41  * wilmoore-dbquit (Read error: No route to host)
16:02:42  * wilmoore-dbjoined
16:08:27  * mikealquit (Quit: Leaving.)
16:09:26  * timoxleyjoined
16:09:27  * jcrugzzquit (Ping timeout: 260 seconds)
16:13:40  * timoxleyquit (Ping timeout: 246 seconds)
16:16:54  * nnnnathannquit (Remote host closed the connection)
16:26:38  * dguttmanjoined
16:27:05  * rudjoined
16:27:06  * rudquit (Changing host)
16:27:06  * rudjoined
16:31:43  * ednapiranhaquit (Remote host closed the connection)
16:32:45  * thlorenzchanged nick to thlorenz_zz
16:33:04  * ednapiranhajoined
16:34:33  * bradleymeckjoined
16:35:45  <bradleymeck>dominictarr: I know you know a lot about append only data structs, what about ones that don't fit into memory, is there a good way to compress their state that you know of or could point me to?
16:36:44  <dominictarr>well, you basically, have to throw away old data that you know you wont need any more
16:36:54  <dominictarr>what sort of data do you have?
16:37:05  <dominictarr>bradleymeck: ^
16:37:16  <bradleymeck>a very large list of uuids, I had to chunk it
16:37:39  <bradleymeck>i figured out an alto thats o(n*m)
16:37:47  <bradleymeck>algo*
16:38:45  <bradleymeck>bbiab work just fired up
16:40:58  * wilmoore-dbquit (Remote host closed the connection)
16:41:28  <dominictarr>bradleymeck: aha, a set. what are you doing with the uuids?
16:43:04  <dominictarr>also, if you can find a way to use a bloom filter, all your problems will be solved
16:50:37  * bradleymeckquit (Quit: bradleymeck)
16:59:22  * chrisdickinsonquit (Quit: ZNC - http://znc.in)
16:59:59  * chrisdickinsonjoined
17:00:11  * thlorenz_zzchanged nick to thlorenz
17:10:06  * timoxleyjoined
17:14:45  * timoxleyquit (Ping timeout: 248 seconds)
17:24:30  * Acconutjoined
17:24:59  * fergusmcdowalljoined
17:26:01  * mikealjoined
17:27:09  * Acconutquit (Client Quit)
17:27:43  * wilmoore-dbjoined
17:34:14  * fallsemojoined
17:34:39  * tmcwquit (Remote host closed the connection)
17:35:14  * tmcwjoined
17:36:15  * fergusmcdowallquit (Quit: fergusmcdowall)
17:37:52  * tmcw_joined
17:37:56  * tmcwquit (Read error: Connection reset by peer)
17:43:43  * jxsonjoined
17:53:16  * vincentmacjoined
17:53:18  * fergusmcdowalljoined
18:03:49  * bradleymeckjoined
18:04:39  <bradleymeck>dominictarr: its a index that is listing keys matching it, so probably not able to do a bloom filter.
18:05:50  * fergusmcdowallquit (Quit: fergusmcdowall)
18:10:55  * timoxleyjoined
18:12:59  * fergusmcdowalljoined
18:13:00  * wilmoore-dbquit (Remote host closed the connection)
18:15:36  * timoxleyquit (Ping timeout: 256 seconds)
18:23:49  * coryfieldschanged nick to cfields
18:28:59  * Acconutjoined
18:37:11  * astrolinjoined
18:39:44  * davidstraussquit (Read error: Connection reset by peer)
18:41:11  <dominictarr>bradleymeck: is it a simple set, or is it key: value? I'm a little confused here
18:41:50  <dominictarr>but knowing how it works / what it's used for makes a massive difference
18:42:44  * fergusmcdowallquit (Quit: fergusmcdowall)
18:43:07  <bradleymeck>example index: key("commentsWithRatingAbove3-Chunk2") => value({ids:['uuid1','uuid2',…]})
18:44:23  <levelbot>[npm] level-indico@0.1.3 <http://npm.im/level-indico>: Simple indexing and querying for leveldb (@mariocasciaro)
18:50:41  * davidstraussjoined
18:52:27  <bradleymeck>basically just trying to make an index that stores the keys, don't need to put the values in there
19:02:15  * jxsonquit (Remote host closed the connection)
19:03:27  * Acconutquit (Quit: Acconut)
19:03:53  <bradleymeck>well i guess that was the old model, now it is closer to : value({deltas:['-uuid1','+uuid2']})
19:07:17  * jxsonjoined
19:08:11  * vincentmacquit (Ping timeout: 248 seconds)
19:09:36  * jmartinsjoined
19:10:50  * jxsonquit (Read error: Connection reset by peer)
19:10:59  * jxsonjoined
19:11:40  * timoxleyjoined
19:15:58  * timoxleyquit (Ping timeout: 245 seconds)
19:24:46  <dominictarr>bradleymeck: delta? it's a set of changes that apply over a common set?
19:25:22  <dominictarr>I assume you want append only, so you can distribute this data to a bunch of servers?
19:25:52  <bradleymeck>well I'm using deltas for the O(1) insert/delete, reads are a little less common, but the data is split amongst servers due to size
19:26:27  <bradleymeck>so indexname-chunk1 indexname-chunk2 etc can be on diff servers
19:26:42  <bradleymeck>but that is not for duplication/replication
19:27:47  <gkatsev>anyone do any tests with leveldb up on travis?
19:30:13  * tmcw_quit (Remote host closed the connection)
19:30:51  * tmcwjoined
19:31:20  * jcrugzzjoined
19:32:02  <dominictarr>bradleymeck: interesting, how many data points do you have?
19:32:22  * tmcwquit (Read error: Connection reset by peer)
19:32:35  * tmcwjoined
19:32:54  <bradleymeck>we have around 20 million docs being indexed around 20-30 indices (growing)
19:32:57  <dominictarr>also, one idea that I've had for a while but never had the opportunity to persue is to make an in memory db with buffers (etc)
19:33:11  <dominictarr>to get out of the v8 heap
19:33:35  <dominictarr>you could also take the skiplist mem table out of leveldb
19:33:57  * jmartins_joined
19:34:01  <bradleymeck>there are a couple of in memory dbs I've looked at
19:34:42  <dominictarr>part of the idea here would be to do it in the same process as node.
19:35:20  <dominictarr>bradleymeck how important is deletes?
19:35:27  <dominictarr>can you just do add only?
19:35:54  <bradleymeck>for some of the indices we can, but moderation means we need to be able to remove uuids from lists
19:36:22  <dominictarr>right. often with systems like this, you need to handle deletes differently
19:36:35  <dominictarr>simplest is {add: …, delete: …}
19:36:47  <dominictarr>and then add - delete is your set
19:37:12  * jmartinsquit (Ping timeout: 252 seconds)
19:37:36  <dominictarr>what does the request to this thing look like?
19:39:01  <dominictarr>it sounds like you are using it to put a badge on a listing?… are you looking up the listing as part of a http response?
19:39:24  <bradleymeck>well its all done for validation so like : db.index(x).getStream()
19:39:35  <bradleymeck>right now it is not http but it could be
19:39:57  <bradleymeck>but the index is too big to put all of them in a single doc
19:41:50  * jcrugzzquit (Ping timeout: 240 seconds)
19:45:10  <bradleymeck>ill try and think on this, may need to move to sharding rather than a single chunking algo
19:50:01  * timoxleyjoined
19:54:44  * Acconutjoined
19:54:51  * Acconutquit (Client Quit)
19:59:46  <dominictarr>bradleymeck: I need more information to understand exactly what you are trying to do.
20:02:11  <bradleymeck>well the original goal was to make a n-ary chunking system so we could store multi-dimensional arrays. this would then be used as a uniform way for us to export data about reviews/comments vs items that could be reviewed/commented on
20:02:33  * jcrugzzjoined
20:02:44  <bradleymeck>we need a way to generate these indices for various audits, fraud prevention, moderation, metrics, etc.
20:03:24  <bradleymeck>the data set is large and even working on a subset is becoming brutal to process in memory so we moved to a chunked index
20:04:43  <bradleymeck>the chunked index had O(n*m) state compression for removing useless data, O(1) insert/delete, we mainly just needed a way for O(n*m) to get a lot smaller
20:15:39  <levelbot>[npm] typewise@0.6.0 <http://npm.im/typewise>: Typewise structured sorting for arbirarily complex data structures (@deanlandolt)
20:25:51  * jmartins_quit (Quit: Konversation terminated!)
20:36:37  * bradleymeckquit (Quit: bradleymeck)
20:37:06  * bradleymeckjoined
20:42:42  * jxsonquit (Remote host closed the connection)
20:49:44  * timoxleyquit (Remote host closed the connection)
20:50:20  * timoxleyjoined
20:54:51  * timoxleyquit (Ping timeout: 248 seconds)
20:54:57  * timoxleyjoined
21:20:12  * davidstraussquit (Ping timeout: 260 seconds)
21:21:27  * davidstraussjoined
21:27:35  * bradleymeckquit (Quit: bradleymeck)
21:30:26  * julianduquejoined
21:48:44  * jxsonjoined
21:55:41  * davidstraussquit (Ping timeout: 260 seconds)
21:59:35  * davidstraussjoined
22:05:36  <julianduque>my first leveldb module, feedback welcome https://github.com/julianduque/level-geo, specially in the rTree approach in leveldb (cc mbalho)
22:06:23  * davidstraussquit (Ping timeout: 260 seconds)
22:08:56  <brycebaril>julianduque: that's cool, does the bounding box search return in order of proximity?
22:09:56  <julianduque>brycebaril: not sure, proximity regarding the centroid?
22:10:37  <brycebaril>That's what I was thinking though I suppose there are definitely other ways people might consider the results to be ordered correctly
22:20:00  * thlorenzquit (Remote host closed the connection)
22:20:10  * tmcwquit (Remote host closed the connection)
22:20:43  * tmcwjoined
22:25:00  * tmcwquit (Ping timeout: 245 seconds)
22:25:06  * ryan_ramagejoined
22:36:53  * timoxleyquit (Remote host closed the connection)
22:37:31  * timoxleyjoined
22:41:49  * timoxleyquit (Ping timeout: 246 seconds)
22:50:11  * jjmalinaquit (Quit: Leaving.)
22:50:13  * thlorenzjoined
22:57:39  * jcrugzzquit (Read error: Connection reset by peer)
22:58:00  * jcrugzzjoined
22:58:24  * thlorenzquit (Ping timeout: 240 seconds)
22:59:21  * ednapiranhaquit (Remote host closed the connection)
23:04:14  * ryan_ramagequit (Quit: ryan_ramage)
23:07:59  * timoxleyjoined
23:12:38  * timoxleyquit (Ping timeout: 245 seconds)
23:12:50  * dominictarrquit (Quit: dominictarr)
23:41:16  * fallsemoquit (Quit: Leaving.)