00:05:30  * dominictarrquit (Quit: dominictarr)
00:16:04  * timoxleyquit (Quit: Computer has gone to sleep.)
00:20:52  * dominictarrjoined
00:21:57  * ralphtheninjaquit (Ping timeout: 248 seconds)
00:27:25  * thl0joined
00:35:26  * dominictarrquit (Quit: dominictarr)
01:14:40  * timoxleyjoined
01:35:35  * levelbotquit (Remote host closed the connection)
01:37:02  <thl0>juliangruber: rvagg: so there is one problem I see with the .pre add indexing approach
01:38:02  <thl0>it forces me to JSON.parse the value if I wanna index on one of it's properties
01:38:09  <thl0>juliangruber: rvagg: i.e. https://github.com/thlorenz/level1/blob/master/samples/indexing/with-sublevel-hooks.js#L29
01:39:00  <thl0>although this only makes inserts slow, so it depends on what you are trying to do
01:39:58  <thl0>still though, there should be a better way, i.e. don't just give me the value, but instead the entire thing, so I could attach a property right next to key and value and wouldn't have to parse
01:42:20  <thl0>somehow everything but key and value gets stripped before pre is called
01:42:35  <thl0>may do that right after pre so I can use that info to index?
01:50:50  <thl0>juliangruber: rvagg: submitted an issue, maybe dominictarr will have a chance to look at it: https://github.com/dominictarr/level-sublevel/issues/12
02:25:33  * eugenewarejoined
02:41:50  * Pwnnaquit (Remote host closed the connection)
02:58:10  * Pwnnajoined
03:01:36  * levelbotjoined
03:02:16  <thl0>rvagg: seems like mapped-index is broken due to patches of dependencies?
03:02:50  <thl0>crypto.js:132
03:02:50  <thl0> return new Hash(hash);
03:02:50  <thl0> ^
03:02:51  <thl0>RangeError: Maximum call stack size exceeded
03:03:34  <thl0>is the error I get when running the tests
03:03:57  <rvagg>eeek
03:04:10  <rvagg>thl0: thanks, will get on to it when I find a spare moment
03:04:21  * rvaggis fixing raid arrays with busted disks
03:05:07  <thl0>cool - I'll stick with just sublevel for now then - also mapped-index still depends on levelup 0.7
03:05:24  <rvagg>it probably shouldn't depend on anything
03:05:37  <rvagg>I need to update the main place where I'm using it, that'll force me to fix stuff
03:06:31  <thl0>np - whenever you have a chance - I'm good for now - sublevel is fine all these other things are just very nice wrappers
03:06:38  <thl0>I can rewite stuff later
03:30:16  * thl0quit (Remote host closed the connection)
03:40:12  * eugeneware_joined
03:41:22  * eugeneware_part
04:15:27  <levelbot>[npm] jsonquery@0.0.0 <http://npm.im/jsonquery>: MongoDB query language implemented as a Streaming filter (@nharbour)
04:18:39  <rvagg>eugeneware: going hot with all the leveldb packages, would be interested to hear what you're doing with it
05:19:06  * ramitosquit (Ping timeout: 252 seconds)
05:21:28  <levelbot>[npm] jsonquery@0.0.1 <http://npm.im/jsonquery>: MongoDB query language implemented as a Streaming filter (@nharbour)
05:36:38  * timoxleyquit (Quit: Computer has gone to sleep.)
06:25:23  * nemequpart ("Ex-Chat")
06:34:54  * timoxleyjoined
06:35:29  * Pwnnaquit (Ping timeout: 277 seconds)
06:57:07  * ralphtheninjajoined
07:12:03  * Pwnnajoined
07:25:27  * Pwnnaquit (Remote host closed the connection)
07:30:08  * Pwnnajoined
07:48:22  * dominictarrjoined
07:53:03  * no9joined
08:11:41  * num9joined
08:19:18  * eugenewarequit (Quit: Leaving.)
08:22:12  * levelbotquit (Remote host closed the connection)
08:36:26  * ChrisPartridgequit (Ping timeout: 276 seconds)
08:55:54  * hij1nxquit (Ping timeout: 264 seconds)
08:56:10  * Pwnnaquit (Ping timeout: 256 seconds)
08:56:55  * hij1nxjoined
10:30:46  * timoxleyquit (Quit: Computer has gone to sleep.)
10:39:14  * werlejoined
10:43:42  * timoxleyjoined
10:59:33  * thl0joined
12:10:53  <dominictarr>hij1nx: ping?
12:33:39  * thl0quit (Remote host closed the connection)
13:13:29  * werlequit (Quit: Leaving.)
13:17:00  * thl0joined
13:26:02  <thl0>dominictarr: I created somewhat of a plan on how to mine npm registry and github to get the necessary data in order to analyze a package an properly rate it
13:26:08  <thl0>https://github.com/thlorenz/valuepack/blob/master/data/mine.md
13:26:26  <thl0>probably will first just store all this info and then run an analyzer over the data
13:27:39  <thl0>will prob. use levelup and sublevel only (don't need more abstraction) and mapped-index didn't really work for me
13:40:09  <dominictarr>mapped-index?
13:42:06  <dominictarr>so, I should probably tell you about my ideas for npmd, so that we can work together where they overlap
13:45:04  <dominictarr>thl0: which is basically about making npm install really fast
13:50:37  <thl0>dominictarr: ok, do you have anything written up?
13:50:56  <thl0>level-mapped-index by rvagg didn't work for me
13:51:28  <thl0>so yes, the first part of what I'm doing may overlap, since I'm pulling down all repos metadata
13:51:54  <thl0>but after that I'm not sure (I'm not pulling tars or readmes)
13:52:03  * thl0takes a look at npmd
13:54:24  <thl0>dominictarr: so looks like for npmd dealing with deltas make sense, but maybe not for valuepack
13:54:52  <thl0>in my case it will be hard to reason about what data will have to be updated due to a change in the npm registry
13:55:07  <thl0>also I need to look at github info, which has no update feed
13:55:32  <thl0>so for now I was just gonna run a daily thing that pulls down everything and runs analyzer over the data
13:55:56  <dominictarr>you could get an updated feed from github by creating a user, and following everyone you want to follow
13:56:01  <dominictarr>or every project.
13:56:04  <mbalho>ooh good idea
13:56:13  <thl0>this may be the easiest and actually be fast enough - tonight I'm gonna do some experimentation on how long everything will take
13:56:26  <thl0>dominictarr: ah that's an idea
13:56:47  <dominictarr>plus, then you get near realtime updates
13:57:01  <thl0>but to get a start I'd need to pull down everything anyways
13:57:14  <dominictarr>thl0: I also implemented a simpler index thing for level
13:57:19  <thl0>so I'll see how long that takes and then build it out from there
13:57:20  <dominictarr>level-index
13:57:49  <thl0>I saw that, but sublevel seems fine
13:57:51  <dominictarr>thl0: good point. you need both of those parts combined for data replication
13:58:17  <dominictarr>have a bit that finds where it was last up to, and pull down everything since then
13:58:26  <dominictarr>and then a bit that keeps it up to date.
13:58:39  <thl0>dominictarr: as far as I understand from reading level-hooks, the pre first goes thru entire batch and just tags the indexing operations onto it?
13:59:01  <thl0>so puts and indexing is done in one batch right?
13:59:01  <dominictarr>yes
13:59:06  <thl0>nice
13:59:24  <dominictarr>exactly. so your indexes and your data will always be consistent.
13:59:43  <thl0>dominictarr: yeah I thought about that way, but it's way more complex to handle deltas
13:59:43  <dominictarr>(well, you need to handle deletes differently)
13:59:58  <thl0>dominictarr: I'll have none if I don't deal with deltas ;)
14:00:37  <thl0>so if I can get away with just reevaluating everything i.e. daily I'll probably go with that for now
14:00:56  <thl0>I can always get smart and more efficient later ;)
14:00:56  <dominictarr>sure, that is probably the simple way to start.
14:01:13  <thl0>but the follow all users on github is a grand idea :)
14:01:56  <thl0>dominictarr: for pre analyzation data, would you store the npm (users and packages) in a separate db than the github user info?
14:02:05  <thl0>or is it better in one?
14:02:24  <dominictarr>well, if you use sublevel it doesn't make much difference
14:02:36  <dominictarr>it's easy to decouple the github part, and the npm part
14:02:57  <thl0>except I could run two processes in parallel if I use two dbs
14:03:10  <thl0>yeah decoupling is automatic
14:03:34  <dominictarr>might be simpler to use multilevel
14:03:44  <dominictarr>that could let you fan out to many workers
14:03:47  <dominictarr>if necessary
14:04:20  <thl0>working on replicated dbs?
14:04:37  <thl0>or is this on same machine?
14:05:33  <thl0>dominictarr: ah, I think I got it now this uses a server so multiple processes can connect to same db, how about contention juliangruber?
14:06:05  <thl0>i.e. if two process try to write at same time?
14:06:13  <thl0>maybe even to same key?
14:07:16  * no9quit (Ping timeout: 256 seconds)
14:07:17  * num9quit (Ping timeout: 252 seconds)
14:07:52  <dominictarr>so, that isn't a part of multilevel, you'd want to handle that another way
14:08:22  <dominictarr>the simplest, would be to arange your workers so that they worked with different data, so they never collided.
14:08:53  <dominictarr>they can write to different keys, just fine, but the same key the last one will win
14:08:56  <dominictarr>like in mongo.
14:09:17  <dominictarr>we have been experimenting with ways to do a conditional update, though.
14:09:23  <thl0>got it
14:09:48  <thl0>also if I give each worker a subset (i.e. of github users) to work with, the keys should never collide
14:10:04  <thl0>knowing of that option it makes sense to just use one db
14:10:34  <thl0>can't wait to get home and give this a try ;)
14:11:49  <dominictarr>but I'm gonna work on the conditional update stuff soon, because that will give us master-master replication
14:12:09  <thl0>that will be nice to have
14:12:38  <thl0>dominictarr: btw I saw that all these modules basically mixin functionality into the level db instance
14:12:51  <thl0>aren't you worried about possible name collides?
14:13:07  <thl0>i.e. if more people write mixins like that?
14:14:04  <thl0>also some mixins depend on mixins from other modules (i.e. sublevel), but don't require it
14:14:08  <dominictarr>well, only level-sublevel does that
14:14:59  <dominictarr>and then sublevel gives you a new instance that you are free to extend.
14:15:04  <thl0>mapped-index does as well
14:15:43  <thl0>you have to do db = sublevel(db); db = mappedIndex(db)
14:15:50  <thl0>each call mixes more stuff in
14:16:01  <dominictarr>hmm
14:16:07  <dominictarr>you should do it like this:
14:16:17  <dominictarr>db = sublevel(level(path))
14:16:25  <dominictarr>mapDb = MapReduce(db, ...)
14:16:30  <dominictarr>mapDb !== db
14:16:39  <thl0>just wondering if there is maybe a better way, i.e. just exposing functions into which you pass your db instead of mutating the db itself by adding functions itself
14:16:45  <dominictarr>then add map-reduce related features to mapDb
14:16:58  <thl0>yeah, but different modules may try to add fns with same name
14:17:07  <thl0>then you are screwed
14:17:40  <thl0>and get weird behaviors depending on what module mixed itself in last
14:17:56  <dominictarr>okay, mapped-index is a bad pattern
14:18:26  <thl0>dominictarr: just pointing this out, since it's still fairly new and API may still be fixed now, but maybe harder to do later
14:18:35  <dominictarr>yes.
14:18:47  <dominictarr>this is a huge improvement since before sublevel
14:19:18  <dominictarr>but if you only extend sublevels, and not the root db (except for sublevel, of course)
14:19:40  <dominictarr>I mean, sublevel is the only thing that changes stuff.
14:19:48  <thl0>well hooks does too
14:19:48  <dominictarr>and then everything else can be clean.
14:19:56  <thl0>https://github.com/dominictarr/level-hooks/blob/master/index.js#L134-L146
14:20:04  <dominictarr>oh, right. consider that a part of sublevel.
14:20:19  <dominictarr>we are looking at combining hooks into levelup, anyway.
14:20:49  <dominictarr>the blocker is just deciding how to handle async prehooks,
14:20:59  <dominictarr>(for example, for conditional updates)
14:22:22  <thl0>ok, that is better then
14:23:31  <thl0>I also saw another one that added a bunch of readStream functions (with two names each) - can't remember which one
14:24:00  <thl0>dominictarr: but if you consolidate this stuff and keep this under control it should be ok
14:26:40  <dominictarr>the other reason for this, is that it makes multilevel able to expose sublevels (and features added in them) to client connections.
14:27:30  <thl0>ah, that makes sense
14:28:03  <thl0>it would just suck if way further down the line you realize that this mixin model breaks down and it's too late to fix it without breaking lots of things
14:33:20  * werlejoined
14:41:29  * levelbotjoined
14:58:10  * no9joined
14:58:28  * num9joined
15:06:01  * levelbotquit (Remote host closed the connection)
15:07:45  * levelbotjoined
15:26:16  * brianloveswordsquit (Excess Flood)
15:27:48  * brianloveswordsjoined
15:38:24  * thl0quit (Ping timeout: 245 seconds)
15:39:14  * thl0joined
15:44:47  * ramitosjoined
15:48:50  <thl0>dominictarr: ping
15:48:56  <dominictarr>yo
15:49:05  <thl0>am I missing something in that issue?
15:49:19  <thl0>i.e. I don't wanna store the values of my indexes as json
15:49:29  <thl0>unnec. overhead
16:01:43  <levelbot>[npm] level-window@1.0.0 <http://npm.im/level-window>: levelup plugin for creating views on realtime time series data. (@dominictarr)
16:04:00  <dominictarr>thl0: you are using json as the example, though
16:04:24  <dominictarr>you want to add an index per keyword, right?
16:04:33  <dominictarr>so that
16:04:44  <dominictarr>drive -> car?
16:05:26  <thl0>so the value is json
16:05:36  <thl0>but I wanna index by keywords like drive
16:05:48  <dominictarr>if you set encoding: json, then level will stringify it for you,
16:05:57  <dominictarr>so you don't have to parse it twice
16:06:09  <thl0>but I'm storing indexes in same db
16:06:14  <dominictarr>oh
16:06:19  <thl0>whose values are strings
16:06:25  <thl0>can't mix
16:06:31  <dominictarr>right - so you want to store them as strings without ""
16:06:50  <thl0>i.e. 'drive' : value 'car'
16:07:09  <thl0>this automatic json thing breaks down once you store any indexes that have string values
16:07:24  <thl0>also this is pre store right?
16:07:33  <dominictarr>What if you could set different encodings on different sublevels?
16:07:53  <thl0>so why can't I include some extra info on the object that I use for indexing before it gets stored?
16:08:00  <thl0>yse that would be great
16:08:35  <dominictarr>cool. I have been meaning to add that feature!
16:09:09  <thl0>i.e. what I'd like to do is { key: 'car', value: '{ some stringified json }', keywords: [ array of keywords I only need before entry for to index the car ] }
16:09:14  <dominictarr>If you are keen you can make a pr for that?
16:09:35  <thl0>dominictarr: when I have some time and if it is simple enough ;)
16:09:49  <dominictarr>should be pretty simple I think.
16:10:15  <thl0>so how about preserving my object past the .pre step, so I can attach extra info?
16:10:28  <dominictarr>I don't like that idea.
16:10:37  <dominictarr>it's not a clean api.
16:10:54  <thl0>hm
16:11:03  <dominictarr>that isn't how arguments should get to the prehook
16:11:16  <thl0>but sometimes the stuff to index by may not even part of the value
16:11:24  <thl0>so how do I pass those on?
16:11:32  <thl0>to use them in pre?
16:11:41  <dominictarr>iin that case, don't use a hook, use a batch.
16:11:55  <thl0>ok, makes sense
16:12:30  <dominictarr>the hook should be independent.
16:15:16  <thl0>but just to be clear, even if I use json encoding when ever it hits the pre it would have to parse the json?
16:15:45  <thl0>dominictarr: or is the auto stringify happening after it went thru pre?
16:16:24  <chapel>thl0: how I am doing indexing is using batches, effectively I have a separate function I call that handles the indexing
16:16:40  <chapel>thl0: so it doesn't overwrite the existing put function
16:17:19  <thl0>chapel: I did indexing manually as well https://github.com/thlorenz/level1/blob/master/samples/indexing/with-levelup.js
16:17:38  <thl0>however I feel like doing it with sublevel is a bit clenaner
16:17:50  <dominictarr>thl0: no, if you used json encoding, the pre hook will get it before it's been stringified.
16:18:15  <thl0>dominictarr: awesome, so I'm gonna have to get that mixed encoding going and the problem will be solved
16:18:22  <chapel>thl0: as well with the indexes, I keep them simple, effectively they are structured like <index level>:<collection>:<key>:<value> mind that : is just a placeholder for the actual separator
16:18:56  <chapel>so you can do 1:items:name:foo
16:19:14  <thl0>chapel: nice, I just wanted to have sublevel help me a bit there, i.e. I'd just have to worry about setting things up right and then keep entering values
16:19:27  <chapel>yeah
16:19:27  <thl0>I feel like the sublevel abstraction is quite nice
16:19:36  <chapel>since Im using binary keys I can't use sublevel directly
16:19:42  <thl0>ah
16:20:21  <chapel>so I hand rolled the sublevel like functionality, but my code isn't really meant to extend base levelup
16:20:22  <chapel>its written on top of
16:20:47  <chapel>but I agree, I like sublevel and how it works
16:20:57  <thl0>that's kinda what I did here as well https://github.com/thlorenz/level1/blob/master/samples/indexing/with-levelup.js#L22
16:20:58  <chapel>the other thing Im working with is bytewise
16:21:30  <thl0>I think I saw that, but currently have no need (just dealing with json at the moment)
16:21:33  <chapel>which if you haven't seen does some nice standardized binary key formation
16:21:47  <dominictarr>chapel: can you post an sublevel issue describing how you use binary keys?
16:21:48  <chapel>so my keys are actually arrays
16:21:57  <chapel>dominictarr: well its not an issue with sublevel
16:22:04  <chapel>so much as I need to write the keys directly
16:22:10  <chapel>and pass them through bytewise
16:22:12  <dominictarr>that you can't use binary keys is an issue
16:22:42  <chapel>so I would have to overload all write commands to rewrite the keys to use bytewise
16:23:12  <dominictarr>we have been discussing custom encodings for levelup
16:23:46  <chapel>sublevel would work fine if I didn't want to control the key structure entirely
16:24:10  <chapel>my keys are actually arrays [0, collectionname, key, value] if you will
16:24:29  <thl0>dominictarr: allowing different encodings should be fairly simple - all I'd have to do is pass a value thru here https://github.com/dominictarr/level-sublevel/blob/master/sub.js
16:24:42  <thl0>so that the puts, etc. can pass that in each time
16:24:50  <chapel>and bytewise encodes them in a way that makes it easy to use to do partial key look ups
16:25:15  <dominictarr>yeah, levelup already allows you to provide an encoding per request (put, get, and streams)
16:25:23  <thl0>yep
16:25:34  <thl0>no performance issues with that?
16:25:43  <thl0>guess not it's just strings in the end anyways right?
16:26:07  <thl0>so cool I should be able to get this in tonight or tomorrow
16:26:11  <dominictarr>well, it brings it into better parity with levelup, which is important.
16:26:31  <chapel>dominictarr: mikeal's pushdb is where I got the idea to use bytewise how I am https://github.com/mikeal/pushdb
16:31:10  <dominictarr>hmm, I'll look into that.
16:31:13  <dominictarr>thanks
16:37:14  <dominictarr>no9: basic windowing for leveldb https://github.com/dominictarr/level-window
16:38:48  <no9>dominictarr cool I am making dinner here so I will look at it later
16:39:01  <dominictarr>good idea
16:39:16  <no9>Yes food FTW!
16:39:24  <no9>or FML
16:45:46  * thl0quit (Remote host closed the connection)
16:48:19  * timoxleyquit (Quit: Computer has gone to sleep.)
17:12:02  * ramitosquit (Quit: Textual IRC Client: www.textualapp.com)
17:49:16  * thl0joined
18:28:02  * Pwnnajoined
18:31:11  * Pwnnaquit (Remote host closed the connection)
18:32:41  <levelbot>[npm] pouchdb@0.0.13 <http://npm.im/pouchdb>: PouchDB is a pocket-sized database. (@chesles, @eckoit, @daleharvey)
18:40:21  * Pwnnajoined
19:37:41  * num9quit (Ping timeout: 248 seconds)
19:37:41  * no9quit (Ping timeout: 248 seconds)
19:43:51  * Pwnnaquit (Remote host closed the connection)
19:46:57  * no9joined
19:47:01  * Pwnnajoined
20:03:30  * num9joined
20:18:57  * thl0quit (Ping timeout: 240 seconds)
20:20:29  * thl0joined
20:46:11  <levelbot>[npm] level-sublevel@4.6.6 <http://npm.im/level-sublevel>: Separate sections of levelup, with hooks! (@dominictarr)
20:52:43  * thl0quit (Ping timeout: 256 seconds)
20:53:01  * thl0joined
21:25:42  <levelbot>[npm] level-sublevel@4.6.7 <http://npm.im/level-sublevel>: Separate sections of levelup, with hooks! (@dominictarr)
21:40:26  * dominictarrquit (Quit: dominictarr)
21:42:05  * dominictarrjoined
22:04:19  * thl0quit (Ping timeout: 260 seconds)
22:05:50  * thl0joined
22:07:21  * timoxleyjoined
22:09:12  * thl0quit (Remote host closed the connection)
22:10:50  * werlequit (Quit: Leaving.)
22:12:19  * ChrisPartridgejoined
22:17:30  * ChrisPartridgequit
22:36:33  * jcrugzzjoined
23:07:07  * jcrugzzquit (Ping timeout: 264 seconds)
23:07:40  <levelbot>[npm] level-sublevel@4.6.8 <http://npm.im/level-sublevel>: Separate sections of levelup, with hooks! (@dominictarr)
23:44:59  <no9>rvagg ZFS won on the backup stuff
23:50:30  <rvagg>no9: fair enough, zfs has a habit of winning
23:51:19  <no9>started a blog post on it https://gist.github.com/No9/25db556b8567125b04bf but it still needs another pass at the technical detail of snapshots etc
23:51:39  <no9>technical detail == impact
23:52:27  <rvagg>yeah, well I think that the other implementations of block-level snapshots are all just copies of zfs, it has maturity on its side
23:52:45  <rvagg>what OS are you doing this on? are you deploying to joyent or something?
23:52:50  * dominictarrquit (Quit: dominictarr)
23:52:57  <no9>Ubuntu for the blog post
23:53:46  <no9>Uses files instead of partitions for easy introduction
23:57:11  <rvagg>ah, nice