00:13:42  * stagasquit (Ping timeout: 255 seconds)
01:09:13  * _pidquit (Quit: cu)
01:14:39  * aaronlidmanjoined
02:27:20  * ednapiranhajoined
02:29:51  * ednapiranhaquit (Client Quit)
04:07:15  * aaronlidmanquit (Remote host closed the connection)
07:46:33  * stagasjoined
09:44:27  * stagasquit (Ping timeout: 272 seconds)
10:20:03  * _pidjoined
10:56:27  * stagasjoined
13:50:17  * insertcoffeequit (Ping timeout: 244 seconds)
14:15:54  * stagasquit (Ping timeout: 244 seconds)
14:54:30  * aaronlidmanjoined
15:57:05  * lcajoined
15:59:16  <lca>I am getting "too many open files" error.
16:00:21  <lca>any hints?
16:00:24  <rescrv>lca: increase the maximum number of open files
16:00:31  <rescrv>first using ulimit -n
16:00:43  <rescrv>you may need to edit /etc/security/limits.conf or equivalent to set a higher number
16:00:52  <rescrv>then look at the leveldb options for the same parameter
16:01:17  <rescrv>I recommend setting the nofile parameter to somewhere between 65536 and 262144
16:01:33  <lca>rescrv, how many files should I have for each db?
16:03:10  <lca>I see. that is a lot. would it be enough even if I have many databases?
16:04:14  <rescrv>figure one file for every 2.5MB of data
16:04:23  <rescrv>fewer files if you are using level-hyper
16:04:37  <rescrv>err.. one file per 1.5MB of data
16:05:11  <lca>what is level-hyper?
16:05:48  <rescrv>it's a fork of leveldb that I made for use in HyperDex (http://hyperdex.org). It has some performance optimizations that kick in on larger databases.
16:06:19  <rescrv>how many databases do you anticipate having?
16:06:52  <lca>~100
16:07:37  <rescrv>if they don't have to reside in distinct areas on disk, you might consider multiplexing one database, using a common prefix for each logical db. I believe sublevel is the JS way to do that
16:10:10  <lca>would multiplexing not be a performance problem when I do forward iteration over one logical db?
16:11:04  <rescrv>too many variables to answer that as it depends on how often you insert, how often you iterate, the size of your data, and how long it's been since you loaded the majority of it.
16:11:29  <rescrv>it also depends on your hardware
16:18:56  <lca>I insert and iterate very often. but the iterations will be (most of the time) over the latest items inserted in the db. item size will be ~ .5 k
16:27:26  <rescrv>then you'll need to benchmark. I suspect you'll do better with multiple DBs if you're inserting in sorted order, and your insert rate divided by write buffer size is less than the time between insertion and iteration of an item.
16:31:04  <rescrv>(that's insert rate across all DBs)
16:36:35  <lca>yeah, I will need to do some tests here.
16:36:53  <lca>rescrv, thank you very much for all that information
16:38:17  <rescrv>you'll probably do best if you're inserting in (roughly) sorted order, and set a big write buffer. The default write buffer is too small.
16:40:09  <lca>I am indeed inserting in roughly sorted order
16:40:37  <lca>the wirte buffer hwlps with the number of open files?
16:40:50  <lca>what about cache?
16:46:41  <rescrv>the write buffer shouldn't impact number of open files in the limit. What it will do is help to keep the data you're iterating over in memory so that you are traversing a linked list instead of reading from files on disk.
16:48:53  * brianloveswordsjoined
16:51:23  <lca>very good information rescrv! thanks!
16:55:55  * _pid_joined
16:56:20  * _pidquit (Read error: Connection reset by peer)
17:16:33  * jaribquit (*.net *.split)
17:19:40  * jaribjoined
17:30:22  * stagasjoined
17:32:52  * insertcoffeejoined
17:34:37  * brianloveswordsquit (Quit: Computer has gone to sleep.)
17:45:21  * lcaquit (Ping timeout: 255 seconds)
17:48:16  * brianloveswordsjoined
18:09:57  * brianloveswordsquit (Quit: Computer has gone to sleep.)
19:30:36  * lcajoined
19:50:00  * dguttmanjoined
20:35:53  * lcaquit (Ping timeout: 264 seconds)
20:42:33  * aaronlidmanquit (Remote host closed the connection)
22:13:48  * insertcoffeequit (Ping timeout: 250 seconds)
23:29:09  * aaronlidmanjoined