00:04:54  * PerilousApricotquit (Remote host closed the connection)
00:06:48  * PerilousApricotjoined
00:11:11  * PerilousApricotquit (Ping timeout: 240 seconds)
00:25:54  * importantshockjoined
00:31:58  * qard-appnetaquit (Quit: (╯°□°)╯︵ pɹɐb)
00:37:09  * rmg_joined
00:41:48  * importantshockquit (Remote host closed the connection)
00:42:09  * rmg_quit (Ping timeout: 276 seconds)
00:53:49  * zhangyqjoined
00:57:54  * brsonquit (Quit: leaving)
01:06:47  * zju_xquit (Remote host closed the connection)
01:16:10  * tunniclm_quit (Ping timeout: 244 seconds)
01:42:38  * importantshockjoined
01:48:27  * importantshockquit (Ping timeout: 276 seconds)
02:02:45  * roxluquit (Ping timeout: 276 seconds)
02:04:31  * rgrinbergquit (Ping timeout: 240 seconds)
02:09:22  * roxlujoined
02:21:03  * happy-dudequit (Quit: Connection closed for inactivity)
02:32:21  * rgrinbergjoined
02:38:39  * rmg_joined
02:43:44  * rmg_quit (Ping timeout: 272 seconds)
02:53:02  <Ralith_>Dirkson: I strongly suspect your program will have undesirable behavior even in the unlikely event that you do get all the mutex wrangling exactly correct
02:53:21  <Ralith_>you need to fundamentally redesign it so that you don't have several threads trying to munge the same data simultaneously.
02:54:10  <Ralith_>because indeed if they're blocking on eachother all the time then they *aren't* munging data simultaneously, so it might as well be a single thread.
02:56:18  <Ralith_>I don't know exactly what problem you did this in an attempt to solve, but you should strongly consider moving all operations on this data into a single thread and then, if and only if necessary, dividing up non-overlapping segments of work using a thread pool
02:56:19  <Dirkson>Ralith_: Mostly I have a bunch of read threads, with the rare (and usually rather heavy) need to write. As a result, I've designed a system of rwlocks. Mostly it works fine - There are only two outstanding threading errors I'm aware of currently. And based on the section of code I'm fuddling around in, I'm starting to suspect that they're actually the same single error popping out at a couple separate seams.
02:56:48  <Ralith_>that type of architecture is almost never the correct solution
02:57:11  <Ralith_>why are you doing that?
02:57:38  <Dirkson>Ralith_: Which bit did you mean to indicate with 'that'?
02:57:53  <Ralith_>using large numbers of threads which simultaneously access a large amount of shared state
02:59:12  <Dirkson>Because I have large data structures that numerous threads need to be able to access random sections of quite often.
02:59:25  <Ralith_>what makes you think you need that?
03:00:09  <Dirkson>I am unusre what question you are asking. I ran of out of cpu to do the math I wanted to do?
03:00:50  <Ralith_>distributing math across multiple cores requires neither large numbers of threads nor significant amounts of shared state
03:01:06  <Dirkson>Yeah, then I don't understand the question you've asked.
03:02:04  <Ralith_>I am trying to understand what lead you to this design so I can explain a better way in terms you can directly relate to
03:02:33  <Dirkson>Ralith_: Yes! I kinda get that, and I'm trying to be helpful as a result :D
03:03:46  <Ralith_>would I be right to say that you started with a design that could have been single threaded, but it wasn't fast enough (at least in theory) so you divided up work into a bunch of different threads?
03:06:13  <Dirkson>Ralith_: Ok, so I'm making a space game. Voxel space ships. Each ship is a -huge- amount of data. Even with compression techniques, a large ship could eat up a gigabyte of ram. I can't copy the entire ship. I have lots of operations that need to be done to ships - Preparing them for rendering is a huge one. There's also a good deal of simulation math. Taking user input, and seeing if that causes any changes to
03:06:15  <Dirkson>the ships. Taking network information, and applying it to the ships. I could probably aggregate all changes into a thread that 'owns' the ships, but aggregating all the reads would rapidly turn into a nightmare, and destroy performance.
03:08:26  <Ralith_>have you thought about whta happens if a read operation (say, rendering) sees, in a single pass, part of a ship before a write operation has taken place, and another part after?
03:09:50  <Dirkson>Ralith_: Generally speaking, some minor math inconsistencies, or 1/60th of a second of delayed rendering of the affected object. Assuming I don't bork up the mutexes :D
03:11:37  <Ralith_>what if the write operation involves some material moving from an area controlled by one mutex to another? a read operation to e.g. compute total mass, or enable the user to pick up an object, might see the material doubled, or missing entirely
03:12:07  <Ralith_>(to say nothing of inertia tensors and so forth)
03:14:13  * zhangyq_joined
03:14:28  * zhangyqquit (Ping timeout: 244 seconds)
03:14:38  * zhangyq_changed nick to zhangyq
03:18:20  <Dirkson>Ralith_: Heat springs to mind. That'd move from block to block, potentially over current chunk-mutex bounds. It's difficult to imagine how I'd lock the mutexes without introducing deadlocks, or allowing for occasional undefined behavior between unlocking one mutex and locking another. I probably haven't been seeing these effects because I have few things that travel from one chunk to another.
03:19:52  <Ralith_>avoiding this type of error is indeed not really feasible with your current massively-shared-state design, and is one of several reasons why such a design should be avoided
03:20:32  <Ralith_>instead, you can transpose the problem
03:21:25  <Ralith_>allocate sufficient memory for two copies of the time-varying aspects of your world state
03:22:23  <Ralith_>construct a single simulation thread, which will proceed in discrete timesteps
03:22:56  <Ralith_>construct a worker thread pool, having a number of threads roughly equal to the number of CPUs available on the system
03:24:44  <Ralith_>each time step, the simulation thread computes a new world state based on the old world state, by reading the old world state and writing the updated state elsewhere
03:25:22  <Ralith_>note that rendering of the "old" world state can take place simultaneous to this, if desired, because you are not writing to it
03:25:36  <Dirkson>This is roughly the approach I was taking with my heat simulation, writ-large across the sky.
03:25:58  <Dirkson>With some tweaks, I suspect it could work very well indeed.
03:26:18  <Ralith_>to take full advantage of available CPU cores, you can divide up the "compute a new world state" operation into discrete units, which you submit to a job queue for execution by the thread pool
03:26:51  <Ralith_>each discrete unit will be responsible for writing to a specific part of the new world state, such that no two distinct units will ever write to the same location
03:27:03  <Dirkson>Yup.
03:27:16  <Ralith_>because the previous world state is separate and not being modified, any work unit can read any location in it freely
03:27:50  <Ralith_>when the new world state has been computed, and any rendering or other work accessing the old world state has finished, you swap new/old labels and start again
03:28:46  <Ralith_>tihs allows you to take full advantage of SMP hardware without requiring any shared writable state or mutexes except the trivial amount necessary to manage the work queue for the thread pool
03:29:37  <Ralith_>and it ensures that all portions of the system always see a single internally-consistent view of the world
03:30:05  <Ralith_>not only is this drastically less fragile and better-behaved than continuously munging a single shared state, it's also got a lot less overhead, assuming the size of your work units isn't completely off (and you can tune that adaptively if you like)
03:31:33  <Dirkson>Ralith_: I associate this style of parallelism with openmp. Should I?
03:31:48  * Ralith_shrug
03:32:09  <Ralith_>I would never use openmp for gamedev, but it's not a totally ridiculous analogy
03:33:07  <Ralith_>this approach will give you an especially visible performance improvement in cases where you have a lot of work to be done that requires accessing a single region, e.g. a ship, since units of simulation are never blocked by eachother
03:33:41  <Ralith_>you'd manage user input by queueing it up as received and consuming the whole queue at the beginning of each timestep
03:33:56  <Dirkson>Aye. It also forces me to parallelize other sections of the game that could probably use it, such as preparing ships for rendering.
03:34:07  <Ralith_>it allows you to, it certainly doesn't force you to
03:34:32  <Ralith_>anything that isn't factored into a work unit for execution by the thread pool can just be done directly in the simulation thread
03:34:40  <Dirkson>Currently preparing ships for rendering eats up one core of whatever cpu it gets. And on most CPUs, could use a bit extra :D
03:42:07  <Dirkson>Ralith_: Yeah, I like this idea. I'd never seriously considered this style of paralellization, but it has significant advantages. And whatever cycles it drops dealing with single threaded stuff is almost certainly made up for with more even load balance, and not having to deal with slow mutexes.
03:43:26  <Ralith_>at the limit, the only "single-threaded stuff" that takes place in such a system is the issuing of work units, which is computationally trivial
03:45:11  * importantshockjoined
03:50:14  * importantshockquit (Ping timeout: 272 seconds)
03:57:09  <Dirkson>Ralith_: Right. Thanks for talking through this with me. You really forced me to think about the benefits offered by this style of parallelism.
03:57:38  <Ralith_>np
04:02:35  <Dirkson>Opengl does complicate things a bit. It -needs- its own thread, because it will absolutely peg whatever I hand to it so that it can sit there and, as near as I can tell, do absolutely nothing. (opengl render calls are blocking, which seems odd, since all the work is done on the gpu, isn't it?) And there is a -little- shared data that the render thread currently utilizes mutexes to access. But almost all of
04:02:37  <Dirkson>that is small enough that I can just send copies back and forth without issue.
04:05:00  * PerilousApricotjoined
04:06:42  <Dirkson>And this applies in reverse to input, as the window system I'm using has to run its input on the main thread. I think. I'll verify that, just to be sure.
04:09:22  * rgrinbergquit (Ping timeout: 260 seconds)
04:14:33  * importantshockjoined
04:15:33  * importantshockquit (Remote host closed the connection)
04:28:32  * importantshockjoined
04:30:26  * PerilousApricotquit (Remote host closed the connection)
04:35:50  * importantshockquit (Remote host closed the connection)
04:40:08  * rmg_joined
04:44:22  * rmg_quit (Ping timeout: 252 seconds)
06:41:42  * rmg_joined
06:46:03  * rmg_quit (Ping timeout: 240 seconds)
07:26:10  * rendarjoined
07:42:27  * rmg_joined
07:46:31  * rmg_quit (Ping timeout: 240 seconds)
08:01:53  * Fishrock123joined
08:13:47  * Fishrock123quit (Remote host closed the connection)
08:14:23  * xuequit (Ping timeout: 250 seconds)
08:18:48  * Fishrock123joined
08:27:29  * xuejoined
08:28:13  * xuequit (Max SendQ exceeded)
08:28:45  * xuejoined
09:14:02  * seishunjoined
09:17:34  * Fishrock123quit (Quit: Leaving...)
09:21:21  <Ralith_>Dirkson: the only system I'm aware of that cares in the slightest about what thread you use to interact with the windowing system is OSX
09:21:34  <Ralith_>on linux, you can quite easily route all input through a libuv event loop
09:21:59  <Ralith_>on windows it's theoretically possible but kind of a bitch, probably easier to just have a dedicated thread feeding events to a queue
09:23:53  <Ralith_>OpenGL does need a dedicated thread, but it certainly shouldn't peg it if you're doing things right; alternatively, you could also switch to vulkan, and not have to deal with any of that unpredictable wackiness at all
09:35:18  * zju3joined
09:38:53  * xuequit (Ping timeout: 244 seconds)
09:39:41  * xuejoined
09:43:53  * rmg_joined
09:48:42  * rmg_quit (Ping timeout: 272 seconds)
10:18:03  * xuequit (Ping timeout: 276 seconds)
10:31:12  * xuejoined
10:41:14  * thealphanerdquit (Remote host closed the connection)
10:41:36  * thealphanerdjoined
10:43:11  * zju_25joined
10:58:04  * seishunquit (Ping timeout: 264 seconds)
11:01:55  * fceldajoined
11:05:25  * seishunjoined
11:06:55  * fceldaquit (Quit: fcelda)
11:13:31  * zhangyqquit (Ping timeout: 240 seconds)
11:33:28  * seishunquit (Ping timeout: 264 seconds)
11:45:26  * rmg_joined
11:49:21  * rgrinbergjoined
11:50:12  * rmg_quit (Ping timeout: 260 seconds)
11:57:43  * importantshockjoined
11:58:08  * saghuljoined
12:10:15  * Fishrock123joined
12:25:01  * Fishrock123quit (Remote host closed the connection)
12:31:16  * seishunjoined
12:34:24  * Fishrock123joined
12:43:39  * Fishrock123quit (Remote host closed the connection)
12:47:52  * seishunquit (Ping timeout: 264 seconds)
12:52:21  * seishunjoined
13:05:12  * Fishrock123joined
13:28:31  * importantshockquit (Remote host closed the connection)
13:29:04  * importantshockjoined
13:29:09  * zju3quit (Ping timeout: 276 seconds)
13:30:47  * importantshockquit (Remote host closed the connection)
13:31:30  * importantshockjoined
13:32:35  * importantshockquit (Remote host closed the connection)
13:36:44  * importantshockjoined
13:46:55  * rmg_joined
13:51:37  * rmg_quit (Ping timeout: 252 seconds)
14:06:43  * importantshockquit (Remote host closed the connection)
14:11:56  * Fishrock123quit (Remote host closed the connection)
14:20:42  * importantshockjoined
14:44:37  * importantshockquit (Remote host closed the connection)
14:47:38  * rmg_joined
14:52:12  * rmg_quit (Ping timeout: 258 seconds)
14:54:06  * Fishrock123joined
14:55:03  * PerilousApricotjoined
15:01:18  * importantshockjoined
15:05:59  * zju3joined
15:45:40  * s3shsjoined
15:45:53  * importantshockquit (Remote host closed the connection)
15:55:08  <Dirkson>Ralith_: Yeah, vulkan is on the list of future upgrades. I suspect it'd open up a lot of cool stuff... But I also suspect it'd be a really substantial redesign that'd take weeks. Opengl's calls are blocking, so they basically have to peg the cpu if you're not rendering above your target fps, which quite frequently we're not. Currently I use glfw for windowing and input, and it requires its input callbacks to
15:55:10  <Dirkson>operate on the same thread as the opengl rendering. My current plan is to simplify those callbacks down to just recording events into my own queue structure, then sending it to the other thread for -actual- processing.
15:55:45  <Dirkson>Which, upon re-reading, sounds a lot like one of your suggestions :D
16:16:12  * rgrinbergquit (Ping timeout: 276 seconds)
16:24:17  * brsonjoined
16:44:16  * Fishrock123quit (Remote host closed the connection)
16:49:09  * rmg_joined
16:53:30  * rmg_quit (Ping timeout: 250 seconds)
17:00:21  * qard-appnetajoined
17:07:38  * rgrinbergjoined
17:17:51  <Ralith_>Dirkson: very few opengl calls should block under any circumstances
17:25:06  <Dirkson>Ralith_: Well.. They do :D Like crazy amounts. Near as I can tell, they block for exactly as long as they're rendering.
17:25:21  * s3shsquit (Quit: Computer has gone to sleep.)
17:26:56  <Ralith_>very few opengl calls actually render anything, either
17:27:10  <Dirkson>Well, that's true.
17:27:46  <Dirkson>I don't remember offhand which call(s) it is that actually introduces the delay. I want to say the swapbuffers command.
17:39:05  * saghulquit (Quit: My MacBook Pro has gone to sleep. ZZZzzz…)
17:41:01  * happy-dudejoined
17:41:54  * Fishrock123joined
17:42:58  * s3shsjoined
17:43:37  * importantshockjoined
17:45:58  * s3shsquit (Client Quit)
17:46:04  * addaleaxquit (Ping timeout: 264 seconds)
17:52:49  * s3shsjoined
17:57:48  * Fishrock123quit (Remote host closed the connection)
17:58:00  * s3shsquit (Quit: Computer has gone to sleep.)
17:59:04  * addaleaxjoined
17:59:48  * s3shsjoined
18:03:37  * importantshockquit (Remote host closed the connection)
18:03:46  * Fishrock123joined
18:08:03  * importantshockjoined
18:12:02  * s3shsquit (Quit: Computer has gone to sleep.)
18:35:06  * Fishrock123quit (Remote host closed the connection)
18:36:09  * PerilousApricotquit (Remote host closed the connection)
18:50:36  * rmg_joined
18:54:14  * PerilousApricotjoined
18:54:51  * rmg_quit (Ping timeout: 240 seconds)
19:03:05  * rgrinbergquit (Ping timeout: 244 seconds)
19:28:33  * rendarquit (Ping timeout: 240 seconds)
19:31:50  * tunniclm_joined
19:34:02  * rgrinbergjoined
19:36:00  * Fishrock123joined
19:41:30  * Fishrock123quit (Ping timeout: 272 seconds)
19:44:27  * PerilousApricotquit
19:44:47  * PerilousApricotjoined
19:49:53  * importantshockquit (Remote host closed the connection)
19:50:09  * importantshockjoined
19:51:27  * importantshockquit (Remote host closed the connection)
19:55:51  * davijoined
19:55:54  * importantshockjoined
19:58:28  * rendarjoined
20:01:16  * PerilousApricotquit (Remote host closed the connection)
20:01:54  * s3shsjoined
20:07:10  * dap_joined
20:09:19  * s3shsquit (Quit: Computer has gone to sleep.)
20:09:35  * addaleaxquit (Remote host closed the connection)
20:19:37  * Fishrock123joined
20:30:00  * s3shsjoined
20:34:26  * s3shsquit (Client Quit)
20:43:27  * s3shsjoined
20:47:23  * addaleaxjoined
20:50:49  * PerilousApricotjoined
20:51:40  * s3shsquit (Quit: Computer has gone to sleep.)
20:52:07  * rmg_joined
20:55:40  * Dirksonquit (Ping timeout: 264 seconds)
20:56:11  * rmg_quit (Ping timeout: 240 seconds)
21:00:08  * saghuljoined
21:02:16  * s3shsjoined
21:11:29  * PerilousApricotquit (Remote host closed the connection)
21:19:49  * importantshockquit (Remote host closed the connection)
21:25:25  * importantshockjoined
21:33:08  * zhangyqjoined
21:35:34  * PerilousApricotjoined
21:38:33  * PerilousApricotquit (Remote host closed the connection)
21:40:40  * seishunquit (Ping timeout: 264 seconds)
21:46:31  * zhangyqquit (Ping timeout: 240 seconds)
21:47:07  * MoZu4k_joined
21:49:35  * zhangyqjoined
21:54:39  * zhangyqquit (Ping timeout: 250 seconds)
21:55:08  * zhangyqjoined
22:03:45  * zhangyqquit (Ping timeout: 250 seconds)
22:04:22  * zhangyqjoined
22:09:11  * zhangyqquit (Ping timeout: 240 seconds)
22:10:05  * zhangyqjoined
22:19:56  * Fishrock123quit (Quit: Leaving...)
22:40:34  * zhangyq_joined
22:40:36  * zhangyqquit (Ping timeout: 244 seconds)
22:44:43  * zhangyq_quit (Ping timeout: 244 seconds)
22:51:45  * rendarquit (Quit: std::lower_bound + std::less_equal *works* with a vector without duplicates!)
22:53:41  * rmg_joined
22:58:03  * rmg_quit (Ping timeout: 240 seconds)
23:02:08  * importantshockquit (Remote host closed the connection)
23:25:32  * saghulquit (Quit: My MacBook Pro has gone to sleep. ZZZzzz…)
23:35:43  * MoZu4k_quit (Quit: MoZu4k_)
23:36:09  * importantshockjoined
23:37:07  * brsonquit (Quit: leaving)
23:38:14  * s3shsquit (Quit: Computer has gone to sleep.)
23:55:55  * importantshockquit (Remote host closed the connection)