Small optimization?

  • Quasialias
    31st May 2013 Member 0 Permalink
    If development for TPT is still going on, I think a code re-ordering is sort of necessary. A solid, large circle of wood brings my fps down to 30 from a solid 60, and I run a quadcore with parking disabled. Maybe you could completely skip evaluating a particle if it's velocity is zero? It would still interact if another particle called upon it, but it brings the complexity in these sorts of situations from n^2 to n at most. I have a few more ideas, but... What do you guys think?
  • jacob1
    31st May 2013 Developer 0 Permalink
    @Quasialias (View Post)
    we already do this actually. It isn't going to try and move a particle that isn't moving.

    The problem is that it doesn't matter at all that you use a quadcore. That isn't going to speed up the game at all (except maybe newtonian gravity) because tpt can only use 1 core. Attempting to make the simulation code multicore would be a nightmare ... it's just not designed for it and would not work.

    We've already made many improvements recently, so it does keep getting a little faster. But I think it's just that it has to loop through every single particle a few times in the code, and that there is the problem.
  • Quasialias
    31st May 2013 Member 0 Permalink
    My other idea unrelated to multi-core or gpu (neither of which I understand enough to recommend) is a sort of 'particle procedure stack', which lists only the particles that need simulation every tick. So for a static solid that isn't going to do anything unless anything does something to it, the game needs not even know it exists until another particle calls upon it.

    Also, I looked at the source code. It is a nightmare.
  • savask
    31st May 2013 Developer 0 Permalink

    @Quasialias (View Post)

    Your "particle stack" will bring more overhead so that speed benefits (if any) won't be even noticeable, or even get worser. There is another way to make particles rest if nothing happens near them: at the stage of temperature simulation, game keeps a particle neighbors counter. You can pass it to element code and skip it if there are no particles nearby (counter = 0). As I remember that's what TPT does already xD

  • plypencil
    31st May 2013 Member 0 Permalink

    I suggested a while back splitting the particle logic between the CPU cores. Will only be beneficial when you have lots of particles, plus you would still get the slow down from the drawing. But it will be a lot better than what it is now.

  • boxmein
    31st May 2013 Former Staff 0 Permalink
    A lot of the elements actually do stuff without moving.
    And a lot of elements don't have an update function (which means they won't literally cause any speed decrease, see DMND).
    Wood updates are necessary for it to burn.

    There's not much threading can do to help since threads need communication between each other and waiting and stuff like that, bringing in lots of speed decreases. Due to the particle nature of the game one simply can't separate updating from drawing with two threads, or even move drawing to graphics cards (because it won't help).

    Best you can do, live with it.

    Also, if you feel like you could give it a whirl, the source is here: https://github.com/FacialTurd/The-Powder-Toy/
    Lots of contributions from good modders are accepted via pull requests so your giving a whirl might actually benefit everyone.
    :)
  • cyberdragon
    31st May 2013 Member 0 Permalink

    I have just the opposite problem. I have a mod element that updates so slowly that instead of causing lag it literally skips half the funtions in it's loop. And, no, I already checked for stray returns or breaks. Any way to increase the overhead. I need to force it to update more often.

  • Pilihp64
    31st May 2013 Developer 0 Permalink

    @plypencil (View Post)

    Someone made a version that did exactly this.  It does increase speed, but it causes undefined behavior because of particles updating in different orders, pretty much breaking every single save (if split processing).  The particle list is meant to be updated one at a time, and each update can change a lot.

     

    @cyberdragon (View Post)

    This is unrelated to the topic, your own mod issues are because of your code, if you have a question or need help, use your mod thread or start another topic.

  • Quasialias
    1st Jun 2013 Member 0 Permalink

    Well, I suggest any of you chaps create a huge static circle of wood and see what happens to your fps. A human would look at this situation, and stop checking for fire and mayhaps stop processing altogether until some fire is actually added. That's what I think heuristics refer to in an application like this: Small rules of thumb you can follow to not waste computation. Update() itself does not cause significant overhead in the long run, nor would any dynamic programming - It is all purely based on the Big-O complexity of which code is being run, i.e. 99% of the processing time is wasted in n^2 loops. Minimize when you call those and you've got it.

    Some simple possible heuristics:
    If the particle has no 'active' neighbors, add it to a list of particles which you can skip, including this. O(n): It would iterate through every particle once every tick.
    Leave particles that indirectly sense other particles over small distances active. - It is action-at-a-large-distance that causes n^2 complexity.
    Implement heat in a separate handler

    A bit trickier ones:
    As I said before, a stack which includes only the particles that can possibly catalyze an interaction, i.e. the ones that move. For heat, have the mechanisms through which convention occurs add the particles they are interacting on to the active list. For ambient heat, have the separated class that handles this do so.

    A temporal analysis to find local loops in time. For repeating things, much of the calculation after the first few frames is useless. Though memory inefficient, each particle saving the two previous states of themselves might be better in the long-run. See Poincare Recurrence Time, in conventional physics.

    Finally, whoever said diamond is efficient quite plainly lied. Anyone, open up powder, spawn a large circle of diamond, and watch your fps half itself. This is not because it's doing furiously complex, necessary calculations in the background. If you could guess what's about to happen, it does not take half your fps. And what I guess is going to happen to that diamond ball is nothing until I add more elements. Despite what a lot of people think, common sense is programmable.

    plypencil:

    I suggested a while back splitting the particle logic between the CPU cores. Will only be beneficial when you have lots of particles, plus you would still get the slow down from the drawing. But it will be a lot better than what it is now.


    I don't think multi-core is even necessary, but if you think it's only effective for large amounts of particles, only enable it when there are! I hope nobody overestimates the computational price of that: O(1) if the game already keeps track of the number of particles.

  • Pilihp64
    1st Jun 2013 Developer 0 Permalink

    @Quasialias (View Post)

    This would be nice, sure, but still has some issues, nearly *every* particle type does loops around itself to check for things.  An exception list for elements that don't might be nice, but really, a save with a huge block of diamond is useless.  It will always be possible to make pointless laggy saves, functioning saves are most likely going to have very few particles where there is little wasted calculations.

     

    Checking previous state for patterns would be neat, but what would the limit be?  What about a blob of an element with a very low chance of exploding, and what happens when the blob doesn't explode for a few seconds(hundreds of frames)?  What about huge circle of diamond with a single gas particle inside, it always changes, wouldn't pattern calc make it even worse?  Now nothing usually affects diamond, but what if we add a gas that does?

     

    It really isn't possible to determine what is a catalyst for every element dynamically, considering you can change entire update functions/properties from Lua.