Fractal Softworks Forum

Please login or register.

Login with username, password and session length
Pages: [1] 2

Author Topic: Massive performance drops when docked on Linux  (Read 2900 times)

TimeMasterNix

  • Ensign
  • *
  • Posts: 1
    • View Profile
Massive performance drops when docked on Linux
« on: October 30, 2020, 10:52:15 PM »

Whenever I dock somewhere, my fps drops to 4 or 5 when the menu options are fading in, on shop screens, etc. I'm on a good computer and the game runs smoothly everywhere else. The performance percentage in the top left seems to stay around 10-15%, and this doesn't happen running it in Windows on the same machine. Has anyone else had this issue?

My specs are:
Intel i5 4590S
AMD RX 5600XT
8GB RAM (4G allocated to the game)
Manjaro Linux (tried a manual install and an AUR install)

UPDATE: Installing amdgpu-pro-libgl seems to have fixed it.
« Last Edit: October 30, 2020, 11:11:45 PM by TimeMasterNix »
Logged

Swanny

  • Ensign
  • *
  • Posts: 35
    • View Profile
Re: Massive performance drops when docked on Linux
« Reply #1 on: October 31, 2020, 09:10:29 AM »

Already a well known and reported issue on Starsector with the mesa amdgpu driver on Linux with an existing upstream bug 1265 but no progress. (Should update the bug with a new API trace since the original reporters appears to be dead.)

It looks to be unoptimised old OpenGL calls since Starsector only uses OpenGL 1.5 IIRC. Probably synchronised copies from GPU to CPU that hurt performance that could perhaps be improved within the games code too (without needing to use newer OpenGL version).

Anyway, there is a recent upstream merge request to improve display lists MR7078 that might help and also maybe MR6946.
Logged

Alex

  • Administrator
  • Admiral
  • *****
  • Posts: 24125
    • View Profile
Re: Massive performance drops when docked on Linux
« Reply #2 on: October 31, 2020, 10:31:05 AM »

Of note: you can mitigate this by opening up data/config/settings.json, and changing this:

"uiFadeSpeedMult":1,

To this:

"uiFadeSpeedMult":1000,

This'll functionally remove the fade transitions from the game.

... that could perhaps be improved within the games code too (without needing to use newer OpenGL version).

Thank you for all the good info! Any thoughts on how? The game is not doing much special there, just rendering the same things with different transparency, and a black quad over the entire screen. But it does the "quad over whole screen" thing elsewhere, too, so this is just... odd.
Logged

Swanny

  • Ensign
  • *
  • Posts: 35
    • View Profile
Re: Massive performance drops when docked on Linux
« Reply #3 on: October 31, 2020, 12:41:05 PM »

The fades have never seemed like something special that should cause a slowdown (and the only other slowdown is occasionally, but not always, when a mass swarm of fighters attack).

I'd only suggest the usual modern OpenGL performance recommendations which you probably already know. Had a good article bookmarked that I've now lost, but a quick search maybe the Intel Game Development document OpenGL Performance Tips: Avoid OpenGL Calls that Synchronize CPU and GPU or from Khronos Buffer Object Streaming but some suggestions might preclude keeping OpenGL 1.5 support as minimum without extensions.

I'll readd an apitrace or renderdoc to the mesa bug to track the calls at slowdown for the developers.
Logged

Alex

  • Administrator
  • Admiral
  • *****
  • Posts: 24125
    • View Profile
Re: Massive performance drops when docked on Linux
« Reply #4 on: October 31, 2020, 12:53:51 PM »

Thanks! Yeah, broadly aware of the "don't make round trips to the GPU" thing. A lot of the other generally-good advice isn't super applicable to 2D stuff, unfortunately...
Logged

Swanny

  • Ensign
  • *
  • Posts: 35
    • View Profile
Re: Massive performance drops when docked on Linux
« Reply #5 on: November 04, 2020, 06:18:43 AM »

Had an initial response from one of mesa developers on the apitrace;

Quote
The game has a few errors: glGetError and glBlendFunc are both called between a glBegin and a glEnd which causes GL_INVALID_OPERATION errors

So I assume the operations between the affected glBegin glEnd are discarded hence the FPS crash (and not the guess of a GPU to CPU sync). Mesa is more a correct OpenGL implementation as opposed to what the proprietary nVidia and ATI/AMD drivers might bodge around for in games.
Logged

Alex

  • Administrator
  • Admiral
  • *****
  • Posts: 24125
    • View Profile
Re: Massive performance drops when docked on Linux
« Reply #6 on: November 04, 2020, 10:53:28 AM »

Thank you for the added info!

Could you point me to how one would go about replaying/examining the api trace file you've posted there? Some preliminary poking around with apitrace didn't produce any results. I'm not sure if .zst is an archive that needs to be extracted somehow, or...?

That aside - hmm. I've looked at all the occurrences of glBlendFunc in the game (all 390 of them, sigh) and - aside from one errant and unnecessary call when drawing the hyperspace storm terrain - they all appear to be outside glBegin()/glEnd() calls. It's possible I've missed something, or that there's an unexpected combination of method calls that somehow leads to this, but at least I'm not seeing anything obviously wrong.

I also looked at, more specifically, the "showing dialog" transition; everything there seems to be in the right order; that is, the things it's doing in addition to the "regular" things do not have glBlendFunc() inside glBegin()/glEnd() calls. Which, I mean, the previous check of all glBlendFunc calls in the game would cover, but, just paying more careful attention here. Rendering stuff with a partial transparency also seems like it'd be unlikely to change the order of calls here...

The game also doesn't call glGetError() directly. AFAIK it's called from LWJGL's Display.update() method, but it's hard to see how that could ever end up inside glBegin()/glEnd() without causing major problems, and also how it might be related to the dialog transition.

So, yeah, if I can see more detail about the glBlendFunc call that somehow ends up inside glBegin/End, that would be very helpful! The glGetError call being there, though, it's hard to imagine how that could happen at all.

(Edit: just want to say that I really appreciate you taking the time to deep-dive this issue.)
« Last Edit: November 04, 2020, 11:05:38 AM by Alex »
Logged

Swanny

  • Ensign
  • *
  • Posts: 35
    • View Profile
Re: Massive performance drops when docked on Linux
« Reply #7 on: November 05, 2020, 03:06:21 AM »

Zstandard is a newer compression method, much faster decompression for same or better compression ratios. It's much easier to use different compression methods on Linux than Windows! I've uploaded a alternative zipped trace for you; https://www.improbability.net/apitrace/starsector-mesa-git20201031-issue1265.trace.zip

Not really used apitrace much myself, but to view errors in a trace with qapitrace;
  • Open the trace.
  • Run a replay
  • Errors from the replay are then shown at bottom.
  • Double click an error to jump to that frame and event/call.
  • Search/find up or down from that event/call.
  • Double click an event/call to see the GL state if desired

I see the major api error 2: GL_INVALID_OPERATION in Inside glBegin/glEnd errors corresponding to frames with fades where the FPS plummets but struggling to find the glBlendFunc calls between glBegin glEnd as reported by the mesa developer though. Doesn't seem to show what's causing the GL_INVALID_OPERATION so I think I'm missing some step here. Qapitrace than reports too many identical messages and stops reporting these errors.

We then see some glGetError(glEnd) = GL_INVALID_OPERATION errors at a lesser amount/frequency but still gets too many identical messages and stops reporting those errors too. Some glGetError(glEndList) = GL_INVALID_OPERATION also. However glGetError calls aren't listed in the events for some reason...
EDIT: Presuming the glGetError calls are done by qapitrace when replaying with error checking (the default), hence why not in the trace.

And for a profile;
  • Run a profile (which will run a replay and then open a new window with the profile).
  • Double click on a call (the ones where CPU duration is long, at least when replayed with mesa drivers for this bug perhaps) to jump to that event.

Renderdoc is newer with a more extensive debugger interface but only supports OpenGL 3.2 and newer unfortunately.

EDIT: Updated the mesa bug with my findings.
« Last Edit: November 05, 2020, 06:49:12 AM by Swanny »
Logged

Alex

  • Administrator
  • Admiral
  • *****
  • Posts: 24125
    • View Profile
Re: Massive performance drops when docked on Linux
« Reply #8 on: November 05, 2020, 09:40:56 AM »

Thank you! Hmm, getting an "error: unsupported trace format version 6" error trying to run "apitrace replay". But then it looks like you already did what I was going to do with this, anyway!

Regarding this:
glGetError(glEndList)

I'm assuming this means that a glEndList() call resulted in an error being set? I don't imagine the glEndList() method calls glGetError(), and, right, the game itself doesn't, so this is the only interpretation I can come up with.

The two ways this could happen, according to the doc, is:

1) if glEndList is called without a preceding glNewList, or if glNewList is called while a display list is being defined.

2) if glNewList or glEndList is executed between the execution of glBegin and the corresponding execution of glEnd.

Since the game uses a utility class to manage GL lists, it's pretty easy to instrument this a bit; so I've made sure that the first case definitely isn't happening. The second, again, looking at the code, I'm not seeing anything obvious as far as how that might happen...
Logged

Swanny

  • Ensign
  • *
  • Posts: 35
    • View Profile
Re: Massive performance drops when docked on Linux
« Reply #9 on: November 05, 2020, 10:19:39 AM »

Looks like the Windows binary build of apitrace is too old (7 versus 9 perhaps) unless you build from source. Darn.

Already agree with your deductions and can't see why GL_INVALID_OPERATION error is being set for the glBegin/glEnd and glNewList/glEndList during these fade wipes - can't see any invalid calls within each pair, can't see them being intermixed with other pairs and can't see any glEnd/glEndList without preceding glBegin/glNewList. Very odd.

Had already updated the mesa bug with these observations.
Logged

Alex

  • Administrator
  • Admiral
  • *****
  • Posts: 24125
    • View Profile
Re: Massive performance drops when docked on Linux
« Reply #10 on: November 05, 2020, 10:27:25 AM »

*thumbs up*
Logged

Swanny

  • Ensign
  • *
  • Posts: 35
    • View Profile
Re: Massive performance drops when docked on Linux
« Reply #11 on: November 28, 2020, 01:16:41 PM »

So, unfortunately it does look like the games fault and there are indeed glBlendFunc calls within glBegin glEnd.

On the mesa bug, after resolving qapitrace inserting glGetError calls in BadTM locations someone with better eyesight than me spotted a few examples of the issue. After getting advice on how to dump the apitrace as text for a few example frames, when exiting hyperspace for the two frames available below, I scripted a simple indent for glBegin glEnd and marked glBlendFunc calls within those with "ERROR". Makes it much easier to detect.

https://www.improbability.net/apitrace/starsector-frame848-frame849-indent-glblendfunc-errors.zip

For frame 848 it's only 23 out of 1117 glBlendFunc calls.
For frame 849 it's again only 23 out of 959 glBlendFunc calls.

So somewhere in those 390 calls is the problem. Sorry!  :'(
Logged

Alex

  • Administrator
  • Admiral
  • *****
  • Posts: 24125
    • View Profile
Re: Massive performance drops when docked on Linux
« Reply #12 on: November 28, 2020, 03:59:10 PM »

Ah - as I think I was saying earlier, there was an errant glBlendFunc() call in the hyperspace terrain renderer - so if this only happens when hyperspace is involved, it's probably that, right? And since the issue happens when entering/exiting markets etc, where no hyperspace is involved, it doesn't sound like it's the problem.

Edit: having a closer look at the traces you've provided, this does indeed look like the hyperspace terrain rendering code - the right blend parameters, and the right call structure - so I can definitely confirm that this would *not* be happening during, say, a normal market opening/closing transition.
« Last Edit: November 28, 2020, 04:01:25 PM by Alex »
Logged

Swanny

  • Ensign
  • *
  • Posts: 35
    • View Profile
Re: Massive performance drops when docked on Linux
« Reply #13 on: November 30, 2020, 06:21:24 AM »

Yes, looks like two separate issues;

1) The FPS crash when entering exiting hyperspace which should hopefully be fixed by the errant glBlendFunc call removal, thanks.

2) The slow menu transitions I'm not sure yet.
2A) Going from space directly to, for example, fleet management menu is slow but doesn't catch any breakpoints when running under gdb against _mesa_error. Might just be slow display lists on mesa.
2B) But going from space to docking at a planet and then when switching between menus does have glEnd GL_INVALID_OPERATION errors. Only about 10 per frame so probably another errant call, not necessarily another glBlendFunc.

Will look further later this week but the hyperspace bug was the most annoying!
Logged

Alex

  • Administrator
  • Admiral
  • *****
  • Posts: 24125
    • View Profile
Re: Massive performance drops when docked on Linux
« Reply #14 on: November 30, 2020, 09:32:32 AM »

1) The FPS crash when entering exiting hyperspace which should hopefully be fixed by the errant glBlendFunc call removal, thanks.

Ah, hmm - that would happen all the time (well, any time a storming cloud is visible), not just during the transition, so it doesn't sound like the glBlendFunc call is the issue here. I suspect that the glBlendFunc thing is a red herring.
Logged
Pages: [1] 2