Cuda freezes Max
#1
Hi!

I have gotten myself a new workstation, with a new RTX3080, hoping to benifit from the CUDA collision solver.
Unfortunately when opening the example scenes it is extremely slow (compared to turning CUDA off) and even worse: on some scenes it freezes MAX. So no crash, but one single CPU core stays at 100% (please see screenshot) while Max 2021 is just completely frozen. 
For example "tyFlow_cloth_CCCS_twister_001.max" from the sample scenes just freezes after 4 or 5 frames simulating. 
I have the latest TyFlow (tyFlow_016111), latest Cuda files, latest Nvidia driver. TyFlow Main Settings shows my RTX3080 as CUDA GPU listed. Also "?" sais it finds all DLL files placed and my system appears compatible. I also tested the "compatibility mode" checked and I also installed another Max Version (2020) - still the same.
I also placed the DLL files into Max root server as I researched here in the forum - still same freeze. 

It's a completely fresh installed workstation with lots of RAM, good CPU and the mentioned RTX3080.

Does anybody know whats going on?


Attached Files Thumbnail(s)
   
  Reply
#2
You don't need to put the DLLs in the max root anymore...as long as they're in the same location as the tyFlow DLO it'll find them.

Anyways, this is a tough one due to the limited availability of 30XX GPUs right now....I can't even buy one at the moment to test, so all debugging has to happen by proxy with users who send me info.

Can you open the tyProfiler from the editor right click menu (utilities), and then simulate a few frames (as many as you can before it freezes) and then expand the nodes in the tyProfiler to show where the slowdown is? (slower operations will be highlighted, allowing you to quickly determine which operations are the slowest). Then take a screenshot of which function is reported to be the slowest and post it here.

Next, can you open the Debugging rollout of the tyFlow and enable CCCS printouts. Then open the MAXScript listener and re-run the sim. It will run a lot slower, but you should see the MAXScript listener updating at least with CCCS info. Now wait until a frame where it normally would have "frozen"...is the MAXScript listener still updating with CCCS info? Or is everything completely frozen? If you don't mind, give it at least 5 minutes of non-updating MAXScript listener before concluding that it's totally frozen. Doing this part will help me determine if tyFlow is actually frozen, or if there's just an issue causing impact zones to take a really long time to solve.

Anyways, doing these things will hopefully allow me to track down closer to where the issue is.
  Reply
#3
many thanks for getting back!

Please have a look at the attached screenshots. I let the sim run for about 10minutes but nothing changed in the Listener so I had to end task again.
While investigating more I found online that Gamers used to have issues with their games frozen with some RTX 3080/3090. So I am nut sure what is going on here and if the issue is even TyFlow related???

Hope their is a solution.

thanks again, Tim


Attached Files Thumbnail(s)
       
  Reply
#4
So, all GPU benchmark tools work great (V-Ray, FurMark) - not sure if these benchmarks also test if CUDA is working? probably not. So something with TyFlow access to the 3080 CUDA seems to be off?
  Reply
#5
Thanks for the screenshot, TimJ...nothing looks out of the ordinary in the tyProfiler screenshot. However, the MAXScript screenshot shows the crash happening during repulse calculations, which is bizarre.

Can you turn on file logging in your tyFlow settings (debugging rollout), and then send me c:\tyFlow.log after it freezes? That will give me more information regarding which function is freezing...
  Reply
#6
Thanks Tyson.

Please find the log attached. 

Tim


Attached Files
.zip   tyFlow.zip (Size: 193.06 KB / Downloads: 22)
  Reply
#7
Hello Tyson,

could you somehow manage to investigate a bit more about the CUDA issue? still having the same freeze (after latest tyflow update) when activating CUDA.  I am also wondering why this one thread of my CPU keeps staying at 100% (all others at 0%) in task manager while this happens... (see screenshot in first post)

any update would be great.
much appreciated!

Tim

Hello Tyson,

could you somehow manage to investigate a bit more about the CUDA issue? still having the same freeze (after latest tyflow update) when activating CUDA.  I am also wondering why this one thread of my CPU keeps staying at 100% (all others at 0%) in task manager while this happens... (see screenshot in first post)

any update would be great.
much appreciated!

Tim
  Reply
#8
Hi Tim,

Unfortunately due to the shortage of 30XX cards there's nothing more that I can debug. I looked at your log but sadly the place it's freezing/crashing is related to CUDA, as opposed to somewhere else that I might have been able to look further into. I've thoroughly tested the CCCS on a wide range of non-30XX cards and none of them have presented issues (960ti, 1080ti, 2080ti, Quadro 8000, etc). So until I am able to get a 30XX card (which may be a long time at this point), my hands are tied.

If you need the CCCS, I would recommend downgrading to v0.16109, since that was the last build that used CUDA 10.2 which appears to be natively compatible with 30XX cards. The only issue is your 3ds Max startup time may be longer (1-2 minutes) since CUDA 10.2 requires JIT compilation during runtime (that's why I originally moved up to 11.2). More info about downgrading here:

http://docs.tyflow.com/faq/beta
  Reply
#9
Thanks for your effort Tyson!

Unfortunately downgrading to CUDA 10.2 and Tyflow 16090 still causes the same freeze! (log file attached) 
This is so frustrating...


Attached Files
.zip   tyFlow_02.zip (Size: 17.15 KB / Downloads: 19)
  Reply
#10
Hmm that's strange, could you attach the scene file? Or send to support@tyflow.com if you don't want it to be public...

Forgive me if you've already sent it...I can't recall if you have or not....
  Reply
#11
I tested "tyFlow_scenes_016" "Cloth_CUDA". 
The freeze happens pretty quick here "tyFlow_cloth_CCCS_twister_001.max". Just after 5 -10 frames. 
"tyFlow_cloth_CCCS_blobbyFill_001.max" is better, I can simulate about 60 - 100 frames until it freezes. All the other examples in the folder arer very slow (compared to turning CUDA off) but stable.

It's exactly the same with the new TyFlow Version and Cuda 11: same files are freeze max, all the other one in the folder slow, but no freeze
  Reply
#12
I see, well in the case of the sample files not working, it's still a mystery then...because the machines I listed where I tested things were machines where I ran through the sample files without any issues. So the only variable left in the equation once again is the GPU itself....

One last thing you could try, is to load up those same problem scenes and before simming, set the IZ/CG threshold values in the CCCS to be very, very high. Like 20000. That will prevent certain chunks of the sim from offloading to the GPU, and could potentially help further narrow down the problem.
  Reply
#13
hmm, not really... I even put the numbers to 999999 but same. Also changing Time step to "frame" will keep the same result. Are there any other settings I could try? like Repulse steps, greedy VRAM Usage or something? You can also log on my machine via teamviewer/anydesk for example, if you want to and if you think it could help. Thanks so much for your support Tyson!
  Reply
#14
I've send you a PM, TimJ Smile
  Reply


Forum Jump: