CUDA synchronize function fails during long running kernel

I'm using PyCuda to run a kernel that is expected to take at least two hours to complete, but it is failing after around one hour with the simple error of:

pycuda._driver.Error: cuCtxSynchronize failed: unknown error

I'm using Windows, and I added the registry key TdrDelay and set it to 120000000 to ensure that Windows is not timing out my kernel.

This error doesn't happen when I adjust the parameters of the kernel so it is expected to complete in about 30 minutes. Why could the synchronize call be failing after the kernel has run for a long time?

Could my graphics card be overheating and preemptively terminating the kernel? Could there be a CUDA setting that terminates a kernel if it runs for too long? Could running the kernel in NVidia Visual Profiler help figure out what the problem might be?