Several issues with CLion building and CUDA
1. How can I see every command that Cmake produces?
a) UI says "Build options" Arguments after '--' are passed on to the build, other arguments are CMake command line parameters. Default options depend on the the toolchain's environment."
This is very unclear to me. When I put in --trace, I just get an error. That does not work.
b) In the "CMake options" field - default values are displayed which disappear if you type anything into the field. How can I update this field?
2. Where is the actual CMake/make log file? I see the following files
.ninja_log
CMakeFiles/CMakeOutput.log
CmakeFIles/clion-log.txt
but I am looking for a list of all of the commands that get executed to compile and link my code.
3) Editing CMakeFiles.txt in the top level directory to try to change compiler options seems to have no effect on the CMakeCache files that get put in the cmake-build-release and cmake-build-debug directories. How can I effect the compiler options? For example, I want to use "-Ofast" on the gcc command line. How??
Please sign in to leave a comment.
I have partially answering my own question. If "--verbose" is passed to cmake, then most of the information I was looking for is output. Also, the ninja_log file shows you what options are being passed to cmake.
Although CLion is a great tool, it makes heavy use of CMake. Since CMake and I have never seen eye-to-eye, that is part of my frustration. The option I needed to pass to nvcc is "-Xcompiler -Ofast". It is important to note that if both -O3 and -Ofast are passed to gcc, then it ignores the -Ofast.
Finally, I am surprised that the current gcc optimizes rather poorly with -O3. The -Ofast option produces much better code. -Ofast is a collection of math-related optimizations. I will do more testing to see which ones are actually making a difference in my code, which boils down to testing the speed of multiplications using std::complex<float> versus writing out the real and imaginary parts separately. Without -Ofast, there is a factor of 6-7 difference in speed, which is pretty alarming. With -Ofast, they are the same speed, which is what I was looking for. A couple of possibilities are explicit NaN processing and stack utilization...