NVIDIA graphic card and CUDA architecture pairs
Matching SM architectures (CUDA arch and CUDA gencode) for various NVIDIA cards I’ve seen some confusion regarding NVIDIA’s nvcc sm flags and what they’re used for: When compiling with NVCC, the arch flag (‘ -arch ‘) specifies the name of the NVIDIA GPU architecture that the CUDA files will be compiled for. Gencodes (‘ -gencode ‘) allows for more PTX generations, and can be repeated many times for different architectures. When should different ‘gencodes’ or ‘cuda arch’ be used? When you compile CUDA code, you should always compile only one ‘ -arch ‘ flag that matches your most used GPU cards. This will enable faster runtime, because code generation will occur during compilation. If you only mention ‘ -gencode ‘, but omit the ‘ -arch ‘ flag, the GPU code generation will occur on the JIT compiler by the CUDA driver. When you want to speed up CUDA compilation, you want to reduce the amount of irrelevant ‘ -gencode ‘ flags. However, sometimes you may wish to have better CUDA back...