[问题] opencv的cuda +cuda 核函数的问题

楼主: su27 (su27)   2023-08-17 11:52:54
开发平台(Platform): (Ex: Win10, Linux, ...)
Win10
编译器(Ex: GCC, clang, VC++...)+目标环境(跟开发平台不同的话需列出)
VC++
额外使用到的函数库(Library Used): (Ex: OpenGL, ...)
opencv
Cuda
问题(Question):
我想将opencv的cuda数据处理完后
在自己写核函数去处理
结果发现10*10的数据
只有一行有 其他都是0
想问一下为什么会这样
我哪里写错了
谢谢
喂入的资料(Input):
[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9;
0, 1, 2, 3, 4, 5, 6, 7, 8, 9;
0, 1, 2, 3, 4, 5, 6, 7, 8, 9;
0, 1, 2, 3, 4, 5, 6, 7, 8, 9;
0, 1, 2, 3, 4, 5, 6, 7, 8, 9;
0, 1, 2, 3, 4, 5, 6, 7, 8, 9;
0, 1, 2, 3, 4, 5, 6, 7, 8, 9;
0, 1, 2, 3, 4, 5, 6, 7, 8, 9;
0, 1, 2, 3, 4, 5, 6, 7, 8, 9;
0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
预期的正确结果(Expected Output):
[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9;
0, 1, 2, 3, 4, 5, 6, 7, 8, 9;
0, 1, 2, 3, 4, 5, 6, 7, 8, 9;
0, 1, 2, 3, 4, 5, 6, 7, 8, 9;
0, 1, 2, 3, 4, 5, 6, 7, 8, 9;
0, 1, 2, 3, 4, 5, 6, 7, 8, 9;
0, 1, 2, 3, 4, 5, 6, 7, 8, 9;
0, 1, 2, 3, 4, 5, 6, 7, 8, 9;
0, 1, 2, 3, 4, 5, 6, 7, 8, 9;
0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
错误结果(Wrong Output):
0 ,1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,
0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,
0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,
0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,
0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,
0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,
0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,
0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,
0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,
0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,
程式码(Code):(请善用置底文网页, 记得排版,禁止使用图档)
__global__ void test1_Kernel(char* src_image,int length)
{
int y = blockIdx.y * blockDim.y + threadIdx.y;
int x = blockIdx.x * blockDim.x + threadIdx.x;
int index = 10 * y + x;
if (index != 0)
{
return;
}
for (int i = 0; i < 10; i++)
{
printf("\n");
for (int j = 0; j < 10; j++)
{
printf("%d ,", (int)src_image[i*10+j]);
}
}
}
void test1_withCuda()
{
cv::Mat src_image = Mat::zeros(Size(10,10), CV_8UC1);
for (int i = 0; i < 10; i++)
{
for (int j = 0; j < 10; j++)
{
src_image.data[i * 10 + j]=j;
}
}
//capture >> src_image;
cv::Mat dst_image;
cv::cuda::GpuMat d_src_img;
cv::cuda::GpuMat d_dst_img;
cout << src_image << endl;
d_src_img.upload(src_image);
int size_temp = d_src_img.rows * d_src_img.cols;
cout << size_temp << endl;
test1_Kernel << <10, 10 >> > ((char*)d_src_img.data, size_temp);
cudaDeviceSynchronize();
waitKey(0);
}

补充说明(Supplement):
作者: ilove88th (Denpa-Girl)   2023-11-14 22:08:00
你的 kernel 后面冒出的 int i 跟 j 怎么来的?跟 x 还有 y 完全没关系啊 ….
作者: GTX9487 (Volta)   2023-11-14 23:48:00
正确做法是从 host copy 到 device再从 device 丢回来 host 再 print

Links booklink

Contact Us: admin [ a t ] ucptt.com