CUDA学习笔记

现在的位置: 首页 > 综合 > 正文

RSS

2014年02月16日 ⁄ 综合 ⁄ 共 1635字 ⁄ 字号小中大 ⁄ 评论关闭

CUDA中：CPU和系统内存当作host，GPU与显存当作device

__global__ 限定词通知编译器这个函数应该被编译在device上运行而不是host

CUDA C需要语言的方法来标记函数为device code（CUDA C needed a linguistic method for marking a function
as device code）。

cudaMalloc()与标准C中的malloc类似，但是它告诉CUDA runtime在device中分配内存

blockIdx

This variable is of type uint3 (see Section B.3.1) and contains the block index within the grid.

It contains the value of the block index for whichever block is currently running the device code.

blockidx包含了当前运行device code的块索引

Each block within the grid can be identified by a one-dimensional, two-dimensional, or three-dimensional index accessible within the kernel through the built-in blockIdx variable

For example, if we launch with kernel<<<2,1>>>(), you can think of the runtime creating two copies of the kernel and running them in parallel.

When we launched the kernel, we specified N as the number of parallel blocks.We call the collection of parallel blocks a grid.

并行块的集合称为网格(grid)

The execution configuration is specified by inserting an expression of the form <<< Dg, Db, Ns, S >>> between the function name and the parenthesized argument list, where:

 Dg is of type dim3 (see Section B.3.2) and specifies the dimension and size of the grid, such that Dg.x * Dg.y * Dg.z equals the number of blocks being launched; Dg.z must be equal to 1 for devices of compute capability 1.x;

 Db is of type dim3 (see Section B.3.2) and specifies the dimension and size of each block, such that Db.x * Db.y * Db.z equals the number of threads per block;

 Ns is of type size_t and specifies the number of bytes in shared memory that is dynamically allocated per block for this call in addition to the statically allocated memory; this dynamically allocated memory is used by any of the variables declared as an
external array as mentioned in Section B.2.3; Ns is an optional argument which defaults to

Ns是size_t类型，它指定了每个block除了静态分配的memory外，在shared memory中动态分配的字节数

【上篇】~~VC下使用Skin++导致右键菜单出错的解决~~
【下篇】Android 技术: 追踪vm 如何调用基础类

作者: nonwoven

该日志由 nonwoven 于10年前发表在综合分类下，最后更新于 2014年02月16日.
转载请注明: CUDA学习笔记 | 学步园 +复制链接

抱歉!评论已关闭.

学步园

CUDA学习笔记

作者: nonwoven

书签

最新文章New

本站推荐

返回首页