Loading...
Multiply every element of a 1D array by a runtime scalar: out = x * scale.
Scalars are passed as ordinary kernel arguments — no device pointer needed — and used directly inside the kernel.
Implement mul_kernel and run(n=1024, scale=3.0) returning whether the output equals x * scale.
Your solution must define a top-level function run(...) that allocates the inputs, copies them to the GPU, launches your @cuda.jit kernel, and returns a Python bool from np.allclose(gpu_result, reference). The grader prints run(...); the expected output is True.
n = 1024, scale = 3.0
True
n = 1024 and scale = 3.0, representing the size of the 1D array and the scalar multiplier, respectively.x of size n is created, and its elements are multiplied by the scale factor using the mul_kernel function, resulting in an output array out where each element is calculated as outi=xi⋅scale.out is compared to the reference array, which is also calculated as x⋅scale, using np.allclose to check for equality within a tolerance.mul_kernel function correctly multiplies each element of the array by the scale factor, the comparison returns True, indicating that the output array matches the reference array.