TornadoVM icon indicating copy to clipboard operation
TornadoVM copied to clipboard

Example code producing memory access fault for array size 1024*8, works for 1023*8

Open chopikus opened this issue 1 year ago • 5 comments
trafficstars

Describe the bug

Running the example program produces an error for big enough arrays.

Program:

public class App 
{
    public static void parallelInitialization(VectorFloat8 data) {
        for (@Parallel int i = 0; i < data.size(); i++) {
            int j = i * 8;
            data.set(i, new Float8(j, j + 1, j + 2, j + 3, j + 4 , j + 5 , j + 6, j + 7));
        }
    }

    public static void computeSquare(VectorFloat8 data) {
        for (@Parallel int i = 0; i < data.size(); i++) {
            Float8 item = data.get(i);
            Float8 result = Float8.mult(item, item);
            data.set(i, result);
        }
    }

    public static void main( String[] args ) {
        VectorFloat8 array = new VectorFloat8(1024 * 8);
        TaskGraph taskGraph = new TaskGraph("s0")
                .transferToDevice(DataTransferMode.EVERY_EXECUTION, array)
                .task("t0", App::parallelInitialization, array)
                .task("t1", App::computeSquare, array)
                .transferToHost(DataTransferMode.EVERY_EXECUTION, array);

        TornadoExecutionPlan executionPlan = new TornadoExecutionPlan(taskGraph.snapshot());

        // Obtain a device from the list
        TornadoDevice device = TornadoExecutionPlan.getDevice(0, 0);
        executionPlan.withDevice(device);

        // Put in a loop to analyze hotspots with Intel VTune (as a demo)
        for (int i = 0; i < 1000; i++ ) {
            // Execute the application
            executionPlan.execute();
        }
    }
}

Running mvn package and tornado -jar [JARFILE] produces an error:

Memory access fault by GPU node-1 (Agent handle: 0x7fe80076f230) on address 0x7fe614c00000. Reason: Page not present or supervisor privilege.

However if I change the size of array to 1023*8 instead of 1024*8 the error is gone.

How To Reproduce

I put my code into a repository: https://github.com/chopikus/my-tornado-app.

Steps:

  1. git clone https://github.com/chopikus/my-tornado-app.git
  2. cd my-tornado-app
  3. ./run.sh

Expected behavior

No errors should be produced

Computing system setup (please complete the following information):

  • Fedora 40
  • ROCm runtime version 1.13
  • Radeon 680M GPU on Ryzen 7 PRO 6850U
  • tornado --version: version=1.0.7-dev, branch=master, commit=96b3040; Backends installed: opencl
  • tornado -version: java version "21.0.4" 2024-07-16 LTS; Java(TM) SE Runtime Environment (build 21.0.4+8-LTS-274); Java HotSpot(TM) 64-Bit Server VM (build 21.0.4+8-LTS-274, mixed mode)

Additional context

tornado --devices:

WARNING: Using incubator modules: jdk.incubator.vector

Number of Tornado drivers: 1
Driver: OpenCL
  Total number of OpenCL devices  : 1
  Tornado device=0:0  (DEFAULT)
	OPENCL --  [AMD Accelerated Parallel Processing] -- gfx1035
		Global Memory Size: 4.0 GB
		Local Memory Size: 64.0 KB
		Workgroup Dimensions: 3
		Total Number of Block Threads: [256]
		Max WorkGroup Configuration: [1024, 1024, 1024]
		Device OpenCL C version: OpenCL C 2.0

chopikus avatar Jul 25 '24 23:07 chopikus