pennylane-lightning SX gate on lightning

Before submitting

Please complete the following checklist when submitting a PR:

[x] All new features must include a unit test. If you've fixed a bug or added code that should be tested, add a test to the tests directory!
[ ] All new functions and code must be clearly commented and documented. If you do make documentation changes, make sure that the docs build and render correctly by running make docs.
[x] Ensure that the test suite passes, by running make test.
[ ] Add a new entry to the .github/CHANGELOG.md file, summarizing the change, and including a link back to the PR.
[x] Ensure that code is properly formatted by running make format.

When all the above are checked, delete everything above the dashed line and fill in the pull request template.

Context: Internal assignment.

Description of the Change: SX gate implementation in lightning.qubit C++ backend

Benefits: Improve the computation speed of the SX gate using lightning.qubit C++ backend

Possible Drawbacks:

Related GitHub Issues: Close #710

May 16 '24 15:05 LuisAlfredoNu

Codecov Report

Attention: Patch coverage is 99.52153% with 1 line in your changes missing coverage. Please review.

Project coverage is 96.80%. Comparing base (e1572f5) to head (e353646). Report is 57 commits behind head on master.

Files with missing lines	Patch %	Lines
...tning_qubit/gates/tests/Test_OpToMemberFuncPtr.cpp	0.00%	1 Missing :warning:

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #731      +/-   ##
==========================================
- Coverage   97.76%   96.80%   -0.97%     
==========================================
  Files         233      267      +34     
  Lines       39186    44685    +5499     
==========================================
+ Hits        38312    43257    +4945     
- Misses        874     1428     +554

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:

:snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

May 16 '24 16:05 codecov[bot]

Thank you @AmintorDusko and @maliasadi for your comments I will work on your requests and try to answer your questions. 😄

May 23 '24 18:05 LuisAlfredoNu

Hi @AmintorDusko @maliasadi , I have already made the requested modifications. However, I made all the changes in my local repo and then push the commits but I do not know if this was the correct procedure or if it was necessary to add the commits from the actual conversation. 😅

May 25 '24 00:05 LuisAlfredoNu

Hi @AmintorDusko @maliasadi , I have already made the requested modifications. However, I made all the changes in my local repo and then push the commits but I do not know if this was the correct procedure or if it was necessary to add the commits from the actual conversation. 😅

No worries. Both ways are acceptable.

May 27 '24 11:05 AmintorDusko

The plot shows the performance between the current implementation 0.44-dev and the previous version 0.43-dev:

As we can see, all the devices improved except LQ. This issue could be related to memory bounds; however, further research is needed to find the source of the decent performance.

The testing code it was the following:

def test_SX(wires):
    """Test the SX gate performance."""

    dev = qml.device(device_name, wires=wires)

    def circuit():
        
        for layer in range(int(5000/wires)):
            [qml.SX(i) for i in range(wires)]
        
        return qml.expval(qml.PauliZ(0))
    

    qnode = qml.QNode(circuit, dev, grad_on_execution=False, diff_method=None)
    result = qnode()

    return result

Jan 08 '25 22:01 LuisAlfredoNu

The plot shows the performance between the current implementation 0.44-dev and the previous version 0.43-dev:

As we can see, all the devices improved except LQ. This issue could be related to memory bounds; however, further research is needed to find the source of the decent performance.

The testing code it was the following:
def test_SX(wires):
    """Test the SX gate performance."""

    dev = qml.device(device_name, wires=wires)

    def circuit():
        
        for layer in range(int(5000/wires)):
            [qml.SX(i) for i in range(wires)]
        
        return qml.expval(qml.PauliZ(0))
    

    qnode = qml.QNode(circuit, dev, grad_on_execution=False, diff_method=None)
    result = qnode()

    return result

@maliasadi After diving into the code, I got the explanation about the performance regression on LQ:

LQ has an implementation of AVX kernels that allows to alignment of the gate computation for vectorization. However, this alignment is enabled by default, therefore the performance measurement of this implementation against the master branch, using the standard compilation process, results in an unfair comparison. The correct way to compare this implementation is using the cmake flag ENABLE_GATE_DISPATCHER=OFF in master branch compilation. The following plot shows a fair comparison between master and the SX implementation.

As we can see, the native support of the gate improves the performance of LQ. A further action will be adding the AVX kernel for 'SX' gate.

Jan 27 '25 00:01 LuisAlfredoNu

Awesome! Happy to see you got it @LuisAlfredoNu! That's what we needed :1st_place_medal:

Jan 28 '25 03:01 maliasadi

SX gate on lightning_qubit

Before submitting

Codecov Report