pennylane icon indicating copy to clipboard operation
pennylane copied to clipboard

Create `How to develop PennyLane for 4 different ML frameworks` documentation page in the developer hub

Open antalszava opened this issue 2 years ago • 0 comments

Feature details

There are many subtleties that come up when developing the different Machine Learning frameworks that PennyLane supports. It would be great to add a new page in the developer documentation that notes these details.

  1. Trainability: how does each framework mark trainability? A) Autograd: no way to mark, argnum argument can be passed to autograd.grad/autograd.jacobian; PennyLane patches Autograd's NumPy to allow marking tensors trainable via requires_grad. B) Torch: tensors can be marked as trainable via requires_grad. C) TensorFlow: only tracks trainability if being watched by a gradient tape. Although, there's a trainable argument that can be passed to tf.Variable, it is True by default and such tensors still require a gradient tape for differentiation. Non-trainable tensors can be created via tf.constant. D) JAX: no way to mark, argnums argument can be passed to jax.grad/jax.jacobian. JAX creates Tracer objects under the hood for those arrays marked using the argnums argument.

  2. Marking trainability is connected to the ways in how the user interface for doing a forward pass and backward pass with the ML framework is: A) Functional: Autograd and JAX provide a functional user interface where the function to be differentiated is passed to autograd.grad/autograd.jacobian & jax.grad/jax.jacobian and then the obtained gradient function is called with the input arguments. B) Object-oriented: TensorFlow and PyTorch provide an object-oriented user interface where parameters are first marked as trainable, then the function is called to compute the forward pass and eventually the backward pass can also be computed to obtain gradients.

  3. Writing tests:

  • Naively thinking about each framework first implies creating 4 or more different test cases for each framework. Although one could leverage pytest's parameterization, creating separate test cases may still be arguably the cleanest solution because of how different each framework's UI is.
  • When computing gradients with TensorFlow, Jacobians should only be computed if absolutely necessary for the use case due to a potentially degraded performance with for example diff_method="parameter-shift" (see Gotchas section).
  1. Gotchas:
  • Due to the trainability characteristics of JAX, certain PennyLane features may require an additional argnum/argnums argument. Functions that extract trainability information about some input parameters will require such an argument for JAX support. See the example of qml.metric_tensor that would require an argnums keyword to work without having to call a JAX transform that creates arrays that are perceived as trainable by PennyLane (https://github.com/PennyLaneAI/pennylane/issues/1880).
  • It was found that TensorFlow's tape.jacobian syntax yields poor performance with its default experimental_use_pfor=True when using diff_method="parameter-shift" (see https://github.com/PennyLaneAI/pennylane/pull/1869).
  • For higher-order derivatives with TensorFlow, experimental_use_pfor=False has to be set for each "internal" tapeN.jacobian call with the "outermost" call allowing both allowing experimental_use_pfor=False or experimental_use_pfor=True.

E.g.,

        with tf.GradientTape() as tape1:
            with tf.GradientTape(persistent=True) as tape2:
                res = circuit(x)
            g = tape2.jacobian(res, x, experimental_use_pfor=False) # <--- Needs experimental_use_pfor=False

        hess = tape1.jacobian(g, x) # <--- Can use experimental_use_pfor=True

Implementation

No response

How important would you say this feature is?

2: Somewhat important. Needed this quarter.

Additional information

No response

antalszava avatar Aug 12 '22 14:08 antalszava