photutils icon indicating copy to clipboard operation
photutils copied to clipboard

Fast build ellipse model

Open HuanLe02 opened this issue 3 years ago • 6 comments

Hello! This is a pull request in response to #1168 for a faster build_ellipse_model() function, using C and OpenMP. Changes are

  • model.py: (1) changed data structure for result and weight from 2D numpy array to a 1D C-native array (2) converted all auxiliary numpy arrays (intens_array, eps_array, etc) into C arrays (3) ported the for loop into C code
  • tests/test_model.py: (1) Add new test for a multithreaded version of the function (2) Add new test for processing a large array (5000x5000), shows that result is returned very fast
  • worker.so: compiled file from C code

Here is the C code for the loop (worker.c):

#include <omp.h>
#include <math.h>
#include <stdlib.h>

#define PI 3.1415926535897932384626433

// MAX FUNCTION
static double max(double a, double b)
{
  if (a > b) return a;
  else return b;
}

// radius function, obtained from code of EllipseGeomtry
// https://photutils.readthedocs.io/en/stable/_modules/photutils/isophote/geometry.html#EllipseGeometry
static double radius(double angle, double sma, double eps)
{
  double val = (sma * (1. - eps) /
                sqrt(pow(((1. - eps) * cos(angle)),2) +
                        pow((sin(angle)),2)));
  return val;
}

// WORKER FUNCTION
void worker(double* result, double* weight, int n_rows, int n_cols, int N, int high_harmonics,
		double* fss_arr, double* intens_arr, double* eps_arr, double* pa_arr,
		double* x0_arr, double* y0_arr, double* a3_arr, double* b3_arr, double* a4_arr, double* b4_arr,
		int nthreads)
{
  int i;
  int n_entries = n_rows * n_cols;

  omp_set_num_threads(nthreads);

  #pragma omp parallel default(shared) private(i)
  {
    // private arrays for each thread to write, calloc auto initializes to 0.0
    double* result_priv = calloc(n_entries, sizeof(double));
    double* weight_priv = calloc(n_entries, sizeof(double));

    #pragma omp for schedule(static) nowait
    for (i = 1; i < N; i++) {
      // variables, obtained at array[index]
      double sma0 = fss_arr[i];
      double intens = intens_arr[i];
      double eps = eps_arr[i];
      double pa = pa_arr[i];
      double x0 = x0_arr[i];
      double y0 = y0_arr[i];

      // Ellipse geometry, _phi_min (defined in original code as 0.05)
      // https://photutils.readthedocs.io/en/stable/_modules/photutils/isophote/geometry.html#EllipseGeometry
      double _phi_min = 0.05;

      // scan angle
      double r = sma0;
      double phi = 0.0;
      while (phi <= 2*PI + _phi_min) {
        double harm = 0.0;
        if (high_harmonics) {
          harm = (a3_arr[i] * sin(3.0*phi) +
          	b3_arr[i] * cos(3.0*phi) +
          	a4_arr[i] * sin(4.0*phi) +
          	b4_arr[i] * cos(4.0*phi)) / 4.0;
        }
        double x = r * cos(phi + pa) + x0;
        double y = r * sin(phi + pa) + y0;
        int i = (int) x;
        int j = (int) y;
        double fx, fy;

        if (i > 0 && i < n_cols-1 && j > 0 && j < n_rows-1) {
          // fractional deviations
          fx = x - i;
          fy = y - j;

          // transform 2D index to 1D index on a flattened array
          // ex: j_i = [j,i], j1_i = [j+1,i]
          int j_i = i + j*n_cols;
          int j_i1 = (i+1) + j*n_cols;
          int j1_i = i + (j+1)*n_cols;
          int j1_i1 = (i+1) + (j+1)*n_cols;

          // isophote contribution
          result_priv[j_i] += (intens + harm) * (1.0 - fy) * (1.0 - fx);
          result_priv[j_i1] += (intens + harm) * (1.0 - fy) * fx;
          result_priv[j1_i] += (intens + harm) * fy * (1.0 - fx);
          result_priv[j1_i1] += (intens + harm) * fy * fx;

          // fractional area contribution
          weight_priv[j_i] += (1.0 - fy) * (1.0 - fx);
          weight_priv[j_i1] += (1.0 - fy) * fx;
          weight_priv[j1_i] += fy * (1.0 - fx);
          weight_priv[j1_i1] += fy * fx;

          // step next pixel on ellipse
          phi = max((phi + 0.75/r), _phi_min);
          r = max(radius(phi, sma0, eps), 0.5);
        }
        else {
          break;
        }
      }
    }

    // write to main array, one thread at a time
    #pragma omp critical
    {
      for (int ct = 0; ct < n_entries; ct++) {
        result[ct] += result_priv[ct];
	weight[ct] += weight_priv[ct];
      }
    }

    // free memory
    free(result_priv);
    free(weight_priv);
  }  
}

HuanLe02 avatar Jan 31 '22 21:01 HuanLe02

Hello @huancho9802! Thanks for opening this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 35:5: W191 indentation contains tabs Line 35:5: E101 indentation contains mixed spaces and tabs Line 91:1: W293 blank line contains whitespace Line 109:1: W293 blank line contains whitespace Line 124:1: W293 blank line contains whitespace Line 125:6: E114 indentation is not a multiple of four (comment) Line 125:6: E116 unexpected indentation (comment) Line 136:1: W293 blank line contains whitespace Line 137:18: W291 trailing whitespace

Line 47:1: W293 blank line contains whitespace Line 48:1: E302 expected 2 blank lines, found 1 Line 65:1: W293 blank line contains whitespace Line 94:1: E302 expected 2 blank lines, found 1 Line 109:1: W293 blank line contains whitespace

pep8speaks avatar Jan 31 '22 21:01 pep8speaks

It seems like the ReadTheDocs build failed because of my worker.so file (OSError: shared object file not found). Can you help me with packaging the .so file? This is my first time doing this so I really appreciate any help. Thanks!

HuanLe02 avatar Feb 01 '22 15:02 HuanLe02

Thanks, @huancho9802. @ibusko, can you help with this PR?

larrybradley avatar Feb 07 '22 17:02 larrybradley

@larrybradley Unfortunately, I don't have much experience with CI, especially with packaging C libraries for ReadTheDocs. It would take me a while to get into that, and I am currently booked solid with JETC work. Next week I go back to work with the Indigo team, maybe we can carve out some time them?

ibusko avatar Feb 07 '22 19:02 ibusko

It seems like the ReadTheDocs build failed because of my worker.so file (OSError: shared object file not found). Can you help me with packaging the .so file? This is my first time doing this so I really appreciate any help. Thanks!

My memory from a couple of years ago dealing with a (possibly) similar issue is that ReadTheDocs simply won't install Python extension modules (i.e., modules based on compiled C/C++ code).

The problem I had was that I had Sphinx generating API-based HTML documentation using Sphinx autodoc commands; this would fail on ReadTheDocs for any module that tried to import the extension module, because the latter wasn't present in the ReadTheDocs installation.

(I eventually figured out a kludge where I would generate documentation normally on my laptop, and then repackage the API-based HTML output into "raw HTML" that could be used by Sphinx on ReadTheDocs in place of the autodoc-related commands.)

perwin avatar Feb 08 '22 10:02 perwin

Hello all! Thank you so much for taking a look at my PR. I'm aware that the auto build tools fail to build C code and recognize the binary *.so file. I notice that a lot of the C-extensions in the package are written in Cython, so should I migrate my function to Cython instead of pure C?

If yes then I will definitely take a look at Cython in the future, though I'm in school right now so it may take a while. For the mean time, you can take a look at the demo at https://github.com/huancho9802/fast-build-ellipse-demo and to see the speedup compared to pure Python code.

Thanks!

HuanLe02 avatar Feb 16 '22 16:02 HuanLe02