OpticSim.jl
OpticSim.jl copied to clipboard
Raytrace Forking Strategy
This issue has come up in the discussion about adding polarization and I had been reading some of the other issues relating to this in regards to r & t. For now, I am not advocating and specific changes or improvements, but rather want to have a single place for this discussion.
My main concern is that the current stochastic branching between a reflection and transmission and its extension to e & o rays of polarization or orders a grating is not the most effective way to address these problems. The is not to criticize the choice but rather I want to ask if it positions the code for solving the largest number of problems and how can we guide implementation to be most useable.
My main points are:
- It would be just as easy to run two "experiments" with each one tracing a branch and then combine the results.
- The quantity needed to determine the branching must be calculated. It could just as easily be carried through to modify the final result of a single fork. (for example carrying along Reflectivity and then associating it with the ray rather than counting rays at the end).
- Comparison to a random number is not really free because it causes decisions to be made during execution that might impact the optimizations that could be done especially with regard to the data structures that hold the results.
- Most importantly, it removes control from the user on the statistics of the data from each branch. Why not just let the user pick how many rays in each branch?
To me, it would seem that if I had two branches and one was the primary and one the secondary, it would be better to calculate enough rays for the main branch to get the accuracy I needed and the set a smaller number for the secondary branch as this might be only for unwanted light or system monitoring and not need the accuracy of the first. Or maybe it was orders of a grating and I need the same number for each to determine the resolution even though the intensities are vastly different.
I also see a need at some point for a raytrace that doesn't just branch but forks and both parts are traced. This has specific application to imaging and interferometry applications where coherent combinations of the two paths are combined including the phase information to generate quantities like the PSF or fringe intensity. The implementation of this kind of raytrace should not replace the raytrace for deterministic paths, but rather be an enhancement to enable modeling of specific imaging (coherent ghosting for example), interferometry and birefringent systems. Such a raytrace might also be used for scattered light, although that probably doesn't need the phase information required of the other cases.
Thoughts?
Dear @mattderstine
I also see a need at some point for a raytrace that doesn't just branch but forks and both parts are traced. This has specific application to imaging and interferometry applications where coherent combinations of the two paths are combined including the phase information [..]
let me repeat the question from the former thread: why do you think such a combination is impossible with the current stochastic raytrace (after adding phase information etc indeed)? In fact this is exactly why I wish to use OpticSim in most of my use cases, and I do not see the problem.
- Most importantly, it removes control from the user on the statistics of the data from each branch. Why not just let the user pick how many rays in each branch?
I believe this is a good point, the only necessary enhancement is to add the means how to control this by the user, perhaps by some additional interface properties, but why not. You know from my #276 reasoning that I wish this to be, in line with pre-#276 OpticSim, by default decided automatically based on pow. But changing it to user-overriden fraction shall be a minor change.
I would like to focus of this discussion to be around point 3. The others are really just explanations of the problems with a stochastic approach. These have implications for issues #266, #267 and #141 as well as the advanced topics of #276. I'd like to understand what would help provide the underpinning for progress on those problems.
With point 3 I fully agree. However, seeded PRNG shall solve for that perfectly, right?
@hrobotron. No.Item 3 is not about the random number. It is about the uncertainty and complexity of the data structure needed to capture the raytrace data. I'd like to hear from the software experts on the costs associated with this. If the raytrace is along a fixed path, the data structures can be static and preallocated. This is not the case for raytraces that fork dynamically. The costs for doing this for a dyanamic fork are real, but are they large? For my implementations, they are but I'm not a world class coder. Perhaps there is a way. This is what item 3 and this issue is about.
@hrobotron. No.Item 3 is not about the random number. It is about the uncertainty and complexity of the data structure needed to capture the raytrace data. I'd like to hear from the software experts on the costs associated with this. If the raytrace is along a fixed path, the data structures can be static and preallocated. This is not the case for raytraces that fork dynamically. The costs for doing this for a dyanamic fork are real, but are they large? For my implementations, they are but I'm not a world class coder. Perhaps there is a way. This is what item 3 and this issue is about.
OK, I misunderstood, I thought you were talking about optimization of optical assembly, not the raytracer itself. But your reference to data structures makes this clear. I am of no use for this point at the moment, first I need to familiarize with OpticSim design to see how actually the rays and results are allocated and stored.
There are several issues packaged together here.
- should the code should be refactored so users can choose deterministic or stochastic ray tracing in a convenient way.
- how does stochastic branching interact with polarized computations in particular.
Currently, the stochastic branching decision logic is buried in the process intersection functions for the different types of interfaces. This is inconvenient from a user perspective and maybe from a software engineering one as well. Changing from stochastic to deterministic tracing, or allowing both reflection and refraction paths to be taken simultaneously, requires rewriting each processintersection function.There are currently 4 processintersection functions that matter for this topic
@cdhf submitted PR #101 which allowed the user to deterministically specify whether to take the reflected or refracted path at a FresnelInterface. But this only works for FresnelInterface. Not sure if this would even apply to the grating and hologram interface types.
Regardless, this is clearly a stopgap and not satisfying as a long term general solution. There are other unintuitive/hacky aspects to the FresnelInterface pointed out by @cdhf in issue #174. But many thanks to @cdhf for doing the code spelunking necessary to make this temporary deterministic path fix work.
We could abstract the code which makes the stochastic/deterministic decision into a function independent of the processintersection functions, let's call it choosepath for now. Then we could extend the interface specification for OpticalInterface to include a flag for choosing stochastic/deterministic/forking ray tracing on a per surface basis, as is currently possibly only for FresnelInterface due to PR #101. Each processintersection function would be pass the flag along with the power in the reflected and refracted paths to the choosepath function which would return the appropriate rays. If the stochastic flag is set the path is chosen randomly according to power (any compelling reason to allow a user defined function to make the choice?). If the reflected/refracted flag is set then the specified path is always taken. And if the all flag is set then both paths are taken.
However allowing this choice to be made on a per surface basis makes it difficult, if not impossible, to properly compute power if different choices are made at different surfaces. For Monte Carlo sampling the power along a path is proportional to the source power*probability(path).
For all-path and deterministic sampling the power takes into account attenuation at each surface and is stored in the power for each ray. The system currently currently computes this and stores it in the propagating ray even though we don't use this value.
But if you mix stochastic and deterministic/all-path, with some surfaces having one type and some having another, it's not at all clear how you unscramble the values to get a realistic power estimate. Perhaps there should be a warning in the documentation that it's up to the user to make sense of this because the system won't.
This would also require rewriting the trace function in LensAssembly.jl because it currently assumes a single ray path is returned by processintersection. Power would have to be accounted for differently in the trace function for CSGOpticalSystem, which records ray power per pixel. Currently it uses sourcepower because stochastic tracing is assumed.
I don't have time to make these changes now but I would be willing to advise and answer questions. Would either of you be willing to take this on?
On to the polarization issue. I initially thought that we could compose Chimpan's polarization matrices as the ray traverses the system and then store a list of P matrices at each pixel, one for each input ray. Then after the fact one could change the electric field vector of the input ray and compute the polarized output simply by multiplying by the appropriate P matrix. This should be much faster than repeating the entire ray trace.
However, this doesn't work with the stochastic sampling we currently use because the sampling is based on power. For polarized light this isn't known until the input electric field is known. Figuring out if this can somehow be salvaged would take some thought.
Once the sampling is abstracted into a choosepath function we could use an alternative stochastic choice mechanism for polarization, using something other than power to make the branch choice, that wouldn't require the electric field to be specified at ray trace time.
Or we could bite the bullet and not save the P matrices at each pixel in the detector and instead propagate the electric field vector during the ray trace. Which shouldn't be much slower than the non-polarized tracing we are doing now.
Any thoughts on how to proceed?
Dear @BrianGun
I don't have time to make these changes now but I would be willing to advise and answer questions. Would either of you be willing to take this on?
from my side: I'd prefer first to participate in writing and testing of the polarization.
Then maybe we will all (or at least me) learn the internals and maybe also worked our example of pow impact on stochastic forking, see below. Then maybe I could feel ready to first get some opinion on deterministic approach and then maybe contribute.
On to the polarization issue. I initially thought that we could compose Chimpan's polarization matrices as the ray traverses the system and then store a list of P matrices at each pixel, one for each input ray. Then after the fact one could change the electric field vector of the input ray and compute the polarized output simply by multiplying by the appropriate P matrix. This should be much faster than repeating the entire ray trace.
However, this doesn't work with the stochastic sampling we currently use because the sampling is based on power. For polarized light this isn't known until the input electric field is known. Figuring out if this can somehow be salvaged would take some thought.
Yes, exactly that was my point. Thank you for listening.
Or we could bite the bullet and not save the P matrices at each pixel in the detector and instead propagate the electric field vector during the ray trace. Which shouldn't be much slower than the non-polarized tracing we are doing now.
Yes, that's what I propose in #276. Propagate the E vector, plus the scalar value DoP (0<=DoP<=1) to allow for unpolarized light (with unpolarized source as a good default which shall satisfy that no polarization will be completely neglected, although not reflecting the proper power ratios for stochastic mixture). And indeed to record the P-matrices to be able to check against well proven Chipman's method.
@BrianGun and @hrobotron I'm going to put off stating an opinion on a way forward until I look at OpticSim more. I've invested too much time in discussing issues without being able to make helpful suggestions based upon knowledge of the what code is really in place.
You have, however, confirmed my concerns about the code and the issues hint at larger difficulties ahead if this issue is not addressed with care. Taking the easy way at this point will certainly cripple the functionality of OpticSim. I see this also as being a question of the main function of OpticSim. If this is primarily for illumination applications (for example LightTools) then stochastic analysis should be basis of how it works with little attention to functions like wavefront analysis and polarization's impact on image performance. On the other hand, if the objective is to model complicated systems that approach the diffraction limit, the choices are different. In these cases, just propagating the electric field is insufficient.
I'll be back with more after I can knowledgeably write about the choices @BrianGun describes.
OpticSim is not biased toward illumination applications or wavefront analysis. Ideally it will do both well.
It's important to keep in mind the magnitude of the code changes we're talking about. The trace function in LensAssembly.jl is 38 lines long. The trace function in OpticalSystem.jl is 73 lines long. processintersection in Fresnel.jl is 54 lines. It is almost exclusively these functions which need to be changed with very minor changes elsewhere. And even the changes to these functions should be small, at least to make them handle polarization (given my current understanding of polarization).
The code that generates rays and tracks the ray tracing process is highly modularized and independent of the vast majority of the rest of OpticSim. For example, if sequential ray tracing was something everyone wanted we could write a sequential trace function in 50-100 lines of code. We didn't write it because we didn't need it for our application.
Even if we threw out these functions and started from scratch we're not talking about much code. Once we understand what we want to do and how to do it this should be straightforward. A few days of programming and testing, maybe a week. Most of the work will be in figuring out what and how, not in the programming itself.
We don't have to write code with giant switch statements that handle every possibility: polarization, wavefront analysis, illumination, etc.. Instead we can dispatch the trace and processintersection functions on the type of the ray and use different rays for different analyses. The current type for optical rays is OpticalRay{T,N}. We could change this to OpticalRay{T,N,P} where P encodes the type of analysis to be done: OpticalRay{Float64,3,Polarization}, OpticalRay{Float64,3,Illumination}, etc.
Then the user decides what analysis to do by creating the appropriate type of ray to begin the ray tracing process.
Hi @BrianGun, I'm about to put your goal of being all things to all users to the test. :-)
I'd like to address some approaches to your question about how to manage the apparent inconsistency between the stochastic raytrace and the P matrix method of addressing polarization.
The basic question is what properties of the light are attached to the ray? Is the ray is carrying a power-like quantity (intensity, radiance, specific intensity are all power-like. I'll just reference them as "power") and amplitude or E-field (which I'll reference as field). For most ray tracing, the ray can be used for either depending upon what is convenient. So convenient that we forget that they are not identical.
The approach for stochastic raytracing, as I understand it, in OpticSim is to make decisions based upon the power as to which rays to trace. This breaks the assumptions on the ray that enable the field to be treated identically to the power. I believe this was part of the discussion with @cdhr on how to manage measuring power between stochastic and deterministic raytrace; specifically, I seem to remember some discussion about addressing appodization properly. (For some reason I can no longer see those issues or I would have read them carefully and referenced them).
This is at the heart of the problem with using the P matrix with the stochastic raytrace. The P matrix is a field related quantity. One solution to this issue is to carry more information along with the stochastic raytrace so that the field behavior can be recovered. I consider this to be easy to implement but ultimately confusing. It also has the potential to lead to a raytrace that is not great for power raytraces and not great for amplitude retraces because extra data is being carried along that is only relevant to power or amplitude. There are other choices that I will outline.
First, let me describe examples of analysis based on power and field and show how they are different. I'll then describe some of the choices and advantages and disadvantages.
Consider a very simple system: a partially reflecting/transmitting surface. Let's assume that there is an 20/80 power reflection to power transmission. Now trace 10 rays with identical starting conditions as if this was a stochastic sampling of the emitter distribution. (delta function space distribution with delta function angular distribution and the rays carry power). There will about 2 rays reflected and 8 rays transmitted. This means that the reflected path has 20% of the power and the transmitted path has 80%. This is pretty trivial so far.
However, what has happened to the field? In this case, the reflected field is two rays, from an identical direction, location and source in the reflected path. How should these be combined? Addition and squaring is wrong because they come from the same source. It would give a power 2 times higher than than interpreting the ray as power. However, if the rays came from different sources (say were combined by another beamspitter before our example system, and their phase and wavelength were the same, then adding and squaring gives just the right answer and the power based interpretation is incorrect. This is equivalent to the situation that you identify where the P matrix will be inconsistent in the stochastic raytrace.
I can think of several solutions to this dilemma.
- Track the information for both power and field quantities in this same raytrace.
- Get rid of the stochastic raytrace and replace it with alternatives that better preserve the ability of the ray to represent power and field.
- Write multiple raytraces that are focused on the problems being investigated. For example a power centric raytrace could have stochastic sampling and Mueller matrix polarization tracing while a field centric raytrace would forgo stochastic sampling and use the P matrix. The power and field raytraces could then be optimized for their main application area, scattering and illumination in the former and image quality and detailed investigation of interference in the latter. Naturally, the application area of the two (or more) raytraces would overlap, for example it appears @hrobotron uses power raytraces to model interferometers. Users are always going to find the best way to solve their problem.
Some ideas on how to implement each approach
- Stochastic branching a. In this case, the power splitting ratio for each stochastic fork needs to be calculated and stored for each fork in the raytrace so that the field centric analsyis can reconstruct the fields (assuming the phase information is stored already). My recommendation would be write analysis functions that show how to use this information to perform accurate field analysis because of the confusion as to how to extract the physical quantities from the model output. b. Remove the power based test and gain a statistical advantage. Make all branching with fixed ratios,chosen by the user, so that the number of rays in each branch matches the desired statistical requirement. This information would then be combined with the actual R or T at the interface to determine the powers or fields. It would not need to be stored on a per ray basis because all are the same. This probably runs faster than the 1a but is less friendly to the user who is only interested in power and doesn't care to a priori address the statistics of their runs.
- Eliminate stochastic branching. a. In this case, create analysis functions that enable a fixed number of rays to go one direction and a fixed number the other. This ends up with the same analysis functions as 1b above. It is likely faster than 1b but it would not model systems with an indeterminate numbers of possible branch opportunities well (for example light leaking from a hollow cylinder). This raytrace is becoming more field centric and less power centric. b. Create a raytrace that forks and follows all paths. This would work well for field systems (especially for walkoff from birefringent materals and mulitiple orders from gratings) but like 2a, is a problem for some illumination and scattering problems.
- Multiple raytraces. I'll add a sequential raytrace to the options as it would make field centric analysis for many systems much simpler and efficient.
I fear that this is still pretty confusing. As you can surmise, I have been focusing on field-centric problems and my desire to have solutions that work for them. I believe @hrobotron has been most interested in power-centric solutions and so finds my recommendations needlessly complicated. And it is true that one could write raytraces that are field centric and then derive all of the power centric metrics from the underlying field parameters as I was originally advocating. The question is if this is really the best way to solve the widest range of problems. I have become convinced while thinking about these problems that there are illumination and scattering problems for which this is decidedly not the case. Or to be succinct , some illumination problems should be solved with stochastic raytraces, but stocastic raytraces are very inefficient for most field centric analyses.
Ultimately, you'll have to decide where your and your colleagues' interest lie and focus on creating the infrastructure to solve those problems. The rest of us can only hope that it will be close enough to solve ours or be willing to take the time to create what we need.
@mattderstine
Because of the relatively small amount of code that needs to be modified to change the ray selection policy, or to add polarization support, let's think of what we are doing now as a series of experiments. We'll talk for a bit then code for a bit and use the code for a while, noting deficiencies. Repeat a few times till we are satisfied.
We shouldn't obsess too much right now about ray spawning, stochastic vs. deterministic. We can easily do both because the amount of code that needs to change is between the two is small, perhaps a hundred lines, probably less.
By properly using generic types and function dispatch we should be able to make the overhead zero, i.e., there will be one function you dispatch on for deterministic tracing and a different one for stochastic, and neither one will do unnecessary work. By selecting the right generic type for ray argument to trace the appropriate functions are chosen automatically.
To make this simpler the current code needs to be refactored to better abstract ray spawning out of the various processintersection functions where it is buried. But this doesn't look hard to me.
Initially we shouldn't worry too much about efficiency until we have a correct and complete(ish) implementation of Chipman and have tried to use it solve problems. Then we can optimize.
Short reponses to some of your questions:
The basic question is what properties of the light are attached to the ray? Is the ray is carrying a power-like quantity (intensity, radiance, specific intensity are all power-like. I'll just reference them as "power") and amplitude or E-field (which I'll reference as field). For most ray tracing, the ray can be used for either depending upon what is convenient. So convenient that we forget that they are not identical.
The way I have written the current experimental code the polarization/evector information is carried in the OpticalRay struct by extending the type signature from OpticalRay {T,N} to OpticalRay{T,N,P<:AbstractPolarization}. There are currently only two subtypes of AbstractPolarization: Chipman and NoPolarization. Chipman has two fields: pmatrix and electricfieldvector. I think this is all we need to propagate all the electric field information, but we will find out as we get further into using the code.
The various experimental processintersection functions dispatch on the precise type of the ray argument. There is a processintersection for a ray with type OpticalRay{T,N,NoPolarization} and another for OpticalRay{T,N,Chipman}.
Each function does only the work necessary for the desired type of analysis. The user pays the computational cost for polarization only when they specify they want a polarization trace. If they are doing conventional power based tracing the additional cost is zero compared to current OpticSim.
This decision about OpticalRay carrying field information this way is not written in stone. It can easily be changed but for now we should leave it unless it is obviously logically incorrect. It will be more productive for us to begin writing and using code than arguing in the abstract. If something isn't working it will become immediately apparent when we try to simulate a system.
- Stochastic branching a. In this case, the power splitting ratio for each stochastic fork needs to be calculated and stored for each fork in the raytrace so that the field centric analsyis can reconstruct the fields (assuming the phase information is stored already). My recommendation would be write analysis functions that show how to use this information to perform accurate field analysis because of the confusion as to how to extract the physical quantities from the model output. b. Remove the power based test and gain a statistical advantage. Make all branching with fixed ratios,chosen by the user, so that the number of rays in each branch matches the desired statistical requirement. This information would then be combined with the actual R or T at the interface to determine the powers or fields. It would not need to be stored on a per ray basis because all are the same. This probably runs faster than the 1a but is less friendly to the user who is only interested in power and doesn't care to a priori address the statistics of their runs.hat we need.
As mentioned above this is something we will be able to do more easily when the code has been refactored to abstract out the ray spawning algorithm. It does not seem hard to me to allow for many different types of ray spawning either conceptually or in terms of the amount of code we'll have to write.
@mattderstine @hrobotron
just pushed a version that has Chipman ex. 9.4 in the Polarization.jl testset in the test directory. You can use this to see how to set up your own tests. This does not test the trace function. It should return correct polarization information but not verified yet.
This code does not correctly handle normal incidence yet. That's a special case that requires a different construction for the local s-p coordinate frame. Not hard, just not coded.
Let me know if you can make sense out of the code and implement other examples from Chipman.