Ciw icon indicating copy to clipboard operation
Ciw copied to clipboard

Wishlist for Ciw 3.0

Open geraintpalmer opened this issue 2 years ago • 10 comments

The possibility of a 3.0 release opens up some possibilities, allowing some big changes that are not back-compatible, and some smaller internal changes. Below is a list of possible features or changes that might appear. Please use the comments below to discuss.

  1. Remove ability to read in from yaml (reasoning: pyYaml does this for us and better)
  2. Remove ability to write results to file (reasoning: pandas does this for us and better. Ciw records are already lists of NamedTuples with are easily compatible with pandas)
  3. Baulking and rejection losses should recorded as DataRecords (exactly the same as how reneging and interrupted services are, for consistency)
  4. Classes should simply be indices (no need for the string "Class 1", just index by the number 1)?
  5. Add option for 'class_names' and 'node_names' in the Network object, so that the data records show these. This might make Ciw records more readable.
  6. Ensure consistent using of current_time / next_event_time internally (I think current_time is better)
  7. Rethink exact arithmetic (Decimal(3.2) + 4.1 = Decimal(7.3), so maybe no need for the increment_time thing
  8. (If possible) simplify import_params.py (especially in terms of routing functions)
  9. Replace ciw.dists.NoArrivals() with None (this is how reneging and dynamic classes handle no distributions)
  10. Make it easier to choose Individual attributes to include in the DataRecords, e.g Q = ciw.Simulation(N, attributes_to_record=['successful_service', 'age']) (see https://github.com/CiwPython/Ciw/issues/205#issuecomment-1175267882 for how this is currently handled)

geraintpalmer avatar Jul 05 '22 16:07 geraintpalmer

@drvinceknight @11michalis11 Let me know your thoughts / suggestions.

geraintpalmer avatar Jul 05 '22 16:07 geraintpalmer

May I add a couple of ideas?

  1. Allow state-dependent reneging distributions
  2. Allow general distributions for baulking
  3. Allow to set "batching" at any node, that is, a mechanism to wait for a certain number of customers before being accepted/released, and accept/release all of them at the same time
  4. A "wait-to-be-pushed" mechanism, that is, a customer stays in service indefinitely until the queue is filled and one more customer wants to join the queue, therefore it "pushes" all others in the queue and also the one who was in service is released
  5. Parallel computations (which however is always tricky)

Hope it helps

lec00q avatar Jul 22 '22 12:07 lec00q

HI @lec00q thank you so much for the suggestions!

I believe points 2) and 5) can be done already in some way:

  • Baulking distributions are always user-defined, and they take in the number of customers already present. This means that users can define some hard-and-fast rules, or probabilistic distributions, or something more complex, to decide whether a customer will baulk. However I think this can be improved further by passing the simulation object itself to the baulking function, allowing the rules/distributions to use the full state of the network, rather than just number of customers present at the current node.
  • I don't think parallel processing a single run of the simulation would be possible, as the logic is highly sequential. However parallelising the trials can be done, for example using the `multiprocessing' library, or there are other solutions. I think a page in the documentation on how do this might be beneficial.

I love the idea of 3) and 4). Do you have any examples of this so that I can further understand what is meant here?

I think 1) is quite difficult. When implementing reneging we found it difficult to clearly well define a state-dependent reneging mechanism without falsely multiply sampling, and so changing the probability distributions. I would welcome a further discussion on this though.

geraintpalmer avatar Jul 25 '22 15:07 geraintpalmer

  • I don't think parallel processing a single run of the simulation would be possible, as the logic is highly sequential. However parallelising the trials can be done, for example using the `multiprocessing' library, or there are other solutions. I think a page in the documentation on how do this might be beneficial.

I'm happy to PR this if you'd like me to @geraintpalmer

drvinceknight avatar Jul 26 '22 09:07 drvinceknight

@drvinceknight that would be fantastic thank you

geraintpalmer avatar Jul 26 '22 12:07 geraintpalmer

I've opened https://github.com/CiwPython/Ciw/pull/209 with https://ciw--209.org.readthedocs.build/en/209/Guides/parallel_process.html

drvinceknight avatar Jul 29 '22 09:07 drvinceknight

The possibility of a 3.0 release opens up some possibilities, allowing some big changes that are not back-compatible, and some smaller internal changes. Below is a list of possible features or changes that might appear. Please use the comments below to discuss.

  1. Remove ability to read in from yaml (reasoning: pyYaml does this for us and better)
  2. Remove ability to write results to file (reasoning: pandas does this for us and better. Ciw records are already lists of NamedTuples with are easily compatible with pandas)
  3. Baulking and rejection losses should recorded as DataRecords (exactly the same as how reneging and interrupted services are, for consistency)
  4. Classes should simply be indices (no need for the string "Class 1", just index by the number 1)?
  5. Add option for 'class_names' and 'node_names' in the Network object, so that the data records show these. This might make Ciw records more readable.
  6. Ensure consistent using of current_time / next_event_time internally (I think current_time is better)
  7. Rethink exact arithmetic (Decimal(3.2) + 4.1 = Decimal(7.3), so maybe no need for the increment_time thing
  8. (If possible) simplify import_params.py (especially in terms of routing functions)
  9. Replace ciw.dists.NoArrivals() with None (this is how reneging and dynamic classes handle no distributions)
  10. Make it easier to choose Individual attributes to include in the DataRecords, e.g Q = ciw.Simulation(N, attributes_to_record=['successful_service', 'age']) (see error rate #205 (comment) for how this is currently handled)

For # 4 we can actually use immutable types that have an ordering. See Can Ciw Use Tuples For Class IDs?. Personally, I like this flexibility.

galenseilis avatar Oct 23 '23 22:10 galenseilis

Thanks @galenseilis this is a nice idea. I initially thought to keep customer classes as strings or integers, in order to make the data records easier to read with pandas, and reading/writing from file. Do you see any issue with this?

geraintpalmer avatar Oct 26 '23 14:10 geraintpalmer

Thanks @galenseilis this is a nice idea. I initially thought to keep customer classes as strings or integers, in order to make the data records easier to read with pandas, and reading/writing from file. Do you see any issue with this?

I think that not making any further changes to customer classes is desirable for my use cases. Using strings or integer works in many cases, and it is also compatible with using tuples depending on the project. A column of tuples can be "exploded" into multiple columns using pandas, so I am not particularly concerned about that. Overall, I like the current state.

galenseilis avatar Oct 30 '23 03:10 galenseilis