Ciw
Ciw copied to clipboard
Wishlist for Ciw 3.0
The possibility of a 3.0 release opens up some possibilities, allowing some big changes that are not back-compatible, and some smaller internal changes. Below is a list of possible features or changes that might appear. Please use the comments below to discuss.
- Remove ability to read in from yaml (reasoning:
pyYaml
does this for us and better) - Remove ability to write results to file (reasoning:
pandas
does this for us and better. Ciw records are already lists of NamedTuples with are easily compatible with pandas) - Baulking and rejection losses should recorded as
DataRecords
(exactly the same as how reneging and interrupted services are, for consistency) - Classes should simply be indices (no need for the string
"Class 1",
just index by the number1
)? - Add option for
'class_names'
and'node_names'
in the Network object, so that the data records show these. This might make Ciw records more readable. - Ensure consistent using of
current_time
/next_event_time
internally (I thinkcurrent_time
is better) - Rethink exact arithmetic (
Decimal(3.2) + 4.1 = Decimal(7.3)
, so maybe no need for theincrement_time
thing - (If possible) simplify
import_params.py
(especially in terms of routing functions) - Replace
ciw.dists.NoArrivals()
withNone
(this is how reneging and dynamic classes handle no distributions) - Make it easier to choose Individual attributes to include in the DataRecords, e.g
Q = ciw.Simulation(N, attributes_to_record=['successful_service', 'age'])
(see https://github.com/CiwPython/Ciw/issues/205#issuecomment-1175267882 for how this is currently handled)
@drvinceknight @11michalis11 Let me know your thoughts / suggestions.
May I add a couple of ideas?
- Allow state-dependent reneging distributions
- Allow general distributions for baulking
- Allow to set "batching" at any node, that is, a mechanism to wait for a certain number of customers before being accepted/released, and accept/release all of them at the same time
- A "wait-to-be-pushed" mechanism, that is, a customer stays in service indefinitely until the queue is filled and one more customer wants to join the queue, therefore it "pushes" all others in the queue and also the one who was in service is released
- Parallel computations (which however is always tricky)
Hope it helps
HI @lec00q thank you so much for the suggestions!
I believe points 2) and 5) can be done already in some way:
- Baulking distributions are always user-defined, and they take in the number of customers already present. This means that users can define some hard-and-fast rules, or probabilistic distributions, or something more complex, to decide whether a customer will baulk. However I think this can be improved further by passing the simulation object itself to the baulking function, allowing the rules/distributions to use the full state of the network, rather than just number of customers present at the current node.
- I don't think parallel processing a single run of the simulation would be possible, as the logic is highly sequential. However parallelising the trials can be done, for example using the `multiprocessing' library, or there are other solutions. I think a page in the documentation on how do this might be beneficial.
I love the idea of 3) and 4). Do you have any examples of this so that I can further understand what is meant here?
I think 1) is quite difficult. When implementing reneging we found it difficult to clearly well define a state-dependent reneging mechanism without falsely multiply sampling, and so changing the probability distributions. I would welcome a further discussion on this though.
- I don't think parallel processing a single run of the simulation would be possible, as the logic is highly sequential. However parallelising the trials can be done, for example using the `multiprocessing' library, or there are other solutions. I think a page in the documentation on how do this might be beneficial.
I'm happy to PR this if you'd like me to @geraintpalmer
@drvinceknight that would be fantastic thank you
I've opened https://github.com/CiwPython/Ciw/pull/209 with https://ciw--209.org.readthedocs.build/en/209/Guides/parallel_process.html
The possibility of a 3.0 release opens up some possibilities, allowing some big changes that are not back-compatible, and some smaller internal changes. Below is a list of possible features or changes that might appear. Please use the comments below to discuss.
- Remove ability to read in from yaml (reasoning:
pyYaml
does this for us and better)- Remove ability to write results to file (reasoning:
pandas
does this for us and better. Ciw records are already lists of NamedTuples with are easily compatible with pandas)- Baulking and rejection losses should recorded as
DataRecords
(exactly the same as how reneging and interrupted services are, for consistency)- Classes should simply be indices (no need for the string
"Class 1",
just index by the number1
)?- Add option for
'class_names'
and'node_names'
in the Network object, so that the data records show these. This might make Ciw records more readable.- Ensure consistent using of
current_time
/next_event_time
internally (I thinkcurrent_time
is better)- Rethink exact arithmetic (
Decimal(3.2) + 4.1 = Decimal(7.3)
, so maybe no need for theincrement_time
thing- (If possible) simplify
import_params.py
(especially in terms of routing functions)- Replace
ciw.dists.NoArrivals()
withNone
(this is how reneging and dynamic classes handle no distributions)- Make it easier to choose Individual attributes to include in the DataRecords, e.g
Q = ciw.Simulation(N, attributes_to_record=['successful_service', 'age'])
(see error rate #205 (comment) for how this is currently handled)
For # 4 we can actually use immutable types that have an ordering. See Can Ciw Use Tuples For Class IDs?. Personally, I like this flexibility.
Thanks @galenseilis this is a nice idea. I initially thought to keep customer classes as strings or integers, in order to make the data records easier to read with pandas, and reading/writing from file. Do you see any issue with this?
Thanks @galenseilis this is a nice idea. I initially thought to keep customer classes as strings or integers, in order to make the data records easier to read with pandas, and reading/writing from file. Do you see any issue with this?
I think that not making any further changes to customer classes is desirable for my use cases. Using strings or integer works in many cases, and it is also compatible with using tuples depending on the project. A column of tuples can be "exploded" into multiple columns using pandas, so I am not particularly concerned about that. Overall, I like the current state.