[2.0] Proposal: A standard policy for vector dimension mismatches
[2.0] Proposal: A standard policy for vector dimension mismatches
The goal: A consistent policy for vector operations
With the introduction of $n$-dimensional vectors in p5.js 2.0, we have an exciting opportunity to align p5.js with modern math and machine-learning libraries. This can provide an accessible, creative onramp for users to learn fundamental data-science concepts. In fact, this was the original motivation for introducing $n$-dimensional vectors to p5.
A core concept in these libraries is broadcasting, a standard set of rules for handling operations between vectors, matrices, or tensors of different dimensions. These rules are used everywhere from math libraries like math.js, to machine-learning libraries like TensorFlow.js.
This issue proposes that p5.js adopt the standard broadcasting rules to ensure our math library is consistent, predictable, and extensible for future features like p5.Matrix. Let's explore what that means.
How broadcasting works
For now, it will help to consider how broadcasting works in the special case of vectors. Here, broadcasting tells us that two vectors can be operated on if they have matching dimensions, or if one of the vectors is 1D. That's it. Let's look at some examples of why this rule is so useful.
Addition and subtraction:
createVector(10, 10, 10).add(2) produces components [12, 12, 12]
The 2 in the 1D vector [2] is broadcast to higher dimensions, so that we're really adding [10, 10, 10] and [2, 2, 2]. This is a useful operation in data processing, statistics, etc. For example, given a list of exam grades like [92, 83, 61, 97, 72, 75, 64, 95, 100, 82], we can center it around zero by subtracting the average (82.1) from every number in the list. This makes it clear which scores are below average and which are above average.
Multiplication and division:
createVector(10, 10, 10).mult(2) produces components [20, 20, 20]
As before, the 1D vector [2] becomes [2, 2, 2], and the operation is applied elementwise. This isn't just predictable. It's also very useful. This is scalar multiplication: the vector is scaled by 2, making it twice as long.
[Edit] Performance clarification:
To clarify, the explanation above describes the conceptual model that users can rely on to predict the results of broadcasting. Efficient broadcasting implementations would never actually expand a 1D vector like [2] to a 3D vector [2, 2, 2].
The problem: Current behavior is inconsistent
Currently, p5.js does not follow standard broadcasting rules, which creates several problems:
- Internal inconsistency: p5.js is inconsistent with itself—
mult(2)applies to all components (correct), butadd(2)applies only to the first component. It offers no such shortcut for the second component. - External inconsistency: The custom behavior in p5 is different from every major math and ML library, making p5.js less of an onramp, and forcing users to unlearn p5's rules to advance. It's also inconsistent with creative-coding libraries. For example, openFrameworks uses standard broadcasting for operations among vectors.
- Confusing padding rules: For multi-element multipliers like in
createVector(1, 1, 1).mult(2, 2), v1.x pads the missing component with 1 (resulting in[2, 2, 1]), which is an unpredictable special case. Users might guess[2, 2, 2].
[Update] Added point about openFrameworks.
How could p5 work? The options.
As we stabilize the p5.Vector feature set for p5.js 2.0, we have the opportunity to reassess the rules that p5 should follow. Several options have been described by @limzykenneth:
- Refuse to operate on incompatible vectors (ie. throwing an error when this is tried)
- Perform broadcasting where possible and refuse to operate thereafter
- Automatically convert all vectors to the highest common dimension with 0 padding before operating
- Some combination of the above
Weighing the options: Options 2 and 4 seem most viable
The first option is likely not viable, as it would disallow common operations like scalar multiplication. That leaves Options 2, 3, and 4. Option 3 introduces additional forms of complexity, as noted previously, and it goes against the original reason for introducing $n$-dimensional vectors to p5, since advanced math and machine-learning libraries do not work this way. That leaves Options 2 and 4. Perhaps, in a creative-coding context, Option 4 might be useful?
The trouble with Option 4
For Option 4, it seems sensible to at least follow standard broadcasting rules when one of the vectors is 1D. Then the question is, how do we handle mismatches where neither of the vectors is 1D? If we look at a concrete example, we start to see how confusing it might be. In the example below, there is no obvious way to proceed, and users are left guessing.
Example: createVector(2, 3).mult(4, 5, 6)
Do we extend [2, 3] to [2, 3, 0], since that's the most natural way to extend a 2D vector to a 3D vector?
Do we extend [2, 3] to [2, 3, 3], extending the broadcasting approach by repeating the last entry?
Do we extend [2, 3] to [2, 3, 1], since 1 is the multiplicative identity?
Does the user want the vector [2, 3] to turn into a 3D vector at all?
This is just one simple vector example. If we consider matrices or tensors, the situation may become more complicated.
Proposal: Adopt Standard Broadcasting (Option 2)
Based on the analysis above, I propose that p5.js adopt the standard, widely-used broadcasting rules:
- Operations are allowed if vector dimensions match.
- Operations are allowed if one operand is a scalar (a 1D vector or a single number).
- All other dimension mismatches will throw an error.
This approach is simple, consistent, and avoids the ambiguity of custom padding rules (as shown in the "trouble with Option 4" example). It also aligns with the original motivation for $n$-dimensional vectors, by preparing users for advanced math and machine learning libraries. And it ensures our API will be extensible to p5.Matrix and even p5.Tensor.
[Update] Additional benefits: The proposed policy would also resolve existing issues beyond vector algebra. An example is outlined in #8189.
Abundance of evidence indicates zero disruption
This would be a breaking change, but major releases are the appropriate time to fix confusing or inconsistent APIs. The key cost to consider is user disruption. To assess this, I collected two forms of data, which together provide a robust body of evidence that indicates zero disruption.
- Empirical data: I did a Google Search for
site:https://editor.p5js.org/ "createVector" "add"and manually reviewed the first 50 results that called theadd()method. Of these, precisely zero used dimension mismatches. [^1] I repeated the same methodology for themult()method, and similarly found that zero of 50 sketches usingmult()called it with a nonstandard dimension mismatch (the only dimension mismatch detected was multiplication by a scalar, a case which is unchanged by the current proposal). - Domain Knowledge: This 0% finding is what we'd expect. The non-standard behavior is not documented by any reference examples, it has a more writable and readable alternative (e.g.
v.x += 2), it has no clear use cases, and it is inconsistent with every single major library, across languages and domains—even Processing'sPVectordoes not behave this way.
A more rigorous, Bayesian analysis (for the curious)
For those interested in statistical rigor, this is a textbook case for a beta-binomial model. Given the $0$ observed uses in our $n=50$ sample foradd(), and a very generous prior belief that maybe 1 in 1,000 sketches using add() rely on the nonstandard behavior (0.1%), we can be 97.5% confident that the true usage rate in the wild is, at most, 0.351% (or about 1 in 284 sketches that use add()). Also, if there are any sketches that use the non-standard behavior, that code would only break if it's migrated to 2.x; many sketches will be left as is and won't switch to an upgraded version. So it's likely that the true proportion of sketches that would be broken is extremely small, and may well be exactly zero.
Just in case, we can clearly document the breaking change in the compatibility README, which contains the official list of breaking changes made in the upgrade from 1.x to 2.x.
Updates:
- Added results of the
mult()analysis. - Added supporting domain knowledge.
- Added statistical model. (See #8203 for original version.)
Discussion
What do you think, everyone? How would you handle dimension mismatches? Are there any use cases I didn't cover that you think are important?
Invitation for comment
Many other community members have been actively involved in related discussions, and I'd love to hear their thoughts. These include @ksen0, @limzykenneth, @inaridarkfox4231, @sidwellr, @Ahmed-Armaan, @davepagurek, @holomorfo, @nickmcintyre, @RandomGamingDev, and many others. Everyone is welcome to share their ideas!
[^1]: In a technical sense, one sketch used a dimension mismatch due to the way vectors are represented in 1.x, but if this sketch were upgraded to 2.x, there would be no dimension mismatch, so the code would not break. Specifically, the code had the form add(number1, number2) and was adding to a vector that was intended to be 2D. In 1.x, vectors are represented internally as 3D vectors, so this involved a mismatch. However, with true 2D vectors in 2.x, there'd be no dimension mismatch, so the change to standard broadcasting rules wouldn't break this code.
My one thing to add for partial broadcasting is that I'd broadcast with whatever leaves the remaining values untouched -- so if you're multiplying or dividing, it'd mean padding with 1; if you're adding or subtracting, it'd mean padding with 0.
That said, I generally agree that those silent choices have a lot of potential for confusion. e.g. outside of the vector API, drawing a 3D model with scale(n, n) accidentally doesn't scale the z axis at all, which can be hard to notice if you don't orbit, and can be hard to trace down the cause for.
So my preference is to also go with what you suggested, allowing operations with scalars but nothing else.
The only thing that gives me a bit of pause is the fact that we're going to be introducing more errors, which could be as simple as adding a 1 or 0 to a createVector, but sometimes the spot you have to add it is not the spot where the error takes place, because the vector in question was created elsewhere. That's probably fine for now? But let's be open to ideas for how to improve messaging there (or possibly different convenience methods to convert between dimensions, e.g. myVec2.pad(0).toDim(3), so you could offer in the error that a user add that directly at the spot causing the error.)
Thanks for the great discussion on this @davepagurek. It seems we're in agreement, but I'm glad you mentioned the alternative idea of "partial broadcasting," by padding with an identity element (e.g., 0 for addition). It is really appealing because it feels like it could be a helpful shortcut. It also aligns with some of p5's 1.x behavior, which is an important consideration. So, I thought it was worth a closer look. My analysis validates the standard-broadcasting approach and shows that your idea for improving the error experience is the perfect complement to it.
Motivation
To make sure we're making the best long-term decision, I did a deeper dive into the potential trade-offs of the custom approach versus the universal standard for broadcasting. I'll share my findings here for anyone reading along, and also to serve as a more complete public record of the design rationale.
A deeper look at broadcasting: The case for a standard, error-throwing approach
My analysis suggests that while well-intentioned, custom broadcasting adds features with limited utility and high costs in consistency, writability, extensibility, and cognitive load.
Key costs
Here are the key costs as I currently understand them:
1. The "intuitive" trap (consistency & predictability)
A custom rule to pad [2, 3] with a 0 when adding it to [4, 5, 6] seems helpful, but this appears to be a hypothetical problem—my review of 50 sketches that call add() found this usage zero times. This suggests that if this feature is used, there's a decent chance it's unintentional. In a far more common operation, scalar multiplication, padding with an identity is likely to cause confusion.
Internal inconsistency: In p5.js, createVector(x, y, z).mult(2) correctly broadcasts the scalar to all components (i.e., [2, 2, 2]). A user would logically infer that createVector(x, y, z).mult([2, 2]) should behave the same way—by repeating the last component. Another possibility is that they might associate [2, 2] with [2, 2, 0], as this is the usual way to map two-space into three-space. In both of these cases, padding with the multiplicative identity (1) amounts to incorrectly guessing user intent.
External inconsistency: This custom behavior does not appear to exist in any major math or ML library. This undercuts our goal of being a creative onramp to these tools.
These inconsistencies result in an API that violates the principle of least surprise.
2. Hidden bugs (writability)
Incorrect inferences
Partial broadcasting hides the bug. It silently "fixes" the data by guessing what the user meant, and it's likely to guess wrong, as noted above. A system that allows users to write code that isn't explicit can also lead to strange visual glitches (as in the scale(n, n) issue you mentioned). In contrast, standard broadcasting throws an error when it sees an invalid operation like createVector(2, 3).add(4, 5, 6). This error is a feature! It tells the user, "Your data shapes don't match, you likely have a bug."
Corrective tools
Your idea to provide tools to easily fix the reported errors, like pad(), is inspired ✨This way, we get the best of both worlds: a system that prevents hidden bugs by default, with thoughtful tools to help users fix them. It perfectly aligns with the Friendly Error System. If we all agree on this path, I think the next step would be to implement the standard error-throwing behavior. The good news is that the repair kit you imagined is already being designed! The matrix proposals I've been working on provide a cohesive set of common functions like pad().
3. Brittleness (extensibility)
The identity-padding approach is not a scalable system. It fails for rem(), which has no identity element. It also fails for fundamental elementwise operations like min(), max(), and equals().
This creates a two-tier API where users must memorize a list of which functions have p5.js-specific magic and which ones throw errors. Standard broadcasting provides one simple, universal rule for everything.
4. Cognitive load (readability)
Every custom rule adds to the cognitive load on learners, likely without adding significant power, since the most powerful libraries work without additional, custom rules.
Standard broadcasting: Users learn one rule that is universal, powerful, and transferable outside of p5.js.
Partial broadcasting: Users must learn the standard rule, plus a list of p5.js-specific exceptions and their corresponding identity elements.
Assessing the impact of breaking changes: Zero disruption detected
The primary argument for keeping the 1.x behavior appears to be backward compatibility. My analysis of existing usage suggests the disruption from adopting the standard would be zero, or near-zero. (See the methodology and results in the top post.)
Summary of trade-offs
| Criteria | Standard broadcasting (proposal) | Partial broadcasting by identity (alternative) |
|---|---|---|
| Consistency | One universal rule. Consistent internally and with all external libraries. | Many exceptions. Inconsistent with itself and all external libraries. |
| Writability | Throws helpful errors, revealing bugs to the user. | Silently fails, hiding bugs and causing hard-to-diagnose glitches. |
| Extensibility | A robust system. Works for all current and future elementwise operations. | A brittle patch. Fails for many operations beyond basic arithmetic. |
| Readability | Low cognitive load. Easy to learn, teach, and document. | High cognitive load. Users must memorize a list of p5.js-specific special cases. |
To me, a 1D vector is not a scalar. For example, the scalar 2 is different from the vector [2]. Multiplication of a vector by a scalar a is not done by creating a new vector [a, a, ...] of the same dimension and computing the Hadamard product; it is done by simply multiplying each element of the vector by a. Broadcasting is not applicable.
If we want multiplication of two vectors where one is 1D to be treated as scalar multiplication, we could implement that. But p5.js doesn't do that now. This is an important part of what @GregStanton is proposing. It could be implemented using broadcasting, or more simply by treating the 1D vector as a scalar.
I prefer to think of this as option 1, throw an error when vector dimensions don't match, but with a possible exception.
For compatibility with version 1, we need one more exception: If one vector is 2D and the other is 3D with a z component of 0, convert the 3D vector to a 2D vector by removing the z component so the operation can be performed. This is likely to happen when a user creates a sketch with mostly 2D vectors but uses createVector() with no parameters, which currently creates a 3D vector. (@shiffman does this frequently in his Nature of Code book, so I expect it to be a widespread practice.) I do think it appropriate to generate a warning when this happens (a suggestion by @limzykenneth on issue #8117). This may not be needed if we can do some magic with createVector(); see subissue #8156.
Thanks for inviting me, I'd love to contribute :D
Agreements & Disagreements
- Refuse to operate on incompatible vectors (ie. throwing an error when this is tried)
- Perform broadcasting where possible and refuse to operate thereafter
- Automatically convert all vectors to the highest common dimension with 0 padding before operating
- Some combination of the above
While I absolutely do agree about performing broadcasting where possible and refusing to operate otherwise (this is especially since I do come from a background of stricter languages e.g. with static typing alone), but I don't see why we can't have multiple options to suit each group, and possibly benefit both in their individual scenarios when needed.
Suiting Multiple Needs
@davepagurek:
That said, I generally agree that those silent choices have a lot of potential for confusion. e.g. outside of the vector API, drawing a 3D model with
scale(n, n)accidentally doesn't scale the z axis at all, which can be hard to notice if you don't orbit, and can be hard to trace down the cause for.
@GregStanton:
Partial broadcasting hides the bug. It silently "fixes" the data by guessing what the user meant, and it's likely to guess wrong, as noted above.
I absolutely agree with these, but I think an issue being ignored here is not just that people might want their applications to be automatically redundant (as per Javascript's philosophy/principle with things like type coercion), but also the fact that many features like adding a vec2 to a vec3 can be done out of a concern for performance rather than as a mistake (creating a larger vector when dealing with millions is a great way to send any intensive program into a halt), and especially if we have any future plans to batch things like these which can be a pain to implement manually. Plus, there's the fact that throwing a warning (which I think that we should do over throwing an error by default in order to correspond with Javascript's philosophy of making whatever assumptions needed to avoid erroring out) requires checking for mismatches, an operation that can be expensive for what's meant to be an optimized math library meant to handle graphics (and hopefully far more with p5's 2.0 release :D).
Generalization
Take for example numpy, the most popular math library and de facto standard we seem to be basing a lot of things off of. Even they have modes for being more permissible with errors via numpy.seterr in lieu of other solutions.
While they don't have options for broadcasting via numpy.seterr, that's largely because it isn't something absent. Assuming that we don't implement methods for broadcasting that rival it (which we probably shouldn't be considering our position within the JS ecosystem and the manpower cost), it seems the clear choice is to not choose one, but to give the users everything needed to cover all use cases:
- "Perform broadcasting where possible and refuse to operate thereafter": Improves debugging, helping people to learn and avoid dumb issues at the cost of an increased performance penalty.
- "Automatically [broadcast] all vectors to the highest common dimension": Horrible debugging, bad for those trying to learn, but increased performance and redundancy.
This relates to what @davepagurek said:
The only thing that gives me a bit of pause is the fact that we're going to be introducing more errors, which could be as simple as adding a 1 or 0 to a
createVector, but sometimes the spot you have to add it is not the spot where the error takes place, because the vector in question was created elsewhere. That's probably fine for now? But let's be open to ideas for how to improve messaging there (or possibly different convenience methods to convert between dimensions, e.g.myVec2.pad(0).toDim(3), so you could offer in the error that a user add that directly at the spot causing the error.)
Of course, if we were to implement options for broadcasting on the scale of numpy we should solely go with the first, but again, it's unrealistic in our current position.
To be honest, I don't think @davepagurek suggestion:
My one thing to add for partial broadcasting is that I'd broadcast with whatever leaves the remaining values untouched -- so if you're multiplying or dividing, it'd mean padding with 1; if you're adding or subtracting, it'd mean padding with 0.
would be enough.
There's also the aforementioned idea that Javascript and our community within it is far different from that of math libraries like numpy:
"JS is also an example of the 'worse is better' pattern in software engineering" - Eric Lippert
Even if it's a principle that I personally disagree with (I think Javascript should've used strict types and severely disincentivized type coercion from the very start as Brendan Eich said "I regret the implicit conversions that make == not an equivalence relation with disparate types on left and right"), I think there's something to be said for continuing Javascript's principle of redundancy without numpy-level broadcasting in our Javascript library while offering a stricter option as the default if unwanted.
P.S. Why choose one? Choose both.
Thanks for raising these points, @sidwellr! This is a great opportunity to clarify some of the subtle but important details for everyone following along.
To me, a 1D vector is not a scalar... Multiplication of a vector by a scalar
ais not done by creating a new vector[a, a, ...]... Broadcasting is not applicable.
This point is crucial. In scientific computing, there is a clear distinction between a scalar (2) and a 1D vector ([2]). The challenge for p5.js is that our own API often makes it impossible for us to know which one the user intends. Throughout most of the p5.Vector API, users can specify vectors with separate number arguments, an array of numbers, or a vector instance. Consequently, a user may reasonably view vec.mult(2) and vec.mult([2]) as two ways of multiplying by a 1D vector.
Fortunately, this is a classic problem that standard broadcasting solves elegantly. The conceptual model is simple: for any operation between a scalar and a vector, the scalar is "promoted" to a 1D vector, and then the broadcasting rules take over. This ensures that whether the user provides 2 or [2], the result is the same. It turns a potential source of ambiguity into a point of consistency.
This is also why the implementation details don't matter to the user. While an efficient broadcasting implementation would never actually create a new [a, a, ...] array, that simple conceptual model makes the rule easy to understand and predict.
Adopting this standard also resolves confusing inconsistencies in the current API. In 2.x, vec.mult(2) and vec.mult([2]) produce different results, which is an undocumented behavior that requires users to learn separate rules, without a significant benefit. This also breaks backward compatibility with 1.x, which produces the same result in each case. Standard broadcasting unifies disparate rules and restores backward compatibility in this case, ensuring that no matter how a user specifies the scalar/1D vector, the outcome is predictable and correct.
Hope this clarifies the thinking behind this part of the proposal! It's a great example of how adopting a well-established standard can help us resolve tricky API ambiguities.
Thank you for sharing such detailed thoughts, @RandomGamingDev, and for your offer to contribute! This is shaping up to be a terrifically productive conversation.
It's great that we're aligned on the core preference for standard broadcasting rules. I'll focus my reply on the new points you've surfaced about performance, error handling, and the idea of offering multiple modes.
...many features like adding a
vec2to avec3can be done out of a concern for performance rather than as a mistake (creating a larger vector when dealing with millions is a great way to send any intensive program into a halt)
I'm so glad you're watching out for this. Performance is a critical accessibility concern.
The good news is that this performance concern is completely solved by how standard broadcasting is implemented. Efficient broadcasting implementations never actually create larger, temporary arrays. The idea of "expanding" a vector is just a simple conceptual model to help users predict the results. The underlying code works efficiently on the original data, so there is no performance penalty or memory allocation cost. This means we get the benefit of a simple mental model without any of the performance drawbacks.
Plus, there's the fact that throwing a warning (which I think that we should do over throwing an error by default in order to correspond with Javascript's philosophy...) requires checking for mismatches, an operation that can be expensive...
This is a great topic for discussion, and it touches on the core philosophy of the library. Here’s how I see it:
- The p5.js precedent: While JavaScript as a general language is permissive, p5's philosophy is tuned to its precise audience and realized through its Friendly Error System (FES). The goal of FES isn't to avoid errors, but to make them as helpful as possible. Your suggestion of a warning, and @davepagurek's idea of a "repair toolkit" in the error message, are perfect extensions of the FES philosophy. A clear error that tells a user how to fix their code is the most user-friendly approach.
- The JS ecosystem precedent: For this specific domain (scientific computing in JS), a library like TensorFlow.js is a very strong precedent. It throws clear errors on shape mismatches.
- The performance consideration: The performance cost of checking for a mismatch is identical for both errors and warnings. In either case, the code must perform an
if (dimensions_do_not_match)check. Throwing an error is actually faster because the program stops, whereas a warning would require both the check and the overhead of logging a message while the program continues.
...it seems the clear choice is to not choose one, but to give the users everything needed to cover all use cases... "Why choose one? Choose both."
I appreciate this perspective, but I believe offering multiple modes would introduce more problems than it solves. The numpy.seterr example is interesting, but it's for numeric errors, and even NumPy doesn't offer a custom "permissive" mode for broadcasting shape errors. Offering a custom mode would increase the maintenance burden for p5 developers and the cognitive load for users, in exchange for a set of features with limited uses.
Fortunately, the standard, error-throwing approach is consistent, performant, debuggable, and extensible. It provides a solid foundation that we can build upon with the excellent FES and "repair kit" ideas that have come up in this conversation.
Point #1
...many features like adding a
vec2to avec3can be done out of a concern for performance rather than as a mistake (creating a larger vector when dealing with millions is a great way to send any intensive program into a halt)I'm so glad you're watching out for this. Performance is a critical accessibility concern.
The good news is that this performance concern is completely solved by how standard broadcasting is implemented. Efficient broadcasting implementations never actually create larger, temporary arrays. The idea of "expanding" a vector is just a simple conceptual model to help users predict the results. The underlying code works efficiently on the original data, so there is no performance penalty or memory allocation cost. This means we get the benefit of a simple mental model without any of the performance drawbacks.
The issue is the quote from me you cited
adding a
vec2to avec3
isn't handled by standard automatic broadcasting even if we added a dimension of 1, and has many different cases for how you'd want it (which numpy handle without needing to incur an additional cost):
- Add an expanded/projected vec2 as if it were vec3 filling empty space with 0 (Handled with numpy views)
a.
(vec2.x + vec3.x, vec2.y + vec3.y, vec3.z)(Handled byvec3[:2] += vec2) b.(vec3.x, vec2.x + vec3.y, vec2.y + vec3.z)(Handled byvec3[1:] += vec2) c.(vec2.x + vec3.x, vec3.y, vec2.y + vec3.z)(Handled byvec3[0] += vec2[0]; vec3[2] += vec2[2]) - Partially repeat the vec2 (Also handled with numpy views, except a combination of the above this time)
a.
(vec2.x + vec3.x, vec2.y + vec3.y, vec2.x + vec3.z)b.(vec2.y + vec3.x, vec2.x + vec3.y, vec2.y + vec3.z)
This means that if users want to work within a limited system that didn't allow for dimension mismatches, nor have the options for broadcasting/resizing like in numpy they will have to expand the vector or write their own functions, both of which go against the whole ideal.
These are operations that are incredible useful, whether we're trying to appeal to p5.js data science side (e.g. 2 classifications each with their own base stats and using repetition to add to each sub-classification to get their respective statistics) or graphics side (e.g. inexpensive way to project onto a basic slanted plane).
Point #2
Plus, there's the fact that throwing a warning (which I think that we should do over throwing an error by default in order to correspond with Javascript's philosophy...) requires checking for mismatches, an operation that can be expensive...
This is a great topic for discussion, and it touches on the core philosophy of the library. Here’s how I see it:
1. **The p5.js precedent:** While JavaScript as a general language is permissive, p5's philosophy is tuned to its precise audience and realized through its Friendly Error System (FES). The goal of FES isn't to avoid errors, but to make them as helpful as possible. Your suggestion of a warning, and [@davepagurek](https://github.com/davepagurek)'s idea of a "repair toolkit" in the error message, are perfect extensions of the FES philosophy. A clear error that tells a user how to fix their code is the most user-friendly approach. 2. **The JS ecosystem precedent:** For this specific domain (scientific computing in JS), a library like TensorFlow.js is a very strong precedent. It throws clear errors on shape mismatches. 3. **The performance consideration:** The performance cost of checking for a mismatch is identical for both errors and warnings. In either case, the code must perform an `if (dimensions_do_not_match)` check. Throwing an error is actually faster because the program stops, whereas a warning would require both the check and the overhead of logging a message while the program continues.
- Yes, p5.js is tuned to its precise audience, which is why it follows Javascript's standard. It uses type coercion and doesn't use strict types, and in general implements the same principle of trying to run instead of throwing an error. For instance, using a completely invalid type or the wrong number of parameters oftentimes just throws a warning when in other languages (and I personally think) it should throw an error.
- Sure, it's a very strong precedent, and I'd love to follow it completely. The issue is that the math library we're currently implementing and that you're encouraging doesn't follow many of its key components meaning that we have to compensate somehow.
- Saying that completely stopping a program is better performance than throwing a warning is like saying that crashing a game is better performance than it showing a warnings panel or that stopping a car drives you farther than running it a bit slower.
That simply isn't how performance works. Also, throwing warnings instead of errors was never a matter of performance. The modes was the matter of performance, and in the more performant mode discussed it wouldn't throw the warning in the first place.
Point #3
...it seems the clear choice is to not choose one, but to give the users everything needed to cover all use cases... "Why choose one? Choose both."
I appreciate this perspective, but I believe offering multiple modes would introduce more problems than it solves. The
numpy.seterrexample is interesting, but it's for numeric errors, and even NumPy doesn't offer a custom "permissive" mode for broadcasting shape errors. Offering a custom mode would increase the maintenance burden for p5 developers and the cognitive load for users, in exchange for a set of features with limited uses.Fortunately, the standard, error-throwing approach is consistent, performant, debuggable, and extensible. It provides a solid foundation that we can build upon with the excellent FES and "repair kit" ideas that have come up in this conversation.
3.1
even NumPy doesn't offer a custom "permissive" mode for broadcasting shape errors.
Yes, I explicitly referenced why here:
While they don't have options for broadcasting via
numpy.seterr, that's largely because it isn't something absent. Assuming that we don't implement methods for broadcasting that rival it (which we probably shouldn't be considering our position within the JS ecosystem and the manpower cost), it seems the clear choice is to not choose one, but to give the users everything needed to cover all use cases:
3.2
Offering a custom mode would increase the maintenance burden for p5 developers and the cognitive load for users, in exchange for a set of features with limited uses.
This is just a matter of disabling checks and writing code that can handle different shapes whether by halting early/skipping, repeating, etc. Handling how things fail is common, and should've been standard long ago considering how inexpensive it is in terms of manpower.
Thanks for your clarifications @RandomGamingDev.
You raise a great point regarding slicing. These sorts of features are planned as part of the matrix proposals I'm working on. An important goal is a unified interface across math classes, so this would likely extend to vectors as well. Since that would be a separate, new feature, I think we could discuss it in a separate issue.
Regarding performance, it may be helpful to clarify that standard broadcasting would not throw errors or warnings in cases where the rules apply. This is consistent with how p5 already behaves. So other approaches wouldn't be more performant on the basis of not throwing errors or warnings. In general, I'm not sure how the alternative mode you described before ("Automatically [broadcast] all vectors to the highest common dimension") could be more performant. It only adds new behavior in cases that are unsupported by standard broadcasting rules. Also, it appears necessary to check dimensions whether we support standard broadcasting or padding by the identity.
I look forward to hearing what others have to say.
~~FWIW I'm of the opinion that $n$-dimensional vectors and matrices are better left to addon libraries. p5.js users and maintainers would benefit from sticking to a lean, performant core with friendly APIs that naturally extend to anything beyond vec4 and mat4.~~
Edit: having caught up with #7754, it turns out that most of the pieces are in place to introduce a didactic p5.Matrix class that supports $m \times n$ matrices with reasonable performance. Cool!
@GregStanton has highlighted some great opportunities for us to improve p5.Vector. Big +1 for Option 2:
Based on the analysis above, I propose that p5.js adopt the standard, widely-used broadcasting rules:
- Operations are allowed if vector dimensions match.
- Operations are allowed if one operand is a scalar (a 1D vector or a single number).
- All other dimension mismatches will throw an error.
Personally, it'd help me to better understand this and related proposals by seeing the API in context. Something like the following example sketches would help to span the space of beginner/intermediate projects:
- Simulations from NOC
- 2D game engine from scratch
- Raytracing from scratch
- Neural network from scratch
- Image compression with PCA
- FFT from scratch
Thanks @nickmcintyre for the helpful list! I also have a list of creative use cases for matrices, in an internal set of proposals that I haven't published yet. Actually, one of the reasons to postpone publishing is that I'm still working on demos to illustrate some of those use cases. (I'm also holding off on it because my plate is full with the vector issues listed in #8149 at the moment.)
My current plan is to include links to the starter demos from the published proposals. Currently, I expect to cover cellular automata and some neural network concepts. I already have a working demo for the new Transform class that illustrates some of its functionality. We may develop other demos as a community, such as the ones you mentioned, to further flesh out specific features.
How does that plan sound to you?
- Partial broadcasting adds a lot of extra rules and special cases, which can make things confusing and harder to debug.
- On the other hand, standard broadcasting is simple, consistent, and easy to understand. It makes sure that when we work with vectors of different sizes, the results are predictable and make sense — especially for creative coding and sketching.
- By using standard broadcasting, we keep things clear and easy to use without adding extra work or forcing users to write their own helper functions.
Aligning with standard broadcasting is the best choice here.
I think we have consensus to use standard broadcasting when performing an operation on a vector with a 1D vector. So, for example, createVector(10,10,10).mult(createVector(2)) would return [20,20,20]. Similarly for other operations. It doesn't do this today, but there are no version 1 compatibility issues since it doesn't have 1D vectors.
The original proposal also includes broadcasting when using a scalar instead of a 1D vector. so createVector(10,10,10).mult(2) would also return [20,20,20]. Similarly for other operations. Corrently, only mult() and div() do this. Importantly, add() and sub() treat a single scalar parameter as the "x component of the vector to add/subtract", which is the documented version 1 behavior. This would be a behavior change from version 1, making them work like mult() and arguably improving consistency both internally and with other libraries. This is a subtle change and I want to make sure nobody is surprised by it. It can potentially break version 1 sketches, but I think most users would use code like v.x += 2 rather than the less obvious v.add(2) to add a value to only the first element of a vector.
I think there is also consensus to not do partial broadcasting, which can be ambiguous and confusing. Rather, an operation should throw an error if the vectors are incompatible. There will be a separate effort to define new operations to extend or truncate vectors to make them compatible (which the FES can refer users to once they are implemented).
As described, this will currently break version 1 sketches that are 2D and use createVector() with no parameters (which is quite common) since that currently creates a 3D vector [0,0,0]. The desired behavior is being discussed at issue #8156, and one possibility is for createVector() with no parameters to create a size-deferred vector that takes on a permanent shape as operations are performed with it. That would avoid breaking version 1 sketches, but is tricky to implement and may cause confusion in the future. The current thinking seems to be to deprecate the no-argument usage of createVector() for the future, and for now have it create a 3D vector but give a warning. This would still break version 1 sketches that use it, but at least users would get an idea of how to fix the problem. (It currently can produce vectors with NaN which cause strange behavior but no messages to point to the cause; see #8117.)
I made a suggestion earlier which has been overlooked since the discussion has focused on broadcasting, but I still think it is worthwhile so will repeat it. To avoid breaking 2D sketches that use createVector() with no parameters, we need a simple exception to the dimension mismatch rule: if one vector is 2D and the other is 3D with a z component of 0, convert the 3D vector to a 2D vector by removing the z component so the operation can be performed. Problematic sketches would generate warnings to add arguments to the createVector() calls, but they would still run correctly so users aren't forced to deal with the issue right away.
Thoughts? Are we getting close here?
Thanks, @sidwellr! This is a super helpful summary of the remaining issues. You've raised two distinct points:
- Broadcasting policy: Aligning p5.js 2.x with the standard broadcasting model.
- No-argument usage of
createVector(): A special exception for 2D/3D vectors to help with legacycreateVector()usage.
On Point 1 (Broadcasting), I'm glad we're in agreement! This is the subject of the current issue, and adding you to the list of supporters strengthens the consensus even further. Thank you so much for sharing your thoughts.
On Point 2 (createVector()), I really appreciate you thinking of ways to reduce friction! This idea is a specific fix for the createVector() no-argument issue. The PR dedicated to that issue (#8203) is a great place to have that discussion.
My concern with the specific exception you proposed is that it would require adding a new, special-case check to every vector operation, which could harm performance and would add internal complexity. It also complicates the vector model for users by adding a significant side-effect in a confusing special case. For example, users may wonder, why not extend the 2D vector instead of truncating the 3D vector? Why doesn't it apply to 3D and 4D vectors? Subtle explanations based on backward compatibility are not suitable for user-friendly documentation. This adds significant complexity for users across the API, and makes it harder for users to understand their code, in order to allow a deprecated edge case in a single function.
I've posted a detailed analysis of an alternative (README + FES message) over on #8203. I'd love to continue this discussion there! That way we can keep the discussion here focused on broadcasting as a general policy.
The title of this issue seems more general than just broadcasting. In particular, it appears we are adding a new behavior: throwing an error when the dimensions don't match. If the #8156 discussion for createVector() with no parameters resulted in creating a 3D vector [0,0,0] (with a warning), the dimension mismatch error would break a lot of version 1 sketches that inadvertantly combined 2D and 3D vectors. My suggestion addressed that, and discussion for it is appropriate here since it is part of the "standard policy for vector dimension mismatches" and would be implemented along with the dimension compatibility checks. But since the PR for that issue, #8203, seems to now call for making createVector() with no parameters use some special magic to determine the intended vector size, no special action for combining 2D and 3D vectors is needed. So we can just ignore that suggestion unless there are issues implementing the magic.
Thanks @sidwellr. I appreciate you digging into all of this! Your comment is really helpful because it allows me to clarify the way I'm using the term "broadcasting."
I've been using it as a shorthand for the standard policy for handling dimension mismatches popularized by NumPy. In this usage, broadcasting has two parts:
- Allowing simple, useful mismatches (one of the operands is a scalar / 1D vector).
- Throwing FES errors for all other mismatches.
Regarding scope, I acknowledge that the createVector() bug is tricky because createVector() itself is independent of broadcasting, but some of the solutions that have been put forth in #8203 do affect broadcasting. Fortunately, there is a non-magic solution that's completely independent of broadcasting. I've been working on it, and hopefully, we'll be able to sort that out independently of the current discussion.
Thanks again for all of your comments!
Hi — I'm Shubham Kahar and I support this proposal.
It improves consistency and clarity for vector users, and I think it will simplify both API usage and documentation.
Happy to help test or document once the final decision is made. 🙂
Hi everyone, thanks so much for all the lively discussion of the p5.js 2.x Vector implementation! Now that that 2.1 is released, we wanted to set up a more direct discussion space for p5.js 2.x Vector implementation bugfixes, documentation, and improvements. So, here is a Discord channel: https://discord.gg/gH3VcRKhen
As we discuss/unblock each of the vector issues, I will also follow up on those issues as a comment. So if you prefer to participate only (or primarily) on GitHub, that still also works!
Hey everyone this is the summary of current understanding — Broadcasting & createVector() (as I understand it)
Broadcasting Policy (Option 2):
- Scalars (e.g. 2) and 1D vectors like [2] will be broadcast to match the dimension of the target vector.
- So vec.mult(2) and vec.mult([2]) will behave the same.
- If dimensions don’t match (e.g. [2, 3] with a 3D vector), an error will be thrown.
- This keeps the API consistent, predictable, and aligned with major math/ML libraries.
Scalar vs [2] Confusion:
- Mathematically, 2 ≠ [2], but for API/UX clarity, they are treated the same.
- This avoids surprises and introduces a consistent UX pattern across operations — though it differs from p5.js 1.x, where scalar broadcasting was limited to multiplication and division.
createVector() with no arguments:
- Still returns [0, 0, 0] (3D vector).
- As of version 2.1, it always shows a Friendly Error System (FES) warning, regardless of context — to help users fix sketches that may break due to the default 3D vector.
- This avoids breaking old sketches while guiding users toward better practices.
Summary:
- Option 2 is the current direction.
- No silent partial broadcasting (like padding with 0 or 1).
- No special-case logic for 2D/3D mixing — just clear rules and helpful warnings.
Since the vector-discussion channel on Discord is currently focused on this issue, I’m posting this summary here to help everyone quickly understand where things stand so far.
If I’ve misunderstood anything above, please feel free to correct me — I’d really appreciate it!
@Ayaan005-sudo Thanks but I want to clear things up in that this is not the verdict or what is fully agreed in terms of what we do, that's what the call being arrange is about, so I don't want to give the impression that this is the agreed approach we are taking.
@Ayaan005-sudo Some minor corrections:
Scalar vs [2] Confusion: Version 1.x does not support 1D vectors like [2], so this is new to version 2. Far from "restoring consistency", scalar operations do not broadcast in any version except for multiplication and division, and the proposal to make them do so is a breaking change. For example, in every version of p5.js so far, createVector(1,2,3).add(4) will return [5,2,3]; it is proposed to change that to return [5,6,7].
createVector() with no arguments: As of version 2.1, this will always show a FES warning, not just if used in a 2D context. It does not avoid breaking old sketches, but will provide helpful direction so users can more easily fix sketches that break when moved to version 2.
As limzykenneth said, this is not a "final Verdict" as you said in your first sentence, but is a helpful summary for further discussion as you said in your penultimate sentence.
Hi all — I'm Shubham Kahar and I support adopting the standard broadcasting policy (allow scalar / 1D broadcasting, throw on other mismatches).
This keeps p5.Vector consistent with other math/ML libs and prevents silent bugs. I also strongly recommend pairing this with clear Friendly Error System (FES) messages and migration docs/examples so users can quickly fix broken sketches.
Happy to help write tests, migration examples, or user-facing docs once a final decision is made.
Thanks so much @limzykenneth and @sidwellr for the thoughtful clarifications — really appreciate it!
I’ve updated the summary based on your feedback. If I’ve still misunderstood anything or missed a detail, please feel free to correct me again — that would be super helpful.
Thanks again for helping me refine this!
Following up on my earlier comment — I want to add my full support for Option 2 (standard broadcasting + errors for incompatible dimensions).
After reading the continued discussion from @limzykenneth, @sidwellr, and others, I believe this approach provides the clearest and most consistent rule set for users. It avoids silent bugs, matches modern math/ML libraries, and keeps vector behavior predictable across all dimensions.
I agree with the direction of: • treating scalars and 1D vectors with standard broadcasting, • throwing clear FES errors for all other mismatches, • avoiding partial/identity-based padding rules.
This direction will make the API easier to teach, easier to document, and safer for beginners. Happy to help test or review once decisions are finalized.
Hi everyone! Here's an update from our discussion on Discord, and thanks for everyone who joined. You can always come by the channel #vector-discussion.
The longer notes from our call , with the main takeway for dimensions mismatch behavior as follows:
- to deprecate / discourage all operations of mismatched vector lengths, but without hard stops because this is very uncommon p5.js. The main use case is updating the sketch from 2D to 3D, and there’s a consensus that this should basically be a little bit of work - but it shouldn’t completely disrupt.
- In addition to informative FES warning (non-blocking), there should be a very explainable, consistent behavior for what happens when user operates on vectors with size mismatch. For mathematical operations, we take the dimension of one of the vectors and make the other vector "match" that. This logic can use the length of the longer vector; the shorter vector; or the caller. See complete notes here and feel free to comment.
- Main next step is to add these different prioritizations as proofs of concept to the table for consideration. Thanks to @limzykenneth for the "updated https://p5js-vector.limzykenneth.com/ with more data to compare and something to filter it with":
At this point I think we can compare the smaller vector priority approach with the larger vector priority approach and see if there is any thing unexpected. If we feel that as discussed in the call, smaller vector priority is preferred, I can move forward with it and see if there are any other considerations.
Thanks for your continued interest and input everyone; the table still serves to demonstrate possible outcomes, so it's not decided. The Discord channel is the quickest way to offer your thoughts, but I will check both here and the notes doc, too.