wger icon indicating copy to clipboard operation
wger copied to clipboard

Insufficient Input Validation and Sanitization

Open Kcodess2807 opened this issue 4 months ago • 4 comments

Description: While the codebase has some HTML sanitization (using bleach), there's minimal custom validation beyond Django's built-in validators. The validators.py file only contains one simple language validator, and many models lack comprehensive field validation.

Impact: This creates potential security vulnerabilities and data quality issues. Users could input malformed data that breaks calculations (like negative weights or impossible dates), or potentially exploit input fields.

Value: Comprehensive input validation would prevent 80% of data quality issues and significantly reduce security risks from malicious input.

Kcodess2807 avatar Oct 05 '25 17:10 Kcodess2807

if it sounds good to you, please assign it to me :)

Kcodess2807 avatar Oct 05 '25 17:10 Kcodess2807

what would you suggest?

rolandgeider avatar Oct 06 '25 22:10 rolandgeider

I researched the repo and found these points:

1. Expand Core Validators

  • Add validators for positive weights, reasonable repetition counts, and valid date ranges
  • Create exercise-specific validators for names and descriptions
  • Implement cross-field validation for workout data (e.g., total volume checks)
  • Add file upload validators for size, format, and content validation

2. Implement Multi-Layer Validation Strategy

  • Model-level validation using clean() methods for business logic
  • Form-level validation for user interface inputs
  • API serializer validation for REST endpoints
  • Database constraint validation as the final safety net

3. Enhance Security-Focused Sanitization

  • Extend the existing bleach configuration for different content types
  • Add specific sanitizers for exercise descriptions, user profiles, and comments
  • Implement request size limits and content type validation
  • Create input sanitization middleware for consistent processing

4. Add Domain-Specific Validation Rules

  • Fitness data validation (weight ranges, rep counts, duration limits)
  • Nutrition data validation (calorie ranges, portion sizes, ingredient names)
  • User data validation (age ranges, measurement units, profile completeness)
  • Date validation for workout schedules and planning

5. Create Configurable Validation Settings

  • Make validation limits configurable through Django settings
  • Allow different validation rules for different user types (beginner vs advanced)
  • Enable/disable specific validators based on deployment needs
  • Support internationalization for different measurement systems

Kcodess2807 avatar Oct 07 '25 04:10 Kcodess2807

we already have some of these things, e.g. the weight being a positive number (wger.manager.models.weight_config.WeightConfig) or the upload checks for images (wger.utils.images.validate_image_static_no_animation). As for the bleach html sanitation, there's an issue about moving the description to markdown, and there's no other places where we use html so I think we're fine there.

I would also not make any of the validators (either the ones we have or any new ones we add) configurable, I think this is too complex for little gain.

A problem with adding validators to the workout data is that the "weight" or "repetition" entries can be something else, depending on the unit selected. For example you can set for the weight "kg" or "lb", but also "km/h" or "body weight" and for the repetitions "minutes" or "km". These units are stored in another table and while we could add validators for those we know, users that have their own instances might have added new ones or deleted existing ones, so we can't really know for sure. In the end, it's the responsibility of the user to enter something sensible here.

This being said, there are probably some places were adding some basic validation to the models like min-max values, only positive values, etc. definitely makes sense!

rolandgeider avatar Oct 07 '25 11:10 rolandgeider