python-betterproto icon indicating copy to clipboard operation
python-betterproto copied to clipboard

In generated field name, an undescore is introduced after a number

Open YodaEmbedding opened this issue 11 months ago • 1 comments

Summary

Underscore is added after a number in the field name, e.g. c2s becomes c2_s.

Reproduction Steps

syntax = "proto3";

message Example {
  string c2s = 1;
}
protoc --python_betterproto_out=lib example.proto

Expected Results

Expecting c2s.

# Generated by the protocol buffer compiler.  DO NOT EDIT!
# sources: example.proto
# plugin: python-betterproto
# This file has been @generated

from dataclasses import dataclass

import betterproto


@dataclass(eq=False, repr=False)
class Example(betterproto.Message):
    c2s: str = betterproto.string_field(1)

Actual Results

Generates c2_s instead.

# Generated by the protocol buffer compiler.  DO NOT EDIT!
# sources: example.proto
# plugin: python-betterproto
# This file has been @generated

from dataclasses import dataclass

import betterproto


@dataclass(eq=False, repr=False)
class Example(betterproto.Message):
    c2_s: str = betterproto.string_field(1)

System Information

libprotoc 27.1 Python 3.8.20 Name: betterproto Version: 2.0.0b7 Location: /Users/.../demo/.venv/lib/python3.8/site-packages Requires: grpclib, python-dateutil, typing-extensions Required-by: demo

Checklist

  • [x] I have searched the issues for duplicates.
  • [x] I have shown the entire traceback, if possible.
  • [x] I have verified this issue occurs on the latest prelease of betterproto which can be installed using pip install -U --pre betterproto, if possible.

YodaEmbedding avatar Jan 27 '25 16:01 YodaEmbedding

Proposed fix:

diff --git a/src/betterproto/casing.py b/src/betterproto/casing.py
index f7d0832..d09c708 100644
--- a/src/betterproto/casing.py
+++ b/src/betterproto/casing.py
@@ -8,11 +8,11 @@ SYMBOLS = "[^a-zA-Z0-9]*"
 
 # Optionally capitalized word.
 # language=PythonRegExp
-WORD = "[A-Z]*[a-z]*[0-9]*"
+WORD = "[A-Z]*([a-z]|[0-9][A-Z]*)*"
 
 # Uppercase word, not followed by lowercase letters.
 # language=PythonRegExp
-WORD_UPPER = "[A-Z]+(?![a-z])[0-9]*"
+WORD_UPPER = "([A-Z]+(?![a-z])[0-9]*)+"
 
 
 def safe_snake_case(value: str) -> str:

Tests:

>>> betterproto.casing.snake_case("C2S2C c2s2c")
'c2s2c_c2s2c'

>>> betterproto.casing.snake_case("Client2Server CLIENT2SERVER client2Server")
'client2server_client2server_client2server'

>>> betterproto.casing.snake_case("clientToServer")
'client_to_server'

Much nicer. :)

YodaEmbedding avatar Mar 16 '25 21:03 YodaEmbedding