lazy-regex icon indicating copy to clipboard operation
lazy-regex copied to clipboard

[Enhancement] Constant support

Open SichangHe opened this issue 2 years ago • 4 comments

Currently, lazy-regex supports string literals only.

Is there any way that constants can be (maybe partially) supported?

As I exploit lazy-regex, my regex's grew more insane and I use const_format to modularize them, but that also means I cannot use lazy-regex any more.

SichangHe avatar Sep 03 '23 00:09 SichangHe

This is an interesting question.

To my knowledge, there's still no way for a procedural macro to receive a const expression already evaluated.

So it looks like lazy-regex would have to recognize the const_format call and evaluate it itself (by calling const_format).

At first sight, it looks like it could be possible, but with a lot of ugly code very specific to const_format, and not covering all cases.

Especially, it would certainly not be possible to use externally defined const values or const functions from the const_format!(...) expression. This probably removes most of the value of const_format.

If somebody researches this and finds a satisfying solution, please tell me.

Canop avatar Sep 03 '23 06:09 Canop

To my knowledge, there's still no way for a procedural macro to receive a const expression already evaluated.

That was my guess as of why lazy-regex only supports literals. If so, we would not have a known way to do it.

Based on the cargo expand result I get, const_format generates a const closure and calls it, so it does not produce any literals.

SichangHe avatar Sep 03 '23 06:09 SichangHe

Based on the cargo expand result I get, const_format generates a const closure and calls it, so it does not produce any literals.

I may miss recent advances in Rust, but otherwise this looks like the only solution: a procedural macro has no way to look outside its own arguments: it can't fetch functions or values defined elsewhere. So const_format can't really evaluate its arguments.

Using such trick in lazy-regex doesn't look interesting: this wouldn't check the regex compiles, wouldn't count the groups, etc.

Disclaimer: I didn't even look at const_format's internals, maybe they found a way

Canop avatar Sep 03 '23 06:09 Canop

Yes, exactly.

As a simplest example from my codebase:

pub const AS_SET_BASE: &str = formatcp!("(?:as-{}|{})", OBJECT_NAME, PEERAS);
is expanded to 😱
pub const AS_SET_BASE: &str = ::const_format::pmr::__AssertStr {
    x: {
        use ::const_format::__cf_osRcTFl4A;
        ({
            #[doc(hidden)]
            #[allow(unused_mut, non_snake_case)]
            const CONCATP_NHPMWYD3NJA: &[__cf_osRcTFl4A::pmr::PArgument] = {
                let mut len = 0usize;
                let const_fmt_local_0 = OBJECT_NAME;
                let const_fmt_local_1 = PEERAS;
                &[
                    __cf_osRcTFl4A::pmr::PConvWrapper("(?:as-")
                        .to_pargument_display(
                            __cf_osRcTFl4A::pmr::FormattingFlags::NEW,
                        ),
                    __cf_osRcTFl4A::pmr::PConvWrapper(const_fmt_local_0)
                        .to_pargument_display(
                            __cf_osRcTFl4A::pmr::FormattingFlags::__REG,
                        ),
                    __cf_osRcTFl4A::pmr::PConvWrapper("|")
                        .to_pargument_display(
                            __cf_osRcTFl4A::pmr::FormattingFlags::NEW,
                        ),
                    __cf_osRcTFl4A::pmr::PConvWrapper(const_fmt_local_1)
                        .to_pargument_display(
                            __cf_osRcTFl4A::pmr::FormattingFlags::__REG,
                        ),
                    __cf_osRcTFl4A::pmr::PConvWrapper(")")
                        .to_pargument_display(
                            __cf_osRcTFl4A::pmr::FormattingFlags::NEW,
                        ),
                ]
            };
            {
                #[doc(hidden)]
                const ARR_LEN: usize = ::const_format::pmr::PArgument::calc_len(
                    CONCATP_NHPMWYD3NJA,
                );
                #[doc(hidden)]
                const CONCAT_ARR: &::const_format::pmr::LenAndArray<[u8; ARR_LEN]> = {
                    use ::const_format::{__write_pvariant, pmr::PVariant};
                    let mut out = ::const_format::pmr::LenAndArray {
                        len: 0,
                        array: [0u8; ARR_LEN],
                    };
                    let input = CONCATP_NHPMWYD3NJA;
                    {
                        let ::const_format::pmr::Range { start: mut outer_i, end } = 0..input
                            .len();
                        while outer_i < end {
                            {
                                let current = &input[outer_i];
                                match current.elem {
                                    PVariant::Str(s) => {
                                        let str = s.as_bytes();
                                        let is_display = current.fmt.is_display();
                                        let mut i = 0;
                                        if is_display {
                                            while i < str.len() {
                                                out.array[out.len] = str[i];
                                                out.len += 1;
                                                i += 1;
                                            }
                                        } else {
                                            out.array[out.len] = b'"';
                                            out.len += 1;
                                            while i < str.len() {
                                                use ::const_format::pmr::{
                                                    hex_as_ascii, ForEscaping, FOR_ESCAPING,
                                                };
                                                let c = str[i];
                                                let mut written_c = c;
                                                if c < 128 {
                                                    let shifted = 1 << c;
                                                    if (FOR_ESCAPING.is_escaped & shifted) != 0 {
                                                        out.array[out.len] = b'\\';
                                                        out.len += 1;
                                                        if (FOR_ESCAPING.is_backslash_escaped & shifted) == 0 {
                                                            out.array[out.len] = b'x';
                                                            out
                                                                .array[out.len
                                                                + 1] = hex_as_ascii(
                                                                c >> 4,
                                                                ::const_format::pmr::HexFormatting::Upper,
                                                            );
                                                            out.len += 2;
                                                            written_c = hex_as_ascii(
                                                                c & 0b1111,
                                                                ::const_format::pmr::HexFormatting::Upper,
                                                            );
                                                        } else {
                                                            written_c = ForEscaping::get_backslash_escape(c);
                                                        };
                                                    }
                                                }
                                                out.array[out.len] = written_c;
                                                out.len += 1;
                                                i += 1;
                                            }
                                            out.array[out.len] = b'"';
                                            out.len += 1;
                                        }
                                    }
                                    PVariant::Int(int) => {
                                        let wrapper = ::const_format::pmr::PWrapper(int);
                                        let debug_display;
                                        let bin;
                                        let hex;
                                        let sa: &::const_format::pmr::StartAndArray<[_]> = match current
                                            .fmt
                                        {
                                            ::const_format::pmr::Formatting::Display => {
                                                debug_display = wrapper.to_start_array_display();
                                                &debug_display
                                            }
                                            ::const_format::pmr::Formatting::Debug => {
                                                match current.fmt_flags.num_fmt() {
                                                    ::const_format::pmr::NumberFormatting::Decimal => {
                                                        debug_display = wrapper.to_start_array_debug();
                                                        &debug_display
                                                    }
                                                    ::const_format::pmr::NumberFormatting::Binary => {
                                                        bin = wrapper.to_start_array_binary(current.fmt_flags);
                                                        &bin
                                                    }
                                                    ::const_format::pmr::NumberFormatting::Hexadecimal => {
                                                        hex = wrapper.to_start_array_hexadecimal(current.fmt_flags);
                                                        &hex
                                                    }
                                                }
                                            }
                                        };
                                        let mut start = sa.start;
                                        while start < sa.array.len() {
                                            out.array[out.len] = sa.array[start];
                                            out.len += 1;
                                            start += 1;
                                        }
                                    }
                                    PVariant::Char(c) => {
                                        let encoded = c.encoded();
                                        let len = c.len();
                                        let mut start = 0;
                                        while start < len {
                                            out.array[out.len] = encoded[start];
                                            out.len += 1;
                                            start += 1;
                                        }
                                    }
                                }
                            }
                            outer_i += 1;
                        }
                    }
                    &{ out }
                };
                #[doc(hidden)]
                #[allow(clippy::transmute_ptr_to_ptr)]
                const CONCAT_STR: &str = unsafe {
                    let slice = ::const_format::pmr::transmute::<
                        &[u8; ARR_LEN],
                        &[u8; CONCAT_ARR.len],
                    >(&CONCAT_ARR.array);
                    {
                        let bytes: &'static [::const_format::pmr::u8] = slice;
                        let string: &'static ::const_format::pmr::str = {
                            ::const_format::__hidden_utils::PtrToRef {
                                ptr: bytes as *const [::const_format::pmr::u8] as *const str,
                            }
                                .reff
                        };
                        string
                    }
                };
                CONCAT_STR
            }
        })
    },
}

SichangHe avatar Sep 03 '23 06:09 SichangHe