Bump databricks-labs-blueprint from 0.8.2 to 0.11.4

Open dependabot[bot] opened this issue 2 months ago • 0 comments

Bumps databricks-labs-blueprint from 0.8.2 to 0.11.4.

Release notes

Sourced from databricks-labs-blueprint's releases.

Release v0.11.4

What's Changed

Added Password Prompt to operate with echo off in terminal (databrickslabs/blueprint#265). The command-line interface now includes a secure password prompt feature, allowing users to enter sensitive information without it being visible on the screen. This is achieved through a new method that utilizes the getpass library to hide user input when entering passwords, taking a prompt message and an optional maximum number of attempts as parameters. The method repeatedly prompts the user for a password until a valid input is provided or the maximum number of attempts is reached, at which point it raises a ValueError. This addition enhances the security and usability of the interface, and its functionality is validated through new test methods that cover both successful password entry and the scenario where the maximum number of attempts is exceeded, ensuring the feature behaves as expected in various situations.

Sniff encoding properly in XML files with a standalone directive (databrickslabs/blueprint#256). The XML file encoding detection has been improved to support a wider range of valid XML declarations. The regular expression used to match XML declarations has been updated to correctly handle cases where both encoding and standalone attributes are present, such as . This change enables more accurate detection of the encoding attribute in XML files, even when a standalone directive is present. Additionally, test functions have been added and modified to verify this functionality, including tests for XML files with a BOM prefix and those with an XML standalone declaration, ensuring that the code can correctly read these files and detect the encoding.

Full Changelog: https://github.com/databrickslabs/blueprint/compare/v0.11.3...v0.11.4

v0.11.3

Fixed configuration file unmarshalling of JSON floating-point values (#253). The unmarshalling of primitive types has been improved to ensure accurate conversion and prevent potential errors or data corruption. The updated functionality now correctly handles the conversion of JSON floating-point values to integers by refusing to truncate precision and instead raising a SerdeError when necessary. Additionally, string-to-boolean conversions are now strictly validated to only accept true or false (case-insensitive). Furthermore, configuration file unmarshalling has been enhanced with additional type checks to verify that loaded values match the expected types, such as strings, integers, and floats, thereby preventing incorrect type conversions and ensuring that the loaded data retains its original precision and type.

Fixed unmarshalling of forward references on Python ≥ 3.12.4 (#252). The library's unmarshalling functionality has been updated to support forward references on Python versions 3.12.4 and later, which introduced changes to the _evaluate method of ForwardRef. A new internal utility method has been added to handle these changes, ensuring compatibility with different Python versions by conditionally passing the recursive_guard parameter as a keyword argument and including additional type information. Additionally, a new test function has been introduced to verify the correct handling of forward references in class fields, simulating future annotations and testing the save and load process of an instance with various field types, including strings, integers, and JSON values, to ensure that forward references are correctly resolved.

Support detecting file encoding from XML declaration (#254). The library's file encoding detection capabilities have been enhanced to support XML files, allowing for the extraction of encoding information from the XML declaration at the start of the file. A new detection method has been introduced, which reads the initial bytes of the file to determine the potential encoding, attempts to decode the XML declaration, and returns the specified encoding if successful. This functionality is accessible through the updated decode_with_bom and read_text functions, which now accept an optional detect_xml parameter to enable XML declaration-based encoding detection. If no encoding is detected via the byte order mark (BOM) or XML declaration, the library defaults to the locale's preferred encoding. Additionally, new test cases have been added to verify the correct detection of XML file encodings, including scenarios with encoding declarations, byte order marks, and default UTF-8 encoding.

Contributors: @asnare, @sundarshankar89

v0.11.2

Allow login URLs as profile host when configuring the workspace/admin client for CLI commands (#250). The handling of the DATABRICKS_HOST environment variable has been modified to ensure consistent normalization of the host URL with the Databricks Go SDK, resolving a host normalization issue that previously arose from differences in SDK implementations. Two new methods, fix_databricks_host and _patch_databricks_host, have been introduced to emulate the Go SDK's host normalization and update the environment variable if necessary. The fix_databricks_host method normalizes the host URL by parsing it and creating a new URL instance with empty path, parameters, query, and fragment if the netloc is empty, while the _patch_databricks_host method checks and updates the DATABRICKS_HOST environment variable accordingly. This change enables the Python SDK to receive a normalized host URL, allowing the labs CLI integration to work correctly, and updates the needs_workspace_client and is_account checks to use the normalized host URL when creating workspace or account clients. Additionally, several unit tests have been added to verify the correctness of the normalization and patching functionality for different host value types and client scenarios.

Contributors: @asnare

v0.11.1

Expose the number of available CPUs for concurrent processing (#244). The library now provides a method to determine the number of logical CPUs available for the current process, considering factors such as containerized environments where the available CPU quota may differ from the total number of CPUs present. This method checks for the availability of the process_cpu_count attribute, and if not available, attempts to use the sched_getaffinity function on Linux or falls back to the total number of CPUs in the system, defaulting to 1 if unknown. The gather method has been updated to utilize this new method, allowing for more accurate determination of the available CPU count and improved concurrency. Additionally, several test cases have been added to verify the correct behavior of the method, including scenarios where the count is retrieved from different sources, ensuring a reliable way to determine the available CPU count for configuring concurrent processing in downstream applications.

Improve support for reading text files that contain a Unicode BOM at the start (#243). The library now provides enhanced support for reading text files that contain a Unicode Byte Order Mark (BOM) at the start, allowing for accurate detection and handling of the file's encoding. New methods have been introduced to detect the BOM and decode the file accordingly, including handling of decoding errors and newline characters. The read_text function has been added, enabling the reading of text files with a BOM prefix, and is designed to work with both seekable and non-seekable files, although specifying a read size for non-seekable files will raise an error. Additionally, the existing code for handling Workspace files has been refactored to utilize the same implementation, and improvements have been made to support non-seekable files, ensuring a more robust and reliable reading experience for text files with Unicode BOM markers.

Contributors: @asnare

v0.11.0

Marshalling: allow JSON-like fields (#241). The library has undergone significant changes to improve its marshalling functionality, code readability, and maintainability. A new JsonValue type alias has been introduced to represent the maximum bounds of values that can be saved for an installation, and support for Any and object as type annotations on data classes has been removed. The library now issues a DeprecationWarning when saving raw list and dict fields, and raises a specific error during loading, instructing users to use list[T] or dict[T] instead. Various methods, including _marshal_generic_list, _marshal_raw_list, _marshal_generic_dict, and _marshal_raw_dict, have been updated to handle the serialization of lists and dictionaries, while the _unmarshal method now handles the deserialization of unions, lists, and dictionaries. Additionally, the library has been updated to provide more informative error messages, and several tests have been added to cover various scenarios, including generic dict and list JSON values, bool in union, and raw list and dict deprecation. The Installation class, MockInstallation class, and Paths class have also been updated with new methods, type hints, and custom initialization to improve code flexibility and maintainability.

Contributors: @asnare

v0.10.2

Consistent exception formatting in logs (#237). The logger's exception formatting has been enhanced to provide a consistent and readable log format, adhering to standard Python norms. When an exception occurs, the log message now ensures a newline character separates the error message from the exception details, regardless of whether logs are colorized or not. This update applies to both exception text and stack information, which are now prepended with a newline character if necessary, resulting in a uniform format for all log types. This change resolves previous inconsistencies between colorized and non-colorized logs, aligning the logging functionality with standard Python practices for exception logging, and improving overall log readability.

Ensure that App logger emits DEBUG events if the CLI is invoked with --debug (#238). The get_logger function has been enhanced to provide more flexibility and consistency with standard logging practices. It now accepts an optional manager parameter, allowing for customization of the logging manager, and returns a logging.Logger object. The logger level is automatically set to DEBUG when the application is running in debug mode, as detected by the is_in_debug function, and the level is set using the logging.DEBUG constant for consistency. This change simplifies the code and ensures that the logger emits DEBUG events when the application is run with the debug flag, which is verified through an updated test suite that covers various scenarios, including logger name setting, debug mode behavior, and logger propagation.

Ensure the names of logger levels are consistent (#234). The logger has been updated to use consistent naming conventions for logging levels, aligning with the Python ecosystem's norms. Previously, colorized logs used compact names WARN and FATAL for warning and critical levels, while non-colorized logs used the conventional WARNING and CRITICAL names. To address this inconsistency, two new dictionaries have been introduced to store colorized level names and color codes, and the format method has been modified to utilize these dictionaries, ensuring consistent logging level names and colorized message text. As a result, logging level names have been updated to use the conventional WARNING and CRITICAL instead of WARN and "FATAL", and color codes for message text have been added for each logging level, promoting consistency and adherence to Python logging conventions.

Ensure the non-colorized logs also include timestamps with second-lev… (#235). The log formatter has been updated to include second-granular timestamps in non-colorized logs, providing more precise logging information and ensuring consistency with colorized output. Previously, only minute-granular timestamps were logged, which was insufficient for logging purposes. The update changes the timestamp format from %H:%M to %H:%M:%S to include seconds, resulting in more detailed timestamp information. This change resolves the inconsistency between colorized and non-colorized logs, and is verified by updated tests that validate the formatter's behavior with and without colors, confirming that the formatter now correctly starts with a timestamp in both cases.

Fixed Blueprint Install (#225). The __version__ variable import statement has been updated to utilize a fully qualified module name, providing a more explicit and absolute reference to the module containing version information. This change ensures that the correct version is imported and used to set the user agent extra in relevant function calls, enhancing the reliability and accuracy of version tracking within the library.

... (truncated)

Changelog

Sourced from databricks-labs-blueprint's changelog.

0.11.4

Added Password Prompt to operate with echo off in terminal (#265). The command-line interface now includes a secure password prompt feature, allowing users to enter sensitive information without it being visible on the screen. This is achieved through a new method that utilizes the getpass library to hide user input when entering passwords, taking a prompt message and an optional maximum number of attempts as parameters. The method repeatedly prompts the user for a password until a valid input is provided or the maximum number of attempts is reached, at which point it raises a ValueError. This addition enhances the security and usability of the interface, and its functionality is validated through new test methods that cover both successful password entry and the scenario where the maximum number of attempts is exceeded, ensuring the feature behaves as expected in various situations.

Sniff encoding properly in XML files with a standalone directive (#256). The XML file encoding detection has been improved to support a wider range of valid XML declarations. The regular expression used to match XML declarations has been updated to correctly handle cases where both encoding and standalone attributes are present, such as <?xml version="1.x" encoding="xxx" standalone="yes"?>. This change enables more accurate detection of the encoding attribute in XML files, even when a standalone directive is present. Additionally, test functions have been added and modified to verify this functionality, including tests for XML files with a BOM prefix and those with an XML standalone declaration, ensuring that the code can correctly read these files and detect the encoding.

0.11.3

Fixed configuration file unmarshalling of JSON floating-point values (#253). The unmarshalling of primitive types has been improved to ensure accurate conversion and prevent potential errors or data corruption. The updated functionality now correctly handles the conversion of JSON floating-point values to integers by refusing to truncate precision and instead raising a SerdeError when necessary. Additionally, string-to-boolean conversions are now strictly validated to only accept true or false (case-insensitive). Furthermore, configuration file unmarshalling has been enhanced with additional type checks to verify that loaded values match the expected types, such as strings, integers, and floats, thereby preventing incorrect type conversions and ensuring that the loaded data retains its original precision and type.

Fixed unmarshalling of forward references on Python ≥ 3.12.4 (#252). The library's unmarshalling functionality has been updated to support forward references on Python versions 3.12.4 and later, which introduced changes to the _evaluate method of ForwardRef. A new internal utility method has been added to handle these changes, ensuring compatibility with different Python versions by conditionally passing the recursive_guard parameter as a keyword argument and including additional type information. Additionally, a new test function has been introduced to verify the correct handling of forward references in class fields, simulating future annotations and testing the save and load process of an instance with various field types, including strings, integers, and JSON values, to ensure that forward references are correctly resolved.

Support detecting file encoding from XML declaration (#254). The library's file encoding detection capabilities have been enhanced to support XML files, allowing for the extraction of encoding information from the XML declaration at the start of the file. A new detection method has been introduced, which reads the initial bytes of the file to determine the potential encoding, attempts to decode the XML declaration, and returns the specified encoding if successful. This functionality is accessible through the updated decode_with_bom and read_text functions, which now accept an optional detect_xml parameter to enable XML declaration-based encoding detection. If no encoding is detected via the byte order mark (BOM) or XML declaration, the library defaults to the locale's preferred encoding. Additionally, new test cases have been added to verify the correct detection of XML file encodings, including scenarios with encoding declarations, byte order marks, and default UTF-8 encoding.

0.11.2

Allow login URLs as profile host when configuring the workspace/admin client for CLI commands (#250). The handling of the DATABRICKS_HOST environment variable has been modified to ensure consistent normalization of the host URL with the Databricks Go SDK, resolving a host normalization issue that previously arose from differences in SDK implementations. Two new methods, fix_databricks_host and _patch_databricks_host, have been introduced to emulate the Go SDK's host normalization and update the environment variable if necessary. The fix_databricks_host method normalizes the host URL by parsing it and creating a new URL instance with empty path, parameters, query, and fragment if the netloc is empty, while the _patch_databricks_host method checks and updates the DATABRICKS_HOST environment variable accordingly. This change enables the Python SDK to receive a normalized host URL, allowing the labs CLI integration to work correctly, and updates the needs_workspace_client and is_account checks to use the normalized host URL when creating workspace or account clients. Additionally, several unit tests have been added to verify the correctness of the normalization and patching functionality for different host value types and client scenarios.

0.11.1

Expose the number of available CPUs for concurrent processing (#244). The library now provides a method to determine the number of logical CPUs available for the current process, considering factors such as containerized environments where the available CPU quota may differ from the total number of CPUs present. This method checks for the availability of the process_cpu_count attribute, and if not available, attempts to use the sched_getaffinity function on Linux or falls back to the total number of CPUs in the system, defaulting to 1 if unknown. The gather method has been updated to utilize this new method, allowing for more accurate determination of the available CPU count and improved concurrency. Additionally, several test cases have been added to verify the correct behavior of the method, including scenarios where the count is retrieved from different sources, ensuring a reliable way to determine the available CPU count for configuring concurrent processing in downstream applications.

Improve support for reading text files that contain a Unicode BOM at the start (#243). The library now provides enhanced support for reading text files that contain a Unicode Byte Order Mark (BOM) at the start, allowing for accurate detection and handling of the file's encoding. New methods have been introduced to detect the BOM and decode the file accordingly, including handling of decoding errors and newline characters. The read_text function has been added, enabling the reading of text files with a BOM prefix, and is designed to work with both seekable and non-seekable files, although specifying a read size for non-seekable files will raise an error. Additionally, the existing code for handling Workspace files has been refactored to utilize the same implementation, and improvements have been made to support non-seekable files, ensuring a more robust and reliable reading experience for text files with Unicode BOM markers.

0.11.0

Marshalling: allow JSON-like fields (#241). The library has undergone significant changes to improve its marshalling functionality, code readability, and maintainability. A new JsonValue type alias has been introduced to represent the maximum bounds of values that can be saved for an installation, and support for Any and object as type annotations on data classes has been removed. The library now issues a DeprecationWarning when saving raw list and dict fields, and raises a specific error during loading, instructing users to use list[T] or dict[T] instead. Various methods, including _marshal_generic_list, _marshal_raw_list, _marshal_generic_dict, and _marshal_raw_dict, have been updated to handle the serialization of lists and dictionaries, while the _unmarshal method now handles the deserialization of unions, lists, and dictionaries. Additionally, the library has been updated to provide more informative error messages, and several tests have been added to cover various scenarios, including generic dict and list JSON values, bool in union, and raw list and dict deprecation. The Installation class, MockInstallation class, and Paths class have also been updated with new methods, type hints, and custom initialization to improve code flexibility and maintainability.

0.10.2

Consistent exception formatting in logs (#237). The logger's exception formatting has been enhanced to provide a consistent and readable log format, adhering to standard Python norms. When an exception occurs, the log message now ensures a newline character separates the error message from the exception details, regardless of whether logs are colorized or not. This update applies to both exception text and stack information, which are now prepended with a newline character if necessary, resulting in a uniform format for all log types. This change resolves previous inconsistencies between colorized and non-colorized logs, aligning the logging functionality with standard Python practices for exception logging, and improving overall log readability.

Ensure that App logger emits DEBUG events if the CLI is invoked with --debug (#238). The get_logger function has been enhanced to provide more flexibility and consistency with standard logging practices. It now accepts an optional manager parameter, allowing for customization of the logging manager, and returns a logging.Logger object. The logger level is automatically set to DEBUG when the application is running in debug mode, as detected by the is_in_debug function, and the level is set using the logging.DEBUG constant for consistency. This change simplifies the code and ensures that the logger emits DEBUG events when the application is run with the debug flag, which is verified through an updated test suite that covers various scenarios, including logger name setting, debug mode behavior, and logger propagation.

Ensure the names of logger levels are consistent (#234). The logger has been updated to use consistent naming conventions for logging levels, aligning with the Python ecosystem's norms. Previously, colorized logs used compact names WARN and FATAL for warning and critical levels, while non-colorized logs used the conventional WARNING and CRITICAL names. To address this inconsistency, two new dictionaries have been introduced to store colorized level names and color codes, and the format method has been modified to utilize these dictionaries, ensuring consistent logging level names and colorized message text. As a result, logging level names have been updated to use the conventional WARNING and CRITICAL instead of WARN and "FATAL", and color codes for message text have been added for each logging level, promoting consistency and adherence to Python logging conventions.

Ensure the non-colorized logs also include timestamps with second-lev… (#235). The log formatter has been updated to include second-granular timestamps in non-colorized logs, providing more precise logging information and ensuring consistency with colorized output. Previously, only minute-granular timestamps were logged, which was insufficient for logging purposes. The update changes the timestamp format from %H:%M to %H:%M:%S to include seconds, resulting in more detailed timestamp information. This change resolves the inconsistency between colorized and non-colorized logs, and is verified by updated tests that validate the formatter's behavior with and without colors, confirming that the formatter now correctly starts with a timestamp in both cases.

Fixed Blueprint Install (#225). The __version__ variable import statement has been updated to utilize a fully qualified module name, providing a more explicit and absolute reference to the module containing version information. This change ensures that the correct version is imported and used to set the user agent extra in relevant function calls, enhancing the reliability and accuracy of version tracking within the library.

Fixed argument interpolation in colorised logs (#233). The colorised log formatter has been enhanced to correctly handle log entries containing %-style placeholders with arguments, a common pattern in third-party code, by retrieving the log message using record.getMessage() instead of directly accessing record.msg. This update resolves an issue with improper formatting of logging from third-party components and builds upon previous changes to address the underlying logging problem. Additionally, the corresponding test case has been updated to verify that the formatter correctly handles messages with arguments that require interpolation, both with and without colors enabled, and is no longer expected to fail, indicating that the issue with argument interpolation in the colorized log formatter has been resolved.

Fixed logger name abbreviation fails if the logger name contains .. (#236). The logger's format method has been enhanced to correctly abbreviate logger names containing multiple consecutive dots, which previously led to exceptions. The new logic splits the logger name into components, abbreviating all but the last two, and then reassembles them, ensuring correct abbreviation and formatting even when consecutive dots are present. This improvement also fixes the colorized logger output to handle logger names with consecutive dots without throwing an exception, and the corresponding test case has been updated to reflect this change, now directly testing the logging functionality by formatting the log record and stripping ANSI escape sequences, providing a more straightforward verification of the logging functionality.

0.10.1

patch hosted runner (#185). In this release, we have implemented a temporary fix to address issues with publishing artifacts in the release workflow. This fix involves changing the runner used for the job from ubuntu-latest to a protected runner group labeled "linux-ubuntu-latest". This ensures that the job runs on a designated hosted runner with the specified configuration, enhancing the reliability and security of the release process. The permissions section of the job remains unchanged, allowing authentication to PyPI and signing of release artifacts with sigstore-python. It is worth noting that this is a stopgap measure, and further changes to the release workflow may be made in the future.

0.10.0

Fixed incorrect script for no-pylint-disable (#178). In this release, we have updated the script used in the no-cheat GitHub workflow to address false positives in stacked pull requests. The updated script fetches the base reference from the remote repository and generates a diff between the base reference and the current branch, saving it to a file. It then runs the "no_cheat.py" script against this diff file and saves the results to a separate file. If the count of cheats (instances where linting has been intentionally disabled) is greater than one, the script outputs the contents of the results file and exits with a non-zero status, indicating an error. This change enhances the accuracy of the script and ensures it functions correctly in a stacked pull request scenario. The no_cheat function, which checks for the presence of certain pylint disable tags in a given diff text, has been updated to the latest version from the ucx project to improve accuracy. The function identifies tags by looking for lines starting with - or "+" followed by the disable tag and a list of codes, and counts the number of times each code is added and removed, reporting any net additions.

Skip dataclassess fields only when None (#180). In this release, we have implemented a change that allows for the skipping of dataclass fields only when the value is None, enabling the inclusion of empty lists, strings, or zeros during marshalling. This modification is in response to issue #179 and involves adding a check for None before marshalling a dataclass field. Specifically, the previous condition if not raw: has been replaced with if raw is None:. This change ensures that empty values such as [], '', or 0 are not skipped during the serialization process, unless they are explicitly set to None. This enhancement provides improved compatibility and flexibility for users working with dataclasses containing empty values, allowing for more fine-grained control during the serialization process.

... (truncated)

Commits

250fa26 Release v0.11.4 (#313)
8bf51d2 FIX hatch click dependency (#269)
2987ea4 Improve Prompts type hints (#267)
a28cf0e Eliminate Pytest warning during unit tests (#266)
5f8823f Added Password Prompt to operate with echo off in terminal (#265)
bb2541c Sniff encoding properly in XML files with a standalone directive (#256)
f1df7ca Release v0.11.3 (#255)
21e6a15 Fix configuration file unmarshalling of JSON floating-point values (#253)
5bb70b3 Support detecting file encoding from XML declaration (#254)
cb3d9e3 Fix unmarshalling of forward references on Python ≥ 3.12.4 (#252)
Additional commits viewable in compare view

You can trigger a rebase of this PR by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Note Automatic rebases have been disabled on this pull request as it has been open for over 30 days.

Oct 06 '25 15:10 dependabot[bot]