PyAPI: Improving and extending Py_buffer support #114429

New Issue

Thomas Barlow · 2023-11-02T23:38:01+01:00

Thomas Barlow commented

2023-11-02 23:38:01 +01:00

Background and rationale

I would like to work on extending and improving the Py_buffer support of Blender's Python API. Having worked on improving the performance of the FBX IO addon by making much use of the existing buffer support, these changes are primarily changes that would have made the Python API easier for me to use or would make further performance improvements to the FBX IO addon possible. None of the changes are specific to the FBX IO addon and they should be useful to other addon/script developers.

I'm not too experienced with C (and even less so with C++), so I anticipate that progress will be slow and that reviews of PRs will definitely identify areas to be improved, but I've made working builds of a few of these changes already to make sure these are things I am capable of working on, though further work is required before they're at a state that I think could be reviewed.

The foreach_get and foreach_set functions available for bpy_prop_collection and bpy_prop_array types (foreach_getset and pyprop_array_foreach_getset in bpy_rna.cc) provide fast access to props of the collection items or the array data. Given a standard Python sequence such as a list, the C code for these functions iterates the data into/out of the sequence. This can be considerably faster than iterating through the bpy_prop_collection or bpy_prop_array in Python code, however, further performance can be achieved by passing in a Python object that supports the Buffer Protocol (see https://docs.python.org/3/c-api/buffer.html and https://peps.python.org/pep-3118/) and is of a type compatible with the type of the property or array being accessed. This is because the data can be copied into/out of the buffer more efficiently with memcpy. Reading the data in and out of buffers can especially be useful for IO that imports/exports data as arrays of bytes or for using libraries that work with entire arrays of data at a time, such as NumPy or PIL.

There are currently limitations to the types of buffers that can be used and there's no clear way to tell through the Python API what type a buffer needs to have to be compatible with the collection attribute being accessed.

Additionally, there are functions available through the Python API that take a sequence as an argument but make no use of the fact that the sequence could be a compatible buffer, and there are functions that return a sequence where it may be beneficial to allow a Python addon developer to instead write the result directly into a buffer. These functions could see considerable performance increases from supporting buffers.

A few utility functions for buffers already exist at the end of py_capi_utils.cc, I intend to extend these and update the bpy_prop_collection and bpy_prop_array functions to use these utility functions like the existing buffer code used by idprop arrays.

Main goals

A shorter overview of the main changes I want to make, described in more detail further below:

Unify existing buffer code used by bpy_prop_array, bpy_prop_collection and idprop arrays.
- Currently they each have their own implementations with varying support for buffers.
- Blender's GPU Python API also uses buffers but I don't think it will be affected much, if at all, due to having more specialised use.
Replace checks for specific buffer types with checking the kind of data (integer/unsigned integer/floating-point/other) and its itemsize.
- int and long may be the same size on some systems, but some of the current Python API only accepts int buffers when a long buffer would also be compatible on some systems.
Add support for standard-size buffer formats or explicitly native-size buffer formats.
- For example, ctypes arrays use standard-size buffer formats, making them incompatible with current API.
Support non-C-contiguous buffers by silently converting to/from C-contiguous.
- This is already implemented by Python's C API, so it's only a matter of allocating C-contiguous arrays to convert into/out of and then using the API.
Add buffer support to Python API functions that take sequence arguments.
- Mesh.normals_split_custom_set is the most notable example, though VertexGroup.add may also see some benefit
Add some property or function to Blender's Python API that will assist in creating compatible buffers OR silently cast incompatible buffers to the correct type.
- Providing the itemsize of the property or a buffer format string would be useful. Notably, PROP_UNSIGNED is already exposed through a property's .subtype.

Proposed changes

Improvements to existing code

Unify buffer parsing code between id_prop arrays, bpy_prop_array foreach_get/set and bpy_prop_collection foreach_get/set.
- The fairly recent updates to convert C code to C++ should make this simpler now that all the affected code is C++.
Check buffer length/shape matches the length/shape of the collection/array.
- Currently, the only checks in place is that the sequence length of the PyObject matches, this is usually the same as the length of a 1D buffer, though it technically might not be.
- For multi-dimensional buffers, we should only require that the total size of the buffer matches the total size of the array/collection, otherwise we would break the current compatibility that allows flat buffers to be used with multi-dimensional collections/arrays.
- This would require modifying the buffer request flags to include at least PyBUF_ND so that shape information is provided by the buffer.
- Checking buffer length is already listed as a TODO in a comment in bpy_rna.cc.
Support buffers of any type that is the correct kind of data (bool/integer/unsigned integer/floating-point) and the correct itemsize.
- Since multiple native types can often have the same size, e.g. int and long or buffers can be 'standard size' which may not match native sizes, it would be beneficial to not require that a buffer's type code exactly matches the data, so long as it's the same kind of data and the same size.
- There are existing utility functions for doing some of this such as PyC_StructFmt_type_is_float_any, though extra functions for signed/unsigned integers and for working with 'standard size' would need to be added.

New features

Support non-C-contiguous and PIL-style (suboffsets) buffers in foreach_get/foreach_set.
- Currently, buffers are required to be C-contiguous as per the PyBUF_SIMPLE flag. It's not difficult to support F-contiguity, non-contiguity or even suboffsets by using the PyBuffer_FromContiguous and PyBuffer_ToContiguous functions from Python's C API to convert the buffer data to/from C-contiguous arrays. For any-contiguity and PIL-style support, the buffer request flags would need to include PyBUF_FULL or PyBUF_FULL_RO depending on whether the buffer needs to be writable.
Add support for buffers that contain byte-order/size/alignment information in their format strings (e.g. ctypes arrays).
- At least to start with, I'm intending to only support buffers in native byte-order.
- The current method of parsing buffer format strings in foreach_get/set checks the first character and nothing else, but there are a number of optional prefix characters that specify byte-order, size (native or 'standard') and alignment (native or unaligned), if any of these are present, the buffer is currently considered incompatible.
- The buffer format follows the Python struct module syntax with extensions from PEP 3118, which also allow numeric prefixes and multi-element formats. These would be unusual to come across, but formats such as "fd" or "1L" (same as "L", but with an explicit repeat count) are possible which would either be parsed incorrectly ("fd" -> 'f') or rejected as invalid ("1L" -> '1') by the current code.
- An existing utility function with support for struct module byte-order/size/alignment prefixes is PyC_StructFmt_type_from_str.
- For simplicity, it may be best to only support single character formats with an optional byte-order/size/alignment prefix. This would mean that "1L" and "L:my_long_name:", which are valid formats, would not be supported, though I haven't seen them used.
Add a function or property that can be used to accurately produce a compatible buffer for a specific property.
- Signedness appears to be retrievable from the subtype of an IntProperty, but there doesn't appear to be a way to get the size in bytes of a property's type. The size in bytes can in some cases be guessed from the hard_min and hard_max of the property, but this is not reliable, for example, color properties usually have a hard_min of 0.0 and can have a hard_max of 1.0.
- As more properties are moving to generic attributes, this is becoming less of an issue because each attribute data_type has a specific C type. These C types do still need to be figured out by developers that are using Blender's Python API, which could be improved upon so that each developer doesn't need to create their own mapping from data_type to C type, but the consistency of the C types of attributes is very helpful.
Or, instead of the previous feature, cast incompatible buffers to the correct type instead of falling back to accessing them as a sequence
- I expect there are already some library functions that can do some or all of this in a performant manner.
- The implementation should be similar in speed to or faster than numpy.ndarray.astype(new_type).
  - NumPy does have a C API, though I don't know if it would be possible to use. If comes down to it, I suppose the NumPy Python API could be called from C code using the Python's C API.
Add buffer support for functions that take sequence arguments, such as VertexGroup.add and Mesh.normals_split_custom_set.
- This can be done by modifying pyrna_py_to_array to take advantage of arguments that are buffers for a performance boost.
  - As a side-effect, it would also mean that assignment to a bpy_prop_array such as my_image.pixels = my_buffer would also take advantage of my_buffer being a buffer and run at the same speed as bpy_prop_array.foreach_set when the buffer is compatible.
  - pyrna_py_to_array_index is similar to pyrna_py_to_array, so it may also be useful to add buffer support to in the case of setting elements of multi-dimensional arrays.
Add foreach_get/set support to bool bpy_prop_array.
- I can't think of any large bool arrays where performance would be a concern, but I couldn't see any other reason for why bool arrays were not supported when I was looking at the code I would need to work on for some of the other changes.
- Since it was simple to copy, paste and modify the existing code for float and int arrays and it's a change that isn't really connected to any of the other planned changes, I've already made a PR for this change, though I do need to update it: !106492

Further extensions

These are possible further extensions that I haven't looked into the details of and may be beyond my current capabilities or may not be possible.

Add as_buffer() function, similar to calling as_pointer() on the first element of a collection, but for the entire collection's contents along with typing.
- One downside to foreach_get is that it always copies data, and in some cases converts from one type to another, e.g. 8-bit integer attributes convert to int.
- This would not be able to support properties that are derived.
- structs can be represented as buffer format strings, for example vec3f: "T{f:x:f:y:f:z:}" or "T{fff}" or "T{(3)f}". Or the much larger BezTriple: "T{(3,3)f:vec:f:tilt:f:weight:f:radius:c:ipo:B:h1:B:h2:B:f1:B:f2:B:f3:c:hide:c:easing:f:back:f:amplitude:f:period:=c:auto_handle_type:(3)c:_pad:}" or "T{(3,3)ffffcBBBBBccfffc(3)c}".
Add to_buffer function similar to foreach_get, but creating and returning a compatible buffer object filled with the requested data.
- memoryview is Python's built-in type for representing a view of a buffer so could be the object type returned if it is preferable over a NumPy or ctypes array. A Python array.array is also a buffer but doesn't support the bool type.
- Alternatively this functionality could be added as an additional option when using foreach_get instead, e.g. foreach_get("co", None) that sees the None argument and then creates and returns a memoryview of a bytearray containing the data.
Add support for functions that would normally return a new list/tuple, to instead return into a provided buffer (or sequence for increased compatibility).
- It would be nice to have some generic solution that works with all existing functions that return sequences, but that sounds like it could require some bigger changes for what is a relatively small feature. Simply adding new 'buffer supported' versions of existing functions as needed is certainly simpler, but could clutter the API if there are many functions that would benefit from buffer support in this way.
  - Since functions already have names assigned to their return values, my ideas for generic solutions revolve around using those names to specify which return values should be output into which provided buffers. e.g. normals_list = my_shape_key.normals_polygon_get() -> my_shape_key.normals_polygon_get(normals=my_buffer) or, in the case that the function usually takes arguments or has multiple return values, groups_list, num_groups = my_mesh.calc_smooth_groups(use_bitflags=True) -> num_groups = my_mesh.calc_smooth_groups(use_bitflags=True, poly_groups=my_buffer).

## Background and rationale I would like to work on extending and improving the Py_buffer support of Blender's Python API. Having worked on improving the performance of the FBX IO addon by making much use of the existing buffer support, these changes are primarily changes that would have made the Python API easier for me to use or would make further performance improvements to the FBX IO addon possible. None of the changes are specific to the FBX IO addon and they should be useful to other addon/script developers. I'm not too experienced with C (and even less so with C++), so I anticipate that progress will be slow and that reviews of PRs will definitely identify areas to be improved, but I've made working builds of a few of these changes already to make sure these are things I am capable of working on, though further work is required before they're at a state that I think could be reviewed. The `foreach_get` and `foreach_set` functions available for `bpy_prop_collection` and `bpy_prop_array` types (`foreach_getset` and `pyprop_array_foreach_getset` in `bpy_rna.cc`) provide fast access to props of the collection items or the array data. Given a standard Python sequence such as a `list`, the C code for these functions iterates the data into/out of the sequence. This can be considerably faster than iterating through the `bpy_prop_collection` or `bpy_prop_array` in Python code, however, further performance can be achieved by passing in a Python object that supports the Buffer Protocol (see https://docs.python.org/3/c-api/buffer.html and https://peps.python.org/pep-3118/) and is of a type compatible with the type of the property or array being accessed. This is because the data can be copied into/out of the buffer more efficiently with `memcpy`. Reading the data in and out of buffers can especially be useful for IO that imports/exports data as arrays of bytes or for using libraries that work with entire arrays of data at a time, such as NumPy or PIL. There are currently limitations to the types of buffers that can be used and there's no clear way to tell through the Python API what type a buffer needs to have to be compatible with the collection attribute being accessed. Additionally, there are functions available through the Python API that take a sequence as an argument but make no use of the fact that the sequence could be a compatible buffer, and there are functions that return a sequence where it may be beneficial to allow a Python addon developer to instead write the result directly into a buffer. These functions could see considerable performance increases from supporting buffers. A few utility functions for buffers already exist at the end of `py_capi_utils.cc`, I intend to extend these and update the `bpy_prop_collection` and `bpy_prop_array` functions to use these utility functions like the existing buffer code used by idprop arrays. ## Main goals A shorter overview of the main changes I want to make, described in more detail further below: - [ ] Unify existing buffer code used by `bpy_prop_array`, `bpy_prop_collection` and idprop arrays. * Currently they each have their own implementations with varying support for buffers. * Blender's GPU Python API also uses buffers but I don't think it will be affected much, if at all, due to having more specialised use. - [ ] Replace checks for specific buffer types with checking the kind of data (integer/unsigned integer/floating-point/other) and its itemsize. * `int` and `long` may be the same size on some systems, but some of the current Python API only accepts `int` buffers when a `long` buffer would also be compatible on some systems. - [ ] Add support for standard-size buffer formats or explicitly native-size buffer formats. * For example, `ctypes` arrays use standard-size buffer formats, making them incompatible with current API. - [ ] Support non-C-contiguous buffers by silently converting to/from C-contiguous. * This is already implemented by Python's C API, so it's only a matter of allocating C-contiguous arrays to convert into/out of and then using the API. - [ ] Add buffer support to Python API functions that take sequence arguments. * Mesh.normals_split_custom_set is the most notable example, though VertexGroup.add may also see some benefit - [ ] Add some property or function to Blender's Python API that will assist in creating compatible buffers OR silently cast incompatible buffers to the correct type. * Providing the itemsize of the property or a buffer format string would be useful. Notably, PROP_UNSIGNED is already exposed through a property's `.subtype`. ## Proposed changes ### Improvements to existing code 1. Unify buffer parsing code between id_prop arrays, bpy_prop_array foreach_get/set and bpy_prop_collection foreach_get/set. * The fairly recent updates to convert C code to C++ should make this simpler now that all the affected code is C++. 1. Check buffer length/shape matches the length/shape of the collection/array. * Currently, the only checks in place is that the sequence length of the `PyObject` matches, this is usually the same as the length of a 1D buffer, though it technically might not be. * For multi-dimensional buffers, we should only require that the total size of the buffer matches the total size of the array/collection, otherwise we would break the current compatibility that allows flat buffers to be used with multi-dimensional collections/arrays. * This would require modifying the buffer request flags to include at least `PyBUF_ND` so that `shape` information is provided by the buffer. * Checking buffer length is already listed as a TODO in a comment in `bpy_rna.cc`. 1. Support buffers of any type that is the correct kind of data (bool/integer/unsigned integer/floating-point) and the correct itemsize. * Since multiple native types can often have the same size, e.g. `int` and `long` or buffers can be ['standard size'](https://docs.python.org/3/library/struct.html#format-characters) which may not match native sizes, it would be beneficial to not require that a buffer's type code exactly matches the data, so long as it's the same kind of data and the same size. * There are existing utility functions for doing some of this such as `PyC_StructFmt_type_is_float_any`, though extra functions for signed/unsigned integers and for working with 'standard size' would need to be added. ### New features 1. Support non-C-contiguous and PIL-style (`suboffsets`) buffers in foreach_get/foreach_set. * Currently, buffers are required to be C-contiguous as per the `PyBUF_SIMPLE` flag. It's not difficult to support F-contiguity, non-contiguity or even `suboffsets` by using the `PyBuffer_FromContiguous` and `PyBuffer_ToContiguous` functions from Python's C API to convert the buffer data to/from C-contiguous arrays. For any-contiguity and PIL-style support, the buffer request flags would need to include `PyBUF_FULL` or `PyBUF_FULL_RO` depending on whether the buffer needs to be writable. 1. Add support for buffers that contain byte-order/size/alignment information in their format strings (e.g. `ctypes` arrays). * At least to start with, I'm intending to only support buffers in native byte-order. * The current method of parsing buffer format strings in foreach_get/set checks the first character and nothing else, but there are a number of optional prefix characters that specify byte-order, size (native or 'standard') and alignment (native or unaligned), if any of these are present, the buffer is currently considered incompatible. * The buffer format follows the Python `struct` module syntax with [extensions from PEP 3118](https://peps.python.org/pep-3118/#additions-to-the-struct-string-syntax), which also allow numeric prefixes and multi-element formats. These would be unusual to come across, but formats such as `"fd"` or `"1L"` (same as `"L"`, but with an explicit repeat count) are possible which would either be parsed incorrectly (`"fd"` -> `'f'`) or rejected as invalid (`"1L"` -> `'1'`) by the current code. * An existing utility function with support for `struct` module byte-order/size/alignment prefixes is `PyC_StructFmt_type_from_str`. * For simplicity, it may be best to only support single character formats with an optional byte-order/size/alignment prefix. This would mean that `"1L"` and `"L:my_long_name:"`, which are valid formats, would not be supported, though I haven't seen them used. 1. Add a function or property that can be used to accurately produce a compatible buffer for a specific property. * Signedness appears to be retrievable from the `subtype` of an `IntProperty`, but there doesn't appear to be a way to get the size in bytes of a property's type. The size in bytes can in some cases be guessed from the `hard_min` and `hard_max` of the property, but this is not reliable, for example, color properties usually have a `hard_min` of `0.0` and can have a `hard_max` of `1.0`. * As more properties are moving to generic attributes, this is becoming less of an issue because each attribute `data_type` has a specific C type. These C types do still need to be figured out by developers that are using Blender's Python API, which could be improved upon so that each developer doesn't need to create their own mapping from `data_type` to C type, but the consistency of the C types of attributes is very helpful. 1. Or, instead of the previous feature, cast incompatible buffers to the correct type instead of falling back to accessing them as a sequence * I expect there are already some library functions that can do some or all of this in a performant manner. * The implementation should be similar in speed to or faster than `numpy.ndarray.astype(new_type)`. * NumPy does have a C API, though I don't know if it would be possible to use. If comes down to it, I suppose the NumPy Python API could be called from C code using the Python's C API. 1. Add buffer support for functions that take sequence arguments, such as `VertexGroup.add` and `Mesh.normals_split_custom_set`. * This can be done by modifying `pyrna_py_to_array` to take advantage of arguments that are buffers for a performance boost. * As a side-effect, it would also mean that assignment to a `bpy_prop_array` such as `my_image.pixels = my_buffer` would also take advantage of `my_buffer` being a buffer and run at the same speed as `bpy_prop_array.foreach_set` when the buffer is compatible. * `pyrna_py_to_array_index` is similar to `pyrna_py_to_array`, so it may also be useful to add buffer support to in the case of setting elements of multi-dimensional arrays. 1. Add `foreach_get/set` support to `bool` `bpy_prop_array`. * I can't think of any large `bool` arrays where performance would be a concern, but I couldn't see any other reason for why `bool` arrays were not supported when I was looking at the code I would need to work on for some of the other changes. * Since it was simple to copy, paste and modify the existing code for `float` and `int` arrays and it's a change that isn't really connected to any of the other planned changes, I've already made a PR for this change, though I do need to update it: !106492 ### Further extensions These are possible further extensions that I haven't looked into the details of and may be beyond my current capabilities or may not be possible. 1. Add `as_buffer()` function, similar to calling `as_pointer()` on the first element of a collection, but for the entire collection's contents along with typing. * One downside to `foreach_get` is that it always copies data, and in some cases converts from one type to another, e.g. 8-bit integer attributes convert to `int`. * This would not be able to support properties that are derived. * structs can be represented as buffer format strings, for example `vec3f`: `"T{f:x:f:y:f:z:}"` or `"T{fff}"` or `"T{(3)f}"`. Or the much larger `BezTriple`: `"T{(3,3)f:vec:f:tilt:f:weight:f:radius:c:ipo:B:h1:B:h2:B:f1:B:f2:B:f3:c:hide:c:easing:f:back:f:amplitude:f:period:=c:auto_handle_type:(3)c:_pad:}"` or `"T{(3,3)ffffcBBBBBccfffc(3)c}"`. 1. Add `to_buffer` function similar to `foreach_get`, but creating and returning a compatible buffer object filled with the requested data. * `memoryview` is Python's built-in type for representing a view of a buffer so could be the object type returned if it is preferable over a NumPy or ctypes array. A Python `array.array` is also a buffer but doesn't support the `bool` type. * Alternatively this functionality could be added as an additional option when using `foreach_get` instead, e.g. `foreach_get("co", None)` that sees the `None` argument and then creates and returns a `memoryview` of a `bytearray` containing the data. 1. Add support for functions that would normally return a new list/tuple, to instead return into a provided buffer (or sequence for increased compatibility). * It would be nice to have some generic solution that works with all existing functions that return sequences, but that sounds like it could require some bigger changes for what is a relatively small feature. Simply adding new 'buffer supported' versions of existing functions as needed is certainly simpler, but could clutter the API if there are many functions that would benefit from buffer support in this way. * Since functions already have names assigned to their return values, my ideas for generic solutions revolve around using those names to specify which return values should be output into which provided buffers. e.g. `normals_list = my_shape_key.normals_polygon_get()` -> `my_shape_key.normals_polygon_get(normals=my_buffer)` or, in the case that the function usually takes arguments or has multiple return values, `groups_list, num_groups = my_mesh.calc_smooth_groups(use_bitflags=True)` -> `num_groups = my_mesh.calc_smooth_groups(use_bitflags=True, poly_groups=my_buffer)`.

❤️ 2

Thomas Barlow added the

Type

Design

label 2023-11-02 23:38:01 +01:00

Iliya Katushenock added this to the Python API project 2023-11-03 08:47:01 +01:00

Thomas Barlow referenced this issue

2023-11-13 04:55:43 +01:00

WIP: PyAPI: Use py_capi_utils in bpy_prop_array and bpy_prop_collection foreach_get/set #114773

Sign in to join this conversation.