Python API: Direct access to attribute arrays of meshes, curves, point clouds and GP drawings #122091

Sietse Brouwer · 2024-05-22T11:34:38+02:00

Sietse Brouwer commented

2024-05-22 11:34:38 +02:00

This PR implements a Python API for direct access to attribute arrays of meshes, curves, point clouds and Grease Pencil drawings.

Current access to attribute arrays has its limitations, mainly because they are defined as property collections, which always return RNA property objects, not direct values.

>>> radius = curve.attributes['radius']
>>> radius.data[0]
bpy.data.hair_curves['Curves']...FloatAttributeValue
>>> radius.data[0].value
1.0
>>> radius.data[0].value = 0.5  # Pretty verbose
>>> radius.data[0:5] = ???  # Slice assign: pretty hard to do

This PR implements a more direct approach to lift these limitations.

Direct access to attributes
.attributes[name].data_array

radius = curve.attributes['radius']
radius.data_array[0]  # Returns float 0.5
radius.data_array[0] = 2.0  # Assign float directly
radius.data_array[1:3] = [1.0, 2.0, 1.5]  # Slice assign

Attribute type aware
All attribute type are automatically handled, so retrieving a float3 attribute will return a Vector(x, y, z), a ColorGeometry4f attribute will give a [r, g, b, a], etc.

positions = curve.attributes['position']
positions.data_array[0] = Vector((1.0, 0.0, 2.5))  # Assign mathutils.Vector or...
positions.data_array[0] = [1.0, 0.0, 2.5]  # Assign list
curve.attributes['rotation_quaternion'].data_array[0] = Quaternion((1.0, 0.0, 0.0, 0.0))
curve.attributes['.selection'].data_array[0] = True

Foreach_get and foreach_set
.data_array.foreach_get(array)
.data_array.foreach_set(array)

num_points = len(radius.data_array)
radii = np.ndarray(num_points, dtype=np.float32)
radius.data_array.foreach_get(radii)
radii[:] = 2.0
radius.data_array.foreach_set(radii)

Fill attribute
.data_array.fill(value)
.data_array.fill(value, indices)

radius.data_array.fill(2.0)  # Set the radius of all the control points in the curves
radius.data_array.fill(2.0, [0, 1, 2, 3])  # Use indices: set the radius of the first four control points

# Select points with a radius between 1.0 and 2.0
num_points = len(radius.data_array)
radii = np.ndarray(num_points, dtype=np.float32)
radius.data_array.foreach_get(radii)
indices = np.nonzero((radii > 1.0) & (radii < 2.0))[0]
curve.attributes['.selection'].data_array.fill(True, indices)

Technical notes

Changing attribute values won't trigger an object redraw, so that trigger has to be done manually in the Python code.
To keep things clean, I placed the BPy_AttributeArray code in its own submodule in python/generic, similar to ID Properties Access.

This PR implements a Python API for direct access to attribute arrays of meshes, curves, point clouds and Grease Pencil drawings. Current access to attribute arrays has its limitations, mainly because they are defined as property collections, which always return RNA property objects, not direct values. ```python >>> radius = curve.attributes['radius'] >>> radius.data[0] bpy.data.hair_curves['Curves']...FloatAttributeValue >>> radius.data[0].value 1.0 >>> radius.data[0].value = 0.5 # Pretty verbose >>> radius.data[0:5] = ??? # Slice assign: pretty hard to do ``` This PR implements a more direct approach to lift these limitations. **Direct access to attributes** _`.attributes[name].data_array`_ ```python radius = curve.attributes['radius'] radius.data_array[0] # Returns float 0.5 radius.data_array[0] = 2.0 # Assign float directly radius.data_array[1:3] = [1.0, 2.0, 1.5] # Slice assign ``` **Attribute type aware** All attribute type are automatically handled, so retrieving a `float3` attribute will return a `Vector(x, y, z)`, a `ColorGeometry4f` attribute will give a `[r, g, b, a]`, etc. ```python positions = curve.attributes['position'] positions.data_array[0] = Vector((1.0, 0.0, 2.5)) # Assign mathutils.Vector or... positions.data_array[0] = [1.0, 0.0, 2.5] # Assign list curve.attributes['rotation_quaternion'].data_array[0] = Quaternion((1.0, 0.0, 0.0, 0.0)) curve.attributes['.selection'].data_array[0] = True ``` **Foreach_get and foreach_set** _`.data_array.foreach_get(array)`_ _`.data_array.foreach_set(array)`_ ```python num_points = len(radius.data_array) radii = np.ndarray(num_points, dtype=np.float32) radius.data_array.foreach_get(radii) radii[:] = 2.0 radius.data_array.foreach_set(radii) ``` **Fill attribute** _`.data_array.fill(value)`_ _`.data_array.fill(value, indices)`_ ```python radius.data_array.fill(2.0) # Set the radius of all the control points in the curves radius.data_array.fill(2.0, [0, 1, 2, 3]) # Use indices: set the radius of the first four control points # Select points with a radius between 1.0 and 2.0 num_points = len(radius.data_array) radii = np.ndarray(num_points, dtype=np.float32) radius.data_array.foreach_get(radii) indices = np.nonzero((radii > 1.0) & (radii < 2.0))[0] curve.attributes['.selection'].data_array.fill(True, indices) ``` --- **Technical notes** - Changing attribute values won't trigger an object redraw, so that trigger has to be done manually in the Python code. - To keep things clean, I placed the `BPy_AttributeArray` code in its own submodule in `python/generic`, similar to ID Properties Access.

Sietse Brouwer added 4 commits 2024-05-22 11:34:49 +02:00

Python access to attribute arrays through BPy_AttributeArray e6a1d94184

Cleanup: Use ColorGeometry4f type instead of float4 (and others) 34af3b796c

Accept 'long' and 'int' as buffer type for foreach_get/set 29499c25f3

Accept numpy array for indices in .fill(value, indices) 97614244a1

Sietse Brouwer added this to the Module: Python API project 2024-05-22 11:35:10 +02:00

Sietse Brouwer referenced this pull request

2024-05-22 12:03:47 +02:00

WIP: GPv3: Python API for frame, drawing and builtin geometry attributes #122094

Sietse Brouwer referenced this pull request

2024-05-22 12:10:01 +02:00

WIP: GPv3: Python API for frame, drawing and builtin geometry attributes #122094

Sietse Brouwer requested review from Falk David 2024-05-22 12:12:32 +02:00

Hans Goudey requested changes 2024-05-22 15:26:00 +02:00

Hans Goudey left a comment

I'm not convinced it's worth it for us to maintain such an API outside of RNA. Some of these improvements would be more generally beneficial applied to RNA itself rather than just one custom type. The need for RNA_struct_is_attribute_array shows how this doesn't fit that well into existing design IMO. At least this should be discussed with core module members before moving forward further.

There are a few more important aspects with RNA/Python access to attribute data. However we improve attribute access, these issues should be improved too.

Implicit Sharing: Currently accessing attribute data in Python un-shares the array, whether the access is for writing or only reading. This is a significant performance problem because it negates the memory usage benefits of implicit sharing as soon as Python/RNA touches the data.
Caches: Currently we have various caches that depend on some attributes. They need to be tagged dirty when the data is changed.
Validation: Some builtin attributes have constraints on what values can be set. These should be respected too.
Separation from CustomData: We are looking to replace CustomData at some point with a more flexible/modern storage system. We should avoid building abstractions directly on top of CustomData directly in the meantime.

The last three issues can be resolved by building on top of the C++ attribute API in BKE_attribute.hh rather than the CustomData system.

I'm not convinced it's worth it for us to maintain such an API outside of RNA. Some of these improvements would be more generally beneficial applied to RNA itself rather than just one custom type. The need for `RNA_struct_is_attribute_array` shows how this doesn't fit that well into existing design IMO. At least this should be discussed with core module members before moving forward further. There are a few more important aspects with RNA/Python access to attribute data. However we improve attribute access, these issues should be improved too. - **Implicit Sharing:** Currently accessing attribute data in Python un-shares the array, whether the access is for writing or only reading. This is a significant performance problem because it negates the memory usage benefits of implicit sharing as soon as Python/RNA touches the data. - **Caches:** Currently we have various caches that depend on some attributes. They need to be tagged dirty when the data is changed. - **Validation:** Some builtin attributes have constraints on what values can be set. These should be respected too. - **Separation from CustomData:** We are looking to replace `CustomData` at some point with a more flexible/modern storage system. We should avoid building abstractions directly on top of CustomData directly in the meantime. The last three issues can be resolved by building on top of the C++ attribute API in `BKE_attribute.hh` rather than the CustomData system.

source/blender/makesdna/DNA_customdata_types.h

						
				@ -43,2 +43,4 @@

				  /** Shape key-block unique id reference. */

				  int uid;

				  /** Only for use in RNA and #bpy_rna: number of items in the layer data. */

				  int length;

Hans Goudey commented

2024-05-22 15:15:12 +02:00

I think it's important to avoid adding this. It should be retrieved from the "context" as necessary. That sort of redundant storage adds plenty of problems with syncing and invalid states.

Sietse Brouwer commented

2024-05-23 00:16:49 +02:00

The 'context' is the problem here. The function that creates the Python object only receives a PointerRNA *ptr, which contains an ID owner_id and void *data. For meshes and curves, we can derive the length of the attribute array from the owner_id, but for Grease Pencil we can't, because the geometry is in a Drawing, which isn't an ID type.
I see the problems you mention, of course, but I don't see a better solution for now. Any ideas, perhaps?

The 'context' is the problem here. The function that creates the Python object only receives a `PointerRNA *ptr`, which contains an `ID owner_id` and `void *data`. For meshes and curves, we can derive the length of the attribute array from the `owner_id`, but for Grease Pencil we can't, because the geometry is in a `Drawing`, which isn't an `ID` type. I see the problems you mention, of course, but I don't see a better solution for now. Any ideas, perhaps?

Falk David commented

2024-05-29 12:37:31 +02:00

There was some discussion on improving the state of things regarding ownership of data pointed at by PointerRNA. See #122431

There was some discussion on improving the state of things regarding ownership of data pointed at by `PointerRNA`. See https://projects.blender.org/blender/blender/issues/122431

Brecht Van Lommel commented

2024-05-29 14:05:56 +02:00

I think storing the array length in the custom data layer is not that bad, if this was a C++ data type this would be a Vector and so it would have the size too. But if this is done, I think it requires some refactoring in customdata.cc to ensure any array pointer changes go along with size changes. Maybe affects some public API functions as well, where maybe it needs to then pass an Array instead of a pointer.

I think storing the array length in the custom data layer is not that bad, if this was a C++ data type this would be a `Vector` and so it would have the size too. But if this is done, I think it requires some refactoring in customdata.cc to ensure any array pointer changes go along with size changes. Maybe affects some public API functions as well, where maybe it needs to then pass an Array instead of a pointer.

Hans Goudey commented

2024-05-29 14:35:11 +02:00

if this was a C++ data type this would be a Vector

Not quite. ImplicitSharingInfo serves as the ownership here. data just serves as quick access, pointing to any contiguous array. At the very least it should be stored in CustomData, not CustomDataLayer.

Anyway, improvements to PointerRNA should make this change unnecessary.

>if this was a C++ data type this would be a Vector Not quite. `ImplicitSharingInfo` serves as the ownership here. `data` just serves as quick access, pointing to any contiguous array. At the very least it should be stored in `CustomData`, not `CustomDataLayer`. Anyway, improvements to `PointerRNA` should make this change unnecessary.

Brecht Van Lommel commented

2024-05-29 15:03:35 +02:00

I think you'd still have some way of freeing the data in a destructor, which for some data types requires the length. To me it's strange for a data structure to not know the length of its own array members, and instead passing it into various CustomData_ API functions.

I think you'd still have some way of freeing the data in a destructor, which for some data types requires the length. To me it's strange for a data structure to not know the length of its own array members, and instead passing it into various `CustomData_` API functions.

Hans Goudey commented

2024-05-29 15:12:52 +02:00

ImplicitSharingInfo handles the destruction itself. All CustomData should have to do is decrement the user count. But your overall point is fair.

`ImplicitSharingInfo` handles the destruction itself. All `CustomData` should have to do is decrement the user count. But your overall point is fair.

source/blender/python/generic/attribute_array_py_types.hh Outdated

						
				@ -0,0 +425,4 @@

				  char buffer_format_a;

				  char buffer_format_b;

				  std::string description;

				  std::function<PyObject *(void *, int)> get_attribute;

Hans Goudey commented

2024-05-22 15:20:17 +02:00

Use simpler types here: function pointers and StringRef will do the trick and will be faster. That said, I'm not sure about passing everything through a function pointer. The indirection seems unnecessarily complex. Why not just use a switch statement?

Use simpler types here: function pointers and `StringRef` will do the trick and will be faster. That said, I'm not sure about passing everything through a function pointer. The indirection seems unnecessarily complex. Why not just use a switch statement?

SietseB marked this conversation as resolved

source/blender/python/generic/attribute_array_py_types.hh Outdated

						
				@ -0,0 +429,4 @@

				  std::function<bool(void *, int, PyObject *)> set_attribute;

				};

				std::unordered_map<int, CustomDataTypeMapping> data_types{

Hans Goudey commented

2024-05-22 15:20:43 +02:00

Use Map in Blender code rather than the std library types

Use `Map` in Blender code rather than the std library types

SietseB marked this conversation as resolved

Sietse Brouwer added 1 commit 2024-05-23 00:03:26 +02:00

Use switch statement for attribute type functions 7ee24a12a2

Brecht Van Lommel commented

2024-05-29 14:02:51 +02:00

Repeating my comment from devtalk:

From a planning point of view, I think the first priority should be wrapping the attributes the same as it works for meshes and curves now. Improving the array access for attributes in general would be great but should not be a blocker for GPv3.

There is an issue to solve there regarding efficiently looking up the array length with multiple drawings. Also as a first step there I suggest to implement the inefficient solution (looping through all drawings and their custom data layers), and then work on making it faster.

Repeating my comment from devtalk: > From a planning point of view, I think the first priority should be wrapping the attributes the same as it works for meshes and curves now. Improving the array access for attributes in general would be great but should not be a blocker for GPv3. > > There is an issue to solve there regarding efficiently looking up the array length with multiple drawings. Also as a first step there I suggest to implement the inefficient solution (looping through all drawings and their custom data layers), and then work on making it faster.

Brecht Van Lommel commented

2024-05-29 14:03:56 +02:00

I'm not convinced it's worth it for us to maintain such an API outside of RNA. Some of these improvements would be more generally beneficial applied to RNA itself rather than just one custom type. The need for RNA_struct_is_attribute_array shows how this doesn't fit that well into existing design IMO. At least this should be discussed with core module members before moving forward further.

I think this can be implemented as a generic RNA feature for arrays rather than something attribute specific. It could define a multi dimensional array (like number of elements x 3), set it a vector or color subtype, and have the Python RNA wrapping recognize that.

> I'm not convinced it's worth it for us to maintain such an API outside of RNA. Some of these improvements would be more generally beneficial applied to RNA itself rather than just one custom type. The need for `RNA_struct_is_attribute_array` shows how this doesn't fit that well into existing design IMO. At least this should be discussed with core module members before moving forward further. I think this can be implemented as a generic RNA feature for arrays rather than something attribute specific. It could define a multi dimensional array (like number of elements x 3), set it a vector or color subtype, and have the Python RNA wrapping recognize that.

👍 1

Brecht Van Lommel referenced this pull request

2024-05-29 17:51:13 +02:00

RNA: Make `PointerRNA` aware of its own 'data path' #122431

Bastien Montagne referenced this pull request

2024-06-01 19:35:43 +02:00

RNA: Make `PointerRNA` aware of its own 'data path' #122431

This pull request has changes conflicting with the target branch.