PLY: Improve robustness and performance of the new PLY importer #105842

Merged
Aras Pranckevicius merged 10 commits from aras_p/blender:ply-importer into main 2023-03-17 12:20:56 +01:00

Preface

All this started when someone asked how does performance of the new PLY importer (landed in #104404) compares to regular C/C++ PLY parsing libraries, e.g. miniply. I started to look, found that while the new importer is about 7x-20x faster than Python one, it's between 4x-8x slower than miniply. But then, while testing on some ply files, I found that it was not handling some of them properly (crashes or garbage data). And so...

Fixes

  • General:
    • Was assuming that positions/normals/UVs are always float, and colors are always uchar. But e.g. positions are sometimes doubles, or colors are floats etc.
    • Was assuming that face element is always just a single vertex indices property. But some files have more arbitrary properties per-face.
  • ASCII importer:
    • Was assuming that file elements are always in this order: vertex, face, edge. That is not necessarily the case.
    • Was only handling space character as separator, but not other whitespace e.g. tab.
    • Was not checking for partially present properties (e.g. if just nx is present but not ny and nz, it was still assuming whole normal is there).
  • Binary importer:
    • Was assuming that faces are always 1-byte list size and 4-byte indices.
    • Was assuming that edge vertex indices are always 4-byte.

For the assumptions above, it often lead to either crashes, garbage data or not detecting presence of a vertex component. E.g. binary importer was just always reading 4 bytes for vertex indices, but if a file has 2 byte indices then that would read too much, and then later on would start reading garbage. ASCII importer was doing out-of-bounds reads into properties array if some assumptions were not correct, etc.

Improvements

  • Some ply files use different property type names, e.g. float32, uint16 or int8 instead of float, ushort, char; now that is supported too.
  • Handles tristrips element. Some files out there do not have face but rather has a triangle strip, internally using -1 indices for strip restarts.

Performance

While doing the above fixes/improvements, in order to fix some things it was more convenient to also make them more performant :) Now it avoids splitting each ASCII line into a vector of std::string objects, for example, or reading a binary file in tiny chunks.

Generally this is now 2x-4x faster than the previous state of C++ importer. Here are timings to import "Lucy" model from the Stanford 3D scans page (blender RelWithDebInfo build, Windows is VS2022 16.4 on Ryzen 5950X, Mac is clang 14 on Apple M1Max):

The code has changed quite a bit to achieve both the fixes and performance:

  • Instead of separate files for ASCII and Binary reading, they are now both in ply_import_data.cc/hh. Reason is that all of the "find correct property indices" logic is the same between both (and that bit was mostly missing previously).
  • Instead of using raw C++ fstream object and reading line by line (which in turn reads one byte at a time) or reading field by field, now there's a ply_import_buffer.cc/hh that reads in 64KB chunks and does any needed logic for the ASCII/header part to properly split into lines. This avoids all the mutex locking, locale lookups, C++ abstraction layers overhead that C++ iostreams have when doing a ton of tiny reads.

Tests

Extended test coverage with new files (comitted in svn revision 63274, attached as zip here too). 11 new "situations", in both ascii and binary forms. Additionally, now also has a Big Endian binary file to test; BE codepath was completely untested before.

Overall reworked the tests so that they don't do the whole "load dummy blend file, import ply file on top, go over meshes, check them", but rather do a simpler "run ply importer logic that fills in PlyData structure, check that". The followup conversion to blender meshes is fairly trivial and the tests were not really covering that in a useful way.

### Preface All this started when someone asked how does performance of the new PLY importer (landed in #104404) compares to regular C/C++ PLY parsing libraries, e.g. [miniply](https://github.com/vilya/miniply). I started to look, found that while the new importer is about 7x-20x faster than Python one, it's between 4x-8x slower than miniply. But then, while testing on some ply files, I found that it was not handling some of them properly (crashes or garbage data). And so... ### Fixes - General: - Was assuming that positions/normals/UVs are always `float`, and colors are always `uchar`. But e.g. positions are sometimes doubles, or colors are floats etc. - Was assuming that `face` element is always just a single vertex indices property. But some files have more arbitrary properties per-face. - ASCII importer: - Was assuming that file elements are always in this order: `vertex`, `face`, `edge`. That is not necessarily the case. - Was only handling space character as separator, but not other whitespace e.g. tab. - Was not checking for partially present properties (e.g. if just `nx` is present but not `ny` and `nz`, it was still assuming whole normal is there). - Binary importer: - Was assuming that faces are always 1-byte list size and 4-byte indices. - Was assuming that edge vertex indices are always 4-byte. For the assumptions above, it often lead to either crashes, garbage data or not detecting presence of a vertex component. E.g. binary importer was just always reading 4 bytes for vertex indices, but if a file has 2 byte indices then that would read too much, and then later on would start reading garbage. ASCII importer was doing out-of-bounds reads into properties array if some assumptions were not correct, etc. ### Improvements - Some ply files use different property type names, e.g. `float32`, `uint16` or `int8` instead of `float`, `ushort`, `char`; now that is supported too. - Handles `tristrips` element. Some files out there do not have `face` but rather has a triangle strip, internally using -1 indices for strip restarts. ### Performance While doing the above fixes/improvements, in order to fix some things it was more convenient to also make them more performant :) Now it avoids splitting each ASCII line into a vector of `std::string` objects, for example, or reading a binary file in tiny chunks. Generally this is now 2x-4x faster than the previous state of C++ importer. Here are timings to import "Lucy" model from the [Stanford 3D scans](http://graphics.stanford.edu/data/3Dscanrep/) page (blender RelWithDebInfo build, Windows is VS2022 16.4 on Ryzen 5950X, Mac is clang 14 on Apple M1Max): ![](https://projects.blender.org/attachments/04d41f59-d34f-4162-b9df-96aaf975f714) The code has changed quite a bit to achieve both the fixes and performance: - Instead of separate files for ASCII and Binary reading, they are now both in `ply_import_data.cc/hh`. Reason is that all of the "find correct property indices" logic is the same between both (and that bit was mostly missing previously). - Instead of using raw C++ `fstream` object and reading line by line (which in turn reads one byte at a time) or reading field by field, now there's a `ply_import_buffer.cc/hh` that reads in 64KB chunks and does any needed logic for the ASCII/header part to properly split into lines. This avoids all the mutex locking, locale lookups, C++ abstraction layers overhead that C++ iostreams have when doing a ton of tiny reads. ### Tests Extended test coverage with new files (comitted in svn revision 63274, attached as zip here too). 11 new "situations", in both ascii and binary forms. Additionally, now also has a Big Endian binary file to test; BE codepath was completely untested before. Overall reworked the tests so that they don't do the whole "load dummy blend file, import ply file on top, go over meshes, check them", but rather do a simpler "run ply importer logic that fills in PlyData structure, check that". The followup conversion to blender meshes is fairly trivial and the tests were not really covering that in a useful way.
Aras Pranckevicius force-pushed ply-importer from 5b487e5061 to fafca64d00 2023-03-17 10:40:38 +01:00 Compare
Author
Member

@blender-bot build

@blender-bot build
Aras Pranckevicius added 1 commit 2023-03-17 11:10:15 +01:00
buildbot/vexp-code-patch-coordinator Build done. Details
f15f92fa6f
PLY: fix linux build error/warnings
Aras Pranckevicius changed title from WIP: PLY: Improve robustness and performance of the new PLY importer to PLY: Improve robustness and performance of the new PLY importer 2023-03-17 11:10:44 +01:00
Author
Member

@blender-bot build

@blender-bot build
Aras Pranckevicius merged commit 6c6edaf513 into main 2023-03-17 12:20:56 +01:00
Aras Pranckevicius deleted branch ply-importer 2023-03-17 12:20:56 +01:00
Sign in to join this conversation.
No reviewers
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#105842
No description provided.