Interpolation is now done with SIMD with -DSIMD for tiled images as well.
945 B
[-] Draw rotated pixels in src order -> cache write miss [X] Use atan2 at beginning and end of line. Interpolation in-between values [X] Test pixel perfect 90 [ ] Fix out-of-bounds pixel set
[ ] Optimization for square images? [X] Fixed point computation [-] -funroll-loops -> no gain [-] restrict qualifier -> unavailable in C++
Cache
[-] Rotate per channel -> no gain [X] Cut image in tiles [X] Overlap [-] Rotate in one temp tile then copy/move it [X] Align tiles in memory [ ] Touch beginning of tile
Alignement
[X] RGBX format (create pixel structure) on 8 bytes (can do computation in-place) [X] Load pixels in 64-bit variable [X] Directly load in SIMD 128-bit variable [ ] Align memory on 16 bytes (would require padding) [X] RGBX tiles
Layout
[ ] Pack 4 neighbors in 16B structure (aligned) Each point is followed by the point below [ ] Spiral layout?
Quality
[X] Interpolate using SIMD, SSE [ ] Image borders