Huang, J., Zhou, Y., Funkhouser, T., & Guibas, L. (2019). FrameNet: Learning Local Canonical Frames of 3D Surfaces from a Single RGB Image. International Conference on Computer Vision (ICCV), 2019.


In this work, we introduce the novel problem of iden- tifying dense canonical 3D coordinate frames from a sin- gle RGB image. We observe that each pixel in an image corresponds to a surface in the underlying 3D geometry, where a canonical frame can be identified as represented by three orthogonal axes, one along its normal direction and two in its tangent plane. We propose an algorithm to pre- dict these axes from RGB. Our first insight is that canoni- cal frames computed automatically with recently introduced direction field synthesis methods can provide training data for the task. Our second insight is that networks designed for surface normal prediction provide better results when trained jointly to predict canonical frames, and even bet- ter when trained to also predict 2D projections of canonical frames. We conjecture this is because projections of canoni- cal tangent directions often align with local gradients in im- ages, and because those directions are tightly linked to 3D canonical frames through projective geometry and orthog- onality constraints. In our experiments, we find that our method predicts 3D canonical frames that can be used in applications ranging from surface normal estimation, fea- ture matching, and augmented reality.


  title={FrameNet: Learning Local Canonical Frames of 3D Surfaces from a Single RGB Image},
  author={Huang, Jingwei and Zhou, Yichao and Funkhouser, Thomas and Guibas, Leonidas},
  booktitle={International Conference in Computer Vision (ICCV)},