The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] 3D motion(4hit)

1-4hit
  • Gaze Point Detection by Computing the 3D Positions and 3D Motion of Face

    Kang Ryoung PARK  Jaihie KIM  

    This paper was deleted on March 10, 2006 because it was found to be a duplicate submission (see details in the pdf file).
     
    PAPER-Image Processing, Image Pattern Recognition

      Vol:
    E83-D No:4
      Page(s):
    884-894

    Gaze detection is to locate the position on a monitor screen where a user is looking. In our work, we implement it with a computer vision system setting a single camera above a monitor and a user moves (rotates and/or translates) her face to gaze at a different position on the monitor. For our case, the user is requested not to move pupils of her eyes when she gazes at a different position on the monitor screen, though we are working on to relax this restriction. To detect the gaze position, we extract facial features (both eyes, nostrils and lip corners) automatically in 2D camera images. From the movement of feature points detected in starting images, we can compute the initial 3D positions of those features by recursive estimation algorithm. Then, when a user moves her head in order to gaze at one position on a monitor, the moved 3D positions of those features can be computed from 3D motion estimation by Iterative Extended Kalman Filter (IEKF) and affine transform. Finally, the gaze position on a monitor is computed from the normal vector of the plane determined by those moved 3D positions of features. Especially, in order to obtain the exact 3D positions of initial feature points, we unify three coordinate systems (face, monitor and camera coordinate system) based on perspective transformation. As experimental results, the 3D position estimation error of initial feature points, which is the RMS error between the estimated initial 3D feature positions and the real positions (measured by 3D position tracker sensor) is about 1.28 cm (0.75 cm in X axis, 0.85 cm in Y axis, 0.6 cm in Z axis) and the 3D motion estimation errors of feature points by Iterative Extended Kalman Filter (IEKF) are about 2.8 degrees and 1.21 cm in rotation and translation, respectively. From that, we can obtain the gaze position on a monitor (17 inches) and the gaze position accuracy between the calculated positions and the real ones is about 2.06 inches of RMS error.

  • Gaze Detection by Estimating the Depths and 3D Motion of Facial Features in Monocular Images

    Kang Ryoung PARK  Si Wook NAM  Min Suk LEE  Jaihie KIM  

    This paper was deleted on March 10, 2006 because it was found to be a duplicate submission (see details in the pdf file).
     
    PAPER-Human Communications and Ergonomics

      Vol:
    E82-A No:10
      Page(s):
    2274-2284

    This paper describes a new method for detecting the gaze position of a user on a monitor from monocular images. In order to detect the gaze position, we extract facial features (both eyes, nostrils and lip corners) automatically in 2D camera images and estimate the 3D depth information and the initial 3D positions of those features by recursive estimation algorithm in starting images. Then, when a user moves his/her head in order to gaze at one position on a monitor, the moved 3D positions of those features can be estimated from 3D motion estimation by Extended Kalman Filter (EKF) and affine transform. Finally, the gaze position on a monitor is calculated from the normal vector of the plane determined by those moved 3D positions of features. Especially, in order to obtain the exact 3D depth and positions of initial feature points, we unify three coordinate systems (face, monitor and camera coordinate system) based on perspective transformation. As experimental results, the 3D depth and the position estimation error of initial feature points, which is the RMS error between the estimated initial 3D feature positions and the real positions (measured by 3D position tracker sensor) is about 1.28 cm (0.75 cm in X axis, 0.85 cm in Y axis, 0.6 cm in Z axis) and the 3D motion estimation errors of feature points by Extended Kalman Filter (EKF) are about 3.6 degrees and 1.4 cm in rotation and translation, respectively. From that, we can obtain the gaze position on a monitor (17 inches) and the gaze position accuracy between the calculated positions and the real ones is about 2.1 inches of RMS error.

  • Structure and Motion of 3D Moving Objects from Multi-Views

    Takeaki Y. MORI  Satoshi SUZUKI  Takayuki YASUNO  

     
    PAPER

      Vol:
    E78-D No:12
      Page(s):
    1598-1606

    This paper proposes a new method that can robustly recover 3D structure and 3D motion of 3D moving objects from a few multi-views. It recovers 3D feature points by obtaining intersections of back-projection lines which are connected from the camera's optical centers thorough projected feature points on the image planes corresponding to the different cameras. We show that our method needs only six views to suppress false 3D feature points in most cases by discussing the relation between the occurrence probability of false 3D feature points and the number of views. This discussion gives us a criterion to design the optimal multi-camera system for recovering 3D structure and 3D motion of 3D moving objects. An experimental multi-camera system is constructed to confirm the validity of our method. This system can take images from six different views at once and record motion image sequence from each view over a period of a few seconds. It is tested successfully on recovering the 3D structure of Vinus's plaster head and on recovering the 3D structure and 3D motion of a moving hand.

  • A Method to Interpret 3D Motions Using Neural Network

    Akira WATANABE  Nobuyuki YAZAWA  Arata MIYAUCHI  Minami MIYAUCHI  

     
    PAPER

      Vol:
    E77-A No:8
      Page(s):
    1363-1370

    In computer vision, the interpretation of 3D motion of an object in the physical world is an important task. This study proposes a 3D motion interpretation method which uses a neural network system consisting of three kinds of neural networks. This system estimates the solutions of 3D motion of an object by interpreting three optical flow (OF-motion vector field calculated from images) patterns obtained at the different view points for the same object. In the system, OF normalization network is used to normalize diverse OF patterns into the normalized OF format. Then 2D motion interpretation network is used to interpret the normalized OF pattern and to obtain the object's projected motion onto an image plane. Finally, 3D motion interpretation network totally interprets the three sets of the projected motions and it derives the solutions of the object's 3D motion from the inputs. A complex numbered version of the back-propagation (Complex-BP) algorithm is applied to OF normalization netwerk and to 2D motion interpretation network, so that these networks can learn graphical patterns as complex numbers. Also a 3D vector version of the back-propagation (3DV-BP) algorithm is applied to 3D motion interpretation network so that the network can learn the spatial relationship between the object's 3D motion and the corresponding three OF patterns. Though the interpretation system is trained for only basic 3D motions consisting of a single motion component, the system can interpret unknown multiple 3D motions consisting of several motion components. The generalization capacity of the proposed system was confirmed using diverse test patterns. Also the robustness of the system to noise was probed experimentally. The experimental results showed that this method has suitable features for applying to real images.