Hidehiko OKADA Toshiyuki ASAHI
In this paper, we propose methods for testing the usability of graphical user interface (GUI) applications based on log files of user interactions. Log analysis by existing methods is not efficient because evaluators analyze a single log file or log files of the same user and then manually compare results. The methods proposed here solve this problem; the methods enable evaluators to analyze the log files of multiple users together by detecting interaction patterns that commonly appear in the log files. To achieve the methods, we first clarify usability attributes that can be evaluated by a log-based usability testing method and user interaction patterns that have to be detected for the evaluation. Based on an investigation on the information that can be obtained from the log files, we extract the attributes of clarity, safety, simplicity, and continuity. For the evaluations of clarity and safety, the interaction patterns that have to be detected include those from user errors. We then propose our methods for detecting interaction patterns from the log files of multiple users. Patterns that commonly appear in the log files are detected by utilizing a repeating pattern detection algorithm. By regarding an operation sequence recorded in a log file as a string and concatenating strings, common patterns are able to be detected as repeating patterns in the concatenated string. We next describe the implementation of the methods in a computer tool for log-based usability testing. The tool, GUITESTER, records user-application interactions into log files, generates usability analysis data from the log files by applying the proposed methods, and visualizes the generated usability analysis data. To show the effectiveness of GUITESTER in finding usability problems, we report an example of a usability test. In this test, evaluators could find 14 problems in a tested GUI application. We finally discuss the ability of the proposed methods in terms of its log analysis efficiency, by comparing the analysis/sequence time (AT/ST) ratio of GUITESTER with those of other methods and tools. The ratio of GUITESTER is found to be smaller. This indicates the methods make log analysis more efficient.
An optimun design for N(arbitrary)-sheet capacitive Jaumann elctromagnetic (EM) wave absorber, using genetic algorithm will be presented. This algorithm is a random optimization method based on the genetic relation in the human being. We show the bandwidth for two-sheet capacitive Jaumann absorber can be expanded even more than 108% showed by knott, by using this algorithm and without imposing the double-notch design criteria. We also show that our results approaches knott's results when we restrict the characteristic impedances and lengths of the lines to vary within a very short range. We also design one-sheet and three-sheet capacitive Jaumann absorbers. The only restriction used here is about the meaningful range for the design variables. The goal of this algorithm is that we can impose arbitrary restriction about the range of the variation of the variables. So we can see the performance behaviour with the range dimension of the variables, and we can obtain different optimum results for different ranges. Finally we obtain a 20-dB attenuation bandwidth more than 145% for one-sheet, 173% for two-sheet (compare with 108% obtained in [1]) and 193% for three-sheet capacitive Jaumann EM absorbers, with some acceptable short range for the variables. We design the one-sheet and two-sheet capacitive Jaumann absorbers at low frequency and the three-sheet at high frequency. The 20-dB attenuation bandwidth obtained for the one-sheet and two-sheet capacitive Jaumann absorbers are respectively, from 10 to 77 MHz and, from 4 to 61 MHz. For the three-sheet capacitive Jaumann absorber the 20-dB attenuation bandwidth obtained is, from 0.8 GHz to 280 GHz.
Jun-ichiro TORIWAKI Kensaku MORI
In this article we present a survey of medical image processing with the stress on applications of image generation and pattern recognition / understanding to computer aided diagnosis (CAD) and surgery (CAS). First, topics and fields of research in medical image processing are summarized. Second the importance of the 3D image processing and the use of virtualized human body (VHB) is pointed out. Thirdly the visualization and the observation methods of the VHB are introduced. In the forth section the virtualized endoscope system is presented from the viewpoint of the observation of the VHB with the moving viewpoints. The fifth topic is the use of VHB with deformation such as the simulation of surgical operation, intra-operative aids and image overlay. In the seventh section several topics on image processing methodologies are introduced including model generation, registration, segmentation, rendering and the use of knowledge processing.
Miki YAMAMOTO Satoshi MACHIDA Hiromasa IKEDA
DQRUMA (Distributed-Queueing Request Update Multiple Access) protocol has been proposed as an access protocol for the wireless ATM Local Area Networks. DQRUMA protocol is useful to transmit fixed-length packets (e. g. ATM cells). However, it cannot be applied to multimedia environment because it does not include any access control policy for multimedia traffic. In the paper, we propose a slot assignment scheme of DQRUMA protocol in wireless ATM LAN which supports integrated multimedia traffic with different service requirements. In this scheme we can allocate network resources according to the service requirements of each medium because the base station assigns Transmit-Permission flexibly according to the features of each medium.
This paper addresses the important issue of estimating realistic grasping postures, and presents a methodology and algorithm to automate the generation of hand and body postures during the grasp of arbitrary shaped objects. Predefined body postures stored in a database are generalized to adapt to a specific grasp using inverse kinematics. The reachable space is represented discretely dividing into small subvolumes, which enables to construct the database. The paper also addresses some common problems of articulated figure animation. A new approach for body positioning with kinematic constraints on both hands is described. An efficient and accurate manipulation of joint constraints is presented. Obtained results are quite satisfactory, and some of them are shown in the paper. The proposed algorithms can find application in the motion of virtual actors, all kinds of animation systems including human motion, robotics and some other fields such as medicine, for instance, to move the artificial limbs of handicapped people in a natural way.
This paper proposes and investigates a coding and decoding scheme to achieve adaptive unequal error protection (UEP) using several convolutional codes which have different error-correcting capabilities. An appropriate encoder is selected to unequally protect each frame of information sequence according to the importance of the frame. Since the supplemental information of selected encoder is not sent for the sake of reducing redundancy, we assume that the decoder does not know which encoder was used, and the decoder has to estimate the used encoder. In order to estimate which encoder was used, the method using biased metric in Viterbi decoding is proposed. In decoding, however, there is a problem of Decoder-Selection-Error (DSE), which is an error that the decoder selected in a receiver does not correspond to the encoder used in a transmitter. An upper bound of DSE rate in decoding is derived. The proposed decoding scheme using the biased metric in a trellis can improve DSE rate and BER performance, because transition probability of encoders is taken into account in calculating likelihood by means of making branch or path metric biased. Computer simulation is employed to evaluate the BER performance and DSE rate of the proposed scheme. The performance is compared with a conventional equal error protection scheme and a UEP with the supplemental information on the used encoder. It is found that the proposed scheme can achieve better performance than them in case N=2.
Alberto TOMITA,Jr. Tsuyoshi EBINA Rokuya ISHII
In this paper we propose a method to aid a visually impaired person in the operation of a computer running a graphical user interface (GUI). It is based on image processing techniques, using images taken by a color camera placed over a Braille display. The shape of the user's hand is extracted from the image by analyzing the hue and saturation histograms. The orientation of the hand, given by an angle θ with the vertical axis, is calculated based on central moments. The image of the hand is then rotated to a normalized position. The number of pixels in each column of the normalized image is counted, and the result is put in a histogram. By analyzing the coefficient of asymmetry of this histogram, it can be determined whether the thumb is positioned along the pointing finger, or whether it is far from the other fingers. These two positions define two states that correspond to a mouse button up or down. In this way, by rotating the hand and moving the thumb, we can emulate the acts of moving a scroll bar and depressing a mouse button, respectively. These operations can be used to perform tasks in a GUI, such as cut-and-paste, for example. Experimental results show that this method is fast and efficient for the proposed application.
In order to investigate the nonlinearity and color responses of visual evoked potentials (VEPs), which have been useful in objectively detecting human color vision characteristics, a nonlinear system identification method was applied to VEPs elicited by isoluminant color stimuli, and the relationship between color stimuli and VEPs was examined. VEPs of normal subjects elicited by chromatically modulated stimuli were measured, and their binary kernels were estimated. Results showed that a system with chromatically modulated stimuli and VEP responses can be expressed by binary kernels up to the second order and that first- and second-order binary kernels depended on the color of the stimulus. The characteristics of second-order kernels reflected the difference between two chromatic channels. Opponent-color responses were included in first-order binary kernels, suggesting that they could be used as an index to test human color vision.
Hitoshi SAJI Hiromasa NAKATANI
In this paper, a new method for measuring three-dimensional (3D) moving facial shapes is introduced. This method uses two light sources and a slit pattern projector. First, the normal vectors at points on a face are computed by the photometric stereo method with two light sources and a conventional video camera. Next, multiple light stripes are projected onto the face with a slit pattern projector. The 3D coordinates of the points on the stripes are measured using the stereo vision algorithm. The normal vectors are then integrated within 2D finite intervals around the measured points on the stripes. The 3D curved segment within each finite interval is computed by the integration. Finally, all the curved segments are blended into the complete facial shape using a family of exponential functions. By switching the light rays at high speed, the time required for sampling data can be reduced, and the 3D shape of a moving human face at each instant can be measured.
Calculation Nv(x) of complex order v numerically, we must calculate Df{JN+ε(x)}. When Df{JN+ε(x)} is calculated by the recurrence method, this letter will analyze the error of Df{JN+ε(x)}, and will determine the optimum number of recurrences.
Kenichi MASE James P. CUNNINGHAM Judy CANTOR Hiromichi KAWANO Joseph P. ROTELLIA Tetsuo OKAZAKI Timothy J. LIPETZ Yuji HATAKEYAMA
This study clarifies the effects of network complexity and network map transformation on the ability of network managers to use graphic network displays. Maps of Japan and the United States with outlines of their respective prefectures or states were displayed on a CRT. Each map displayed a fictitious network of nodes and their interconnections. These networks were two-level hierarchical and non-meshed, meaning that each low-level node was connected to a single high-level node, but not all high-level nodes were linked together. The subjects, task was to identify a path between two low-level nodes. In each trial, two low-level nodes were highlighted, and the subject attempted to find the shortest path between these nodes. This was done by using a mouse to select intermediate nodes. Completing a path required a minimum of 4 node traversals. Three variables were manipulated. First, the number of nodes was defined as the total number of low-level nodes in a network (70, 150, or 200). The second variable was the level of transformation. Very densely populated areas of the maps were systematically transformed to reduce congestion. There were three levels of transformation. The final variable was the country map used, that is, the map of Japan and the map of the United States. Several behavioral measures were used. The most informativ. appeared to be the time required to complete a path (the response time), and how often subjects returned to previous portions of a path (back-ups). For both of these measures, the data pattern was essentially the same. Increasing the number of nodes hurts performance. This was particularly pronounced when the map of Japan was tested. However, as the level of transformation increased, this effect was substantially reduced or completely eliminated. The results are discussed in terms of engineering rules and guidelines for designing graphical network representations.
Yasuhiko YASUDA Takayuki YASUNO Fumio KATAYAMA Takashi TOIDA Hideyuki SAKATA
Intending to contribute to constructing better multimedia network systems, we propose a new concept of image database system of which form of storage is featuring exponential or graceful oblivion and abrupt recollection like the human memory property. By virtue of this property of database storages that is realized by employing hierarchical or pyramidal image coding, the database memory and transmission costs can be significantly reduced. In this paper we will describe the details of the concept, the results of theoretical analysis based on a simplified model which reveals the effectiveness of the proposed system, the structure of an experimental prototype system and the result of an experimental image retrieval service carried out by implementing it over ATM high speed channels.
Kang-Hyun JO Kentaro HAYASHI Yoshinori KUNO Yoshiaki SHIRAI
This paper presents a vision-based human interface system that enables a user to move a target object in a 3D CG world by moving his hand. The system can interpret hand motions both in a frame fixed in the world and a frame attached to the user. If the latter is chosen, the user can move the object forward by moving his hand forward even if he has changed his body position. In addition, the user does not have to keep in mind that his hand is in the camera field of view. The active camera system tracks the user to keep him in its field of view. Moreover, the system does not need any camera calibration. The key for the realization of the system with such features is vision algorithms based on the multiple view affine invariance theory. We demon-strate an experimental system as well as the vision algorithms. Human operation experiments show the usefulness of the system.
Tsutomu MIYASATO Haruo NOMA Fumio KISHINO
This paper describes the results of tests that measured the allowable delay between images and tactile information via a force feedback device. In order to investigate the allowable delay, two experiments were performed: 1) subjective evaluation in real space and 2) subjective evaluation in virtual space using a force feedback device.
Tatsuhiro YONEKURA Rikako NARISAWA Yoshiki WATANABE
This paper proposes a new emphasizing three-dimensional pointing device considering user friendliness and lack of cable clutter. The proposed method utilizes five degrees of freedom via the medium of non-verbal voice of human. That is, the spatial direction of the sound source, the type of the voice phoneme and the tone of the voice phoneme are utilized. The input voice is analyzed regarding the above factors and then taking proper effects as previously defined for human interface. In this paper the estimated spatial direction is used for three-dimensional movement for the virtual object as three degrees of freedom. Both of the type and the tone of the voice phoneme are used for remaining two degrees of freedom. Since vocalization of nonverbal human voice is an everyday task, and the intonation of the voice can be quite easily and intentionally controlled by human vocal ability, the proposed scheme is a new three-dimensional spatial interaction medium. In this sense, this paper realizes a cost-effective and handy nonverbal interface scheme without any artificial wearing materials which might give a physical and psychological fatigue. By using the prototype the authors evaluate the performance of the scheme from both of static and dynamic points of view and show some advantages of look and feel, and then prospect possibilities of the application for the proposed scheme.
This paper discusses a coding-based selection approach to a communication aid for the severely motor disabled. Several approaches including row-column scanning are briefly described, then we propose a new selection scheme based on the theory of adaptive coding. They are compared each other with respect to average switch activations in generating some text samples.
Alberto TOMITA,Jr. Rokuya ISHII
This paper proposes a human interface where a novel input method is used to substitute conventional input devices. It overcomes the deficiencies of physical devices, as it is based on image processing techniques. The proposed interface is composed of three parts: extraction of a person's handshape from a digitized image, detection of its fingertip, and interpretation by a software application. First, images of a pointing hand are digitized to obtain a sequence of monochrome frames. In each frame the hand is isolated from the background by means of gray-level slicing; with threshold values calculated dynamically by the combination of movement detection and histogram analysis. The advantage of this approach is that the system adapts itself to any user and compensates any changes in the illumination, while in conventional methods the threshold values are previously defined or markers have to be attached to the hand in order to give reference points. Second, once the hand is isolated, fingertip coordinates are extracted by scanning the image. Third, the coordinates are inputted to an application interface. Overall, as the algorithms are simple and only monochrome images are used, the amount of processing is kept low, making this system suitable to real-time processing without needing expensive hardware.
Masao KODAMA Masayuki YAMASATO Shinya YAMASHIRO
We frequently need to calculate the Neumann function Nν(x) of complex order ν numerically in order to solve boundary problems on electromagnetic fields. This paper presents a new method for the numerical calculation of Nν(x) of complex order ν. This method can calculate Nν(x) precisely even when the order ν is close to an integer n, and the algorithm by the method is very simple.
Sumio OHNO Keikichi HIROSE Hiroya FUJISAKI
In conventional word-spotting methods for automatic recognition of continuous speech, individual frames or segments of the input speech are assigned labels and local likelihood scores solely on the basis of their own acoustic characteristics. On the other hand, experiments on human speech perception conducted by the present authors and others show that human perception of words in connected speech is based, not only on the acoustic characteristics of individual segments, but also on the acoustic and linguistic contexts in which these segments occurs. In other words, individual segments are not correctly perceive by humans unless they are accompanied by their context. These findings on the process of human speech perception have to be applied in automatic speech recognition in order to improve the performance. From this point of view, the present paper proposes a new scheme for detecting words in continuous speech based on template matching where the likelihood of each segment of a word is determined not only by its own characteristics but also by the likelihood of its context within the framework of a word. This is accomplished by modifying the likelihood score of each segment by the likelihood score of its phonetic context, the latter representing the degree of similarity of the context to that of a candidate word in the lexicon. Higher enhancement is given to the segmental likelihood score if the likelihood score of its context is higher. The advantage of the proposed scheme over conventional schemes is demonstrated by an experiment on constructing a word lattice using connected speech of Japanese uttered by a male speaker. The result indicates that the scheme is especially effective in giving correct recognition in cases where there are two or more candidate words which are almost equal in raw segmental likelihood scores.
Ryo MIZUTANI Tsutomu MATSUMOTO
Password checking schemes are human identification methods commonly adopted in many information systems. One of their disadvantages is that an attacker who correctly observed an input password can impersonate the corresponding user freely. To overcome it there have been proposed interactive human identification schemes. Namely, a human prover who has a secret key is asked a question by a machine verifier, who then checks if an answer from the prover matches the question with respect to the key. This letter examines such a scheme that requires relatively less efforts to human provers. By computer experiments this letter evaluates its resistance against a type of attack; after observing several pairs of questions and correct answers how successfully can an attacker answer the next question?