Follow me

Showing posts with label Virtual Reality. Show all posts
Showing posts with label Virtual Reality. Show all posts

Thursday, October 11, 2012

A helmet mounted display system with active gaze control for visual telepresence


Julian P. Brooker, Paul M. Sharkey, John P. Wann, Annaliese M. Ploy
Mechatronics, 9, 1999
Summary
Teleoperation has an interesting potential, enabling human operators to perform delicate physical manipulation without the requirement of physically being present where the task is done, therefore it becomes fundamental to provide an obstructed and natural viewpoint, in order obtain high performance and safety.
There is scientific proof that proprioceptive information for hand localization may be biased by the stereo-vergence typical of humans, therefore two possible solutions maybe possible: using a stereographic display or using a helmet mounted display containing separate lightweight image displays for each eye.
Using an helmet display has the advantage of allowing the user to gaze around a scene without the use of head movement being limited, while problems, in general for Teleoperation, are related to low LCD screen resolution.
In designing a head mounted display device, variables have to be kept into account: 1) the inter-camera distance (ICD) and the inter-display distance (IDD) which have to be matched to the observer’s inter-pupil distance (IPD); 2) the field of view (FOV) of the camera configuration has to be the same of the FOV of the display configuration.
In telepresence environments high-quality binocular image of element in the near viewing field is more important than a wide angled view of the distant viewing field, therefore the FOV must be reduced together with the percentage of the overlap occurring on an object in the near viewing field also decreases.
To obtain this result cameras have to verge, a problem occurring in verging the two mounted cameras is that the object viewed tends to be altered in humans view. We can computer d (minimum distance at which the target is viewed) as: d = (p+w)/(2tan(a/2)), where w is the target’s width, p is the IPD of the operator and a the horizontal angle of view.
Solutions presented in the case of camera verging could be involving tracking operator’s eyes so that the camera geometry could be optimized to the situation, but the problem would be moving screens and cameras synchronously, which involved multiple problems.
A solution may be using electronic image translation, this would avoid use of mechanical moving parts, horizontal image detection could be achieved with minimal additional image processing hardware and the translation system is likely to be higher performing than a mechanical one; the main problem would be that images may appear deformed and LCD display may have low resolution, leading the authors to finally opt for a mechanical display positioning.
Using a mechanical system it has to be kept into account that displays have to face directly the pupil and therefore rotate with it, infrared sensor are used for this purpose. Potential problem are related to small movement the eye perform such as flicks, drifts and tremors, it is demonstrated that a system capable of tracking all this movements appears to be too sensible and therefore not being able to keep a proper and sufficient track of the system.
The eye tracking apparatus appears for each eye is located behind the half silvered mirror, for the prototype make by the authors in this paper, the problem of size of the overall system hasn’t been touched.
The control algorithm works in an iterative process to tune the parameters of the PID controller, in order to give damped responses on the camera axis.
Key Concepts
Virtual Reality
Key Results
The system has been tested and allows to perform teleoperations with good quality and results, IDD and IPD have to be calibrated to match the operator as well as the rotational centers.

Monday, October 8, 2012

Human-Assisted Virtual Environment Modeling


J.G. Wang, Y.F. Li
Autonomous Robots, N0.6, 1999

Summary
                  The paper proposes a man-machine interaction based on stereo-vision system. where the operator’s knowledge about the system is used as a guidance for modelling a 3D environment.
Virtual Environment (VE) modelling appears to be a key point in many robotic systems, specially in regard of tele-robotics. There have been many researches on how to build VE starting from vision sensors while exploring unknown environments and semi-automatic modelling with minimum human interaction. A good example of integrated robotic manipulator system using virtual reality (Chen and Trivedi, 1993, Trivedi and Chen, 1993) visualization to create advanced, flexible and intelligent user interfaces. An interactive modelling system was proposed in order to model remote physical environments through two CCD cameras, where edge information is used for stereo-matching and triangulation to extract shape information, but the system was constrained by the only motion of the camera on the Z axis.
The proposed system is performing in order that the operator can minimize the cues about the features and information the manipulator or mobile robot may encounter. The procedure followed sees first a local model build from different view point and later these local models composing a global model for the environment, once the environment has been constructed virtually, then the operator can fully concentrate in tele-operation.
Considering the use of two cameras, left and right, then two transformation matrices can be obtained: [HR] and [HL] these can be used for calculating W, the corresponding known image coordinate feature points of the 3D coordinate feature points. So in the end, assuming the 3D vector in W as [V3D] and the correspondent 2D vector [V2D], then [H]= [V2D] [V3D]T[[V3D] [V3D]T]-1 , where H can be decomposed in left and right matrices. Further on, if we assume [HR] and [HL] available, [X]=[x,y,z] of a feature in W can be calculated with its corresponding image coordinates [xa ya], [xb yb], so that [X]=[[A]T[A]]-1[A]T[B], where [A] and [B] are image coordinates.
A major difficulty though in stereo vision is the correspondence problem between the feature points in two images, due to poor robustness. A human operator can therefore identify objects in most of the scenes, prompting the vision system to locate and detect some object attributes or special corresponding feature so that the image coordinates would be deducted and the 3D position in W calculated.
A binocular stereo vision, after been guided by an operator to find some correspondent prompted feature, can be used to construct the local models of objects directly. The system work in recognizing primitive solids, from which is later possible to computer composite models. The authors introduce the cuboid (for which four points are detected) and the sphere (for which determination in a 3D space is obtained through the knowledge of radius and center), which through geometrical calculations and transformations can be obtained. Vertexes of objects are found through the intersection of corresponding lines, for other more complicated objects operator’s guidance can be used. In general only one point of view cannot successfully represent a 3D object, more than one is required and therefore Multi-Viewpoint Modelling is used. Therefore from two positions (for instance A and B) a transformation M-1 takes place, determining M rotation and translation are solved separately. If C and C’ represent the coordinate relationships between view point A and B, then C’=M’C and W=M’W’; after some computation M=[R T], with R rotational component and T translational component.
Key Concepts
Machine vision, Human Robot Interaction
Key Results
Performance can be studied either with different between point and their image or with the different between measured and real size objects. The system also work with insertion tasks with an error of 0.6 mm, in case of the need of a more precise system, force sensing would then be needed. Operators can use this methodology for observing real environment from any view points on the virtual reality system.

Information Sharing via Projection Function for Coexistence of Robot and Human


Yujin Wakita, Shigeoki Hirai, Takashi Suehiro, Toshio Hori
Autonomous Robots, N0.10, 2001

Summary
                  The authors introduce the concept of safety based on intelligent augmentation of robotic systems. In previous studies the authors introduced the concept of tele-robotic systems (1992,1995,1996), where a robot is operated from another position with no physical contact and monitored through a television, and intelligent monitoring (1992), a system allowing conveyance of only required information through selection of data. The expansion of this last system has been the snapshot function (1995), where a laser pointer helps in teaching mode to estimate the deviation of the position, while the operator can move the robot, teaching the estimated relative deviation. A further implementation is the here proposed projection function (2001), where a robot and human jointly operate through a Digital Desk, a special environment provided with a projector perpendicular to the working table and a speaker. The aim of this research is to achieve intelligent augmentation in order to prevent and avoid undesirable contact, information sharing is a fundamental aspect in cooperative tasks between a person and a robot (Wakita, 1998). The experiment test a human and robot operating in mainly 5 states (initial, approach, grasp, release and final), the main issue is this kind of problem to be solves are: the person does not know the delivery coordinate, the person must keep holding the object until it is released, the person might be frightened by the robot movement.
The projection function consists of projecting on the table the simulated images of the moving robot, so that the human operator knows in real time the robots trajectory and understand the delivery trajectory. Force sensors in the robot’s fingers are used in order to allow the robot understand when the object has been grasped by the operator. A new teaching method also is introduced: the operator activated the teaching mode by touching the robot’s hand, then, instead of physically moving the manipulator, the projected image of the robot follows the operator’s hand to destination, the advantage is that only the model is required and no robot movement; the robot confirm through the speakers that the teaching trajectory has been saved.
The force sensors are an efficient communication method only during grasping, visual monitoring appears to be necessary for the entire delivery task.
It can be observed that humans in cooperation require visual feedback in order to understand that their motion and activity has been understood, each person expects to be observed during their action. So visual information appears to be extremely important by means of perception and it enhance safety in the system.
The digital desks comes to help once again in monitoring and indicating robots and humans in the system, in fact while operating a symbol (in the experiment it is a white rectangle) is projected on the hand of the operator when the robot has detected an action, in this way the human is aware that the robot knows about its presence.
In order to perform the experiment, a CCD camera was used for detection of human’s hand and robot position, and a video projector (SANYO LP-SG60) mounted on the ceiling in parallel with the camera.
The system as programmed, projects a white rectangle on the human’s hand when the CCD and the computer had performed the detection, while stationary hand is recognized a the delivery position.
Key Concepts
Human-Robot Interaction, Human-Robot Cooperation, Team Working
Key Results
The experiment appears to be useful prompting the importance of communication between robots and humans working together, a communication which need also visual feedback in order to ensure safety. A big part of communication is in fact performed not only by direct communication, but also by indirect feedback, showing that the message has been properly received. Future research may require adding information to the system.

Thursday, October 4, 2012

Binocular Rivalry and Head-Worn Displays


Robert Patterson, Byron-Pierce, Robert Fox
Human Factors: The Journal of the Human Factors and Ergonomics Society
Summary
                  A number of virtual system technologies are used for augmented reality through the use of artificial devices, once of the most advanced concepts in this technology is the Head Worn Display (HWD). When information is presented to only one eye then it is called monocular HWD, if the same information is displayed to two eyes than it is defined as a biocular HWD, which is different from the Binocular HWD which is displayed with binocular parallax. Different kind of devices are in the market with different applications and several advantages, but it is not a system without problems, in fact it is known that it can cause headaches, nausea, eyestrain and dizziness, studies have been performed on IHADSS (Integrated Helmet and Display Sighting System), used for a large number of helicopter pilots lamenting problems. The problems are mainly due because of interrocular differences, which is when the two eyes receive different stimulation because of unnatural viewing. When two eyes receive different stimulation for each eyes, then binocular rivalry will preclude binocular fusion, bringing a state of competition between the two eyes.
Luning is the phenomenon which refers to the case in which the counter in one eye continuously covers the background area in the other eye, this, together with binocular rivalry consists of the most troublesome problems in partial-overlap displays defining conditions of exclusive visibility and mixed visibility (respectively the case of one eye’s image exclusively visible and the case of portions of the two eyes sight visible).
Binocular rivalry is mainly caused by interocular difference in orientation, hue, luminance, contrast polarity, form, size and motion velocity, light levels etc… Rivalry is generally not provoked by stimuli lasting less than 200ms, or flickered repetitive stimuli. It generally happens to both monocular and binocular HWD’s with partial overlap and it is still questioned whether HWDs should be calibrated with the users sighting dominance. It is demonstrated (Melzer and Moffict, 1997, and Klymenko, Harding, Beasly, Martin and Rash, 1999) that luning appears less with convergent design and that overlap also results in a better performance (convergent binocular overlap is when the left eye view the right monocular flanking region and the right eye views the left monocular flanking region, while divergent binocular overlap is when the left eye views the left monocular flanking region and the right eye views the right monocular region).
Rivalry could be reduced through different proposed solutions, a significant solution propoes by Kooi (1993)  is with the use of fusible window frame in both binoculary views and monoculary viewsd scenes.
Under the point of view of the stimulus, it is demonstrated that a stimulus in one eye will dominate a rival stimulus in the other eye if the former possesses a greater contour density, higher contrast, a wider range of spatial frequencies or faster motion. It must be deeper investigated whether presenting information on a Monocular HWD will reduce possibility of binocular rivalry or if it would mask information from real word (visible thank to transparency of the information displayed).
The use of repeated brief exposures and high stimulus contrast can minimize or eliminate suppression of the displayed information, opaque monocular HWD should not be used for this reason.
Two kinds of cognitive variable have also been investigated: stimuli which are familiar or meaningful and those which engage voluntary attention.
Blake in 1998 employed a dichoptic reading paradigm in which one eyes views meaningful text and one non-meaningful text, the experiment showed that meaningful text had not special status and the user would loose its attention on it.
It is also demonstrated that practice over 10 days may help individuals control the rate of rivalry alternations, being able to manage with it. Target detection and recognition has been proved to be delayed due to binocular rivalry form some simple task, but influence on performance has to be verified.
Key Concepts
Binocular rivalry, Augmented Reality, Virtual Reality

Wednesday, October 3, 2012

Interaction With an Immersive Virtual Environment Correct Users’ Distance Estimates


Adam R. Richardson, David Waller
Human Factors: The Journal of the Human Factor and Ergonomics Society
Summary
                   Virtual environment enables the users to interact with spatial information in ways that computer are not able to due to particular interface. VE applications appear to be interesting for different applications, besides more practical and direct applications it can be successfully implemented in researched in human perception, cognition and social interaction.
There a big issue though to be covered: the underestimation operators have of distance and the reasons are still not fully understood.
Originally researches suggested that the reason could be due to limited field of view, errors in accommodation, lack of accurate binocular stereo images or limits in the resolution and quality of the displays. Only recently Richardson and Waller (2005) suggested that the underestimation may be not in the technology but in the user himself, but they weren’t able to explain the reason and their method appeared to be working only in repetitive tasks with feedback assistance.
The authors proposed 2 experiments, the first one in which participants estimated distances to virtual targets before and after walking to various target in the VE, this had the aim to test if there were improvements.
In the second experiment the authors checked if the underestimating effect and the correction were transferred to other means of estimating distances. The distance was measured non verbally, asking the participants to walk while blindfolded to a previously perceived target (this is a method often used in egocentric distance for this kind of experiment since it minimize potential biases and higher level of cognition strategies.
The apparatus was providing a textured ground plane containing a target post place at different distances, the environment was presented in Virtual Research V8 HMD with a binocular stereo image of the scene (basically with one display per eye).
The fist experiment was design with a pre-interaction (involving estimates of egocentric distance), interaction (where participants walked towards the desired location) and post-interaction (again test performed in the pre-interaction where carried using different distances).
The experiment results show that in VE distances are initially underestimated, but after practice in the environment the user get more précises results (nearly fully correct).
The reason may lay in an explicit strategy (for example the user may think of always walking further than it really looks).
In experiment 2 the subjects where asked to indicated distances in two ways: blindfolded directly walking to the target and triangulation by walking (the observes first views the target an the with no vision, traverses a path that is oblique to the target, when instructed the observer turns to face the target and walks the necessary steps.
The design of the experiment again is in a similar manner of the experiment 1.
The results of the experiment show that participant estimates’ of distance were formed with respect to a percept of the target location that remained independent of the method response, this is in contrast with Richardson and Waller (2005).
Key Concepts
Virtual Environment, Distance underestimation in virtual environments
Key Results
ANOVA results show how implicit feedbacks also appear to be more precise than explicit ones in such an environment, therefore it is believed that through training, the problem of underestimation of the distance is not consistent, simple calibration is required to enable the user to alter its perception in VE.