Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
Add more filters










Publication year range
1.
J Vis ; 24(4): 23, 2024 Apr 01.
Article in English | MEDLINE | ID: mdl-38662346

ABSTRACT

This paper reviews projection models and their perception in realistic pictures, and proposes hypotheses for three-dimensional (3D) shape and space perception in pictures. In these hypotheses, eye fixations, and foveal vision play a central role. Many past theories and experimental studies focus solely on linear perspective. Yet, these theories fail to explain many important perceptual phenomena, including the effectiveness of nonlinear projections. Indeed, few classical paintings strictly obey linear perspective, nor do the best distortion-avoidance techniques for wide-angle computational photography. The hypotheses here employ a two-stage model for 3D human vision. When viewing a picture, the first stage perceives 3D shape for the current gaze. Each fixation has its own perspective projection, but, owing to the nature of foveal and peripheral vision, shape information is obtained primarily for a small region of the picture around the fixation. As a viewer moves their eyes, the second stage continually integrates some of the per-gaze information into an overall interpretation of a picture. The interpretation need not be geometrically stable or consistent over time. It is argued that this framework could explain many disparate pictorial phenomena, including different projection styles throughout art history and computational photography, while being consistent with the constraints of human 3D vision. The paper reviews open questions and suggests new studies to explore these hypotheses.


Subject(s)
Fixation, Ocular , Humans , Fixation, Ocular/physiology , Form Perception/physiology , Depth Perception/physiology , Space Perception/physiology , Eye Movements/physiology , Fovea Centralis/physiology
2.
IEEE Comput Graph Appl ; 44(1): 76-85, 2024.
Article in English | MEDLINE | ID: mdl-38271154

ABSTRACT

Computing occluding contours is often a crucial step in stroke-based artistic 3-D stylization for movies, video games, and visualizations. However, many existing applications use only simple curve stylization techniques, such as thin black lines or hand-animated strokes. This is because sophisticated procedural stylization requires accurate curve topology, which has long been an unsolved research problem. This article describes a recent theoretical breakthrough in the topology problem. Specifically, the new theory points out that existing contour algorithms often generate curves that cannot have any valid visibility, and new algorithms show how to correct the problem. This article surveys classes of algorithms that can compute contours accurately and identifies new research opportunities.

3.
IEEE Comput Graph Appl ; 43(6): 112-116, 2023.
Article in English | MEDLINE | ID: mdl-37930893

ABSTRACT

Computer graphics research frequently evaluates research outputs with user studies, often through online crowdworking platforms. When performed carefully and thoughtfully, studies on human behavior and preferences provide valuable insights, useful for both developing and evaluating new tools. Yet, I argue that many of the current studies are performative: they result from reviewers' expectation that "papers should have some evaluation," not from careful thought about the value and usefulness of the studies themselves. These casually done studies are often uninformative or misleading, while putting undue burden on authors and reviewers. The expectation of positive user evaluation results can also inhibit creative new work. I call for reviewers to be more thoughtful about asking for user studies, for authors to be more thoughtful when they perform studies, and for our field to conduct new research and create new guidelines on when and how user studies are genuinely useful.

4.
Science ; 380(6650): 1110-1111, 2023 Jun 16.
Article in English | MEDLINE | ID: mdl-37319193

ABSTRACT

Understanding shifts in creative work will help guide AI's impact on the media ecosystem.

5.
J Vis ; 22(11): 10, 2022 10 04.
Article in English | MEDLINE | ID: mdl-36251307

ABSTRACT

Photography is often understood as an objective recording of light measurements, in contrast with the subjective nature of painting. This article argues that photography entails making the same kinds of choices of color, tone, and perspective as in painting, and surveys examples from film photography and smartphone cameras. Hence, understanding picture perception requires treating photography as just one way to make pictures. More research is needed to understand the effects of these choices on pictorial perception, which in turn could lead to the design of new imaging techniques.


Subject(s)
Photography , Humans , Photography/methods
6.
Perception ; 50(3): 266-275, 2021 Mar.
Article in English | MEDLINE | ID: mdl-33706622

ABSTRACT

It has often been conjectured that the effectiveness of line drawings can be explained by the similarity of edge images to line drawings. This article presents several problems with explaining line drawing perception in terms of edges, and how the recently proposed Realism Hypothesis resolves these problems. There is nonetheless existing evidence that edges are often the best features for predicting where people draw lines; this article describes how the Realism Hypothesis can explain this evidence.


Subject(s)
Art , Humans , Perception
7.
IEEE Trans Pattern Anal Mach Intell ; 43(7): 2388-2399, 2021 Jul.
Article in English | MEDLINE | ID: mdl-31902756

ABSTRACT

Layout is important for graphic design and scene generation. We propose a novel Generative Adversarial Network, called LayoutGAN, that synthesizes layouts by modeling geometric relations of different types of 2D elements. The generator of LayoutGAN takes as input a set of randomly-placed 2D graphic elements, represented by vectors and uses self-attention modules to refine their labels and geometric parameters jointly to produce a realistic layout. Accurate alignment is critical for good layouts. We, thus, propose a novel differentiable wireframe rendering layer that maps the generated layout to a wireframe image, upon which a CNN-based discriminator is used to optimize the layouts in image space. We validate the effectiveness of LayoutGAN in various experiments including MNIST digit generation, document layout generation, clipart abstract scene generation, tangram graphic design, mobile app layout design, and webpage layout optimization from hand-drawn sketches.

8.
Perception ; 49(4): 439-451, 2020 Apr.
Article in English | MEDLINE | ID: mdl-32126897
9.
IEEE Trans Vis Comput Graph ; 25(5): 1817-1827, 2019 May.
Article in English | MEDLINE | ID: mdl-30843842

ABSTRACT

We present a method for adding parallax and real-time playback of 360° videos in Virtual Reality headsets. In current video players, the playback does not respond to translational head movement, which reduces the feeling of immersion, and causes motion sickness for some viewers. Given a 360° video and its corresponding depth (provided by current stereo 360° stitching algorithms), a naive image-based rendering approach would use the depth to generate a 3D mesh around the viewer, then translate it appropriately as the viewer moves their head. However, this approach breaks at depth discontinuities, showing visible distortions, whereas cutting the mesh at such discontinuities leads to ragged silhouettes and holes at disocclusions. We address these issues by improving the given initial depth map to yield cleaner, more natural silhouettes. We rely on a three-layer scene representation, made up of a foreground layer and two static background layers, to handle disocclusions by propagating information from multiple frames for the first background layer, and then inpainting for the second one. Our system works with input from many of today's most popular 360° stereo capture devices (e.g., Yi Halo or GoPro Odyssey), and works well even if the original video does not provide depth information. Our user studies confirm that our method provides a more compelling viewing experience than without parallax, increasing immersion while reducing discomfort and nausea.

10.
IEEE Trans Vis Comput Graph ; 25(7): 2419-2429, 2019 Jul.
Article in English | MEDLINE | ID: mdl-29993550

ABSTRACT

Graphic design tools provide powerful controls for expert-level design creation, but the options can often be overwhelming for novices. This paper proposes Context-Aware Asset Search tools that take the current state of the user's design into account, thereby providing search and selections that are compatible with the current design and better fit the user's needs. In particular, we focus on image search and color selection, two tasks that are central to design. We learn a model for compatibility of images and colors within a design, using crowdsourced data. We then use the learned model to rank image search results or color suggestions during design. We found counterintuitive behavior using conventional training with pairwise comparisons for image search, where models with and without compatibility performed similarly. We describe a data collection procedure that alleviates this problem. We show that our method outperforms baseline approaches in quantitative evaluation, and we also evaluate a prototype interactive design tool.

11.
IEEE Trans Pattern Anal Mach Intell ; 37(12): 2415-27, 2015 Dec.
Article in English | MEDLINE | ID: mdl-26539847

ABSTRACT

We propose an efficient optimization algorithm to select a subset of training data as the inducing set for sparse Gaussian process regression. Previous methods either use different objective functions for inducing set and hyperparameter selection, or else optimize the inducing set by gradient-based continuous optimization. The former approaches are harder to interpret and suboptimal, whereas the latter cannot be applied to discrete input domains or to kernel functions that are not differentiable with respect to the input. The algorithm proposed in this work estimates an inducing set and the hyperparameters using a single objective. It can be used to optimize either the marginal likelihood or a variational free energy. Space and time complexity are linear in training set size, and the algorithm can be applied to large regression problems on discrete or continuous domains. Empirical evaluation shows state-of-art performance in discrete cases, competitive prediction results as well as a favorable trade-off between training and test time in continuous cases.

12.
IEEE Trans Vis Comput Graph ; 20(8): 1200-13, 2014 Aug.
Article in English | MEDLINE | ID: mdl-26357371

ABSTRACT

This paper presents an approach for automatically creating graphic design layouts using a new energy-based model derived from design principles. The model includes several new algorithms for analyzing graphic designs, including the prediction of perceived importance, alignment detection, and hierarchical segmentation. Given the model, we use optimization to synthesize new layouts for a variety of single-page graphic designs. Model parameters are learned with Nonlinear Inverse Optimization (NIO) from a small number of example layouts. To demonstrate our approach, we show results for applications including generating design layouts in various styles, retargeting designs to new sizes, and improving existing designs. We also compare our automatic results with designs created using crowdsourcing and show that our approach performs slightly better than novice designers.

13.
IEEE Trans Vis Comput Graph ; 19(8): 1405-14, 2013 Aug.
Article in English | MEDLINE | ID: mdl-23744269

ABSTRACT

This paper presents the first method for full-body trajectory optimization of physics-based human motion that does not rely on motion capture, specified key-poses, or periodic motion. Optimization is performed using a small set of simple goals, for example, one hand should be on the ground, or the center-of-mass should be above a particular height. These objectives are applied to short spacetime windows which can be composed to express goals over an entire animation. Specific contact locations needed to achieve objectives are not required by our method. We show that the method can synthesize many different kinds of movement, including walking, hand walking, breakdancing, flips, and crawling. Most of these movements have never been previously synthesized by physics-based methods.


Subject(s)
Dancing , Image Processing, Computer-Assisted/methods , Movement , Video Recording/methods , Humans
14.
IEEE Trans Vis Comput Graph ; 19(1): 56-66, 2013 Jan.
Article in English | MEDLINE | ID: mdl-22450822

ABSTRACT

Imagining what a proposed home remodel might look like without actually performing it is challenging. We present an image-based remodeling methodology that allows real-time photorealistic visualization during both the modeling and remodeling process of a home interior. Large-scale edits, like removing a wall or enlarging a window, are performed easily and in real time, with realistic results. Our interface supports the creation of concise, parameterized, and constrained geometry, as well as remodeling directly from within the photographs. Real-time texturing of modified geometry is made possible by precomputing view-dependent textures for all faces that are potentially visible to each original camera viewpoint, blending multiple viewpoints and hole-filling when necessary. The resulting textures are stored and accessed efficiently enabling intuitive real-time realistic visualization, modeling, and editing of the building interior.

15.
IEEE Trans Vis Comput Graph ; 18(3): 475-87, 2012 Mar.
Article in English | MEDLINE | ID: mdl-21383408

ABSTRACT

This paper presents an interactive system for creating painterly animation from video sequences. Previous approaches to painterly animation typically emphasize either purely automatic stroke synthesis or purely manual stroke key framing. Our system supports a spectrum of interaction between these two approaches which allows the user more direct control over stroke synthesis. We introduce an approach for controlling the results of painterly animation: keyframed Control Strokes can affect automatic stroke's placement, orientation, movement, and color. Furthermore, we introduce a new automatic synthesis algorithm that traces strokes through a video sequence in a greedy manner, but, instead of a vector field, uses an objective function to guide placement. This allows the method to capture fine details, respect region boundaries, and achieve greater temporal coherence than previous methods. All editing is performed with a WYSIWYG interface where the user can directly refine the animation. We demonstrate a variety of examples using both automatic and user-guided results, with a variety of styles and source videos.

16.
IEEE Trans Pattern Anal Mach Intell ; 32(6): 1060-71, 2010 Jun.
Article in English | MEDLINE | ID: mdl-20431131

ABSTRACT

This paper describes a photometric stereo method designed for surfaces with spatially-varying BRDFs, including surfaces with both varying diffuse and specular properties. Our optimization-based method builds on the observation that most objects are composed of a small number of fundamental materials by constraining each pixel to be representable by a combination of at most two such materials. This approach recovers not only the shape but also material BRDFs and weight maps, yielding accurate rerenderings under novel lighting conditions for a wide variety of objects. We demonstrate examples of interactive editing operations made possible by our approach.

17.
IEEE Trans Pattern Anal Mach Intell ; 30(5): 878-92, 2008 May.
Article in English | MEDLINE | ID: mdl-18369256

ABSTRACT

This paper describes methods for recovering time-varying shape and motion of non-rigid 3D objects from uncalibrated 2D point tracks. For example, given a video recording of a talking person, we would like to estimate the 3D shape of the face at each instant, and learn a model of facial deformation. Time-varying shape is modeled as a rigid transformation combined with a non-rigid deformation. Reconstruction is ill-posed if arbitrary deformations are allowed, and thus additional assumptions about deformations are required. We first suggest restricting shapes to lie within a low-dimensional subspace, and describe estimation algorithms. However, this restriction alone is insufficient to constrain reconstruction. To address these problems, we propose a reconstruction method using a Probabilistic Principal Components Analysis (PPCA) shape model, and an estimation algorithm that simultaneously estimates 3D shape and motion for each instant, learns the PPCA model parameters, and robustly fills-in missing data points. We then extend the model to model temporal dynamics in object shape, allowing the algorithm to robustly handle severe cases of missing data.


Subject(s)
Algorithms , Artificial Intelligence , Image Interpretation, Computer-Assisted/methods , Imaging, Three-Dimensional/methods , Motion , Pattern Recognition, Automated/methods , Computer Simulation , Image Enhancement/methods , Models, Statistical , Reproducibility of Results , Sensitivity and Specificity
18.
IEEE Trans Pattern Anal Mach Intell ; 30(2): 283-98, 2008 Feb.
Article in English | MEDLINE | ID: mdl-18084059

ABSTRACT

We introduce Gaussian process dynamical models (GPDM) for nonlinear time series analysis, with applications to learning models of human pose and motion from high-dimensionalmotion capture data. A GPDM is a latent variable model. It comprises a low-dimensional latent space with associated dynamics, and a map from the latent space to an observation space. We marginalize out the model parameters in closed-form, using Gaussian process priors for both the dynamics and the observation mappings. This results in a non-parametric model for dynamical systems that accounts for uncertainty in the model. We demonstrate the approach, and compare four learning algorithms on human motion capture data in which each pose is 50-dimensional. Despite the use of small data sets, the GPDM learns an effective representation of the nonlinear dynamics in these spaces.


Subject(s)
Models, Biological , Movement/physiology , Algorithms , Artificial Intelligence , Biomedical Engineering , Computer Simulation , Gait/physiology , Humans , Linear Models , Nonlinear Dynamics , Video Recording , Walking/physiology
19.
IEEE Trans Pattern Anal Mach Intell ; 27(8): 1254-64, 2005 Aug.
Article in English | MEDLINE | ID: mdl-16119264

ABSTRACT

This paper presents a technique for computing the geometry of objects with general reflectance properties from images. For surfaces with varying material properties, a full segmentation into different material types is also computed. It is assumed that the camera viewpoint is fixed, but the illumination varies over the input sequence. It is also assumed that one or more example objects with similar materials and known geometry are imaged under the same illumination conditions. Unlike most previous work in shape reconstruction, this technique can handle objects with arbitrary and spatially-varying BRDFs. Furthermore, the approach works for arbitrary distant and unknown lighting environments. Finally, almost no calibration is needed, making the approach exceptionally simple to apply.


Subject(s)
Algorithms , Artificial Intelligence , Colorimetry/methods , Imaging, Three-Dimensional/methods , Information Storage and Retrieval/methods , Pattern Recognition, Automated/methods , Photogrammetry/methods , Photometry/methods , Cluster Analysis , Computer Simulation , Demography , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Models, Statistical
SELECTION OF CITATIONS
SEARCH DETAIL
...