Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
1.
IEEE Trans Pattern Anal Mach Intell ; 45(11): 13250-13264, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37363835

RESUMO

Image view synthesis has seen great success in reconstructing photorealistic visuals, thanks to deep learning and various novel representations. The next key step in immersive virtual experiences is view synthesis of dynamic scenes. However, several challenges exist due to the lack of high-quality training datasets, and the additional time dimension for videos of dynamic scenes. To address this issue, we introduce a multi-view video dataset, captured with a custom 10-camera rig in 120FPS. The dataset contains 96 high-quality scenes showing various visual effects and human interactions in outdoor scenes. We develop a new algorithm, Deep 3D Mask Volume, which enables temporally-stable view extrapolation from binocular videos of dynamic scenes, captured by static cameras. Our algorithm addresses the temporal inconsistency of disocclusions by identifying the error-prone areas with a 3D mask volume, and replaces them with static background observed throughout the video. Our method enables manipulation in 3D space as opposed to simple 2D masks, We demonstrate better temporal stability than frame-by-frame static view synthesis methods, or those that use 2D masks. The resulting view synthesis videos show minimal flickering artifacts and allow for larger translational movements.

2.
IEEE Trans Pattern Anal Mach Intell ; 44(9): 5747-5760, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-33956625

RESUMO

In this paper we present methods for estimating shape from polarisation and shading information, i.e. photo-polarimetric shape estimation, under varying, but unknown, illumination, i.e. in an uncalibrated scenario. We propose several alternative photo-polarimetric constraints that depend upon the partial derivatives of the surface and show how to express them in a unified system of partial differential equations of which previous work is a special case. By careful combination and manipulation of the constraints, we show how to eliminate non-linearities such that a discrete version of the problem can be solved using linear least squares. We derive a minimal, combinatorial approach for two source illumination estimation which we use with RANSAC for robust light direction and intensity estimation. We also introduce a new method for estimating a polarisation image from multichannel data and provide methods for estimating albedo and refractive index. We evaluate lighting, shape, albedo and refractive index estimation methods on both synthetic and real-world data showing improvements over existing state-of-the-art.

3.
IEEE Trans Pattern Anal Mach Intell ; 43(2): 701-711, 2021 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-31380744

RESUMO

We propose a novel algorithm for stabilizing selfie videos. Our goal is to automatically generate stabilized video that has optimal smooth motion in the sense of both foreground and background. The key insight is that non-rigid foreground motion in selfie videos can be analyzed using a 3D face model, and background motion can be analyzed using optical flow. We use second derivative of temporal trajectory of selected pixels as the measure of smoothness. Our algorithm stabilizes selfie videos by minimizing the smoothness measure of the background, regularized by the motion of the foreground. Experiments show that our method outperforms state-of-the-art general video stabilization techniques in selfie videos.

4.
PLoS Comput Biol ; 16(4): e1007756, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-32251448

RESUMO

Recent advances in electron microscopy have enabled the imaging of single cells in 3D at nanometer length scale resolutions. An uncharted frontier for in silico biology is the ability to simulate cellular processes using these observed geometries. Enabling such simulations requires watertight meshing of electron micrograph images into 3D volume meshes, which can then form the basis of computer simulations of such processes using numerical techniques such as the finite element method. In this paper, we describe the use of our recently rewritten mesh processing software, GAMer 2, to bridge the gap between poorly conditioned meshes generated from segmented micrographs and boundary marked tetrahedral meshes which are compatible with simulation. We demonstrate the application of a workflow using GAMer 2 to a series of electron micrographs of neuronal dendrite morphology explored at three different length scales and show that the resulting meshes are suitable for finite element simulations. This work is an important step towards making physical simulations of biological processes in realistic geometries routine. Innovations in algorithms to reconstruct and simulate cellular length scale phenomena based on emerging structural data will enable realistic physical models and advance discovery at the interface of geometry and cellular processes. We posit that a new frontier at the intersection of computational technologies and single cell biology is now open.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Algoritmos , Simulação por Computador , Dendritos/fisiologia , Difusão , Análise de Elementos Finitos , Humanos , Modelos Biológicos , Modelos Teóricos , Software , Telas Cirúrgicas
5.
IEEE Trans Pattern Anal Mach Intell ; 41(12): 2875-2888, 2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-30188812

RESUMO

We present a method for estimating surface height directly from a single polarisation image simply by solving a large, sparse system of linear equations. To do so, we show how to express polarisation constraints as equations that are linear in the unknown height. The local ambiguity in the surface normal azimuth angle is resolved globally when the optimal surface height is reconstructed. Our method is applicable to dielectric objects exhibiting diffuse and specular reflectance, though lighting and albedo must be known. We relax this requirement by showing that either spatially varying albedo or illumination can be estimated from the polarisation image alone using nonlinear methods. In the case of illumination, the estimate can only be made up to a binary ambiguity which we show is a generalised Bas-relief transformation corresponding to the convex/concave ambiguity. We believe that our method is the first passive, monocular shape-from-x technique that enables well-posed height estimation with only a single, uncalibrated illumination condition. We present results on real world data, including in uncontrolled, outdoor illumination.

6.
IEEE Trans Pattern Anal Mach Intell ; 40(3): 740-754, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-28320650

RESUMO

Light-field cameras have recently emerged as a powerful tool for one-shot passive 3D shape capture. However, obtaining the shape of glossy objects like metals or plastics remains challenging, since standard Lambertian cues like photo-consistency cannot be easily applied. In this paper, we derive a spatially-varying (SV)BRDF-invariant theory for recovering 3D shape and reflectance from light-field cameras. Our key theoretical insight is a novel analysis of diffuse plus single-lobe SVBRDFs under a light-field setup. We show that, although direct shape recovery is not possible, an equation relating depths and normals can still be derived. Using this equation, we then propose using a polynomial (quadratic) shape prior to resolve the shape ambiguity. Once shape is estimated, we also recover the reflectance. We present extensive synthetic data on the entire MERL BRDF dataset, as well as a number of real examples to validate the theory, where we simultaneously recover shape and BRDFs from a single image taken with a Lytro Illum camera.

7.
IEEE Trans Pattern Anal Mach Intell ; 39(9): 1880-1891, 2017 09.
Artigo em Inglês | MEDLINE | ID: mdl-28114056

RESUMO

Photometric stereo is widely used for 3D reconstruction. However, its use in scattering media such as water, biological tissue and fog has been limited until now, because of forward scattered light from both the source and object, as well as light scattered back from the medium (backscatter). Here we make three contributions to address the key modes of light propagation, under the common single scattering assumption for dilute media. First, we show through extensive simulations that single-scattered light from a source can be approximated by a point light source with a single direction. This alleviates the need to handle light source blur explicitly. Next, we model the blur due to scattering of light from the object. We measure the object point-spread function and introduce a simple deconvolution method. Finally, we show how imaging fluorescence emission where available, eliminates the backscatter component and increases the signal-to-noise ratio. Experimental results in a water tank, with different concentrations of scattering media added, show that deconvolution produces higher-quality 3D reconstructions than previous techniques, and that when combined with fluorescence, can produce results similar to that in clear water even for highly turbid media.

8.
IEEE Trans Pattern Anal Mach Intell ; 39(3): 546-560, 2017 03.
Artigo em Inglês | MEDLINE | ID: mdl-27101598

RESUMO

Light-field cameras are quickly becoming commodity items, with consumer and industrial applications. They capture many nearby views simultaneously using a single image with a micro-lens array, thereby providing a wealth of cues for depth recovery: defocus, correspondence, and shading. In particular, apart from conventional image shading, one can refocus images after acquisition, and shift one's viewpoint within the sub-apertures of the main lens, effectively obtaining multiple views. We present a principled algorithm for dense depth estimation that combines defocus and correspondence metrics. We then extend our analysis to the additional cue of shading, using it to refine fine details in the shape. By exploiting an all-in-focus image, in which pixels are expected to exhibit angular coherence, we define an optimization framework that integrates photo consistency, depth consistency, and shading consistency. We show that combining all three sources of information: defocus, correspondence, and shading, outperforms state-of-the-art light-field depth estimation algorithms in multiple scenarios.

9.
IEEE Trans Pattern Anal Mach Intell ; 38(11): 2170-2181, 2016 11.
Artigo em Inglês | MEDLINE | ID: mdl-26761194

RESUMO

Light-field cameras have become widely available in both consumer and industrial applications. However, most previous approaches do not model occlusions explicitly, and therefore fail to capture sharp object boundaries. A common assumption is that for a Lambertian scene, a pixel will exhibit photo-consistency, which means all viewpoints converge to a single point when focused to its depth. However, in the presence of occlusions this assumption fails to hold, making most current approaches unreliable precisely where accurate depth information is most important - at depth discontinuities. In this paper, an occlusion-aware depth estimation algorithm is developed; the method also enables identification of occlusion edges, which may be useful in other applications. It can be shown that although photo-consistency is not preserved for pixels at occlusions, it still holds in approximately half the viewpoints. Moreover, the line separating the two view regions (occluded object versus occluder) has the same orientation as that of the occlusion edge in the spatial domain. By ensuring photo-consistency in only the occluded view region, depth estimation can be improved. Occlusion predictions can also be computed and used for regularization. Experimental results show that our method outperforms current state-of-the-art light-field depth estimation algorithms, especially near occlusion boundaries.

10.
IEEE Trans Pattern Anal Mach Intell ; 38(6): 1155-69, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-26372203

RESUMO

Light-field cameras have now become available in both consumer and industrial applications, and recent papers have demonstrated practical algorithms for depth recovery from a passive single-shot capture. However, current light-field depth estimation methods are designed for Lambertian objects and fail or degrade for glossy or specular surfaces. The standard Lambertian photoconsistency measure considers the variance of different views, effectively enforcing point-consistency, i.e., that all views map to the same point in RGB space. This variance or point-consistency condition is a poor metric for glossy surfaces. In this paper, we present a novel theory of the relationship between light-field data and reflectance from the dichromatic model. We present a physically-based and practical method to estimate the light source color and separate specularity. We present a new photo consistency metric, line-consistency, which represents how viewpoint changes affect specular points. We then show how the new metric can be used in combination with the standard Lambertian variance or point-consistency measure to give us results that are robust against scenes with glossy surfaces. With our analysis, we can also robustly estimate multiple light source colors and remove the specular component from glossy objects. We show that our method outperforms current state-of-the-art specular removal and depth estimation algorithms in multiple real world scenarios using the consumer Lytro and Lytro Illum light field cameras.

11.
IEEE Trans Vis Comput Graph ; 20(12): 2624-33, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-26356976

RESUMO

We present a method for automatically identifying and validating predictive relationships between the visual appearance of a city and its non-visual attributes (e.g. crime statistics, housing prices, population density etc.). Given a set of street-level images and (location, city-attribute-value) pairs of measurements, we first identify visual elements in the images that are discriminative of the attribute. We then train a predictor by learning a set of weights over these elements using non-linear Support Vector Regression. To perform these operations efficiently, we implement a scalable distributed processing framework that speeds up the main computational bottleneck (extracting visual elements) by an order of magnitude. This speedup allows us to investigate a variety of city attributes across 6 different American cities. We find that indeed there is a predictive relationship between visual elements and a number of city attributes including violent crime rates, theft rates, housing prices, population density, tree presence, graffiti presence, and the perception of danger. We also test human performance for predicting theft based on street-level images and show that our predictor outperforms this baseline with 33% higher accuracy on average. Finally, we present three prototype applications that use our system to (1) define the visual boundary of city neighborhoods, (2) generate walking directions that avoid or seek out exposure to city attributes, and (3) validate user-specified visual elements for prediction.

12.
IEEE Trans Pattern Anal Mach Intell ; 35(12): 2941-55, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24136432

RESUMO

This paper presents a comprehensive theory of photometric surface reconstruction from image derivatives in the presence of a general, unknown isotropic BRDF. We derive precise topological classes up to which the surface may be determined and specify exact priors for a full geometric reconstruction. These results are the culmination of a series of fundamental observations. First, we exploit the linearity of chain rule differentiation to discover photometric invariants that relate image derivatives to the surface geometry, regardless of the form of isotropic BRDF. For the problem of shape-from-shading, we show that a reconstruction may be performed up to isocontours of constant magnitude of the gradient. For the problem of photometric stereo, we show that just two measurements of spatial and temporal image derivatives, from unknown light directions on a circle, suffice to recover surface information from the photometric invariant. Surprisingly, the form of the invariant bears a striking resemblance to optical flow; however, it does not suffer from the aperture problem. This photometric flow is shown to determine the surface up to isocontours of constant magnitude of the surface gradient, as well as isocontours of constant depth. Further, we prove that specification of the surface normal at a single point completely determines the surface depth from these isocontours. In addition, we propose practical algorithms that require additional initial or boundary information, but recover depth from lower order derivatives. Our theoretical results are illustrated with several examples on synthetic and real data.

13.
IEEE Trans Pattern Anal Mach Intell ; 35(3): 555-67, 2013 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-22665721

RESUMO

We propose a new method named compressive structured light for recovering inhomogeneous participating media. Whereas conventional structured light methods emit coded light patterns onto the surface of an opaque object to establish correspondence for triangulation, compressive structured light projects patterns into a volume of participating medium to produce images which are integral measurements of the volume density along the line of sight. For a typical participating medium encountered in the real world, the integral nature of the acquired images enables the use of compressive sensing techniques that can recover the entire volume density from only a few measurements. This makes the acquisition process more efficient and enables reconstruction of dynamic volumetric phenomena. Moreover, our method requires the projection of multiplexed coded illumination, which has the added advantage of increasing the signal-to-noise ratio of the acquisition. Finally, we propose an iterative algorithm to correct for the attenuation of the participating medium during the reconstruction process. We show the effectiveness of our method with simulations as well as experiments on the volumetric recovery of multiple translucent layers, 3D point clouds etched in glass, and the dynamic process of milk drops dissolving in water.

14.
IEEE Trans Vis Comput Graph ; 18(10): 1591-1602, 2012 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22144527

RESUMO

We present an algorithm to render objects made of transparent materials with rough surfaces in real-time, under all-frequency distant illumination. Rough surfaces cause wide scattering as light enters and exits objects, which significantly complicates the rendering of such materials. We present two contributions to approximate the successive scattering events at interfaces, due to rough refraction: First, an approximation of the Bidirectional Transmittance Distribution Function (BTDF), using spherical Gaussians, suitable for real-time estimation of environment lighting using preconvolution; second, a combination of cone tracing and macrogeometry filtering to efficiently integrate the scattered rays at the exiting interface of the object. We demonstrate the quality of our approximation by comparison against stochastic ray tracing. Furthermore we propose two extensions to our method for supporting spatially varying roughness on object surfaces and local lighting for thin objects.

15.
IEEE Trans Pattern Anal Mach Intell ; 33(10): 2122-8, 2011 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-21670483

RESUMO

Inverse light transport seeks to undo global illumination effects, such as interreflections, that pervade images of most scenes. This paper presents the theoretical and computational foundations for inverse light transport as a dual of forward rendering. Mathematically, this duality is established through the existence of underlying Neumann series expansions. Physically, it can be shown that each term of our inverse series cancels an interreflection bounce, just as the forward series adds them. While the convergence properties of the forward series are well known, we show that the oscillatory convergence of the inverse series leads to more interesting conditions on material reflectance. Conceptually, the inverse problem requires the inversion of a large light transport matrix, which is impractical for realistic resolutions using standard techniques. A natural consequence of our theoretical framework is a suite of fast computational algorithms for light transport inversion--analogous to finite element radiosity, Monte Carlo and wavelet-based methods in forward rendering--that rely at most on matrix-vector multiplications. We demonstrate two practical applications, namely, separation of individual bounces of the light transport and fast projector radiometric compensation, to display images free of global illumination artifacts in real-world environments.

16.
IEEE Trans Pattern Anal Mach Intell ; 30(2): 197-213, 2008 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-18084053

RESUMO

This paper develops a theory of frequency domain invariants in computer vision. We derive novel identities using spherical harmonics, which are the angular frequency domain analog to common spatial domain invariants such as reflectance ratios. These invariants are derived from the spherical harmonic convolution framework for reflection from a curved surface. Our identities apply in a number of canonical cases, including single and multiple images of objects under the same and different lighting conditions. One important case we consider is two different glossy objects in two different lighting environments. For this case, we derive a novel identity, independent of the specific lighting configurations or BRDFs, that allows us to directly estimate the fourth image if the other three are available. The identity can also be used as an invariant to detecttampering in the images. While this paper is primarily theoretical, it has the potential to lay the mathematical foundations for two important practical applications. First, we can develop more general algorithms for inverse rendering problems, which can directly relight and change material properties by transferring the BRDF or lighting from another object or illumination. Second, we can check the consistency of an image, to detect tampering or image splicing.

17.
IEEE Trans Vis Comput Graph ; 13(3): 595-609, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17356224

RESUMO

The properties of virtually all real-world materials change with time, causing their bidirectional reflectance distribution functions (BRDFs) to be time varying. However, none of the existing BRDF models and databases take time variation into consideration; they represent the appearance of a material at a single time instance. In this paper, we address the acquisition, analysis, modeling, and rendering of a wide range of time-varying BRDFs (TVBRDFs). We have developed an acquisition system that is capable of sampling a material's BRDF at multiple time instances, with each time sample acquired within 36 sec. We have used this acquisition system to measure the BRDFs of a wide range of time-varying phenomena, which include the drying of various types of paints (watercolor, spray, and oil), the drying of wet rough surfaces (cement, plaster, and fabrics), the accumulation of dusts (household and joint compound) on surfaces, and the melting of materials (chocolate). Analytic BRDF functions are fit to these measurements and the model parameters' variations with time are analyzed. Each category exhibits interesting and sometimes nonintuitive parameter trends. These parameter trends are then used to develop analytic TVBRDF models. The analytic TVBRDF models enable us to apply effects such as paint drying and dust accumulation to arbitrary surfaces and novel materials.

18.
IEEE Trans Pattern Anal Mach Intell ; 28(8): 1287-302, 2006 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-16886864

RESUMO

Three-dimensional appearance models consisting of spatially varying reflectance functions defined on a known shape can be used in analysis-by-synthesis approaches to a number of visual tasks. The construction of these models requires the measurement of reflectance, and the problem of recovering spatially varying reflectance from images of known shape has drawn considerable interest. To date, existing methods rely on either: (1) low-dimensional (e.g., parametric) reflectance models, or (2) large data sets involving thousands of images (or more) per object. Appearance models based on the former have limited accuracy and generality since they require the selection of a specific reflectance model a priori, and while approaches based on the latter may be suitable for certain applications, they are generally too costly and cumbersome to be used for image analysis. We present an alternative approach that seeks to combine the benefits of existing methods by enabling the estimation of a nonparametric spatially varying reflectance function from a small number of images. We frame the problem as scattered-data interpolation in a mixed spatial and angular domain, and we present a theory demonstrating that the angular accuracy of a recovered reflectance function can be increased in exchange for a decrease in its spatial resolution. We also present a practical solution to this interpolation problem using a new representation of reflectance based on radial basis functions. This representation is evaluated experimentally by testing its ability to predict appearance under novel view and lighting conditions. Our results suggest that since reflectance typically varies slowly from point to point over much of an object's surface, we can often obtain a nonparametric reflectance function from a sparse set of images. In fact, in some cases, we can obtain reasonable results in the limiting case of only a single input image.


Assuntos
Inteligência Artificial , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Iluminação/métodos , Reconhecimento Automatizado de Padrão/métodos , Fotometria/métodos , Algoritmos , Armazenamento e Recuperação da Informação/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Técnica de Subtração
19.
IEEE Trans Pattern Anal Mach Intell ; 27(2): 288-95, 2005 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-15688567

RESUMO

Cast shadows can be significant in many computer vision applications, such as lighting-insensitive recognition and surface reconstruction. Nevertheless, most algorithms neglect them, primarily because they involve nonlocal interactions in nonconvex regions, making formal analysis difficult. However, many real instances map closely to canonical configurations like a wall, a V-groove type structure, or a pitted surface. In particular, we experiment with 3D textures like moss, gravel, and a kitchen sponge, whose surfaces include canonical configurations like V-grooves. This paper takes a first step toward a formal analysis of cast shadows, showing theoretically that many configurations can be mathematically analyzed using convolutions and Fourier basis functions. Our analysis exposes the mathematical convolution structure of cast shadows and shows strong connections to recent signal-processing frameworks for reflection and illumination.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão/métodos , Gráficos por Computador , Análise de Fourier , Aumento da Imagem/métodos , Luz , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Processamento de Sinais Assistido por Computador
20.
IEEE Trans Pattern Anal Mach Intell ; 27(2): 296-302, 2005 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-15688568

RESUMO

Depth from triangulation has traditionally been investigated in a number of independent threads of research, with methods such as stereo, laser scanning, and coded structured light considered separately. In this paper, we propose a common framework called spacetime stereo that unifies and generalizes many of these previous methods. To show the practical utility of the framework, we develop two new algorithms for depth estimation: depth from unstructured illumination change and depth estimation in dynamic scenes. Based on our analysis, we show that methods derived from the spacetime stereo framework can be used to recover depth in situations in which existing methods perform poorly.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Fotogrametria/métodos , Técnica de Subtração , Gravação em Vídeo/métodos , Simulação por Computador , Aumento da Imagem/métodos , Armazenamento e Recuperação da Informação/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...