Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
1.
Article in English | MEDLINE | ID: mdl-38083134

ABSTRACT

As technology advances and sensing devices improve, it is becoming more and more pertinent to ensure accurate positioning of these devices, especially within the human body. This task remains particularly difficult during manual, minimally invasive surgeries such as cystoscopies where only a monocular, endoscopic camera image is available and driven by hand. Tracking relies on optical localization methods, however, existing classical options do not function well in such a dynamic, non-rigid environment. This work builds on recent works using neural networks to learn a supervised depth estimation from synthetically generated images and, in a second training step, use adversarial training to then apply the network on real images. The improvements made to a synthetic cystoscopic environment are done in such a way to reduce the domain gap between the synthetic images and the real ones. Training with the proposed enhanced environment shows distinct improvements over previously published work when applied to real test images.


Subject(s)
Minimally Invasive Surgical Procedures , Neural Networks, Computer , Humans , Cystoscopy , Photography
2.
Behav Res Methods ; 2023 Dec 19.
Article in English | MEDLINE | ID: mdl-38114881

ABSTRACT

Grounding language in vision is an active field of research seeking to construct cognitively plausible word and sentence representations by incorporating perceptual knowledge from vision into text-based representations. Despite many attempts at language grounding, achieving an optimal equilibrium between textual representations of the language and our embodied experiences remains an open field. Some common concerns are the following. Is visual grounding advantageous for abstract words, or is its effectiveness restricted to concrete words? What is the optimal way of bridging the gap between text and vision? To what extent is perceptual knowledge from images advantageous for acquiring high-quality embeddings? Leveraging the current advances in machine learning and natural language processing, the present study addresses these questions by proposing a simple yet very effective computational grounding model for pre-trained word embeddings. Our model effectively balances the interplay between language and vision by aligning textual embeddings with visual information while simultaneously preserving the distributional statistics that characterize word usage in text corpora. By applying a learned alignment, we are able to indirectly ground unseen words including abstract words. A series of evaluations on a range of behavioral datasets shows that visual grounding is beneficial not only for concrete words but also for abstract words, lending support to the indirect theory of abstract concepts. Moreover, our approach offers advantages for contextualized embeddings, such as those generated by BERT (Devlin et al, 2018), but only when trained on corpora of modest, cognitively plausible sizes. Code and grounded embeddings for English are available at ( https://github.com/Hazel1994/Visually_Grounded_Word_Embeddings_2 ).

3.
Biomed Eng Lett ; 13(2): 141-151, 2023 May.
Article in English | MEDLINE | ID: mdl-37124116

ABSTRACT

Monocular depth estimation from camera images is very important for surrounding scene evaluation in many technical fields from automotive to medicine. However, traditional triangulation methods using stereo cameras or multiple views with the assumption of a rigid environment are not applicable for endoscopic domains. Particularly in cystoscopies it is not possible to produce ground truth depth information to directly train machine learning algorithms for using a monocular image directly for depth prediction. This work considers first creating a synthetic cystoscopic environment for initial encoding of depth information from synthetically rendered images. Next, the task of predicting pixel-wise depth values for real images is constrained to a domain adaption between the synthetic and real image domains. This adaptation is done through added gated residual blocks in order to simplify the network task and maintain training stability during adversarial training. Training is done on an internally collected cystoscopy dataset from human patients. The results after training demonstrate the ability to predict reasonable depth estimations from actual cystoscopic videos and added stability from using gated residual blocks is shown to prevent mode collapse during adversarial training.

4.
Sci Rep ; 12(1): 13433, 2022 08 04.
Article in English | MEDLINE | ID: mdl-35927306

ABSTRACT

Substandard and falsified medicines present a serious threat to public health. Simple, low-cost screening tools are important in the identification of such products in low- and middle-income countries. In the present study, a smartphone-based imaging software was developed for the quantification of thin-layer chromatographic (TLC) analyses. A performance evaluation of this tool in the TLC analysis of 14 active pharmaceutical ingredients according to the procedures of the Global Pharma Health Fund (GPHF) Minilab was carried out, following international guidelines and assessing accuracy, repeatability, intermediate precision, specificity, linearity, range and robustness of the method. Relative standard deviations of 2.79% and 4.46% between individual measurements were observed in the assessments of repeatability and intermediate precision, respectively. Small deliberate variations of the conditions hardly affected the results. A locally producible wooden box was designed which ensures TLC photography under standardized conditions and shielding from ambient light. Photography and image analysis were carried out with a low-cost Android-based smartphone. The app allows to share TLC photos and quantification results using messaging apps, e-mail, cable or Bluetooth connections, or to upload them to a cloud. The app is available free of charge as General Public License (GPL) open-source software, and interested individuals or organizations are welcome to use and/or to further improve this software.


Subject(s)
Counterfeit Drugs , Mobile Applications , Chromatography, Thin Layer/methods , Counterfeit Drugs/analysis , Humans , Quality Control , Smartphone
5.
IEEE Trans Vis Comput Graph ; 13(4): 663-74, 2007.
Article in English | MEDLINE | ID: mdl-17495327

ABSTRACT

By means of passive optical motion capture, real people can be authentically animated and photo-realistically textured. To import real-world characters into virtual environments, however, surface reflectance properties must also be known. We describe a video-based modeling approach that captures human shape and motion as well as reflectance characteristics from a handful of synchronized video recordings. The presented method is able to recover spatially varying surface reflectance properties of clothes from multiview video footage. The resulting model description enables us to realistically reproduce the appearance of animated virtual actors under different lighting conditions, as well as to interchange surface attributes among different people, e.g., for virtual dressing. Our contribution can be used to create 3D renditions of real-world people under arbitrary novel lighting conditions on standard graphics hardware.


Subject(s)
Computer Graphics , Image Interpretation, Computer-Assisted/methods , Joints/anatomy & histology , Joints/physiology , Lighting/methods , Models, Biological , Movement/physiology , Computer Simulation , Image Enhancement/methods , Imaging, Three-Dimensional/methods , User-Computer Interface
6.
IEEE Trans Vis Comput Graph ; 11(3): 296-305, 2005.
Article in English | MEDLINE | ID: mdl-15868829

ABSTRACT

In this paper, we present an image-based framework that acquires the reflectance properties of a human face. A range scan of the face is not required. Based on a morphable face model, the system estimates the 3D shape and establishes point-to-point correspondence across images taken from different viewpoints and across different individuals' faces. This provides a common parameterization of all reconstructed surfaces that can be used to compare and transfer BRDF data between different faces. Shape estimation from images compensates deformations of the face during the measurement process, such as facial expressions. In the common parameterization, regions of homogeneous materials on the face surface can be defined a priori. We apply analytical BRDF models to express the reflectance properties of each region and we estimate their parameters in a least-squares fit from the image data. For each of the surface points, the diffuse component of the BRDF is locally refined, which provides high detail. We present results for multiple analytical BRDF models, rendered at novel orientations and lighting conditions.


Subject(s)
Algorithms , Artificial Intelligence , Face/anatomy & histology , Image Interpretation, Computer-Assisted/methods , Imaging, Three-Dimensional/methods , Models, Biological , Pattern Recognition, Automated/methods , Computer Graphics , Computer Simulation , Humans , Image Enhancement/methods , Information Storage and Retrieval/methods , Numerical Analysis, Computer-Assisted , Photometry/methods , Reproducibility of Results , Sensitivity and Specificity , Signal Processing, Computer-Assisted , Subtraction Technique , User-Computer Interface
SELECTION OF CITATIONS
SEARCH DETAIL
...