Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 45
Filter
Add more filters











Publication year range
1.
Data Brief ; 57: 110949, 2024 Dec.
Article in English | MEDLINE | ID: mdl-39391001

ABSTRACT

Keyboard acoustic recognition is a pivotal area within cybersecurity and human-computer interaction, where the identification and analysis of keyboard sounds are used to enhance security measures. The performance of acoustic-based security systems can be influenced by factors such as the platform used, typing style, and environmental noise. To address these variations and provide a comprehensive resource, we present the Multi-Keyboard Acoustic (MKA) Datasets. These extensive datasets, meticulously gathered by a team in the Computer Science Department at the University of Halabja, include recordings from six widely-used platforms: HP, Lenovo, MSI, Mac, Messenger, and Zoom. The MKA datasets have structured data for each platform, including raw recordings, segmented sound files, and matrices derived from these sounds. They can be used by researchers in keylogging detection, cybersecurity, and other fields related to acoustic emanation attacks on keyboards. Moreover, the datasets capture the intricacies of typing behaviour with both hands and all ten fingers by carefully segmenting and pre-processing the data using the Praat tool, thus ensuring high-quality and dependable data. This comprehensive approach allows researchers to explore various aspects of keyboard sound recognition, contributing to the development of robust recognition algorithms and enhanced security measures. The MKA Datasets stand as one of the largest and most detailed datasets in this domain, offering significant potential for advancing research and improving defences against acoustic-based threats.

2.
Sci Rep ; 14(1): 20994, 2024 09 09.
Article in English | MEDLINE | ID: mdl-39251659

ABSTRACT

Sound recognition is effortless for humans but poses a significant challenge for artificial hearing systems. Deep neural networks (DNNs), especially convolutional neural networks (CNNs), have recently surpassed traditional machine learning in sound classification. However, current DNNs map sounds to labels using binary categorical variables, neglecting the semantic relations between labels. Cognitive neuroscience research suggests that human listeners exploit such semantic information besides acoustic cues. Hence, our hypothesis is that incorporating semantic information improves DNN's sound recognition performance, emulating human behaviour. In our approach, sound recognition is framed as a regression problem, with CNNs trained to map spectrograms to continuous semantic representations from NLP models (Word2Vec, BERT, and CLAP text encoder). Two DNN types were trained: semDNN with continuous embeddings and catDNN with categorical labels, both with a dataset extracted from a collection of 388,211 sounds enriched with semantic descriptions. Evaluations across four external datasets, confirmed the superiority of semantic labeling from semDNN compared to catDNN, preserving higher-level relations. Importantly, an analysis of human similarity ratings for natural sounds, showed that semDNN approximated human listener behaviour better than catDNN, other DNNs, and NLP models. Our work contributes to understanding the role of semantics in sound recognition, bridging the gap between artificial systems and human auditory perception.


Subject(s)
Auditory Perception , Natural Language Processing , Neural Networks, Computer , Semantics , Humans , Auditory Perception/physiology , Deep Learning , Sound
3.
IEEE J Transl Eng Health Med ; 12: 550-557, 2024.
Article in English | MEDLINE | ID: mdl-39155923

ABSTRACT

The objective of this study was to develop a sound recognition-based cardiopulmonary resuscitation (CPR) training system that is accessible, cost-effective, easy-to-maintain and provides accurate CPR feedback. Beep-CPR, a novel device with accordion squeakers that emit high-pitched sounds during compression, was developed. The sounds emitted by Beep-CPR were recorded using a smartphone, segmented into 2-second audio fragments, and then transformed into spectrograms. A total of 6,065 spectrograms were generated from approximately 40 minutes of audio data, which were then randomly split into training, validation, and test datasets. Each spectrogram was matched with the depth, rate, and release velocity of the compression measured at the same time interval by the ZOLL X Series monitor/defibrillator. Deep learning models utilizing spectrograms as input were trained using transfer learning based on EfficientNet to predict the depth (Depth model), rate (Rate model), and release velocity (Recoil model) of compressions. Results: The mean absolute error (MAE) for the Depth model was 0.30 cm (95% confidence interval [CI]: 0.27-0.33). The MAE of the Rate model was 3.6/min (95% CI: 3.2-3.9). For the Recoil model, the MAE was 2.3 cm/s (95% CI: 2.1-2.5). External validation of the models demonstrated acceptable performance across multiple conditions, including the utilization of a newly-manufactured device, a fatigued device, and evaluation in an environment with altered spatial dimensions. We have developed a novel sound recognition-based CPR training system, that accurately measures compression quality during training. Significance: Beep-CPR is a cost-effective and easy-to-maintain solution that can improve the efficacy of CPR training by facilitating decentralized at-home training with performance feedback.


Subject(s)
Cardiopulmonary Resuscitation , Cardiopulmonary Resuscitation/education , Cardiopulmonary Resuscitation/instrumentation , Humans , Sound , Sound Spectrography , Signal Processing, Computer-Assisted/instrumentation , Deep Learning , Smartphone , Equipment Design
4.
Polymers (Basel) ; 16(8)2024 Apr 11.
Article in English | MEDLINE | ID: mdl-38674990

ABSTRACT

In the present study, poling-free PLLA/VB2 piezoelectric composites are fabricated to achieve synchronous sound recognition and energy harvesting. The addition of VB2 can interact with PLLA by intermolecular hydrogen bonding, inducing the dipole orientation of C=O in PLLA. Meanwhile, VB2 can promote crystallization of PLLA through heterogeneous nucleation. The combination of the two strategies significantly improves the piezoelectric performance of PLLA/VB2 composites. The PLLA/VB2 can detect the sound frequency with an accuracy of 0.1% in the range of 0-20 kHz to recognize characteristic sounds from a specific source. PLLA/VB2 can also convert sound into electrical energy synchronously with an energy density of 0.2 W/cm-3 to power up LEDs. Therefore, PLLA/VB2 shows great potential in the field of information and energy synchronous collection.

5.
Sensors (Basel) ; 24(4)2024 Feb 16.
Article in English | MEDLINE | ID: mdl-38400427

ABSTRACT

In order to solve the problem of low recognition accuracy of traditional pig sound recognition methods, deep neural network (DNN) and Hidden Markov Model (HMM) theory were used as the basis of pig sound signal recognition in this study. In this study, the sounds made by 10 landrace pigs during eating, estrus, howling, humming and panting were collected and preprocessed by Kalman filtering and an improved endpoint detection algorithm based on empirical mode decomposition-Teiger energy operator (EMD-TEO) cepstral distance. The extracted 39-dimensional mel-frequency cepstral coefficients (MFCCs) were then used as a dataset for network learning and recognition to build a DNN- and HMM-based sound recognition model for pig states. The results show that in the pig sound dataset, the recognition accuracy of DNN-HMM reaches 83%, which is 22% and 17% higher than that of the baseline models HMM and GMM-HMM, and possesses a better recognition effect. In a sub-dataset of the publicly available dataset AudioSet, DNN-HMM achieves a recognition accuracy of 79%, which is 8% and 4% higher than the classical models SVM and ResNet18, respectively, with better robustness.


Subject(s)
Algorithms , Neural Networks, Computer , Female , Swine , Animals , Sound , Markov Chains
6.
Psychon Bull Rev ; 31(4): 1735-1744, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38267741

ABSTRACT

Over recent decades, studies investigating cross-modal correspondences have documented the existence of a wide range of consistent cross-modal associations between simple auditory and visual stimuli or dimensions (e.g., pitch-lightness). Far fewer studies have investigated the association between complex and realistic auditory stimuli and visually presented concepts (e.g., musical excerpts-animals). Surprisingly, however, there is little evidence concerning the extent to which these associations are shared across cultures. To address this gap in the literature, two experiments using a set of stimuli based on Prokofiev's symphonic fairy tale Peter and the Wolf are reported. In Experiment 1, 293 participants from several countries and with very different language backgrounds rated the association between the musical excerpts, images and words representing the story's characters (namely, bird, duck, wolf, cat, and grandfather). The results revealed that participants tended to consistently associate the wolf and the bird with the corresponding musical excerpt, while the stimuli of other characters were not consistently matched across participants. Remarkably, neither the participants' cultural background, nor their musical expertise affected the ratings. In Experiment 2, 104 participants were invited to rate each stimulus on eight emotional features. The results revealed that the emotional profiles associated with the music and with the concept of the wolf and the bird were perceived as more consistent between observers than the emotional profiles associated with the music and the concept of the duck, the cat, and the grandpa. Taken together, these findings therefore suggest that certain auditory-conceptual associations are perceived consistently across cultures and may be mediated by emotional associations.


Subject(s)
Auditory Perception , Cross-Cultural Comparison , Music , Humans , Adult , Male , Female , Auditory Perception/physiology , Young Adult , Association , Adolescent , Middle Aged , Concept Formation/physiology
7.
Atten Percept Psychophys ; 86(1): 263-272, 2024 Jan.
Article in English | MEDLINE | ID: mdl-37985595

ABSTRACT

The Spatial-Numerical Association of Response Codes (SNARC) effect is evidence of an association between number magnitude and response position, with faster left-key responses to small numbers and faster right-key responses to large numbers. Similarly, recent studies revealed a SNARC-like effect for tempo, defined as the speed of an auditory sequence, with faster left-key responses to slow tempo and faster right-key responses to fast tempo. In order to address some methodological issues of previous studies, in the present study we designed an experiment to investigate the occurrence of a SNARC-like effect for tempo, employing a novel procedure in which only two auditory beats in sequence with a very short interstimulus interval were used. In the "temporal speed" condition, participants were required to judge the temporal speed (slow or fast) of the sequence. In the "interval duration" condition, participants were required to judge the duration of the interval between the two beats (short or long). The results revealed a consistent SNARC-like effect in both conditions, with faster left-hand responses to slow tempo and faster right-hand responses to fast tempo. Interestingly, the consistency of the results across the two conditions indicates that the direction of the SNARC-like effect was influenced by temporal speed even when participants were explicitly required to focus on interval duration. Overall, the current study extends previous findings by employing a new paradigm that addresses potential confounding factors and strengthens evidence for the SNARC-like effect for tempo.


Subject(s)
Hand , Space Perception , Humans , Reaction Time/physiology , Space Perception/physiology , Hand/physiology
8.
Math Biosci Eng ; 20(11): 19438-19453, 2023 Oct 19.
Article in English | MEDLINE | ID: mdl-38052608

ABSTRACT

Bird sound recognition is crucial for bird protection. As bird populations have decreased at an alarming rate, monitoring and analyzing bird species helps us observe diversity and environmental adaptation. A machine learning model was used to classify bird sound signals. To improve the accuracy of bird sound recognition in low-cost hardware systems, a recognition method based on the adaptive frequency cepstrum coefficient and an improved support vector machine model using a hunter-prey optimizer was proposed. First, in sound-specific feature extraction, an adaptive factor is introduced into the extraction of the frequency cepstrum coefficients. The adaptive factor was used to adjust the continuity, smoothness and shape of the filters. The features in the full frequency band are extracted by complementing the two groups of filters. Then, the feature was used as the input for the following support vector machine classification model. A hunter-prey optimizer algorithm was used to improve the support vector machine model. The experimental results show that the recognition accuracy of the proposed method for five types of bird sounds is 93.45%, which is better than that of state-of-the-art support vector machine models. The highest recognition accuracy is obtained by adjusting the adaptive factor. The proposed method improved the accuracy of bird sound recognition. This will be helpful for bird recognition in various applications.


Subject(s)
Algorithms , Support Vector Machine , Machine Learning
9.
Nanotechnology ; 35(7)2023 Nov 29.
Article in English | MEDLINE | ID: mdl-37857282

ABSTRACT

The paper proposes a flexible micro-nano composite piezoelectric thin film. This flexible piezoelectric film is fabricated through electrospinning process, utilizing a combination of 12 wt% poly(vinylidene fluoride-co-trifluoroethylene)(P(VDF-TrFE)), 8 wt% potassium sodium niobate (KNN) nanoparticles, and 0.5 wt% graphene (GR). Under cyclic loading, the composite film demonstrates a remarkable increase in open-circuit voltage and short-circuit current, achieving values of 36.1 V and 163.7 uA, respectively. These values are 5.8 times and 3.6 times higher than those observed in the pure P(VDF-TrFE) film. The integration of this piezoelectric film into a wearable flexible heartbeat sensor, coupled with the RepMLP classification model, facilitates heartbeat acquisition and real-time automated diagnosis. After training and validation on a dataset containing 2000 heartbeat samples, the system achieved an accuracy of approximately 99% in two classification of heart sound signals (normal and abnormal). This research substantially enhances the output performance of the piezoelectric film, offering a novel and valuable solution for the application of flexible piezoelectric films in physiological signal detection.


Subject(s)
Graphite , Heart Diseases , Heart Sounds , Nanocomposites , Humans
10.
Sensors (Basel) ; 23(19)2023 Sep 27.
Article in English | MEDLINE | ID: mdl-37836929

ABSTRACT

Birds play a vital role in the study of ecosystems and biodiversity. Accurate bird identification helps monitor biodiversity, understand the functions of ecosystems, and develop effective conservation strategies. However, previous bird sound recognition methods often relied on single features and overlooked the spatial information associated with these features, leading to low accuracy. Recognizing this gap, the present study proposed a bird sound recognition method that employs multiple convolutional neural-based networks and a transformer encoder to provide a reliable solution for identifying and classifying birds based on their unique sounds. We manually extracted various acoustic features as model inputs, and feature fusion was applied to obtain the final set of feature vectors. Feature fusion combines the deep features extracted by various networks, resulting in a more comprehensive feature set, thereby improving recognition accuracy. The multiple integrated acoustic features, such as mel frequency cepstral coefficients (MFCC), chroma features (Chroma) and Tonnetz features, were encoded by a transformer encoder. The transformer encoder effectively extracted the positional relationships between bird sound features, resulting in enhanced recognition accuracy. The experimental results demonstrated the exceptional performance of our method with an accuracy of 97.99%, a recall of 96.14%, an F1 score of 96.88% and a precision of 97.97% on the Birdsdata dataset. Furthermore, our method achieved an accuracy of 93.18%, a recall of 92.43%, an F1 score of 93.14% and a precision of 93.25% on the Cornell Bird Challenge 2020 (CBC) dataset.


Subject(s)
Ecosystem , Recognition, Psychology , Animals , Sound , Acoustics , Birds
11.
Sensors (Basel) ; 23(17)2023 Sep 02.
Article in English | MEDLINE | ID: mdl-37688079

ABSTRACT

Normal-hearing people use sound as a cue to recognize various events that occur in their surrounding environment; however, this is not possible for deaf and hearing of hard (DHH) people, and in such a context they may not be able to freely detect their surrounding environment. Therefore, there is an opportunity to create a convenient device that can detect sounds occurring in daily life and present them visually instead of auditorily. Additionally, it is of great importance to appropriately evaluate how such a supporting device would change the lives of DHH people. The current study proposes an augmented-reality-based system for presenting household sounds to DHH people as visual information. We examined the effect of displaying both the icons indicating sounds classified by machine learning and a dynamic spectrogram indicating the real-time time-frequency characteristics of the environmental sounds. First, the issues that DHH people perceive as problems in their daily lives were investigated through a survey, suggesting that DHH people need to visualize their surrounding sound environment. Then, after the accuracy of the machine-learning-based classifier installed in the proposed system was validated, the subjective impression of how the proposed system increased the comfort of daily life was obtained through a field experiment in a real residence. The results confirmed that the comfort of daily life in household spaces can be improved by combining not only the classification results of machine learning but also the real-time display of spectrograms.


Subject(s)
Augmented Reality , Persons With Hearing Impairments , Humans , Sound , Hearing , Machine Learning
12.
Entropy (Basel) ; 25(8)2023 Jul 26.
Article in English | MEDLINE | ID: mdl-37628146

ABSTRACT

Developing a tailor-made centrality measure for a given task requires domain- and network-analysis expertise, as well as time and effort. Thus, automatically learning arbitrary centrality measures for providing ground-truth node scores is an important research direction. We propose a generic deep-learning architecture for centrality learning which relies on two insights: 1. Arbitrary centrality measures can be computed using Routing Betweenness Centrality (RBC); 2. As suggested by spectral graph theory, the sound emitted by nodes within the resonating chamber formed by a graph represents both the structure of the graph and the location of the nodes. Based on these insights and our new differentiable implementation of Routing Betweenness Centrality (RBC), we learn routing policies that approximate arbitrary centrality measures on various network topologies. Results show that the proposed architecture can learn multiple types of centrality indices more accurately than the state of the art.

13.
Entropy (Basel) ; 25(8)2023 Aug 09.
Article in English | MEDLINE | ID: mdl-37628214

ABSTRACT

To solve the problems of backward gas and coal dust explosion alarm technology and single monitoring means in coal mines, and to improve the accuracy of gas and coal dust explosion identification in coal mines, a sound identification method for gas and coal dust explosions based on MLP in coal mines is proposed, and the distributions of the mean value of the short-time energy, zero crossing rate, spectral centroid, spectral spread, roll-off, 16-dimensional time-frequency features, MFCC, GFCC, short-time Fourier coefficients of gas explosion sound, coal dust sound, and other underground sounds were analyzed. In order to select the most suitable feature vector to characterize the sound signal, the best feature extraction model of the Relief algorithm was established, and the cross-entropy distribution of the MLP model trained with the different numbers of feature values was analyzed. In order to further optimize the feature value selection, the recognition results of the recognition models trained with the different numbers of sound feature values were compared, and the first 35-dimensional feature values were finally determined as the feature vector to characterize the sound signal. The feature vectors are input into the MLP to establish the sound recognition model of coal mine gas and coal dust explosion. An analysis of the feature extraction, optimal feature extraction, model training, and time consumption for model recognition during the model establishment process shows that the proposed algorithm has high computational efficiency and meets the requirement of the real-time coal mine safety monitoring and alarm system. From the results of recognition experiments, the sound recognition algorithm can distinguish each kind of sound involved in the experiments more accurately. The average recognition rate, recall rate, and accuracy rate of the model can reach 95%, 95%, and 95.8%, respectively, which is obviously better than the comparison algorithm and can meet the requirements of coal mine gas and coal dust explosion sensing and alarming.

14.
Sensors (Basel) ; 23(13)2023 Jul 07.
Article in English | MEDLINE | ID: mdl-37448075

ABSTRACT

Environmental Sound Recognition (ESR) plays a crucial role in smart cities by accurately categorizing audio using well-trained Machine Learning (ML) classifiers. This application is particularly valuable for cities that analyzed environmental sounds to gain insight and data. However, deploying deep learning (DL) models on resource-constrained embedded devices, such as Raspberry Pi (RPi) or Tensor Processing Units (TPUs), poses challenges. In this work, an evaluation of an existing pre-trained model for deployment on Raspberry Pi (RPi) and TPU platforms other than a laptop is proposed. We explored the impact of the retraining parameters and compared the sound classification performance across three datasets: ESC-10, BDLib, and Urban Sound. Our results demonstrate the effectiveness of the pre-trained model for transfer learning in embedded systems. On laptops, the accuracy rates reached 96.6% for ESC-10, 100% for BDLib, and 99% for Urban Sound. On RPi, the accuracy rates were 96.4% for ESC-10, 100% for BDLib, and 95.3% for Urban Sound, while on RPi with Coral TPU, the rates were 95.7% for ESC-10, 100% for BDLib and 95.4% for the Urban Sound. Utilizing pre-trained models reduces the computational requirements, enabling faster inference. Leveraging pre-trained models in embedded systems accelerates the development, deployment, and performance of various real-time applications.


Subject(s)
Machine Learning , Neural Networks, Computer , Cities , Sound
15.
J Neurosci ; 43(21): 3876-3894, 2023 05 24.
Article in English | MEDLINE | ID: mdl-37185101

ABSTRACT

Natural sounds contain rich patterns of amplitude modulation (AM), which is one of the essential sound dimensions for auditory perception. The sensitivity of human hearing to AM measured by psychophysics takes diverse forms depending on the experimental conditions. Here, we address with a single framework the questions of why such patterns of AM sensitivity have emerged in the human auditory system and how they are realized by our neural mechanisms. Assuming that optimization for natural sound recognition has taken place during human evolution and development, we examined its effect on the formation of AM sensitivity by optimizing a computational model, specifically, a multilayer neural network, for natural sound (namely, everyday sounds and speech sounds) recognition and simulating psychophysical experiments in which the AM sensitivity of the model was assessed. Relatively higher layers in the model optimized to sounds with natural AM statistics exhibited AM sensitivity similar to that of humans, although the model was not designed to reproduce human-like AM sensitivity. Moreover, simulated neurophysiological experiments on the model revealed a correspondence between the model layers and the auditory brain regions. The layers in which human-like psychophysical AM sensitivity emerged exhibited substantial neurophysiological similarity with the auditory midbrain and higher regions. These results suggest that human behavioral AM sensitivity has emerged as a result of optimization for natural sound recognition in the course of our evolution and/or development and that it is based on a stimulus representation encoded in the neural firing rates in the auditory midbrain and higher regions.SIGNIFICANCE STATEMENT This study provides a computational paradigm to bridge the gap between the behavioral properties of human sensory systems as measured in psychophysics and neural representations as measured in nonhuman neurophysiology. This was accomplished by combining the knowledge and techniques in psychophysics, neurophysiology, and machine learning. As a specific target modality, we focused on the auditory sensitivity to sound AM. We built an artificial neural network model that performs natural sound recognition and simulated psychophysical and neurophysiological experiments in the model. Quantitative comparison of a machine learning model with human and nonhuman data made it possible to integrate the knowledge of behavioral AM sensitivity and neural AM tunings from the perspective of optimization to natural sound recognition.


Subject(s)
Auditory Cortex , Sound , Humans , Auditory Perception/physiology , Brain/physiology , Hearing , Mesencephalon/physiology , Acoustic Stimulation , Auditory Cortex/physiology
16.
Entropy (Basel) ; 25(3)2023 Feb 24.
Article in English | MEDLINE | ID: mdl-36981301

ABSTRACT

To solve the problems of backward means of coal mine gas and coal dust explosion monitoring, late reporting, and low leakage rate, a sound recognition method of coal mine gas and coal dust explosion based on GoogLeNet was proposed. After installing mining pickups in key monitoring areas of coal mines to collect the sounds of the working equipment and the environment, the collected sound was analyzed by continuous wavelet to obtain its scale coefficient map. This was then imported into GoogLeNet to obtain the recognition model of coal mine gas and coal dust explosions. The test sound was obtained by continuous wavelet analysis to obtain the scale coefficient map, brought into the completed training recognition model to obtain the sound signal class, and verified by experiment. Firstly, the scale coefficient map extracted from the sound signal by continuous wavelet analysis showed that the similarity between the subjective and objective indicators of the wavelet coefficient maps of the gas explosion sound and coal dust explosion sound was higher, but the difference between these and the rest of the coal mine sounds was clearer, helping to effectively distinguish gas and coal dust explosion sounds from other sounds. Secondly, the experimental results of GoogLeNet parameters can be obtained. When the dropout parameter is 0.5 and the initial learning rate is 0.001, the recognition effect of the model established by GoogLeNet was optimal. According to the selected parameters, the training loss, testing loss, training recognition rate, and testing recognition rate of the model are all in line with expectations. Finally, the experimental recognition results show that the recognition rate of the proposed method is 97.38%, the recall rate is 86.1%, and the accuracy rate is 100% for the case of a 9:1 ratio of test data to training data, and the overall recognition effect of the proposed GoogLeNet is significantly better than that of vgg and Alexnet, which can effectively solve the problem of under-sampling of coal mine gas and coal dust explosion sounds and can meet the need for the intelligent recognition of coal mine gas and dust explosions.

17.
Atten Percept Psychophys ; 85(4): 1238-1252, 2023 May.
Article in English | MEDLINE | ID: mdl-36008746

ABSTRACT

Inattentional unawareness potentially occurs in several different sensory domains but is mainly described in visual paradigms ("inattentional blindness"; e.g., Simons & Chabris, 1999, Perception, 28, 1059-1074). Dalton and Fraenkel (2012, Cognition, 124, 367-372) were introducing "inattentional deafness" by showing that participants missed by 70% a voice repeatedly saying "I'm a Gorilla" when focusing on a primary conversation. The present study expanded this finding from the acoustic domain in a multifaceted way: First, we extended the validity perspective by using 10 acoustic samples-specifically, excerpts of popular musical pieces from different music genres. Second, we used as the secondary acoustic signal animal sounds. Those sounds originate from a completely different acoustic domain and are therefore highly distinctive from the primary sound. Participants' task was to count different musical features. Results (N = 37 participants) showed that the frequency of missed animal sounds was higher in participants with higher attentional focus and motivation. Additionally, attentional focus, perceptual load, and feature similarity/saliency were analyzed and did not have an influence on detecting or missing animal sounds. We could demonstrate that for 31.2% of the music plays, people did not recognize highly salient animal voices (regarding the type of acoustic source as well as the frequency spectra) when executing the primary (counting) task. This uncovered, significant effect supports the idea that inattentional deafness is even available when the unattended acoustic stimuli are highly salient.


Subject(s)
Deafness , Music , Humans , Animals , Attention , Cognition , Acoustic Stimulation
18.
Article in English | MEDLINE | ID: mdl-36525201

ABSTRACT

Humans robustly associate spiky shapes to words like "Kiki" and round shapes to words like "Bouba." According to a popular explanation, this is because the mouth assumes an angular shape while speaking "Kiki" and a rounded shape for "Bouba." Alternatively, this effect could reflect more general associations between shape and sound that are not specific to mouth shape or articulatory properties of speech. These possibilities can be distinguished using unpronounceable sounds: The mouth-shape hypothesis predicts no Bouba-Kiki effect for these sounds, whereas the generic shape-sound hypothesis predicts a systematic effect. Here, we show that the Bouba-Kiki effect is present for a variety of unpronounceable sounds ranging from reversed words and real object sounds (n = 45 participants) and even pure tones (n = 28). The effect was strongly correlated with the mean frequency of a sound across both spoken and reversed words. The effect was not systematically predicted by subjective ratings of pronounceability or with mouth aspect ratios measured from video. Thus, the Bouba-Kiki effect is explained using simple shape-sound associations rather than using speech properties.

19.
Sensors (Basel) ; 22(17)2022 Aug 30.
Article in English | MEDLINE | ID: mdl-36080994

ABSTRACT

Pork accounts for an important proportion of livestock products. For pig farming, a lot of manpower, material resources and time are required to monitor pig health and welfare. As the number of pigs in farming increases, the continued use of traditional monitoring methods may cause stress and harm to pigs and farmers and affect pig health and welfare as well as farming economic output. In addition, the application of artificial intelligence has become a core part of smart pig farming. The precision pig farming system uses sensors such as cameras and radio frequency identification to monitor biometric information such as pig sound and pig behavior in real-time and convert them into key indicators of pig health and welfare. By analyzing the key indicators, problems in pig health and welfare can be detected early, and timely intervention and treatment can be provided, which helps to improve the production and economic efficiency of pig farming. This paper studies more than 150 papers on precision pig farming and summarizes and evaluates the application of artificial intelligence technologies to pig detection, tracking, behavior recognition and sound recognition. Finally, we summarize and discuss the opportunities and challenges of precision pig farming.


Subject(s)
Animal Husbandry , Animal Welfare , Animal Husbandry/methods , Animals , Artificial Intelligence , Farms , Livestock , Swine
20.
Atten Percept Psychophys ; 84(5): 1757-1771, 2022 Jul.
Article in English | MEDLINE | ID: mdl-35650471

ABSTRACT

Octave equivalence describes the perception that notes separated by a doubling in frequency sound similar. While the octave is used cross-culturally as a basis of pitch perception, experimental demonstration of the phenomenon has proved to be difficult. In past work, members of our group developed a three-range generalization paradigm that reliably demonstrated octave equivalence. In this study we replicate and expand on this previous work trying to answer three questions that help us understand the origins and potential cross-cultural significance of octave equivalence: (1) whether training with three ranges is strictly necessary or whether an easier-to-learn two-range task would be sufficient, (2) whether the task could demonstrate octave equivalence beyond neighbouring octaves, and (3) whether language skills and musical education impact the use of octave equivalence in this task. We conducted a large-sample study using variations of the original paradigm to answer these questions. Results found here suggest that the three-range discrimination task is indeed vital to demonstrating octave equivalence. In a two-range task, pitch height appears to be dominant over octave equivalence. Octave equivalence has an effect only when pitch height alone is not sufficient. Results also suggest that effects of octave equivalence are strongest between neighbouring octaves, and that tonal language and musical training have a positive effect on learning of discriminations but not on perception of octave equivalence during testing. We discuss these results considering their relevance to future research and to ongoing debates about the basis of octave equivalence perception.


Subject(s)
Music , Pitch Perception , Generalization, Psychological , Humans , Language , Learning , Pitch Discrimination
SELECTION OF CITATIONS
SEARCH DETAIL