Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 6.115
Filtrar
1.
Cancer Med ; 13(13): e7436, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38949177

RESUMO

BACKGROUND: The current guidelines for managing screen-detected pulmonary nodules offer rule-based recommendations for immediate diagnostic work-up or follow-up at intervals of 3, 6, or 12 months. Customized visit plans are lacking. PURPOSE: To develop individualized screening schedules using reinforcement learning (RL) and evaluate the effectiveness of RL-based policy models. METHODS: Using a nested case-control design, we retrospectively identified 308 patients with cancer who had positive screening results in at least two screening rounds in the National Lung Screening Trial. We established a control group that included cancer-free patients with nodules, matched (1:1) according to the year of cancer diagnosis. By generating 10,164 sequence decision episodes, we trained RL-based policy models, incorporating nodule diameter alone, combined with nodule appearance (attenuation and margin) and/or patient information (age, sex, smoking status, pack-years, and family history). We calculated rates of misdiagnosis, missed diagnosis, and delayed diagnosis, and compared the performance of RL-based policy models with rule-based follow-up protocols (National Comprehensive Cancer Network guideline; China Guideline for the Screening and Early Detection of Lung Cancer). RESULTS: We identified significant interactions between certain variables (e.g., nodule shape and patient smoking pack-years, beyond those considered in guideline protocols) and the selection of follow-up testing intervals, thereby impacting the quality of the decision sequence. In validation, one RL-based policy model achieved rates of 12.3% for misdiagnosis, 9.7% for missed diagnosis, and 11.7% for delayed diagnosis. Compared with the two rule-based protocols, the three best-performing RL-based policy models consistently demonstrated optimal performance for specific patient subgroups based on disease characteristics (benign or malignant), nodule phenotypes (size, shape, and attenuation), and individual attributes. CONCLUSIONS: This study highlights the potential of using an RL-based approach that is both clinically interpretable and performance-robust to develop personalized lung cancer screening schedules. Our findings present opportunities for enhancing the current cancer screening system.


Assuntos
Detecção Precoce de Câncer , Neoplasias Pulmonares , Humanos , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/diagnóstico por imagem , Masculino , Feminino , Detecção Precoce de Câncer/métodos , Pessoa de Meia-Idade , Estudos de Casos e Controles , Idoso , Estudos Retrospectivos , Tomografia Computadorizada por Raios X/métodos , Reforço Psicológico , Medicina de Precisão/métodos
2.
PNAS Nexus ; 3(7): pgae235, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38952456

RESUMO

We investigate the boundary between chemotaxis driven by spatial estimation of gradients and chemotaxis driven by temporal estimation. While it is well known that spatial chemotaxis becomes disadvantageous for small organisms at high noise levels, it is unclear whether there is a discontinuous switch of optimal strategies or a continuous transition exists. Here, we employ deep reinforcement learning to study the possible integration of spatial and temporal information in an a priori unconstrained manner. We parameterize such a combined chemotactic policy by a recurrent neural network and evaluate it using a minimal theoretical model of a chemotactic cell. By comparing with constrained variants of the policy, we show that it converges to purely temporal and spatial strategies at small and large cell sizes, respectively. We find that the transition between the regimes is continuous, with the combined strategy outperforming in the transition region both the constrained variants as well as models that explicitly integrate spatial and temporal information. Finally, by utilizing the attribution method of integrated gradients, we show that the policy relies on a nontrivial combination of spatially and temporally derived gradient information in a ratio that varies dynamically during the chemotactic trajectories.

3.
Neural Netw ; 178: 106483, 2024 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-38954893

RESUMO

In reinforcement learning, accurate estimation of the Q-value is crucial for acquiring an optimal policy. However, current successful Actor-Critic methods still suffer from underestimation bias. Additionally, there exists a significant estimation bias, regardless of the method used in the critic initialization phase. To address these challenges and reduce estimation errors, we propose CEILING, a simple and compatible framework that can be applied to any model-free Actor-Critic methods. The core idea of CEILING is to evaluate the superiority of different estimation methods by incorporating the true Q-value, calculated using Monte Carlo, during the training process. CEILING consists of two implementations: the Direct Picking Operation and the Exponential Softmax Weighting Operation. The first implementation selects the optimal method at each fixed step and applies it in subsequent interactions until the next selection. The other implementation utilizes a nonlinear weighting function that dynamically assigns larger weights to more accurate methods. Theoretically, we demonstrate that our methods provide a more accurate and stable Q-value estimation. Additionally, we analyze the upper bound of the estimation bias. Based on two implementations, we propose specific algorithms and their variants, and our methods achieve superior performance on several benchmark tasks.

4.
Elife ; 122024 Jul 03.
Artigo em Inglês | MEDLINE | ID: mdl-38959057

RESUMO

Songbirds' vocal mastery is impressive, but to what extent is it a result of practice? Can they, based on experienced mismatch with a known target, plan the necessary changes to recover the target in a practice-free manner without intermittently singing? In adult zebra finches, we drive the pitch of a song syllable away from its stable (baseline) variant acquired from a tutor, then we withdraw reinforcement and subsequently deprive them of singing experience by muting or deafening. In this deprived state, birds do not recover their baseline song. However, they revert their songs toward the target by about 1 standard deviation of their recent practice, provided the sensory feedback during the latter signaled a pitch mismatch with the target. Thus, targeted vocal plasticity does not require immediate sensory experience, showing that zebra finches are capable of goal-directed vocal planning.


Assuntos
Tentilhões , Objetivos , Vocalização Animal , Animais , Vocalização Animal/fisiologia , Tentilhões/fisiologia , Masculino
5.
Artigo em Inglês | MEDLINE | ID: mdl-38965085

RESUMO

RATIONALE: The potent synthetic opioid fentanyl, and its analogs, continue to drive opioid-related overdoses. Although the pharmacology of fentanyl is well characterized, there is little information about the reinforcing effects of clandestine fentanyl analogs (FAs). OBJECTIVES: Here, we compared the effects of fentanyl and the FAs acetylfentanyl, butyrylfentanyl, and cyclopropylfentanyl on drug self-administration in male and female rats. These FAs feature chemical modifications at the carbonyl moiety of the fentanyl scaffold. METHODS: Sprague-Dawley rats fitted with intravenous jugular catheters were placed in chambers containing two nose poke holes. Active nose poke responses resulted in drug delivery (0.2 mL) over 2 s on a fixed-ratio 1 schedule, followed by a 20 s timeout. Acquisition doses were 0.01 mg/kg/inj for fentanyl and cyclopropylfentanyl, and 0.03 mg/kg/inj for acetylfentanyl and butyrylfentanyl. After 10 days of acquisition, dose-effect testing was carried out, followed by 10 days of saline extinction. RESULTS: Self-administration of fentanyl and FAs was acquired by both male and female rats, with no sex differences in acquisition rate. Fentanyl and FAs showed partial inverted-U dose-effect functions; cyclopropylfentanyl and fentanyl had similar potency, while acetylfentanyl and butyrylfentanyl were less potent. Maximal response rates were similar across drugs, with fentanyl and cyclopropylfentanyl showing maximum responding at 0.001 mg/kg/inj, acetylfentanyl at 0.01 mg/kg/inj, and butyrylfentanyl at 0.003 mg/kg/inj. No sex differences were detected for drug potency, efficacy, or rates of extinction. CONCLUSIONS: Our work provides new evidence that FAs display significant abuse liability in male and female rats, which suggests the potential for compulsive use in humans.

6.
Tech Coloproctol ; 28(1): 79, 2024 Jul 04.
Artigo em Inglês | MEDLINE | ID: mdl-38965146

RESUMO

BACKGROUND: Perineal hernia (PH) is a late complication of abdominoperineal resection (APR) that may compromise a patient's quality of life. The frequency and risk factors for PH after robotic APR adopting recent rectal cancer treatment strategies remain unclear. METHODS: Patients who underwent robotic APR for rectal cancer between December 2011 and June 2022 were retrospectively examined. From July 2020, pelvic reinforcement procedures, such as robotic closure of the pelvic peritoneum and levator ani muscles, were performed as prophylactic procedures for PH whenever feasible. PH was diagnosed in patients with or without symptoms using computed tomography 1 year after surgery. We examined the frequency of PH, compared characteristics between patients with PH (PH+) and without PH (PH-), and identified risk factors for PH. RESULTS: We evaluated 142 patients, including 53 PH+ (37.3%) and 89 PH- (62.6%). PH+ had a significantly higher rate of preoperative chemoradiotherapy (26.4% versus 10.1%, p = 0.017) and a significantly lower rate of undergoing pelvic reinforcement procedures (1.9% versus 14.0%, p = 0.017). PH+ had a lower rate of lateral lymph node dissection (47.2% versus 61.8%, p = 0.115) and a shorter operative time (340 min versus 394 min, p = 0.110). According to multivariate analysis, the independent risk factors for PH were preoperative chemoradiotherapy, not undergoing lateral lymph node dissection, and not undergoing a pelvic reinforcement procedure. CONCLUSIONS: PH after robotic APR for rectal cancer is not a rare complication under the recent treatment strategies for rectal cancer, and performing prophylactic procedures for PH should be considered.


Assuntos
Períneo , Complicações Pós-Operatórias , Protectomia , Neoplasias Retais , Procedimentos Cirúrgicos Robóticos , Humanos , Estudos Retrospectivos , Procedimentos Cirúrgicos Robóticos/efeitos adversos , Procedimentos Cirúrgicos Robóticos/métodos , Masculino , Feminino , Fatores de Risco , Pessoa de Meia-Idade , Períneo/cirurgia , Idoso , Protectomia/efeitos adversos , Protectomia/métodos , Neoplasias Retais/cirurgia , Incidência , Complicações Pós-Operatórias/etiologia , Complicações Pós-Operatórias/epidemiologia , Hérnia/etiologia , Hérnia/prevenção & controle , Hérnia/epidemiologia , Hérnia Incisional/etiologia , Hérnia Incisional/prevenção & controle , Hérnia Incisional/epidemiologia
7.
Psych J ; 2024 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-38965885

RESUMO

Reward processing dysfunction and inhibition control deficiency have been observed in Internet gaming disorder (IGD). However, it is still unclear whether the previous reinforcement learning depends on reward/punishment feedback influences on the cognitive inhibitory control of IGD. This study compared the differences between an IGD group and healthy people without game experiences in the probability selection task and the subsequent stop signal task by the method of behavioral experiments, in order to explore whether the reward learning ability is impaired in the IGD group. We also discuss the influence of previous reward learning on subsequent inhibition control. The results showed that (1) during the reward learning phase, the IGD group's accuracy was significantly lower than that of the control group; (2) compared with the control group, the IGD group's reaction times were longer in the transfer phase; (3) for no-go trials of the inhibitory control phase after reward learning, the accuracy of the reward-related stimulation in the IGD group was lower than that of punishment-related or neutral stimulation, but there was no significant difference among the three conditions in the control group. These findings indicated that the reinforcement learning ability of the IGD group was impaired, which further caused the abnormal response to reinforcement stimuli.

8.
Artif Intell Med ; 154: 102920, 2024 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-38972092

RESUMO

The development of closed-loop systems for glycemia control in type I diabetes relies heavily on simulated patients. Improving the performances and adaptability of these close-loops raises the risk of over-fitting the simulator. This may have dire consequences, especially in unusual cases which were not faithfully - if at all - captured by the simulator. To address this, we propose to use model-free offline RL agents, trained on real patient data, to perform the glycemia control. To further improve the performances, we propose an end-to-end personalization pipeline, which leverages offline-policy evaluation methods to remove altogether the need of a simulator, while still enabling an estimation of clinically relevant metrics for diabetes.

9.
MethodsX ; 12: 102790, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38966714

RESUMO

Stochastic Calculus-guided Reinforcement learning (SCRL) is a new way to make decisions in situations where things are uncertain. It uses mathematical principles to make better choices and improve decision-making in complex situations. SCRL works better than traditional Stochastic Reinforcement Learning (SRL) methods. In tests, SCRL showed that it can adapt and perform well. It was better than the SRL methods. SCRL had a lower dispersion value of 63.49 compared to SRL's 65.96. This means SCRL had less variation in its results. SCRL also had lower risks than SRL in the short- and long-term. SCRL's short-term risk value was 0.64, and its long-term risk value was 0.78. SRL's short-term risk value was much higher at 18.64, and its long-term risk value was 10.41. Lower risk values are better because they mean less chance of something going wrong. Overall, SCRL is a better way to make decisions when things are uncertain. It uses math to make smarter choices and has less risk than other methods. Also, different metrics, viz training rewards, learning progress, and rolling averages between SRL and SCRL, were assessed, and the study found that SCRL outperforms well compared to SRL. This makes SCRL very useful for real-world situations where decisions must be made carefully.•By leveraging mathematical principles derived from stochastic calculus, SCRL offers a robust framework for making informed choices and enhancing performance in complex scenarios.•In comparison to traditional SRL methods, SCRL demonstrates superior adaptability and efficacy, as evidenced by empirical tests.

10.
Sci Rep ; 14(1): 15294, 2024 Jul 03.
Artigo em Inglês | MEDLINE | ID: mdl-38961120

RESUMO

Reliability mapping of 5G low orbit constellation network slice is an important means to ensure link network communication. The problem of state space explosion is a typical problem. The deep reinforcement learning method is introduced. Under the 5G low orbit constellation integrated network architecture based on software definition network (SDN) and network function virtualization (NFV), the resource requirements and resource constraints of the virtual network function (VNF) are comprehensively considered to build the 5G low orbit constellation network slice reliability mapping model, and the reliability mapping model parameters are trained and learned by using deep reinforcement learning, solve the problem of state space explosion in the reliability mapping process of 5G low orbit constellation network slices. In addition, node backup and link backup strategies based on importance are adopted to solve the problem that VNF/link reliability is difficult to meet in the reliability mapping process of 5G low orbit constellation network slice. The experimental results show that this method improves the network throughput, packet loss rate and intra slice traffic of 5G low orbit constellation, and can completely repair network faults within 0.3 s; For different number of 5G low orbit constellation network slicing requests, the reliability of this method remains above 98%; For SFC with different lengths, the average network delay of this method is less than 0.15 s.

11.
Stat Med ; 2024 Jul 08.
Artigo em Inglês | MEDLINE | ID: mdl-38973591

RESUMO

We present a trial design for sequential multiple assignment randomized trials (SMARTs) that use a tailoring function instead of a binary tailoring variable allowing for simultaneous development of the tailoring variable and estimation of dynamic treatment regimens (DTRs). We apply methods for developing DTRs from observational data: tree-based regression learning and Q-learning. We compare this to a balanced randomized SMART with equal re-randomization probabilities and a typical SMART design where re-randomization depends on a binary tailoring variable and DTRs are analyzed with weighted and replicated regression. This project addresses a gap in clinical trial methodology by presenting SMARTs where second stage treatment is based on a continuous outcome removing the need for a binary tailoring variable. We demonstrate that data from a SMART using a tailoring function can be used to efficiently estimate DTRs and is more flexible under varying scenarios than a SMART using a tailoring variable.

12.
Elife ; 122024 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-38953517

RESUMO

The hippocampal-dependent memory system and striatal-dependent memory system modulate reinforcement learning depending on feedback timing in adults, but their contributions during development remain unclear. In a 2-year longitudinal study, 6-to-7-year-old children performed a reinforcement learning task in which they received feedback immediately or with a short delay following their response. Children's learning was found to be sensitive to feedback timing modulations in their reaction time and inverse temperature parameter, which quantifies value-guided decision-making. They showed longitudinal improvements towards more optimal value-based learning, and their hippocampal volume showed protracted maturation. Better delayed model-derived learning covaried with larger hippocampal volume longitudinally, in line with the adult literature. In contrast, a larger striatal volume in children was associated with both better immediate and delayed model-derived learning longitudinally. These findings show, for the first time, an early hippocampal contribution to the dynamic development of reinforcement learning in middle childhood, with neurally less differentiated and more cooperative memory systems than in adults.


Assuntos
Corpo Estriado , Hipocampo , Aprendizagem , Reforço Psicológico , Humanos , Criança , Hipocampo/fisiologia , Estudos Longitudinais , Feminino , Masculino , Corpo Estriado/fisiologia , Aprendizagem/fisiologia , Imageamento por Ressonância Magnética , Tomada de Decisões/fisiologia , Tempo de Reação/fisiologia
13.
Sci Rep ; 14(1): 15245, 2024 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-38956183

RESUMO

In hybrid automatic insulin delivery (HAID) systems, meal disturbance is compensated by feedforward control, which requires the announcement of the meal by the patient with type 1 diabetes (DM1) to achieve the desired glycemic control performance. The calculation of insulin bolus in the HAID system is based on the amount of carbohydrates (CHO) in the meal and patient-specific parameters, i.e. carbohydrate-to-insulin ratio (CR) and insulin sensitivity-related correction factor (CF). The estimation of CHO in a meal is prone to errors and is burdensome for patients. This study proposes a fully automatic insulin delivery (FAID) system that eliminates patient intervention by compensating for unannounced meals. This study exploits the deep reinforcement learning (DRL) algorithm to calculate insulin bolus for unannounced meals without utilizing the information on CHO content. The DRL bolus calculator is integrated with a closed-loop controller and a meal detector (both previously developed by our group) to implement the FAID system. An adult cohort of 68 virtual patients based on the modified UVa/Padova simulator was used for in-silico trials. The percentage of the overall duration spent in the target range of 70-180 mg/dL was 71.2 % and 76.2 % , < 70 mg/dL was 0.9 % and 0.1 % , and > 180 mg/dL was 26.7 % and 21.1 % , respectively, for the FAID system and HAID system utilizing a standard bolus calculator (SBC) including CHO misestimation. The proposed algorithm can be exploited to realize FAID systems in the future.


Assuntos
Aprendizado Profundo , Diabetes Mellitus Tipo 1 , Sistemas de Infusão de Insulina , Insulina , Insulina/administração & dosagem , Humanos , Diabetes Mellitus Tipo 1/tratamento farmacológico , Diabetes Mellitus Tipo 1/sangue , Algoritmos , Glicemia/análise , Adulto , Hipoglicemiantes/administração & dosagem
14.
Sci Rep ; 14(1): 15103, 2024 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-38956201

RESUMO

One of the long-term goals of reinforcement learning is to build intelligent agents capable of rapidly learning and flexibly transferring skills, similar to humans and animals. In this paper, we introduce an episodic control framework based on the temporal expansion of subsequent features to achieve these goals, which we refer to as Temporally Extended Successor Feature Neural Episodic Control (TESFNEC). This method has shown impressive results in significantly improving sample efficiency and elegantly reusing previously learned strategies. Crucially, this model enhances agent training by incorporating episodic memory, significantly reducing the number of iterations required to learn the optimal policy. Furthermore, we adopt the temporal expansion of successor features a technique to capture the expected state transition dynamics of actions. This form of temporal abstraction does not entail learning a top-down hierarchy of task structures but focuses on the bottom-up combination of actions and action repetitions. Thus, our approach directly considers the temporal scope of sequences of temporally extended actions without requiring predefined or domain-specific options. Experimental results in the two-dimensional object collection environment demonstrate that the method proposed in this paper optimizes learning policies faster than baseline reinforcement learning approaches, leading to higher average returns.

15.
Mol Neurobiol ; 2024 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-38987488

RESUMO

Neuropeptide cocaine- and amphetamine-regulated transcript peptide (CARTp) is known to play an important role in reward processing. The rats conditioned to intra-cranial self-stimulation (ICSS) showed massive upregulation of CART protein and mRNA in the vicinity of the electrode implanted to deliver the electric current directly at the lateral hypothalamus (LH)-medial forebrain bundle (MFB) area. However, the underlying mechanisms leading to the upregulation of CART in ICSS animals remain elusive. We tested the putative role of CREB-binding protein (CBP), an epigenetic enzyme with intrinsic histone acetyltransferase (HAT) activity, in regulating CART expression during ICSS. An electrode was implanted in LH-MFB and the rats were conditioned to self-stimulation in an operant chamber. CBP siRNA was delivered ipsilaterally in the LH-MFB to knock-down CBP and the effects on lever press activity were monitored. While ICSS-conditioned rats showed distinct increase in CART, CBP and pCREB levels, enhanced CBP binding and histone acetylation (H3K9ac) were noticed on the CART promoter in chromatin immunoprecipitation assay. Direct infusion of CBP siRNA in the LH-MFB lowered lever press activity, CBP levels, histone acetylation at the CART promoter, and CART mRNA and peptide expression. Co-infusion of CARTp in LH-MFB rescued the waning effects of CBP siRNA on self-stimulation. We suggest that CBP-mediated histone acetylation may play a causal role in CART expression in LH, which in turn may drive the positive reinforcement of lever press activity.

16.
Subst Use Misuse ; : 1-8, 2024 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-38987988

RESUMO

BACKGROUND: Alcohol use is a gendered behavior and motherhood is a life stage which may influence drinking motives. However, there are no drinking motive scales uniquely tailored to maternal populations. This work developed a new maternal drinking motives scale (M-DMS) and determined associations between the M-DMS and alcohol-related behavior. METHODS: An online observational survey (n = 534) and online test-retest survey (n = 164) were conducted with adult, UK mothers. From the observational study, data on drinking motives was extracted to determine M-DMS items and factor loading. This was split into two data sets for exploratory and confirmatory factor analyses. Alcohol Use Disorders Identification Test (AUDIT) and Timeline Follow back data, taken from both surveys, were combined to determine the M-DMS's predictive validity. RESULTS: Following a parallel analysis and exploratory factor analysis, a two-factor model (positive reinforcement motives, negative reinforcement motives) was deemed the best fit. Probability functional analysis identified items with problematic responses. These were removed before confirmatory factor analysis (on the second dataset) demonstrated a good fit for the two-factor model. All factor loadings were significant and positive (ßs > 0.56). Reliability of the two subscales was excellent: negative reinforcement (ωT = 0.95), positive reinforcement (ωT = 0.89). Test-retest reliability was good for both negative (ICC = 0.84, 95%CI = 0.80-0.88) and positive (ICC = 0.77, 95% CI = 0.71-0.82) subscales. Both subscales predicted AUDIT and quantity of alcohol consumption (ps < 0.001). CONCLUSION: The first tailored Maternal Drinking Motives Scale (M-DMS) provides a more valid research tool for assessing psychological mechanisms of alcohol use in mothers.

17.
Artigo em Inglês | MEDLINE | ID: mdl-38988197

RESUMO

Different dopamine subtypes have opposing dynamics at post-synaptic receptors, with the ratio of D1 to D2 receptors determining the relative sensitivity to gains and losses, respectively, during value-based learning. This effective sensitivity to different reward feedback interacts with phasic dopamine levels to determine the effectiveness of learning, particularly in dynamic feedback situations where frequency and magnitude of rewards need to be integrated over time to make optimal decisions. We modeled this effect in simulations of the underlying basal ganglia pathways and then tested the predictions in individuals with a variant of the human dopamine receptor D2 (DRD2; -141C Ins/Del and Del/Del) gene that associates with lower levels of D2 receptor expression (N=119) and compared their performance in the Iowa Gambling Task (IGT) to non-carrier controls (N=319). Ventral striatal (VS) reactivity to rewards was measured in the Cards task with fMRI. DRD2 variant carriers made less effective decisions than non-carriers, but this effect was not moderated by VS reward reactivity as is hypothesized by our model. These results suggest that the interaction between dopamine receptor subtypes and reactivity to rewards during learning may be more complex than originally thought.

18.
J Vasc Surg Cases Innov Tech ; 10(4): 101543, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38994221

RESUMO

Anastomosis of the prosthetic graft to the double-barreled aorta with intimal flap fenestration is a useful technique in surgery for chronic aortic dissection. Conversely, anastomosis to the false lumen's outer wall is prone to complications such as pseudoaneurysms, but little is known about the technique of reinforcing the double-barreled aorta. In this report, we describe a surgical case of chronic aortic dissection in which an H-shaped prosthetic graft was sutured to both aortic lumens, including the intimal flap, to prevent complications at the anastomosis site.

19.
Conscious Cogn ; 123: 103724, 2024 Jul 11.
Artigo em Inglês | MEDLINE | ID: mdl-38996747

RESUMO

The learning process encompasses exploration and exploitation phases. While reinforcement learning models have revealed functional and neuroscientific distinctions between these phases, knowledge regarding how they affect visual attention while observing the external environment is limited. This study sought to elucidate the interplay between these learning phases and visual attention allocation using visual adjustment tasks combined with a two-armed bandit problem tailored to detect serial effects only when attention is dispersed across both arms. Per our findings, human participants exhibited a distinct serial effect only during the exploration phase, suggesting enhanced attention to the visual stimulus associated with the non-target arm. Remarkably, although rewards did not motivate attention dispersion in our task, during the exploration phase, individuals engaged in active observation and searched for targets to observe. This behavior highlights a unique information-seeking process in exploration that is distinct from exploitation.

20.
PeerJ Comput Sci ; 10: e2141, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38983203

RESUMO

The reinforcement learning based hyper-heuristics (RL-HH) is a popular trend in the field of optimization. RL-HH combines the global search ability of hyper-heuristics (HH) with the learning ability of reinforcement learning (RL). This synergy allows the agent to dynamically adjust its own strategy, leading to a gradual optimization of the solution. Existing researches have shown the effectiveness of RL-HH in solving complex real-world problems. However, a comprehensive introduction and summary of the RL-HH field is still blank. This research reviews currently existing RL-HHs and presents a general framework for RL-HHs. This article categorizes the type of algorithms into two categories: value-based reinforcement learning hyper-heuristics and policy-based reinforcement learning hyper-heuristics. Typical algorithms in each category are summarized and described in detail. Finally, the shortcomings in existing researches on RL-HH and future research directions are discussed.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...