Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros










Intervalo de ano de publicação
1.
PLoS One ; 18(6): e0286879, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37294783

RESUMO

Web archives, such as the Internet Archive, preserve the web and allow access to prior states of web pages. We implicitly trust their versions of archived pages, but as their role moves from preserving curios of the past to facilitating present day adjudication, we are concerned with verifying the fixity of archived web pages, or mementos, to ensure they have always remained unaltered. A widely used technique in digital preservation to verify the fixity of an archived resource is to periodically compute a cryptographic hash value on a resource and then compare it with a previous hash value. If the hash values generated on the same resource are identical, then the fixity of the resource is verified. We tested this process by conducting a study on 16,627 mementos from 17 public web archives. We replayed and downloaded the mementos 39 times using a headless browser over a period of 442 days and generated a hash for each memento after each download, resulting in 39 hashes per memento. The hash is calculated by including not only the content of the base HTML of a memento but also all embedded resources, such as images and style sheets. We expected to always observe the same hash for a memento regardless of the number of downloads. However, our results indicate that 88.45% of mementos produce more than one unique hash value, and about 16% (or one in six) of those mementos always produce different hash values. We identify and quantify the types of changes that cause the same memento to produce different hashes. These results point to the need for defining an archive-aware hashing function, as conventional hashing functions are not suitable for replayed archived web pages.


Assuntos
Arquivos , Asteraceae
2.
PLoS One ; 11(12): e0167475, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27911955

RESUMO

Increasingly, scholarly articles contain URI references to "web at large" resources including project web sites, scholarly wikis, ontologies, online debates, presentations, blogs, and videos. Authors reference such resources to provide essential context for the research they report on. A reader who visits a web at large resource by following a URI reference in an article, some time after its publication, is led to believe that the resource's content is representative of what the author originally referenced. However, due to the dynamic nature of the web, that may very well not be the case. We reuse a dataset from a previous study in which several authors of this paper were involved, and investigate to what extent the textual content of web at large resources referenced in a vast collection of Science, Technology, and Medicine (STM) articles published between 1997 and 2012 has remained stable since the publication of the referencing article. We do so in a two-step approach that relies on various well-established similarity measures to compare textual content. In a first step, we use 19 web archives to find snapshots of referenced web at large resources that have textual content that is representative of the state of the resource around the time of publication of the referencing paper. We find that representative snapshots exist for about 30% of all URI references. In a second step, we compare the textual content of representative snapshots with that of their live web counterparts. We find that for over 75% of references the content has drifted away from what it was when referenced. These results raise significant concerns regarding the long term integrity of the web-based scholarly record and call for the deployment of techniques to combat these problems.


Assuntos
Internet , Publicações
3.
PLoS One ; 9(12): e115253, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25541969

RESUMO

The emergence of the web has fundamentally affected most aspects of information communication, including scholarly communication. The immediacy that characterizes publishing information to the web, as well as accessing it, allows for a dramatic increase in the speed of dissemination of scholarly knowledge. But, the transition from a paper-based to a web-based scholarly communication system also poses challenges. In this paper, we focus on reference rot, the combination of link rot and content drift to which references to web resources included in Science, Technology, and Medicine (STM) articles are subject. We investigate the extent to which reference rot impacts the ability to revisit the web context that surrounds STM articles some time after their publication. We do so on the basis of a vast collection of articles from three corpora that span publication years 1997 to 2012. For over one million references to web resources extracted from over 3.5 million articles, we determine whether the HTTP URI is still responsive on the live web and whether web archives contain an archived snapshot representative of the state the referenced resource had at the time it was referenced. We observe that the fraction of articles containing references to web resources is growing steadily over time. We find one out of five STM articles suffering from reference rot, meaning it is impossible to revisit the web context that surrounds them some time after their publication. When only considering STM articles that contain references to web resources, this fraction increases to seven out of ten. We suggest that, in order to safeguard the long-term integrity of the web-based scholarly record, robust solutions to combat the reference rot problem are required. In conclusion, we provide a brief insight into the directions that are explored with this regard in the context of the Hiberlink project.


Assuntos
Acesso à Informação , Internet , Publicações Periódicas como Assunto/estatística & dados numéricos , Editoração/tendências , Publicações Periódicas como Assunto/normas , Fatores de Tempo
4.
PLoS One ; 4(6): e6022, 2009 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-19562078

RESUMO

BACKGROUND: The impact of scientific publications has traditionally been expressed in terms of citation counts. However, scientific activity has moved online over the past decade. To better capture scientific impact in the digital era, a variety of new impact measures has been proposed on the basis of social network analysis and usage log data. Here we investigate how these new measures relate to each other, and how accurately and completely they express scientific impact. METHODOLOGY: We performed a principal component analysis of the rankings produced by 39 existing and proposed measures of scholarly impact that were calculated on the basis of both citation and usage log data. CONCLUSIONS: Our results indicate that the notion of scientific impact is a multi-dimensional construct that can not be adequately measured by any single indicator, although some measures are more suitable than others. The commonly used citation Impact Factor is not positioned at the core of this construct, but at its periphery, and should thus be used with caution.


Assuntos
Bibliometria , Fator de Impacto de Revistas , Publicações Periódicas como Assunto/normas , Editoração/normas , Acesso à Informação , Bases de Dados Factuais , Disseminação de Informação , Publicações Periódicas como Assunto/tendências , Análise de Componente Principal , Editoração/tendências
5.
PLoS One ; 4(3): e4803, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19277205

RESUMO

BACKGROUND: Intricate maps of science have been created from citation data to visualize the structure of scientific activity. However, most scientific publications are now accessed online. Scholarly web portals record detailed log data at a scale that exceeds the number of all existing citations combined. Such log data is recorded immediately upon publication and keeps track of the sequences of user requests (clickstreams) that are issued by a variety of users across many different domains. Given these advantages of log datasets over citation data, we investigate whether they can produce high-resolution, more current maps of science. METHODOLOGY: Over the course of 2007 and 2008, we collected nearly 1 billion user interactions recorded by the scholarly web portals of some of the most significant publishers, aggregators and institutional consortia. The resulting reference data set covers a significant part of world-wide use of scholarly web portals in 2006, and provides a balanced coverage of the humanities, social sciences, and natural sciences. A journal clickstream model, i.e. a first-order Markov chain, was extracted from the sequences of user interactions in the logs. The clickstream model was validated by comparing it to the Getty Research Institute's Architecture and Art Thesaurus. The resulting model was visualized as a journal network that outlines the relationships between various scientific domains and clarifies the connection of the social sciences and humanities to the natural sciences. CONCLUSIONS: Maps of science resulting from large-scale clickstream data provide a detailed, contemporary view of scientific activity and correct the underrepresentation of the social sciences and humanities that is commonly found in citation data.


Assuntos
Bibliometria , Pesquisa/estatística & dados numéricos , Algoritmos , Bases de Dados Bibliográficas/estatística & dados numéricos , Ciências Humanas/estatística & dados numéricos , Cadeias de Markov , Modelos Teóricos , Disciplinas das Ciências Naturais/estatística & dados numéricos , Sistemas On-Line , Publicações Periódicas como Assunto/estatística & dados numéricos , Ciências Sociais/estatística & dados numéricos
6.
Recurso na Internet em Inglês | LIS - Localizador de Informação em Saúde | ID: lis-20889

RESUMO

It explains what the Open Archives Initiative is, its mission and how the process developed by the OAI works.


Assuntos
50111 , Comunicação e Divulgação Científica
7.
J Am Acad Dermatol ; 57(1): 116-9, 2007 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-17499388

RESUMO

BACKGROUND: Thomson Institute for Scientific Information's journal impact factor, the most common measure of journal status, is based on crude citation counts that do not account for the quality of the journals where the citations originate. This study examines how accounting for citation origin affects the impact factor ranking of dermatology journals. METHODS: The 2003 impact factors of dermatology journals were adjusted by a weighted PageRank algorithm that assigned greater weight to citations originating in more frequently cited journals. RESULTS: Adjusting for citation origin moved the rank of the Journal of the American Academy of Dermatology higher than that of the Archives of Dermatology (third to second) but did not affect the ranking of the highest impact dermatology journal, the Journal of Investigative Dermatology. The dermatology journals most positively affected by adjusting for citation origin were Contact Dermatitis (moving from 22nd to 7th in rankings) and Burns (21st to 10th). Dermatology journals most negatively affected were Seminars in Cutaneous Medicine and Surgery (5th to 14th), the Journal of Cutaneous Medicine and Surgery (19th to 27th), and the Journal of Investigative Dermatology Symposium Proceedings (26th to 34th). LIMITATIONS: Current measures of dermatology journal status do not incorporate survey data from dermatologists regarding which journals dermatologists esteem most. CONCLUSION: Adjusting for citation origin provides a more refined measure of journal status and changes relative dermatology journal rankings.


Assuntos
Bibliometria , Dermatologia , Publicações Periódicas como Assunto , Algoritmos , Internet , Editoração/estatística & dados numéricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...