Data Types, Data Doubts & Data Trusts
In: João Marinotti, Data Types, Data Doubts & Data Trusts, New York University Law Review Online (Forthcoming)
46033 Ergebnisse
Sortierung:
In: João Marinotti, Data Types, Data Doubts & Data Trusts, New York University Law Review Online (Forthcoming)
SSRN
Replicating data under Eventual Consistency (EC) allows any replica to accept updates without remote synchronisation. This ensures performance and scalability in large-scale distributed systems (e.g., clouds). However, published EC approaches are ad-hoc and error-prone. Under a formal Strong Eventual Consistency (SEC) model, we study sufficient conditions for convergence. A data type that satisfies these conditions is called a Conflict-free Replicated Data Type (CRDT). Replicas of any CRDT are guaranteed to converge in a self-stabilising manner, despite any number of failures. This paper formalises two popular approaches (state- and operation-based) and their relevant sufficient conditions. We study a number of useful CRDTs, such as sets with clean semantics, supporting both add and remove operations, and consider in depth the more complex Graph data type. CRDT types can be composed to develop large-scale distributed applications, and have interesting theoretical properties. ; La réplication selon la politique de cohérence à terme (Eventual Consistency ou EC) autorise toute réplique à accepter des mises à jour sans se synchroniser avec les autres. Cette approche ne bride pas les performances et permet le passage à l'échelle dans les systèmes distribués, par ex. dans l'informatique en nuage. Cependant, les algorithmes EC précédemment publiés sont ad-hoc et sujets aux erreurs. Nous proposons un modèle formel, la cohérence à terme forte (Strong Eventual Consistency ou SEC), dans lequel nous étudions des conditions suffisantes de converegence. Un type de données satisfaisant ces conditions sera dit sans conflit (Conflict-free Replicated Data Type ou CRDT). Les répliques d'un CRDT convergent de façon auto-stabilisante, quel que soit le nombre de fautes. Cet article formalise deux approches courantes, celle basée sur les états et celle basée sur les données, et les conditions suffisantes correspondantes. Nous étudions un certain nombre de CRDT génériques, comme des ensembles, avec une sémantique appropriée pour les ...
BASE
Replicating data under Eventual Consistency (EC) allows any replica to accept updates without remote synchronisation. This ensures performance and scalability in large-scale distributed systems (e.g., clouds). However, published EC approaches are ad-hoc and error-prone. Under a formal Strong Eventual Consistency (SEC) model, we study sufficient conditions for convergence. A data type that satisfies these conditions is called a Conflict-free Replicated Data Type (CRDT). Replicas of any CRDT are guaranteed to converge in a self-stabilising manner, despite any number of failures. This paper formalises two popular approaches (state- and operation-based) and their relevant sufficient conditions. We study a number of useful CRDTs, such as sets with clean semantics, supporting both add and remove operations, and consider in depth the more complex Graph data type. CRDT types can be composed to develop large-scale distributed applications, and have interesting theoretical properties. ; La réplication selon la politique de cohérence à terme (Eventual Consistency ou EC) autorise toute réplique à accepter des mises à jour sans se synchroniser avec les autres. Cette approche ne bride pas les performances et permet le passage à l'échelle dans les systèmes distribués, par ex. dans l'informatique en nuage. Cependant, les algorithmes EC précédemment publiés sont ad-hoc et sujets aux erreurs. Nous proposons un modèle formel, la cohérence à terme forte (Strong Eventual Consistency ou SEC), dans lequel nous étudions des conditions suffisantes de converegence. Un type de données satisfaisant ces conditions sera dit sans conflit (Conflict-free Replicated Data Type ou CRDT). Les répliques d'un CRDT convergent de façon auto-stabilisante, quel que soit le nombre de fautes. Cet article formalise deux approches courantes, celle basée sur les états et celle basée sur les données, et les conditions suffisantes correspondantes. Nous étudions un certain nombre de CRDT génériques, comme des ensembles, avec une sémantique appropriée pour les opérations add et remove, et approfondissons un type plus complexe, le graphe. Les CRDT peuvent être composés, de façon à développer des applications réparties à grande échelle, et ont des propriétés théoriques intéressantes.
BASE
Replicating data under Eventual Consistency (EC) allows any replica to accept updates without remote synchronisation. This ensures performance and scalability in large-scale distributed systems (e.g., clouds). However, published EC approaches are ad-hoc and error-prone. Under a formal Strong Eventual Consistency (SEC) model, we study sufficient conditions for convergence. A data type that satisfies these conditions is called a Conflict-free Replicated Data Type (CRDT). Replicas of any CRDT are guaranteed to converge in a self-stabilising manner, despite any number of failures. This paper formalises two popular approaches (state- and operation-based) and their relevant sufficient conditions. We study a number of useful CRDTs, such as sets with clean semantics, supporting both add and remove operations, and consider in depth the more complex Graph data type. CRDT types can be composed to develop large-scale distributed applications, and have interesting theoretical properties. ; La réplication selon la politique de cohérence à terme (Eventual Consistency ou EC) autorise toute réplique à accepter des mises à jour sans se synchroniser avec les autres. Cette approche ne bride pas les performances et permet le passage à l'échelle dans les systèmes distribués, par ex. dans l'informatique en nuage. Cependant, les algorithmes EC précédemment publiés sont ad-hoc et sujets aux erreurs. Nous proposons un modèle formel, la cohérence à terme forte (Strong Eventual Consistency ou SEC), dans lequel nous étudions des conditions suffisantes de converegence. Un type de données satisfaisant ces conditions sera dit sans conflit (Conflict-free Replicated Data Type ou CRDT). Les répliques d'un CRDT convergent de façon auto-stabilisante, quel que soit le nombre de fautes. Cet article formalise deux approches courantes, celle basée sur les états et celle basée sur les données, et les conditions suffisantes correspondantes. Nous étudions un certain nombre de CRDT génériques, comme des ensembles, avec une sémantique appropriée pour les opérations add et remove, et approfondissons un type plus complexe, le graphe. Les CRDT peuvent être composés, de façon à développer des applications réparties à grande échelle, et ont des propriétés théoriques intéressantes.
BASE
The design of Conflict-free Replicated Data Types traditionally requires implementing new designs from scratch to meet a desired behavior. Although there are composition rules that can guide the process, there has not been a lot of work explaining how existing data types relate to each other, nor work that factors out common patterns. To bring clarity to the field we explain underlying patterns that are common to flags, sets, and registers. The identified patterns are succinct and composable, which gives them the power to explain both current designs and open up the space for new ones. ; This work was partially supported by the European Union H2020 LightKone project under grant 732505 (https://www. ...
BASE
In: IZA Discussion Paper No. 15586
SSRN
In: The economic journal: the journal of the Royal Economic Society, Band 134, Heft 659, S. 985-1018
ISSN: 1468-0297
Abstract
This paper examines the relationship between p-hacking, publication bias and data-sharing policies. We collect 38,876 test statistics from 1,106 articles published in leading economic journals between 2002–20. We find that, while data-sharing policies increase the provision of data, they do not decrease the extent of p-hacking and publication bias. Similarly, articles that use hard-to-access administrative data or third-party surveys, as compared to those that use easier-to-access (e.g., author-collected) data, are not different in their p-hacking and publication extent. Voluntary provision of data by authors on their home pages offers no evidence of reduced p-hacking.
SSRN
In today's information focused world, there is no lack of entities focused on information gathering. However, there is still a widespread epidemic of information starvation in the Department of Defense (DoD). This starvation is attributed to the lack of interoperability between information gatherers and information consumers. To alleviate this problem, the DoD has put forth a vision of a Joint Battlespace Infosphere (JBI). This research proposes a framework for sharing and finding resources in a JBI. The framework uses an extensible metadata specification, agent technology, and the Control of Agent Based Systems (CoABS). It provides several tools for publication and subscription of resources, including a visual query wizard and a visualization of the results. This framework and tools provide visual query capability for the heterogeneous resources within the JBI.
BASE
SSRN
Working paper
In: IASSIST quarterly: IQ, Band 45, Heft 1
ISSN: 2331-4141
Linking social media data with survey data is a way to combine the unique strengths and address some of the respective limitations of these two data types. As such linked data can be quite disclosive and potentially sensitive, it is important that researchers obtain informed consent from the individuals whose data are being linked. When formulating appropriate informed consent, there are several things that researchers need to take into account. Besides legal and ethical questions, key aspects to consider are the differences between platforms and data types. Depending on what type of social media data is collected, how the data are collected, and from which platform(s), different points need to be addressed in the informed consent. In this paper, we present three case studies in which survey data were linked with data from 1) Twitter, 2) Facebook, and 3) LinkedIn and discuss how the specific features of the platforms and data collection methods were covered in the informed consent. We compare the key attributes of these platforms that are relevant for the formulation of informed consent and also discuss scenarios of social media data collection and linking in which obtaining informed consent is not necessary. By presenting the specific case studies as well as general considerations, this paper is meant to provide guidance on informed consent for linked survey and social media data for both researchers and archivists working with this type of data.
In: Epitheōrēsē koinōnikōn ereunōn: The Greek review of social research, S. 193-219
ISSN: 2241-8512
Over the past fifteen years, technology has contributed to the emergence of new types of data, particularly big data, influencing the methods of observation, study, and measurement of social phenomena from the perspective of the social sciences. The increasing digitization of social activities generates vast amounts of data that fuel contemplation about the way modern societies function. Additionally, factors such as the recent COVID-19 pandemic with mandatory social distancing have contributed to the creation of a favourable environment for the generation of new types of data, with an emphasis on big data. Within this ongoing transformation of the data landscape, we will attempt to pose questions related to the environment of Data Repositories/Research Infrastructures and the means/methods of addressing and managing these data. It appears that social research is shifting towards a more "data-driven approach", which requires new skills and capabilities at the intersection of the computational and social sciences. One of the major issues that arise is the potential for collaborations between data organizations and researchers/users of data to promote not only a culture of data sharing but also the reuse of such data. This work will be based on primary and secondary sources generated within the framework of research projects in collaboration with CESSDA ERIC (European Social Science Data Archives-European Research Infrastructures), as well as literature on the management of data from various sources, with an emphasis on their legal/ethical and technical aspects.
In: Public administration review: PAR, Band 78, Heft 6, S. 852-863
ISSN: 1540-6210
AbstractThis article addresses important questions about the complex construct of underlying performance information use: public service performance. A between‐subjects experimental vignette methodology was implemented to answer questions about the effects of emphasizing different dimensions of performance and the sources and types of performance information among internal and external stakeholders in two service arenas (secondary education and solid waste management) in Hong Kong. The findings indicate common attitudes and agreement across stakeholder groups and services on the merits of archival and external data types. Other results vary by service and between stakeholder groups. The effects of information about effectiveness can depend on its combination with information about efficiency or equity. This complexity needs to be considered when designing information communication to different stakeholder groups.
Since the 1970s, various aspects of power have been at the focus of theoretical and empirical adult education research. Despite the actual interest in political and discursive aspects of power, this article emphasizes the importance of interactional studies when observing and identifying power based on various types of data. As for German interaction studies, three phases can be distinguished, characterized by a) observations of failed participation based on records of classroom behaviour, b) the identification of mutual power negotiation in classroom and counselling situations based on transcriptions, and c) the identification of the power of physical settings in adult education classrooms and in counselling sessions based on visual data. It is presumed that observing/identifying power in adult education classrooms and counselling sessions generally depends not only on the notions of power underlying the studies but also on the data types produced and the methods applied for their interpretation. In addition, the question is raised whether the identification of power can be considered a power practice used by adult education researchers. (DIPF/Orig.)
BASE