You may have to register before you can download all our books and magazines, click the sign up button below to create a free account.
Entity Resolution (ER) lies at the core of data integration and cleaning and, thus, a bulk of the research examines ways for improving its effectiveness and time efficiency. The initial ER methods primarily target Veracity in the context of structured (relational) data that are described by a schema of well-known quality and meaning. To achieve high effectiveness, they leverage schema, expert, and/or external knowledge. Part of these methods are extended to address Volume, processing large datasets through multi-core or massive parallelization approaches, such as the MapReduce paradigm. However, these early schema-based approaches are inapplicable to Web Data, which abound in voluminous, noi...
Resource Description Framework (or RDF, in short) is set to deliver many of the original semi-structured data promises: flexible structure, optional schema, and rich, flexible Universal Resource Identifiers as a basis for information sharing. Moreover, RDF is uniquely positioned to benefit from the efforts of scientific communities studying databases, knowledge representation, and Web technologies. As a consequence, the RDF data model is used in a variety of applications today for integrating knowledge and information: in open Web or government data via the Linked Open Data initiative, in scientific domains such as bioinformatics, and more recently in search engines and personal assistants o...
This book constitutes the proceedings of the 23rd European Conference on Advances in Databases and Information Systems, ADBIS 2019, held in Bled, Slovenia, in September 2019. The 27 full papers presented were carefully reviewed and selected from 103 submissions. The papers cover a wide range of topics from different areas of research in database and information systems technologies and their advanced applications from theoretical foundations to optimizing index structures. They focus on data mining and machine learning, data warehouses and big data technologies, semantic data processing, and data modeling. They are organized in the following topical sections: data mining; machine learning; document and text databases; big data; novel applications; ontologies and knowledge management; process mining and stream processing; data quality; optimization; theoretical foundation and new requirements; and data warehouses.
Proceedings of the 29th Annual International Conference on Very Large Data Bases held in Berlin, Germany on September 9-12, 2003. Organized by the VLDB Endowment, VLDB is the premier international conference on database technology.
This book constitutes the thoroughly refereed post-proceedings of the Fifth International School and Symposium on Advanced Distributed Systems, ISSADS 2005, held in Guadalajara, Mexico in January 2005. The 50 revised full papers presented were carefully reviewed and selected from over 100 submissions. The papers are organized in topical sections on database systems, distributed and parallel algorithms, real-time distributed systems, cooperative information systems, fault tolerance, information retrieval, modeling and simulation, wireless networks and mobile computing, artificial life and multi agent systems.
This book constitutes the thoroughly refereed postproceedings of the Second International Workshop on Semantic Web and Databases, SWDB 2004, held in Toronto, Canada in August 2004 as a satellite workshop of VLDB 2004. The 14 revised full papers presented together with 2 papers by the invited keynote speakers were carefully selected during two rounds of reviewing and improvement from 47 submissions. Among the topics addressed are data semantics, semantic Web services, service-oriented computing, workflow composition, XML semantics, relational tables, ontologies, semantic Web algebra, heterogeneous data sources, context mediation, OWL, ontology engineering, data integration, semantic Web queries, database queries, and peer-to-peer warehouses.
This book constitutes the refereed proceedings of the 12th International Conference on Applications of Natural Language to Information Systems, NLDB 2007, held in Paris, France in June 2007. It covers natural language for database query processing, email management, semantic annotation, text clustering, ontology engineering, natural language for information system design, information retrieval systems, and natural language processing techniques.
This two-volume set LNCS 5870/5871 constitutes the refereed proceedings of the four confederated international conferences on Cooperative Information Systems (CoopIS 2009), Distributed Objects and Applications (DOA 2009), Information Security (IS 2009), and Ontologies, Databases and Applications of Semantics (ODBASE 2009), held as OTM 2009 in Vilamoura, Portugal, in November 2009. The 83 revised full papers presented together with 4 keynote talks were carefully reviewed and selected from a total of 234 submissions. Corresponding to the four OTM 2009 main conferences CoopIS, DOA, IS, and ODBASE the papers are organized in topical sections on workflow; process models; ontology challenges; netw...
This book contains a number of chapters on transactional database concurrency control. This volume's entire sequence of chapters can summarized as follows: A two-sentence summary of the volume's entire sequence of chapters is this: traditional locking techniques can be improved in multiple dimensions, notably in lock scopes (sizes), lock modes (increment, decrement, and more), lock durations (late acquisition, early release), and lock acquisition sequence (to avoid deadlocks). Even if some of these improvements can be transferred to optimistic concurrency control, notably a fine granularity of concurrency control with serializable transaction isolation including phantom protection, pessimistic concurrency control is categorically superior to optimistic concurrency control, i.e., independent of application, workload, deployment, hardware, and software implementation.
How do you approach answering queries when your data is stored in multiple databases that were designed independently by different people? This is first comprehensive book on data integration and is written by three of the most respected experts in the field. This book provides an extensive introduction to the theory and concepts underlying today's data integration techniques, with detailed, instruction for their application using concrete examples throughout to explain the concepts. Data integration is the problem of answering queries that span multiple data sources (e.g., databases, web pages). Data integration problems surface in multiple contexts, including enterprise information integration, query processing on the Web, coordination between government agencies and collaboration between scientists. In some cases, data integration is the key bottleneck to making progress in a field. The authors provide a working knowledge of data integration concepts and techniques, giving you the tools you need to develop a complete and concise package of algorithms and applications.