You may have to register before you can download all our books and magazines, click the sign up button below to create a free account.
This handbook offers a thorough treatment of the science of linguistic annotation. Leaders in the field guide the reader through the process of modeling, creating an annotation language, building a corpus and evaluating it for correctness. Essential reading for both computer scientists and linguistic researchers.Linguistic annotation is an increasingly important activity in the field of computational linguistics because of its critical role in the development of language models for natural language processing applications. Part one of this book covers all phases of the linguistic annotation process, from annotation scheme design and choice of representation format through both the manual and...
There is a perceived tension between empirical and theoretical approaches to the study of language. Many recent works in the discipline emphasise that linguistics is an 'empirical science'. This volume argues for a nuanced view, highlighting that theory and practice necessarily and as a matter of fact complement each other in linguistic research. Its contributions ranging from experimental studies in psychology via linguistic fieldwork and cross-linguistic comparisons to the application of formal and logical approaches to language exemplify the mutual relationship between empirical and theoretical work. The volume illustrates how selected topics are addressed by different contributions and methodological stances. Topics include the cognitive grounding of language, social cognition and the construction of meaning in interaction, and, closely related, pragmatics from a typological perspective and beyond. Anyone interested in these topics and more generally in meta-theoretical considerations will find great value in this volume.
An introduction to annotation as a genre--a synthesis of reading, thinking, writing, and communication--and its significance in scholarship and everyday life. Annotation--the addition of a note to a text--is an everyday and social activity that provides information, shares commentary, sparks conversation, expresses power, and aids learning. It helps mediate the relationship between reading and writing. This volume in the MIT Press Essential Knowledge series offers an introduction to annotation and its literary, scholarly, civic, and everyday significance across historical and contemporary contexts. It approaches annotation as a genre--a synthesis of reading, thinking, writing, and communication--and offer examples of annotation that range from medieval rubrication and early book culture to data labeling and online reviews.
This fourth edition provides an updated look at information organization, featuring coverage of the Semantic Web, linked data, and EAC-CPF; new metadata models such as IFLA-LRM and RiC; and new perspectives on RDA and its implementation. This latest edition of The Organization of Information is a key resource for anyone in the beginning stages of their LIS career as well as longstanding professionals and paraprofessionals seeking accurate, clear, and up-to-date guidance on information organization activities across the discipline. The book begins with a historical look at information organization methods, covering libraries, archives, museums, and online settings. It then addresses the types...
Understanding the role of humans in environmental change is one of the most pressing challenges of the 21st century. Environmental narratives – written texts with a focus on the environment – offer rich material capturing relationships between people and surroundings. We take advantage of two key opportunities for their computational analysis: massive growth in the availability of digitised contemporary and historical sources, and parallel advances in the computational analysis of natural language. We open by introducing interdisciplinary research questions related to the environment and amenable to analysis through written sources. The reader is then introduced to potential collections ...
If programming is magic then web scraping is surely a form of wizardry. By writing a simple automated program, you can query web servers, request data, and parse it to extract the information you need. The expanded edition of this practical book not only introduces you web scraping, but also serves as a comprehensive guide to scraping almost every type of data from the modern web. Part I focuses on web scraping mechanics: using Python to request information from a web server, performing basic handling of the server's response, and interacting with sites in an automated fashion. Part II explores a variety of more specific tools and applications to fit any web scraping scenario you're likely to encounter. Parse complicated HTML pages Develop crawlers with the Scrapy framework Learn methods to store data you scrape Read and extract data from documents Clean and normalize badly formatted data Read and write natural languages Crawl through forms and logins Scrape JavaScript and crawl through APIs Use and write image-to-text software Avoid scraping traps and bot blockers Use scrapers to test your website
Linguistically annotated corpora are becoming a central part of the corpus linguistics field. One of their main strengths is the level of searchability they offer, but with the annotation come problems of the initial complexity of queries and query tools. This book gives a full, pedagogic account of this burgeoning field. Beginning with an overview of corpus linguistics, its prerequisites and goals, the book then introduces linguistically annotated corpora. It explores the different levels of linguistic annotation, including morphological, parts of speech, syntactic, semantic and discourse-level, as well as advantages and challenges for such annotations. It covers the main annotated corpora for English, the Penn Treebank, the International Corpus of English, and OntoNotes, as well as a wide range of corpora for other languages. In its third part, search strategies required for different types of data are explored. All chapters are accompanied by exercises and by sections on further reading.
Over recent years there has been major investment in research infrastructure to harness the potential of routinely collected health data. In 2013, The Farr Institute for Health Informatics Research was established in the UK, undertaking health informatics research to enhance patient and public health by the analysis of data from multiple sources and unleashing the value of vast sources of clinical, biological, population and environmental data for public benefit. The Medical Informatics Europe (MIE) conference is already established as a key event in the calendar of the European Federation of Medical Informatics (EFMI); The Farr Institute has been establishing a conference series. For 2017, ...
This open access book presents an interdisciplinary approach to reveal biases in English news articles reporting on a given political event. The approach named person-oriented framing analysis identifies the coverage’s different perspectives on the event by assessing how articles portray the persons involved in the event. In contrast to prior automated approaches, the identified frames are more meaningful and substantially present in person-oriented news coverage. The book is structured in seven chapters: Chapter 1 presents a few of the severe problems caused by slanted news coverage and identifies the research gap that motivated the research described in this thesis. Chapter 2 discusses m...