<style type="text/css"> .wpb_animate_when_almost_visible { opacity: 1; }</style> Enap catalog › Details for: Text as data :
Normal view MARC view ISBD view

Text as data : a new framework for machine learning and the social sciences / por Justin Grimmer, Margaret E. Roberts e Brandon M. Stewart. --

By: Grimmer, Justin.
Contributor(s): Roberts, Margaret E | Stewart, Brandon M.
Material type: materialTypeLabelBookPublisher: Nova Jersey, EUA ; Oxford, UK : Princeton University Press, 2022Description: 336 p. : il.ISBN: 9780691207551.Subject(s): Mineração de Dados de Texto | Ciências Sociais - Processamento de Dados | Machine Learning
Contents:
Preface PART I - PRELIMINARIES CHAPTER 1 - Introduction 1.1 How This Book Informs the Social Sciences 1.2 How This Book Informs the Digital Humanities 1.3 How This Book Informs Data Science in Industry and Government 1.4 A Guide to This Book 1.5 Conclusion CHAPTER 2 - Social Science Research and Text Analysis 2.1 Discovery 2.2 Measurement 2.3 Inference 2.4 Social Science as an Interative and Cumulative Process 2.5 An Agnostic Approach to Text Analysis 2.6 Discovery, Meansurement, and Causal Inference: How the Chinese Government Censors Social Media 2.7 Six Principals of Text Analysis 2.8 Conclusion: nText Data and Social Science PART II - SELECTION AND REPRESENTATION CHAPTER 3 - Principles of Selection and Representation 3.1 Principle 1: Question-Specific Corpus Construction 3.2 Principle 2: No Values-Free Corpus Construction 3.3 Principle 3: No Right Way to Represent Text 3.4 Principle 4: Validation 3.5 State of the Union Addresses 3.6 The Autorship of the Federalist Papers 3.7 Conclusion CHAPTER 4 - Selecting Documents 4.1 Populations and Quantities of Interest 4.2 Four Types of Bias 4.3 Considerations of "Found Data" 4.4 Conclusion CHAPTER 5 - Bag of Words 5.1 The Bag of Words Model 5.2 Choose the Unit of Analysis 5.3 Tokenize 5.4 Reduce Complexity 5.5 Construct Document-Feature Matrix 5.6 Rethinking the Defaults 5.7 Conclusion CHAPTER 6 - The Multinomial Language Model 6.1 Multinomial Distribution 6.2 Basic Language Modeling 6.3 Regularization and Smoothing 6.4 The Dirichlet Distribution 6.5 Conclusion CHAPTER 7 - The Vector Space Model and Similarity Metrics 7.1 Similarity Metrics 7.2 Distance Metrics 7.3 tf-idf Weighting 7.4 Conclusion CHAPTER 8 - Distributed Representations of Words 8.1 Why Word Embeddings 8.2 Estimating Word Embeddings 8.3 Aggregating Word Embeddings to the Document Level 8.4 Validation 8.5 Contextualized Word Embeddings 8.6 Conclusion CHAPTER 9 - Rpresentations from Language Sequences 9.1 Text Reuse 9.2 Parts of Speech Tegging 9.3 Named-Entity Recognition 9.4 Dependency Parsing 9.5 Broader Information Extraction Tasks 9.6 Conclusion PART III - DISCOVERY CHAPTER 10 - Principles of Discovery 10.1 Principle 1: Context Relevance 10.2 Principle 2: No Ground Truth 10.3 Principle 3: Judge the Concept, Not the Method 10.4 Principle 4: Separate Data Is Best 10.5 Conceptualizing the US Congress 10.6 Conclusion CHAPTER 11 - Discriminating Words 11.1 Mutual Information 11.2 Fightin' Words 11.3 Fictitious Prediction Problems 11.4 Conclusion CHAPTER 12 - Clustering 12.1 An Initial Example Using k-Means Clustering 12.2 Representations to Clustering 12.3 Approaches to Clustering 12.4 Making Choices 12.5 The Human Side of Clustering 12.6 Conclusion CHAPTER 13 - Topic Models 13.1 Latent Dirichlet Allocation 13.2 Interpreting the Output of Topic Models 13.3 Incorporating Structure into LDA 13.4 Structural Topic Models 13.5 Labeling Topic Models 13.6 Conclusion CHAPTER 14 - Low-Dimensional Document Embeddings 14.1 Principal Component Analysis 14.2 Classical Multidimensional Scaling 14.3 Conclusion PART IV - MEASUREMENT CHAPTER 15 - Principles of Measurement 15.1 From Concept to Measurement 15.2 What Makes a Good Measurement 15.3 Balancing Discovery and Measurement with Sample Splits CHAPTER 16 - Word Counting 16.1 Keyword Counting 16.2 Dictionary Methods 16.3 Limitations and Validations of Dictionary Methods 16.4 Conclusion CHAPTER 17 - An Overview of Supervised Classification 17.1 Example: Discursive Governance 17.2 Create a Training Set 17.3 Classify Documents with Supervised Learning 17.4 Check Performance 17.5 Using the Measure 17.6 Conclusion CHAPTER 18 - Coding a Training Set 18.1 Characteristics of a Good Training Set 18.2 Hand Coding 18.3 Crowdsourcing 18.4 Supervision with Found Data 18.5 Conclusion CHAPTEER 19 - Classifying Documents with Supervised 19.1 Naive Bayes 19.2 Machine Learning 19.3 Example: Estimating Jihad Scores 19.4 Conclusion CHAPTER 20 - Checking Performance 20.1 Validation with Gold-Standard Data 20.2 Validation without Gold-Standar Data 20.3 Example: Validating Jihad Scores 20.4 Conclusion CHAPTER 21 - Repurposing Discovery Methods 21.1 Unsupervised Methods Tend to Measure Subject 21.2 Example: Scaling via Differential Word Rates 21.3 A Workflow for Repurposing Unsupervised Methods 21.4 Concerns in Repurposing Unsupervised Methods for Measurement 21.5 Conclusion PART V - INFERENCE CHAPTER 22 - Principles of inference 22.1 Prediction 22.2 Causal inference 22.3 Comparing Prediction and Causal Inference 22.4 Partial and General Equilibrium in Prediction and Causal Inference 22.5 Conclusion CHAPTER 23 - Prediction 23.1 The Basic Task of Prediction 23.2 Similarities and Fifferences between Prediction and Measurement 23.3 Five Principles of Prediction 23.4 Using Text as Data for Prediction: Examples 23.5 Conclusion CHAPTER 24 - Causal Inference 24.1 Introduction to Causal Inference 24.2 Similarities and Differences between Prediction and Measurement, and Causal Inference 24.3 Key Principles of Causal Inference with Text 24.4 The Mapping Function 24.5 Workflows for Making Causal Inferences with Text 24.6 Conclusion CHAPTER 25 - Text as Outcome 25.1 An Experiment on Immigration 25.2 The Effect of Presidential Public Appeals 25.3 Conclusion CHAPTER 26 - Text as Treatment 26.1 An Experiment Using Trump's Tweets 26.2 A Candidate Biography Experiment 26.3 Conclusion CHAPTER 27 - Text as Confounder 27.1 Regression Adjustments for Text Confounders 27.2 Matching Adjustments for Text 27.3 Conclusion PART VI - CONCLUSION 28.1 How to Use Text as Data in the Social Sciences 28.2 Applying Our Principles beyond Text Data 28.3 Avoiding the Cycle of Creation and Destruction in Social Science Methodology Acknowledgments Bibliography Index
Tags from this library: No tags from this library for this title. Log in to add tags.
    average rating: 0.0 (0 votes)
Item type Current location Collection Call number Copy number Status Date due Barcode
Livro Geral Biblioteca Graciliano Ramos
Livro Geral 006.312 G8648t (Browse shelf) Ex. 1 Available 2023-0282

Preface PART I - PRELIMINARIES CHAPTER 1 - Introduction 1.1 How This Book Informs the Social Sciences 1.2 How This Book Informs the Digital Humanities 1.3 How This Book Informs Data Science in Industry and Government 1.4 A Guide to This Book 1.5 Conclusion CHAPTER 2 - Social Science Research and Text Analysis 2.1 Discovery 2.2 Measurement 2.3 Inference 2.4 Social Science as an Interative and Cumulative Process 2.5 An Agnostic Approach to Text Analysis 2.6 Discovery, Meansurement, and Causal Inference: How the Chinese Government Censors Social Media 2.7 Six Principals of Text Analysis 2.8 Conclusion: nText Data and Social Science PART II - SELECTION AND REPRESENTATION CHAPTER 3 - Principles of Selection and Representation 3.1 Principle 1: Question-Specific Corpus Construction 3.2 Principle 2: No Values-Free Corpus Construction 3.3 Principle 3: No Right Way to Represent Text 3.4 Principle 4: Validation 3.5 State of the Union Addresses 3.6 The Autorship of the Federalist Papers 3.7 Conclusion CHAPTER 4 - Selecting Documents 4.1 Populations and Quantities of Interest 4.2 Four Types of Bias 4.3 Considerations of "Found Data" 4.4 Conclusion CHAPTER 5 - Bag of Words 5.1 The Bag of Words Model 5.2 Choose the Unit of Analysis 5.3 Tokenize 5.4 Reduce Complexity 5.5 Construct Document-Feature Matrix 5.6 Rethinking the Defaults 5.7 Conclusion CHAPTER 6 - The Multinomial Language Model 6.1 Multinomial Distribution 6.2 Basic Language Modeling 6.3 Regularization and Smoothing 6.4 The Dirichlet Distribution 6.5 Conclusion CHAPTER 7 - The Vector Space Model and Similarity Metrics 7.1 Similarity Metrics 7.2 Distance Metrics 7.3 tf-idf Weighting 7.4 Conclusion CHAPTER 8 - Distributed Representations of Words 8.1 Why Word Embeddings 8.2 Estimating Word Embeddings 8.3 Aggregating Word Embeddings to the Document Level 8.4 Validation 8.5 Contextualized Word Embeddings 8.6 Conclusion CHAPTER 9 - Rpresentations from Language Sequences 9.1 Text Reuse 9.2 Parts of Speech Tegging 9.3 Named-Entity Recognition 9.4 Dependency Parsing 9.5 Broader Information Extraction Tasks 9.6 Conclusion PART III - DISCOVERY CHAPTER 10 - Principles of Discovery 10.1 Principle 1: Context Relevance 10.2 Principle 2: No Ground Truth 10.3 Principle 3: Judge the Concept, Not the Method 10.4 Principle 4: Separate Data Is Best 10.5 Conceptualizing the US Congress 10.6 Conclusion CHAPTER 11 - Discriminating Words 11.1 Mutual Information 11.2 Fightin' Words 11.3 Fictitious Prediction Problems 11.4 Conclusion CHAPTER 12 - Clustering 12.1 An Initial Example Using k-Means Clustering 12.2 Representations to Clustering 12.3 Approaches to Clustering 12.4 Making Choices 12.5 The Human Side of Clustering 12.6 Conclusion CHAPTER 13 - Topic Models 13.1 Latent Dirichlet Allocation 13.2 Interpreting the Output of Topic Models 13.3 Incorporating Structure into LDA 13.4 Structural Topic Models 13.5 Labeling Topic Models 13.6 Conclusion CHAPTER 14 - Low-Dimensional Document Embeddings 14.1 Principal Component Analysis 14.2 Classical Multidimensional Scaling 14.3 Conclusion PART IV - MEASUREMENT CHAPTER 15 - Principles of Measurement 15.1 From Concept to Measurement 15.2 What Makes a Good Measurement 15.3 Balancing Discovery and Measurement with Sample Splits CHAPTER 16 - Word Counting 16.1 Keyword Counting 16.2 Dictionary Methods 16.3 Limitations and Validations of Dictionary Methods 16.4 Conclusion CHAPTER 17 - An Overview of Supervised Classification 17.1 Example: Discursive Governance 17.2 Create a Training Set 17.3 Classify Documents with Supervised Learning 17.4 Check Performance 17.5 Using the Measure 17.6 Conclusion CHAPTER 18 - Coding a Training Set 18.1 Characteristics of a Good Training Set 18.2 Hand Coding 18.3 Crowdsourcing 18.4 Supervision with Found Data 18.5 Conclusion CHAPTEER 19 - Classifying Documents with Supervised 19.1 Naive Bayes 19.2 Machine Learning 19.3 Example: Estimating Jihad Scores 19.4 Conclusion CHAPTER 20 - Checking Performance 20.1 Validation with Gold-Standard Data 20.2 Validation without Gold-Standar Data 20.3 Example: Validating Jihad Scores 20.4 Conclusion CHAPTER 21 - Repurposing Discovery Methods 21.1 Unsupervised Methods Tend to Measure Subject 21.2 Example: Scaling via Differential Word Rates 21.3 A Workflow for Repurposing Unsupervised Methods 21.4 Concerns in Repurposing Unsupervised Methods for Measurement 21.5 Conclusion PART V - INFERENCE CHAPTER 22 - Principles of inference 22.1 Prediction 22.2 Causal inference 22.3 Comparing Prediction and Causal Inference 22.4 Partial and General Equilibrium in Prediction and Causal Inference 22.5 Conclusion CHAPTER 23 - Prediction 23.1 The Basic Task of Prediction 23.2 Similarities and Fifferences between Prediction and Measurement 23.3 Five Principles of Prediction 23.4 Using Text as Data for Prediction: Examples 23.5 Conclusion CHAPTER 24 - Causal Inference 24.1 Introduction to Causal Inference 24.2 Similarities and Differences between Prediction and Measurement, and Causal Inference 24.3 Key Principles of Causal Inference with Text 24.4 The Mapping Function 24.5 Workflows for Making Causal Inferences with Text 24.6 Conclusion CHAPTER 25 - Text as Outcome 25.1 An Experiment on Immigration 25.2 The Effect of Presidential Public Appeals 25.3 Conclusion CHAPTER 26 - Text as Treatment 26.1 An Experiment Using Trump's Tweets 26.2 A Candidate Biography Experiment 26.3 Conclusion CHAPTER 27 - Text as Confounder 27.1 Regression Adjustments for Text Confounders 27.2 Matching Adjustments for Text 27.3 Conclusion PART VI - CONCLUSION 28.1 How to Use Text as Data in the Social Sciences 28.2 Applying Our Principles beyond Text Data 28.3 Avoiding the Cycle of Creation and Destruction in Social Science Methodology Acknowledgments Bibliography Index

There are no comments for this item.

Log in to your account to post a comment.

Click on an image to view it in the image viewer

Escola Nacional de Administração Pública

Escola Nacional de Administração Pública

Endereço:

  • Biblioteca Graciliano Ramos
  • Funcionamento: segunda a sexta-feira, das 9h às 19h
  • +55 61 2020-3139 / biblioteca@enap.gov.br
  • SPO Área Especial 2-A
  • CEP 70610-900 - Brasília/DF
<
Acesso à Informação TRANSPARÊNCIA

Powered by Koha