000 nam a22 7a 4500
999 _c524415
_d524528
003 BR-BrENAP
005 20230809180814.0
008 230314t20222022njua b 001 0 eng d
020 _a9780691207551
040 _aBR-BrENAP
_bPt_BR
041 _aeng
090 _a006.312
_bG8648t
100 1 _aGrimmer, Justin
_968335
245 1 0 _aText as data :
_ba new framework for machine learning and the social sciences /
_cpor Justin Grimmer, Margaret E. Roberts e Brandon M. Stewart. --
260 _aNova Jersey, EUA ;
_aOxford, UK :
_bPrinceton University Press,
_c2022.
300 _a336 p. :
_bil.
505 _tPreface
_tPART I - PRELIMINARIES
_tCHAPTER 1 - Introduction
_t1.1 How This Book Informs the Social Sciences
_t1.2 How This Book Informs the Digital Humanities
_t1.3 How This Book Informs Data Science in Industry and Government
_t1.4 A Guide to This Book
_t1.5 Conclusion
_tCHAPTER 2 - Social Science Research and Text Analysis
_t2.1 Discovery
_t2.2 Measurement
_t2.3 Inference
_t2.4 Social Science as an Interative and Cumulative Process
_t2.5 An Agnostic Approach to Text Analysis
_t2.6 Discovery, Meansurement, and Causal Inference: How the Chinese Government Censors Social Media
_t2.7 Six Principals of Text Analysis
_t2.8 Conclusion: nText Data and Social Science
_tPART II - SELECTION AND REPRESENTATION
_tCHAPTER 3 - Principles of Selection and Representation
_t3.1 Principle 1: Question-Specific Corpus Construction
_t3.2 Principle 2: No Values-Free Corpus Construction
_t3.3 Principle 3: No Right Way to Represent Text
_t3.4 Principle 4: Validation
_t3.5 State of the Union Addresses
_t3.6 The Autorship of the Federalist Papers
_t3.7 Conclusion
_tCHAPTER 4 - Selecting Documents
_t4.1 Populations and Quantities of Interest
_t4.2 Four Types of Bias
_t4.3 Considerations of "Found Data"
_t4.4 Conclusion
_tCHAPTER 5 - Bag of Words
_t5.1 The Bag of Words Model
_t5.2 Choose the Unit of Analysis
_t5.3 Tokenize
_t5.4 Reduce Complexity
_t5.5 Construct Document-Feature Matrix
_t5.6 Rethinking the Defaults
_t5.7 Conclusion
_tCHAPTER 6 - The Multinomial Language Model
_t6.1 Multinomial Distribution
_t6.2 Basic Language Modeling
_t6.3 Regularization and Smoothing
_t6.4 The Dirichlet Distribution
_t6.5 Conclusion
_tCHAPTER 7 - The Vector Space Model and Similarity Metrics
_t7.1 Similarity Metrics
_t7.2 Distance Metrics
_t7.3 tf-idf Weighting
_t7.4 Conclusion
_tCHAPTER 8 - Distributed Representations of Words
_t8.1 Why Word Embeddings
_t8.2 Estimating Word Embeddings
_t8.3 Aggregating Word Embeddings to the Document Level
_t8.4 Validation
_t8.5 Contextualized Word Embeddings
_t8.6 Conclusion
_tCHAPTER 9 - Rpresentations from Language Sequences
_t9.1 Text Reuse
_t9.2 Parts of Speech Tegging
_t9.3 Named-Entity Recognition
_t9.4 Dependency Parsing
_t9.5 Broader Information Extraction Tasks
_t9.6 Conclusion
_tPART III - DISCOVERY
_tCHAPTER 10 - Principles of Discovery
_t10.1 Principle 1: Context Relevance
_t10.2 Principle 2: No Ground Truth
_t10.3 Principle 3: Judge the Concept, Not the Method
_t10.4 Principle 4: Separate Data Is Best
_t10.5 Conceptualizing the US Congress
_t10.6 Conclusion
_tCHAPTER 11 - Discriminating Words
_t11.1 Mutual Information
_t11.2 Fightin' Words
_t11.3 Fictitious Prediction Problems
_t11.4 Conclusion
_tCHAPTER 12 - Clustering
_t12.1 An Initial Example Using k-Means Clustering
_t12.2 Representations to Clustering
_t12.3 Approaches to Clustering
_t12.4 Making Choices
_t12.5 The Human Side of Clustering
_t12.6 Conclusion
_tCHAPTER 13 - Topic Models
_t13.1 Latent Dirichlet Allocation
_t13.2 Interpreting the Output of Topic Models
_t13.3 Incorporating Structure into LDA
_t13.4 Structural Topic Models
_t13.5 Labeling Topic Models
_t13.6 Conclusion
_tCHAPTER 14 - Low-Dimensional Document Embeddings
_t14.1 Principal Component Analysis
_t14.2 Classical Multidimensional Scaling
_t14.3 Conclusion
_tPART IV - MEASUREMENT
_tCHAPTER 15 - Principles of Measurement
_t15.1 From Concept to Measurement
_t15.2 What Makes a Good Measurement
_t15.3 Balancing Discovery and Measurement with Sample Splits
_tCHAPTER 16 - Word Counting
_t16.1 Keyword Counting
_t16.2 Dictionary Methods
_t16.3 Limitations and Validations of Dictionary Methods
_t16.4 Conclusion
_tCHAPTER 17 - An Overview of Supervised Classification
_t17.1 Example: Discursive Governance
_t17.2 Create a Training Set
_t17.3 Classify Documents with Supervised Learning
_t17.4 Check Performance
_t17.5 Using the Measure
_t17.6 Conclusion
_tCHAPTER 18 - Coding a Training Set
_t18.1 Characteristics of a Good Training Set
_t18.2 Hand Coding
_t18.3 Crowdsourcing
_t18.4 Supervision with Found Data
_t18.5 Conclusion
_tCHAPTEER 19 - Classifying Documents with Supervised
_t19.1 Naive Bayes
_t19.2 Machine Learning
_t19.3 Example: Estimating Jihad Scores
_t19.4 Conclusion
_tCHAPTER 20 - Checking Performance
_t20.1 Validation with Gold-Standard Data
_t20.2 Validation without Gold-Standar Data
_t20.3 Example: Validating Jihad Scores
_t20.4 Conclusion
_tCHAPTER 21 - Repurposing Discovery Methods
_t21.1 Unsupervised Methods Tend to Measure Subject
_t21.2 Example: Scaling via Differential Word Rates
_t21.3 A Workflow for Repurposing Unsupervised Methods
_t21.4 Concerns in Repurposing Unsupervised Methods for Measurement
_t21.5 Conclusion
_tPART V - INFERENCE
_tCHAPTER 22 - Principles of inference
_t22.1 Prediction
_t22.2 Causal inference
_t22.3 Comparing Prediction and Causal Inference
_t22.4 Partial and General Equilibrium in Prediction and Causal Inference
_t22.5 Conclusion
_tCHAPTER 23 - Prediction
_t23.1 The Basic Task of Prediction
_t23.2 Similarities and Fifferences between Prediction and Measurement
_t23.3 Five Principles of Prediction
_t23.4 Using Text as Data for Prediction: Examples
_t23.5 Conclusion
_tCHAPTER 24 - Causal Inference
_t24.1 Introduction to Causal Inference
_t24.2 Similarities and Differences between Prediction and Measurement, and Causal Inference
_t24.3 Key Principles of Causal Inference with Text
_t24.4 The Mapping Function
_t24.5 Workflows for Making Causal Inferences with Text
_t24.6 Conclusion
_tCHAPTER 25 - Text as Outcome
_t25.1 An Experiment on Immigration
_t25.2 The Effect of Presidential Public Appeals
_t25.3 Conclusion
_tCHAPTER 26 - Text as Treatment
_t26.1 An Experiment Using Trump's Tweets
_t26.2 A Candidate Biography Experiment
_t26.3 Conclusion
_tCHAPTER 27 - Text as Confounder
_t27.1 Regression Adjustments for Text Confounders
_t27.2 Matching Adjustments for Text
_t27.3 Conclusion
_tPART VI - CONCLUSION
_t28.1 How to Use Text as Data in the Social Sciences
_t28.2 Applying Our Principles beyond Text Data
_t28.3 Avoiding the Cycle of Creation and Destruction in Social Science Methodology
_tAcknowledgments
_tBibliography
_tIndex
650 0 _aMineração de Dados de Texto
_968336
650 0 _aCiências Sociais - Processamento de Dados
_968337
650 0 _aMachine Learning
_968338
700 1 _aRoberts, Margaret E.
_968339
700 1 _aStewart, Brandon M.
_968340
909 _a202308
_bRaynara
942 _cG