000 | nam a22 7a 4500 | ||
---|---|---|---|
999 |
_c524415 _d524528 |
||
003 | BR-BrENAP | ||
005 | 20230809180814.0 | ||
008 | 230314t20222022njua b 001 0 eng d | ||
020 | _a9780691207551 | ||
040 |
_aBR-BrENAP _bPt_BR |
||
041 | _aeng | ||
090 |
_a006.312 _bG8648t |
||
100 | 1 |
_aGrimmer, Justin _968335 |
|
245 | 1 | 0 |
_aText as data : _ba new framework for machine learning and the social sciences / _cpor Justin Grimmer, Margaret E. Roberts e Brandon M. Stewart. -- |
260 |
_aNova Jersey, EUA ; _aOxford, UK : _bPrinceton University Press, _c2022. |
||
300 |
_a336 p. : _bil. |
||
505 |
_tPreface _tPART I - PRELIMINARIES _tCHAPTER 1 - Introduction _t1.1 How This Book Informs the Social Sciences _t1.2 How This Book Informs the Digital Humanities _t1.3 How This Book Informs Data Science in Industry and Government _t1.4 A Guide to This Book _t1.5 Conclusion _tCHAPTER 2 - Social Science Research and Text Analysis _t2.1 Discovery _t2.2 Measurement _t2.3 Inference _t2.4 Social Science as an Interative and Cumulative Process _t2.5 An Agnostic Approach to Text Analysis _t2.6 Discovery, Meansurement, and Causal Inference: How the Chinese Government Censors Social Media _t2.7 Six Principals of Text Analysis _t2.8 Conclusion: nText Data and Social Science _tPART II - SELECTION AND REPRESENTATION _tCHAPTER 3 - Principles of Selection and Representation _t3.1 Principle 1: Question-Specific Corpus Construction _t3.2 Principle 2: No Values-Free Corpus Construction _t3.3 Principle 3: No Right Way to Represent Text _t3.4 Principle 4: Validation _t3.5 State of the Union Addresses _t3.6 The Autorship of the Federalist Papers _t3.7 Conclusion _tCHAPTER 4 - Selecting Documents _t4.1 Populations and Quantities of Interest _t4.2 Four Types of Bias _t4.3 Considerations of "Found Data" _t4.4 Conclusion _tCHAPTER 5 - Bag of Words _t5.1 The Bag of Words Model _t5.2 Choose the Unit of Analysis _t5.3 Tokenize _t5.4 Reduce Complexity _t5.5 Construct Document-Feature Matrix _t5.6 Rethinking the Defaults _t5.7 Conclusion _tCHAPTER 6 - The Multinomial Language Model _t6.1 Multinomial Distribution _t6.2 Basic Language Modeling _t6.3 Regularization and Smoothing _t6.4 The Dirichlet Distribution _t6.5 Conclusion _tCHAPTER 7 - The Vector Space Model and Similarity Metrics _t7.1 Similarity Metrics _t7.2 Distance Metrics _t7.3 tf-idf Weighting _t7.4 Conclusion _tCHAPTER 8 - Distributed Representations of Words _t8.1 Why Word Embeddings _t8.2 Estimating Word Embeddings _t8.3 Aggregating Word Embeddings to the Document Level _t8.4 Validation _t8.5 Contextualized Word Embeddings _t8.6 Conclusion _tCHAPTER 9 - Rpresentations from Language Sequences _t9.1 Text Reuse _t9.2 Parts of Speech Tegging _t9.3 Named-Entity Recognition _t9.4 Dependency Parsing _t9.5 Broader Information Extraction Tasks _t9.6 Conclusion _tPART III - DISCOVERY _tCHAPTER 10 - Principles of Discovery _t10.1 Principle 1: Context Relevance _t10.2 Principle 2: No Ground Truth _t10.3 Principle 3: Judge the Concept, Not the Method _t10.4 Principle 4: Separate Data Is Best _t10.5 Conceptualizing the US Congress _t10.6 Conclusion _tCHAPTER 11 - Discriminating Words _t11.1 Mutual Information _t11.2 Fightin' Words _t11.3 Fictitious Prediction Problems _t11.4 Conclusion _tCHAPTER 12 - Clustering _t12.1 An Initial Example Using k-Means Clustering _t12.2 Representations to Clustering _t12.3 Approaches to Clustering _t12.4 Making Choices _t12.5 The Human Side of Clustering _t12.6 Conclusion _tCHAPTER 13 - Topic Models _t13.1 Latent Dirichlet Allocation _t13.2 Interpreting the Output of Topic Models _t13.3 Incorporating Structure into LDA _t13.4 Structural Topic Models _t13.5 Labeling Topic Models _t13.6 Conclusion _tCHAPTER 14 - Low-Dimensional Document Embeddings _t14.1 Principal Component Analysis _t14.2 Classical Multidimensional Scaling _t14.3 Conclusion _tPART IV - MEASUREMENT _tCHAPTER 15 - Principles of Measurement _t15.1 From Concept to Measurement _t15.2 What Makes a Good Measurement _t15.3 Balancing Discovery and Measurement with Sample Splits _tCHAPTER 16 - Word Counting _t16.1 Keyword Counting _t16.2 Dictionary Methods _t16.3 Limitations and Validations of Dictionary Methods _t16.4 Conclusion _tCHAPTER 17 - An Overview of Supervised Classification _t17.1 Example: Discursive Governance _t17.2 Create a Training Set _t17.3 Classify Documents with Supervised Learning _t17.4 Check Performance _t17.5 Using the Measure _t17.6 Conclusion _tCHAPTER 18 - Coding a Training Set _t18.1 Characteristics of a Good Training Set _t18.2 Hand Coding _t18.3 Crowdsourcing _t18.4 Supervision with Found Data _t18.5 Conclusion _tCHAPTEER 19 - Classifying Documents with Supervised _t19.1 Naive Bayes _t19.2 Machine Learning _t19.3 Example: Estimating Jihad Scores _t19.4 Conclusion _tCHAPTER 20 - Checking Performance _t20.1 Validation with Gold-Standard Data _t20.2 Validation without Gold-Standar Data _t20.3 Example: Validating Jihad Scores _t20.4 Conclusion _tCHAPTER 21 - Repurposing Discovery Methods _t21.1 Unsupervised Methods Tend to Measure Subject _t21.2 Example: Scaling via Differential Word Rates _t21.3 A Workflow for Repurposing Unsupervised Methods _t21.4 Concerns in Repurposing Unsupervised Methods for Measurement _t21.5 Conclusion _tPART V - INFERENCE _tCHAPTER 22 - Principles of inference _t22.1 Prediction _t22.2 Causal inference _t22.3 Comparing Prediction and Causal Inference _t22.4 Partial and General Equilibrium in Prediction and Causal Inference _t22.5 Conclusion _tCHAPTER 23 - Prediction _t23.1 The Basic Task of Prediction _t23.2 Similarities and Fifferences between Prediction and Measurement _t23.3 Five Principles of Prediction _t23.4 Using Text as Data for Prediction: Examples _t23.5 Conclusion _tCHAPTER 24 - Causal Inference _t24.1 Introduction to Causal Inference _t24.2 Similarities and Differences between Prediction and Measurement, and Causal Inference _t24.3 Key Principles of Causal Inference with Text _t24.4 The Mapping Function _t24.5 Workflows for Making Causal Inferences with Text _t24.6 Conclusion _tCHAPTER 25 - Text as Outcome _t25.1 An Experiment on Immigration _t25.2 The Effect of Presidential Public Appeals _t25.3 Conclusion _tCHAPTER 26 - Text as Treatment _t26.1 An Experiment Using Trump's Tweets _t26.2 A Candidate Biography Experiment _t26.3 Conclusion _tCHAPTER 27 - Text as Confounder _t27.1 Regression Adjustments for Text Confounders _t27.2 Matching Adjustments for Text _t27.3 Conclusion _tPART VI - CONCLUSION _t28.1 How to Use Text as Data in the Social Sciences _t28.2 Applying Our Principles beyond Text Data _t28.3 Avoiding the Cycle of Creation and Destruction in Social Science Methodology _tAcknowledgments _tBibliography _tIndex |
||
650 | 0 |
_aMineração de Dados de Texto _968336 |
|
650 | 0 |
_aCiências Sociais - Processamento de Dados _968337 |
|
650 | 0 |
_aMachine Learning _968338 |
|
700 | 1 |
_aRoberts, Margaret E. _968339 |
|
700 | 1 |
_aStewart, Brandon M. _968340 |
|
909 |
_a202308 _bRaynara |
||
942 | _cG |