Section outline

  • Course Overview

    In this course you'll learn how to approach the study of language and social interactions in digital environments so you can analyze textual data sets from social media sites, digital archives, and digital surveys and interviews.

    Learning Outcomes

    This course will help you to:

    1. Define text mining and text analysis  
    2. Select and sample appropriate data  
    3. Identify the various types of lexical resources, their uses and availability   
    4. Differentiate between narrative, thematic, and metaphor analysis approaches
    5. Differentiate between text classification, opinion mining, information extraction, and topic modeling 
    6. Explore available data sets and software tools for key text mining techniques 

    • Course Instructors

      • Gabe Ignatow

        Gabe Ignatow is a Professor in the Department of Sociology at the University of North Texas. His research interests are in sociological theory, digital research methods, cognitive social science and philosophy of social science. He currently serves on the editorial boards of Sociological Forum and the Journal for the Theory of Social Behavior. Along with the two recent books on text mining co-authored with Rada Mihalcea, Gabe has co-authored a forthcoming volume on digital social research methods and co-edited the Oxford Handbook of Cognitive Sociology. He is currently working on a book project on sociological theory in the digital age while serving as his department's graduate program director.

        View Bio for Gabe Ignatow
      • Rada Mihalcea

        Rada Mihalcea is a Professor in the Computer Science and Engineering department at the University of Michigan. Her research interests are in computational linguistics, with a focus on lexical semantics, multilingual natural language processing, and computational social sciences. She serves or has served on the editorial boards of the Journals of Computational Linguistics, Language Resources and Evaluations, Natural Language Engineering, Research in Language in Computation, IEEE Transactions on Affective Computing, and Transactions of the Association for Computational Linguistics.

        View Bio for Rada Mihalcea
      • Pre-Course Self Assessment

        Before you dive into this course, spend a few moments reflecting on your familiarity with the topic and your current level of skills confidence. 

        You will then re-visit the same questions in our Post-Course Self Assessment and reflect on how the course has helped you develop in confidence and grow your skills.

        • Module One: Foundations

          This module will help you to:

          1. Define text mining and text analysis  
          2. Compare approaches to text analysis   
          3. Consider ethical, philosophical, and logical issues related to text mining  
        • Module Two: Research Design and Basic Tools

          This module will help you to:

          1. Compare various research design methods     
          2. Select and sample appropriate data   
          3. Scrape (collect) data from the Internet   
          4. Format and code texts   
          5. Compare software packages used by researchers
        • Module Three: Text Mining Fundamentals

          This module will help you to:

          1. Identify the various types of lexical resources, their uses and availability   
          2. Recognize the key text processing steps and the basics of language models   
          3. Explore the task of supervised learning, its application, and available software   
        • Module Four: Methods from the Humanities and Social Sciences

          This module will help you to:

          1. Differentiate between narrative, thematic, and metaphor analysis approaches 
          2.  Compare the techniques of each type of analysis  
          3. Explore the different software tools available for each type of analysis 
        • Module Five: Computer Science Methods

          This module will help you to:

          1. Differentiate between text classification, opinion mining, information extraction, and topic modeling 
          2. Identify how social scientists have used these techniques   
          3. Explore available data sets and software tools for these key text mining techniques  
        • Glossary of Key Terms

          In addition to the glossary you’ll find woven throughout the course, you can find the full glossary collated in one place here.

        • Post-Course Self Assessment

          Now you’ve completed the course, spend a few moments reflecting on where your familiarity with the topics and your confidence skills le vels are at now. 

          Has the course helped you develop new skills and grow your confidence?

          You'll need to complete the Post-Course Self Assessment in order to download your certificate. If you didn't do the Pre-Course Self Assessment before starting the course, please go to the top of the page and reflect on your familiarity with the topic and your level of skills confidence before you started the course.

          • Completion: Certificate

            Completing the course and Post-Course Self Assessment will unlock a course certificate, which you can download here.

            • Acknowledgements

              With thanks to Kayla Abner for her insightful contributions to the 2025 update.

              • Give Feedback About This Course

                Did you enjoy the course? Please take two minutes to share your feedback. We use learner feedback in future course updates and developments to provide an excellent learning experience.

              • Accessibility, Diversity, Equity and Inclusion