- AmCAT (Amsterdam Content Analysis Toolkit) is an open source platform for corpus management and quantitative text analysis developed and maintained by our members.
- INCA (INfrastructure for Content Analysis) is a Python module for collecting, storing, processing and analyzing media content.
Python for CCS
Python is a programming language that offers many possibilities for the collection, processing and analysis of data. Some resources by our members:
- An online book on Doing Computational Social Science with Python (PDF)
- Jupyter Notebooks on various techniques, such as topic modelling or time series analysis (Github)
- Automated Visual Content Analysis using Pre-Trained Models. This repository contains sample code for automated visual content analysis using pre-trained computer vision models. (Github)
- The Conversational Agent Research Toolkit (CART) is aimed at enabling researchers to create conversational agents for experimental studies using computational methods. CART provides a unifying toolkit written in Python that integrates existing services and APIs for dialogue management, natural language understanding and generation, and frameworks that enable publishing the conversational agent as either web interface or within messaging apps. (Github)
R for CCS
R is a very powerful and flexible statistics package and programming language. It can be used for classical statistical modelling, including more advanced models such as multilevel and time series models. Through various packages, it can also be used for things like text analysis, web/API scraping, network analysis, et cetera.
We host a repository for R tutorials at ccs-amsterdam/r-course-material. We also warmly recommend the (free) book R for data science.
We have developed some packages for text analysis, most notably:
- kasperwelbers/corpustools for token-based (context-sensitive) text analysis
- vanatteveldt/rsyntax for using syntactic rules, i.e. for clause and source extraction
- amcatr: client bindings for AmCAT