buzz is a Python library for parsing and analysing natural language.
It relies heavily on pandas, numpy, and occasionally NLTK. Dependency parsing is done by spaCy, and dependency searching is handled by a purpose-built library called depgrep. Almost all major data structures are based on Pandas' DataFrames, so you can use that functionality for anything that isn't already provided by buzz.
Note that a shorter, general introduction to buzz is available via GitHub. This site provides more comprehensive documentation.
buzz: table of contents
- Modelling and parsing corpora
- Exploring parsed datasets
- Processing raw strings
- Generating tables
- Measuring prototypicality and similarity
- Working with pandas
- Interactive visualisation in the browser
- Case study: lexical density
A web-app for buzz (buzzword)
For a web-app based on buzz, called buzzword, head here. If you're not such a strong programmer, but want to be able to use the core features of buzz, then this is likely the project for you. This code is open-source, and I can help you get it running on your university server with the datasets you want to be able to explore.
Free and open software
Pull requests are always welcome for both buzz and buzzword. I believe they can address a lot of shortcomings in available tools for research into natural language, and welcome any collaboration you might want to offer.