Sounds of PTT
Sounds of PTT is a web application that can turn an article on PTT into audible frequencies.
Idea
PTT is the most popular BBS in Taiwan, where one can find the public’s opinion for almost any given topic. With public’s opinion structuralized in a series of short comments and “push/boo” tags, one can read through the discussion thread to catch the public’s sentiment orientations of a topic.
However, reading is not the only way we sense this universe. Just like this inspiring project by NASA, what if we could actually “hear” public’s sentiment orientations? Does it feel different from reading? Can we compose meaningful sounds from public’s opinions? Will a positive rated article sounds better than a negative rated article? If the author and commenters of an article start a band, how will their music sound like? This web application is designed to answer these questions.
How It Works in General
A User first enters a PTT article URL in the Frontend
. When the Backend API
receives a valid PTT article URL, it dispatches a Crawler
, which crawls and re-structuralizes article contents for the Sound Maker
. The Sound Maker
then can process structuralized data with 3 steps, tokenization, quantization (based on sentiment polarity of words), and finally sonification. After the procedure is done, a media file will be generated and therefore can be referenced via the Frontend
.
This project is archived
Technology Used
Frontend
- Vue.js
- Vue router
- Bootstrap
Backend
Hosting/Environment
- Google Cloud Compute Engine
- Docker-Compose
Web Server
- Nginx
API
- Django
- Django rest framework
- Celery
Crawler
- Scrapy/Scrapinghub
Sound Maker
Tokenization
- pandas
- jseg/jieba
Quantization
- numpy
- ANTUSD
Sonification
- thinkdsp (with some customized code)
Database
- Postgresql
Messaging/Cache Backend
- Redis
Credits
Inspiration and Acknowledgements
Resources
- Allen B. Downey: The author of Think DSP, an awesome book for learning digital signal processing.
- NLPSA: Where I acquire ANTUSD for this project.
- Jseg: A better choice for Chinese tokenization in this project.
References
- Dockerizing a Full-stack Application
- nginx配置location总结及rewrite规则写法
- Voltage-Controlled Oscillator (VCO)
- Setup caching in Django With Redis
- Wave Generation in Python