Sounds of PTT banner

Sounds of PTT

nlpdspdata sonificationvuedockerdjangopandasnumpypythoncelerygcpnginxscrapypostgresqlredisjiebabootstrapweb crawling

Sounds of PTT is a web application that can turn an article on PTT into audible frequencies.

Idea

PTT is the most popular BBS in Taiwan, where one can find the public’s opinion for almost any given topic. With public’s opinion structuralized in a series of short comments and “push/boo” tags, one can read through the discussion thread to catch the public’s sentiment orientations of a topic.

However, reading is not the only way we sense this universe. Just like this inspiring project by NASA, what if we could actually “hear” public’s sentiment orientations? Does it feel different from reading? Can we compose meaningful sounds from public’s opinions? Will a positive rated article sounds better than a negative rated article? If the author and commenters of an article start a band, how will their music sound like? This web application is designed to answer these questions.

How It Works in General

A User first enters a PTT article URL in the Frontend. When the Backend API receives a valid PTT article URL, it dispatches a Crawler, which crawls and re-structuralizes article contents for the Sound Maker. The Sound Maker then can process structuralized data with 3 steps, tokenization, quantization (based on sentiment polarity of words), and finally sonification. After the procedure is done, a media file will be generated and therefore can be referenced via the Frontend.

This project is archived

Technology Used

Frontend

  • Vue.js
  • Vue router
  • Bootstrap

Backend

Hosting/Environment

  • Google Cloud Compute Engine
  • Docker-Compose

Web Server

  • Nginx

API

  • Django
  • Django rest framework
  • Celery

Crawler

  • Scrapy/Scrapinghub

Sound Maker

Tokenization
  • pandas
  • jseg/jieba
Quantization
  • numpy
  • ANTUSD
Sonification
  • thinkdsp (with some customized code)

Database

  • Postgresql

Messaging/Cache Backend

  • Redis

Credits

Inspiration and Acknowledgements

Resources

  • Allen B. Downey: The author of Think DSP, an awesome book for learning digital signal processing.
  • NLPSA: Where I acquire ANTUSD for this project.
  • Jseg: A better choice for Chinese tokenization in this project.

References

Fig. 1 - Interface
Fig. 1 - Interface

Further Reading

Back to top