Neural Style Music Visualizer


Controls

Background:
Song:
Cut Frequency between background styles:
550 Hz
Blending of different styles (slide right for more blending):
Smoothing when transitioning styles (slide right for quicker transitions):
Lowpass particle threshold (slide right for fewer blue particles):
Highpass particle threshold (slide right for fewer red particles):

Overview

This project uses an implementation of the Neural Style paper in conjunction with Python signal processing libraries and Pixi.js to create a music visualizer. When a song's bass tones are dominating, the background will show one style of some ambient background, and when the treble tones are dominating, the background will show a different style for the same background. "In-between" tones will present a blend of the two styles. Additionally, particles will appear for strong bass (blue) and treble (red) noises. These particles are initialized with a random speed and direction but they follow a vector field formed by the gradient of the background image.

Notes

This is not a real-time system, so if you want to add more backgrounds or music, you will have to generate images or data files using Python scripts.

  • To create a set of 5 background images using 2 different styles, first clone and install the neural-style repo, copy gen_bg.py to that directory, modify the paths in that script to point to the desired content/style images, and then run it.
  • To create a vector field file for the particle systems, run image_gradient.py after modifying that script (bottom) to point to the right image file.
  • To create a frequency data file for a song, run filter_music.py after modifying that script (bottom) to point to the right music file (note that the music needs to be in WAV format). The frequency cutoffs that decide when bass or treble particle systems are created are also selected at this stage (in comparison, the background style cutoff is decided by the client).

If you create new content, you will need to modify the global variables at the top of musicvis.js so that you will be able to select them on this page. To run this system, you need to have a server with PHP serving this directory. XAMPP is a good option for this task. If PHP is not available, a separate JS file (readclient.js) can be used to bypass the need for webservice.php. The sliders should work as the song is playing; however, to choose a new song or background, wait until the current song is over or refresh the page. If the renderer blacks out after multiple plays, or the audio and video are desynced, please refresh the page (there are a bunch of small problems that prevent the system from being robust; this prototype is intended to show the general idea of the project).

Future Work

I do not plan to develop this project further, but if I did I would consider the following:

  • Making the system more "real-time" by allowing the user to upload songs/backgrounds. The bottleneck of computation is with the neural style computations, which can take a couple hours on a good server; however, the FFT computations for a music file are relatively quick.
  • Interpolating frames between the keyframes generated by neural-style, to replace the current method of blending keyframes with alpha value manipulation. Likely would use something like optical flow.
  • Exploring more sophisticated signal processing techniques so picking out treble and bass is more reliable.