Models that can scale in interesting ways
Multi-View Models were my introduction to university research. I did most of this work when I was a sophomore and junior at UIUC working with Paris Smaragdis and other students in his lab.
Multi-View models were motivated by the difficulties of deploying audio models on devices with different form factors, numbers of microphones, and available compute. We were looking to construct a model that could be trained on data from one kind of device but generalize to a whole host of other devices.
The first problem we tackled was dealing with different numbers of microphones. In our first two papers we constructed Multi-View models that were trained on a fixed number of microphones
Having discovered some initial solutions for microphone scalability, we moved on to compute scalability. The goal here was to make models that could adapt their test-time computational or communication requirements. We did this by training models which modulated the number of inputs consumed before producing an output. We were able to get this working for multi-channel speech enhancement where our Multi-View model streamed audio from a time-varying number of microphones