First, I will answer questions, then, with permission, I will express my opinion on the idea.
You can achieve delays of 10 ms, for example, when connecting point-to-point or in a simple local area network. But in your case it is almost unreal. For example, right now I have a ping to the gateway 75-80 ms. In addition to the fact that the packets will be lost, they will also be delivered with non-permanent time, that is, there will be a jiter.
The transport protocol is uniquely UDP. Although there is still CSTP , but it has limited support from providers, it is likely that there will be a node that will cut everything that is not UDP and not TCP. And in Vindouz he is generally a poor relative. On the application level, I recommend to get acquainted with the RTP .
P2P as in bittorent will not help you if the city network is strictly centralized, which is often the case in smaller cities. That is, if the provider has one gateway for all and to it centrally all channels from average users converge. It turns out that there will be one channel between participants and the bittorent will most likely lose because he has his own overhead too. The bit torrent needs redundant channels, then there is the possibility of parallel transmission, which is its essence.
You can try to organize P2P in Skype . For example, if among the three participants it turns out that 2 is more profitable to be an intermediary between 1 and 3, then you can designate 2 as a server for 1 and 3. Then he will have to transfer for 1: himself and 3, and for 2: himself and 1. For more participants will need to select intermediaries for each channel. That is, this is the maximum flow task.
It is definitely necessary to try to compress, but not with a zip, but with an audio codec. It is likely that compression / decompression can be done in hardware.
Now the opinion about the venture. Why did you even decide that a delay of 10 ms will be enough, music is a subtle thing. The big orchestra is usually managed by a conductor because the musicians simply do not hear each other. I recommend you look at the radio.