Home » Android » java – How do I mix multiple live voice audio stream from DatagramPackets?

java – How do I mix multiple live voice audio stream from DatagramPackets?

Posted by: admin May 14, 2020 Leave a comment

Questions:

I am working on a project where I like to add a push-to-talk functionality and
I have android as clients and java as my server. What I do is send bytes from AudioRecord to my server and broadcast it back to connected clients.

What my problem lies when mixing data from different clients that were sent at the same time.

This is what I have tried on my server:

static boolean status = true;
static int port = 1938;
static byte[] mixed_audio;
static byte[][] all_bytes;
static int client_count = 0;
static DatagramSocket socket;
static ArrayList<InetAddress> addresses;
public static void main(String args[]) throws Exception {

    DatagramSocket serverSocket = new DatagramSocket(port);    
    System.out.println("Listening. . .");    
    addresses = new ArrayList<>();

    for(int x = 0; x < args.length; x++){
        if(args[x].equals("-p")){
            port = Integer.parseInt(args[x+1]);
        }
    }

    byte[] receiveData = new byte[1400];

    DatagramPacket receivePacket = new DatagramPacket(receiveData,
            receiveData.length);

    socket = new DatagramSocket();

    while (status == true) {
        all_bytes = new byte[1400][1400];
        mixed_audio = new byte[1400];
        serverSocket.receive(receivePacket);        
        int a = addresses.indexOf(receivePacket.getAddress());
        if(a < 0 ){
            addresses.add(receivePacket.getAddress());            
        }
        client_count++;        
        all_bytes[client_count] = receivePacket.getData();
       new Thread(new ReceiveData(receivePacket.getData(), receivePacket.getAddress())).start();

    }
}

public static class ReceiveData implements Runnable{

    byte[] data;
    InetAddress address;

    public ReceiveData(byte[]  b, InetAddress address){
        this.data = b;
        this.address = address;
    }

    @Override
    public void run() {

        try {
            for(int i = 0; i < 1400; i++){
                for(int j = 0; j < 1400; j++){
                    mixed_audio[j] += all_bytes[i][j];
                }
            } 

            if(client_count > 1){
                int c=0;
                for(int x = 0; x < 1400; x++){
                    mixed_audio[x]  = (byte) (mixed_audio[x] / client_count + 1);
                }
            }else{
                mixed_audio = data;
            }
            client_count--;

            for(InetAddress add: addresses){

                if(add != address){
                    DatagramPacket packet;
                    packet = new DatagramPacket(mixed_audio, mixed_audio.length, add, port);
                    socket.send(packet);

                }

            }


        } catch (IOException ex) {
            //Logger.getLogger(TeraMix.class.getName()).log(Level.SEVERE, null, ex);
        }

    }

}

The audio output when only one client is talking is clear but the audio output when multiple clients start to talk simultaneously becomes very unclear.

I also tried my algorithm for mixing audio by using it on files on my PC and it worked good. What I need is to mix data packets that are sent at the same time by different clients.

Do I need to handle clients on different threads? Am I doing it wrong?
is there a better way on this?
please guide me on this. Thanks!

How to&Answers:

I don’t know if you’ve been able to debug how the packets are merging there, but from a read through I would say the problem is that each received packet spawns it’s own thread, which will then send the current merged packet.

E.g. if it’s three clients A, B and C, sending in packets 1 and 2, the merges would be:

  • A1 in -> A1 out
  • B1 in -> B1, or maybe A1+B1 out
  • C1 in -> C1, or B1+C1 or even A1+B1+C1 out
  • A2 in -> A2 or C1+A2 or …
  • B2 in -> B2 or A2+B2 or …
  • C2 in -> C2 or B2+C2 or …

It looks in this simplified case, it would have sent out six packets instead of the ideal two – A1+B1+C1 and A2+B2+B2?

Clearly this will take a bit of care to get the merge smooth, especially as I’m sure the packets won’t be arriving perfectly in sync – this is UDP after all.

Assuming it’s OK to work on a ‘merge what packets you have’ basis, it might work to only trigger the send thread when either you have packets for all current clients now, or if a second packet arrives for one of the clients, or maybe after a timeout on the sample rate.

I guess this would risk contention over all_bytes between the receiving and sending threads though. It might work better to pass the current all_bytes through to the ReceiveData runnable once enough packets are in, but then start a new one to read more packets into. Or at least cycle the arrays if memory/GC overhead could be an issue.