jeudi 15 avril 2021

Multiprocessing python video server

I am building a screen recording tool. It is a Chrome extension that records a user's screen and sends chunks of video to a remote video processing server every couple seconds until the user finishes the recording.

The video processing server is written in Python. I have a flask server process which accepts HTTP requests from the client extension. When a user starts a new recording session, a POST /sessions request is sent, a unique identifier for the session is created, and the flask process spawns a child recording process to handle all the processing for chunks from that session. This child recording process sets up some ffmpeg subprocesses for creating a thumbnail, preview gif, and dash manifest/segment files for playback. After this, the Chrome extension sends POST /sessions/:sessionId/chunks/:chunkNum with the chunk bytes in the data.

The flask server uses multiprocessing.Process to create the recording process. It uses multiprocessing.Pipe to set up a bidirectional communication channel. It stores a reference to the parent side of the connection in an in-memory map <sessionId, parent conn>.

When POST /sessions/:sessionId/chunks/:chunkNum comes in, it looks up the parent connection in the map and sends the chunk data to the recording process which pipes it into the running ffmpeg subprocesses.

When POST /sessions/:sessionId/complete comes in, it sends a COMPLETION message to the child process, for it to close the input pipes, waits for the ffmpeg subprocesses to complete, and upload the output files to cloud storage.

The thing I think I'm doing wrong is storing a global mapping of <session id, parent conn> in the flask server process. According to this answer, this is probably not a good approach for a production system.

I'd like some suggestions on how to store a mapping of <session id, recording process> in my flask server. Should the recording processes be spawned as HTTP servers of their own running on different ports, and then I could store a mapping of <session id, port number of recording server process> in a database? All ideas welcome. Any general design critiques are welcome. I need to ultimately be able to support dozens of concurrent recording sessions.

Aucun commentaire:

Enregistrer un commentaire