Uncategorized

Azure Batch: How to get small amounts of data (100 bytes) per node in Python jobs


I have many (50,000) quick tasks to be performed on files around 400 mb. The result is a row of csv text (about 100 bytes).

What is the best way to handle getting this data into a single CSV?

If using a local HPC I would use a parallel pool and use a queue to write to a single CSV file.

The Azure Batch document recommends writing a file to blob storage. I could then make a script to combine those files. I could imagine writing straight to SQL could also be an option but I don’t have a database set up.

What are people doing in practise?



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *