I have a function repair_file_in_place(path)
that makes HTTP connections to check links for 404 errors.
Here’s the Python code to run repair_file_in_place()
concurrently on all the markdown files in a directory:
async def repair_files_in_directory(path: str):
"""
Repair all markdown files recursively in a directory.
"""
async with aiohttp.ClientSession() as session:
async with asyncio.TaskGroup() as group:
for root, _, files in os.walk(path):
for file in files:
if file.endswith(".md"):
group.create_task(repair_file_in_place(os.path.join(root, file), session))
- How would the code be structured in Elixir / OTP?
- Is there some kind of semaphore-like ability to limit the concurrency? My use case is not blowing the ulimit on open files.
Probably something like:
path
|> Path.join("**/*.md")
|> Path.wildcard()
|> Task.async_stream(&repair_file_in_place/1)
|> Stream.run()
Concurrency can be limited using options passed to async_stream. Docs: Task — Elixir v1.16.0
(Note: untested and possibly not optimal, but this is the first that came to mind!)
2 Likes