I have a function repair_file_in_place(path)
that makes HTTP connections to check links for 404 errors.
Here’s the Python code to run repair_file_in_place()
concurrently on all the markdown files in a directory:
async def repair_files_in_directory(path: str):
"""
Repair all markdown files recursively in a directory.
"""
async with aiohttp.ClientSession() as session:
async with asyncio.TaskGroup() as group:
for root, _, files in os.walk(path):
for file in files:
if file.endswith(".md"):
group.create_task(repair_file_in_place(os.path.join(root, file), session))
- How would the code be structured in Elixir / OTP?
- Is there some kind of semaphore-like ability to limit the concurrency? My use case is not blowing the ulimit on open files.
Probably something like:
path
|> Path.join("**/*.md")
|> Path.wildcard()
|> Task.async_stream(&repair_file_in_place/1)
|> Stream.run()
Concurrency can be limited using options passed to async_stream. Docs: Task — Elixir v1.16.0
(Note: untested and possibly not optimal, but this is the first that came to mind!)
9 Likes
You are almost there, you forgot the if
part:
paths
|> Enum.filter(&String.ends_with?(&1, ".md"))
...
This is handled by the wildcard, which expands the glob "your/path/**/*.md"
into a list of all the .md
files that live anywhere in your/path
.
2 Likes
It’s amazing that this actually works, and is so simple. Especially compared with the code I showed.
People dig Elixir for a reason, my dude.
That kind of expressive and readable code hooked me to Elixir almost 8 years ago.
1 Like