What to do when FireDucks is slow
Thank you for your interest in FireDucks. This article describes possible causes and remedies for slow programs using FireDucks.
When a pandas program with FireDucks applied is slow, the reason may be the followings.
- Using ‘apply’ or ’loop’.
- Using pandas API not implemented in FireDucks.
In the case of 1, if you change the pandas program, the program may become faster. For example,
sum_val = 0
for i in range(len(df)):
if df["A"][i] > 2:
sum_val += df["B"][i]
A program using ’loop’ like the one above can be faster by writing it like the one below.
sum_val = df[df["A"] > 2]["B"].sum()
If you have difficulty in confirming by yourself, or if the source code is too complicated to find suitable modification methods, please feel free to ask for help from the FireDucks community via the slack listed below.
In the case of 2, we cannot immediately speed up the program, but by reporting unimplemented pandas API in FireDucks, we will implement them and speed up the program in the future.
Please set environment variables to determine if FireDucks implements the functionality you need for your pandas program.
FIREDUCKS_FLAGS="-Wfallback"
After setting the environment variables, please run the program. If you see the word “Fallback” in the pandas function, it would be helpful if you could report it.
If you would like to report a problem, please contact us by any of the following methods.
- 🦆github : https://github.com/fireducks-dev/fireducks/issues/new
- 📧mail : contact@fireducks.jp.nec.com
- 🤝slack : https://join.slack.com/t/fireducks/shared_invite/zt-2j4lucmtj-IGR7AWlXO62Lu605pnBJ2w
Please feel free to contact us even if you have difficulty in researching Fallback.
This concludes this article. Thank you for reading.