- Pitfalls of Time Measurement for FireDucks with %%time in Notebooks
Thursday, December 26, 2024 in Posts
This is Osamu Daido from the FireDucks development team. In today's developers' blog, I would like to present a subtle pitfall in time measurement. Quick Overview When measuring the execution time of FireDucks using the %%time magic command in …
- How to take traces in FireDucks
Friday, December 20, 2024 in Posts
FireDucks has a trace function that records how long each process such as read_csv, groupby, sort, etc. takes. This article introduces how to use the trace function. How to output and display trace files To use the trace function, you do not need to …
- Ensuring compatibility with pandas in the GPU version of FireDucks
Thursday, December 19, 2024 in Posts
We are currently developing a GPU version of FireDucks. FireDucks is built with an architecture that translates programs into an intermediate representation at runtime, optimizes them in this intermediate representation, and then compiles and …
- Exploring performance benefits of FireDucks over cuDF
Wednesday, December 18, 2024 in Posts
Research says that Data scientists spend about 45% of their time on data preparation tasks, including loading (19%) and cleaning (26%) the data. Pandas is one of the most popular python libraries for tabular data processing because of its diverse …
- Cache or Eliminate? How FireDucks increase opportunity of optimization
Tuesday, December 17, 2024 in Posts
As described here, FireDucks uses lazy execution model with define-by-run IR generation. Since FireDucks uses MLIR compiler framework to optimize and execute IR, first step of the execution is creating MLIR function which holds operations to be …
- How to run polars-tpch benchmark with FireDucks
Friday, December 06, 2024 in Posts
Recently we have updated the result of polars-tpch benchmark on 4th generation Xeon processor. The latest result can be found here, and also below in this artice, explaining how to reproduce the same. For reproducibility, we have used AWS EC2 for …
- Unveiling the Optimization Benefit of FireDucks Lazy Execution: Part #2
Thursday, December 05, 2024 in Posts
In the previous article, we have talked about how FireDucks can take care pushdown-projection related optimization for read_parquet(), read_csv() etc. In today’s article, we will focus on the efficient caching mechanism by its JIT compiler. …
- Unveiling the Optimization Benefit of FireDucks Lazy Execution: Part #1
Thursday, December 05, 2024 in Posts
The availability of runtime memory is often a challenge faced at processing larger-than-memory-dataset while working with pandas. To solve the problem, one can either shift to a system with larger memory capacity or consider switching to alternative …
- What to do when FireDucks is slow
Monday, November 11, 2024 in Posts
Thank you for your interest in FireDucks. This article describes possible causes and remedies for slow programs using FireDucks. When a pandas program with FireDucks applied is slow, the reason may be the followings. Using ‘apply’ or …
- Workshop at Bangalore, India
Thursday, September 19, 2024 in Posts
We had a workshop on FireDucks with faculties from universities around Bangalore. Thank you for joining and discussion.