The code to acquire and pre-process the card data to use for future analysis.
Grab MTGJSON Card Data
Here I will download and clean the data for MTG cards.
First we will download the data from MTGJSON. The AllPrintings card data comes in various formats, such as json, sql, csv, and parquet.
I will use the parquet format, since that is the most performant format for data analysis. It has high compression, fast load times, and can query directly on disk. This minimizes both disk space and memory usage.
Data URLs: - Linux: https://mtgjson.com/api/v5/AllPrintingsParquetFiles.tar.gz - Windows: https://mtgjson.com/api/v5/AllPrintingsParquetFiles.zip
The following code downloads and decompresses the data.
Downloading AllPrintingsParquetFiles Data
Starting datetime: 2024-08-24 20:29:54.591346
1.1G data/raw/mtgjson
Downloaded AllPrintingsParquetFiles Data
Final size: 0
Final path: data/raw/mtgjson/AllPrintingsParquetFiles
Finished datetime: 2024-08-24 20:30:35.446599
Grab MTGJSON All Price Data
Next we will download the data from MTGJSON. The AllPrices card data only comes in json format. We will have to convert this to parquet for ease of use in future analysis.
Note this data only covers the previous 90 days.
Since the file is very large, I will use polars instead of pandas
Data URLs: - Linux: https://mtgjson.com/api/v5/AllPrices.json.gz - Windows: https://mtgjson.com/api/v5/AllPrices.json.zip
The following code downloads and decompresses the data.
Downloading AllPrices.json Data
Starting datetime: 2024-08-24 20:30:35.455416
1.1G data/raw/mtgjson
Downloaded AllPrices.json Data
Final size: 0
Final path: data/raw/mtgjson/AllPrices
Finished datetime: 2024-08-24 20:31:12.933750