Summary

The code to acquire and pre-process the card data to use for future analysis.

1 Grab MTGJSON Card Data

Here I will download and clean the data for MTG cards.

First we will download the data from MTGJSON. The AllPrintings card data comes in various formats, such as json, sql, csv, and parquet.

I will use the parquet format, since that is the most performant format for data analysis. It has high compression, fast load times, and can query directly on disk. This minimizes both disk space and memory usage.

Data URLs: - Linux: https://mtgjson.com/api/v5/AllPrintingsParquetFiles.tar.gz - Windows: https://mtgjson.com/api/v5/AllPrintingsParquetFiles.zip

The following code downloads and decompresses the data.

Downloading AllPrintingsParquetFiles Data
Starting datetime: 2024-08-24 20:29:54.591346
1.1G    data/raw/mtgjson
Downloaded AllPrintingsParquetFiles Data
Final size: 0
Final path: data/raw/mtgjson/AllPrintingsParquetFiles
Finished datetime: 2024-08-24 20:30:35.446599

2 Grab MTGJSON All Price Data

Next we will download the data from MTGJSON. The AllPrices card data only comes in json format. We will have to convert this to parquet for ease of use in future analysis.

Note this data only covers the previous 90 days.

Since the file is very large, I will use polars instead of pandas

Data URLs: - Linux: https://mtgjson.com/api/v5/AllPrices.json.gz - Windows: https://mtgjson.com/api/v5/AllPrices.json.zip

The following code downloads and decompresses the data.

Downloading AllPrices.json Data
Starting datetime: 2024-08-24 20:30:35.455416
1.1G    data/raw/mtgjson
Downloaded AllPrices.json Data
Final size: 0
Final path: data/raw/mtgjson/AllPrices
Finished datetime: 2024-08-24 20:31:12.933750