# Summary

An introduction to the MTGJSON card data, including reviewing the data files and column features.

0.1 Introduction

I will use the data from MTGJSON. The AllPrintings card data comes in various formats, such as json, sql, csv, and parquet.

I will use the parquet format, since that is the most performant format for data analysis. It has high compression, fast load times, and can query directly on disk. This minimizes both disk space and memory usage.

I will also use the AllPrices data for economic analysis. This is only available in the json format.

0.2 Downloads

See notebook 10-get-data.ipynb to fetch the data.

1 Review AllPrintings Tables

We have 18 parquet files associated with the card data, let’t take a quick tour.

Changed working directory to: d:\mtg-modeling
['cardForeignData.parquet',
 'cardIdentifiers.parquet',
 'cardLegalities.parquet',
 'cardPrices.parquet',
 'cardPurchaseUrls.parquet',
 'cardRulings.parquet',
 'cards.parquet',
 'meta.parquet',
 'setBoosterContentWeights.parquet',
 'setBoosterContents.parquet',
 'setBoosterSheetCards.parquet',
 'setBoosterSheets.parquet',
 'setTranslations.parquet',
 'sets.parquet',
 'tokenIdentifiers.parquet',
 'tokens.parquet']

1.1 Card Files:

  • cards.parquet: The primary file that contains card data, such as card name, mana cost, type, and text.
  • tokens.parquet: Same for tokens.
  • cardForeignData.parquet: Foreign language translations of cards.
  • cardLegalities.parquet: Legality of cards for various play formats.
  • cardPrices.parquet: Latest prices for cards on various platforms, including retail and buylist prices.
  • cardPurchaseUrls.parquet: URLs to various retail platforms.
  • cardRulings.parquet: The rulings for cards.

1.2 Set Files:

  • sets.parquet: Data on various released sets, such as set code (10E, OTJ…), set size, and release date.
  • setTranslations.parquet: Translations for set names in various languages.

1.3 Identifier Files:

  • cardIdentifiers.parquet: Identifiers for various MTG data platforms (TCG Collector, Scryfall, Cardmarket…).
  • tokenIdentifiers.parquet: Same for tokens.

1.4 Set Booster Files:

  • setBoosterContents.parquet: For booster packs, different mixes of sheet composition (1 theList + 13 others versus 0 theList + 14 others).
  • setBoosterContentWeights.parquet: The weight of each booster mix (1 in 10 boosters has theList).
  • setBoosterSheets.parquet: Card sheet information.
  • setBoosterSheetCards.parquet: Card composition of each sheet, including counts.

1.5 Meta File:

  • meta.parquet: Version and date for current MTGJSON build.

2 Unique Identifiers

Most of the files have a uuid. This is the universally unique identifier (UUID v5) for each card printing. It is the primary key for the cards.parquet file and will be used to join data across tables.

2.1 MTGJSON

  • uuid:
    • Reprinted card editions: Unique id
    • Double-faced cards (DBC): Each face has a unique uuid.
    • Foreign languages: Same Id.

2.2 WOTC Gatherer

  • multiverseId: The WOTC card identifier used their Gatherer card database.
    • Reprinted card editions: Unique id
    • Double-faced cards: Same id
    • Foreign languages: Different id

2.3 Scryfall

  • scryfallId: The Scryfall uuid. It has different rules than the MTGJSON uuid, such as faces of DFCs are not unique.
    • Reprinted card editions: Unique id
    • Double-faced cards: Same id. See scryfallCardBackId.
    • Foreign languages: Different id