Python 平铺数据并映射

  • Post category:Python

Python 中的平铺数据映射功能是指将数据从嵌套格式转换为单层格式,并将其中的每个键映射到原始嵌套层次结构中的路径。这样可以使数据更易于处理和分析。

平铺数据映射的使用方法如下:

  • 导入需要使用的库
import pandas as pd
from pandas.io.json import json_normalize
  • 将数据加载到 Pandas DataFrame 中
data = pd.read_json('data.json')
  • 使用 json_normalize 函数将嵌套数据进行平铺
flatdata = json_normalize(data)
  • 映射数据的键到原始嵌套层次结构中的路径
flatdata.columns = flatdata.columns.map(lambda x: '.'.join(x.split('.')[1:]))
  • 完整示例代码:
import pandas as pd

data = pd.read_json('data.json')
flatdata = pd.json_normalize(data)
flatdata.columns = flatdata.columns.map(lambda x: '.'.join(x.split('.')[1:]))
  • 示例说明1:

假设有以下示例嵌套 JSON 数据:

{
    "name": "John",
    "age": 30,
    "address": {
        "street": "123 Main St",
        "city": "Anytown",
        "state": "CA",
        "zip": 12345
    },
    "phone": {
        "home": "555-555-1234",
        "work": "555-555-5678"
    }
}

使用 Python 进行平铺映射后,得到的结果如下:

name | age | address.street | address.city | address.state | address.zip | phone.home | phone.work
---- | --- | -------------- | ------------ | ------------- | ----------- | ---------- | -----------
John | 30  | 123 Main St    | Anytown      | CA            | 12345       | 555-555-1234 | 555-555-5678

其中,每个键都映射到原始嵌套层次结构中的路径。

  • 示例说明2:

假设有以下示例嵌套 JSON 数组数据:

[
    {
        "name": "John",
        "age": 30,
        "address": {
            "street": "123 Main St",
            "city": "Anytown",
            "state": "CA",
            "zip": 12345
        },
        "phone": {
            "home": "555-555-1234",
            "work": "555-555-5678"
        }
    },
    {
        "name": "Mary",
        "age": 25,
        "address": {
            "street": "456 Elm St",
            "city": "Anytown",
            "state": "CA",
            "zip": 12345
        },
        "phone": {
            "home": "555-555-9876",
            "work": "555-555-4321"
        }
    }
]

使用 Python 进行平铺映射后,得到的结果如下:

name | age | address.street | address.city | address.state | address.zip | phone.home | phone.work
---- | --- | -------------- | ------------ | ------------- | ----------- | ---------- | -----------
John | 30  | 123 Main St    | Anytown      | CA            | 12345       | 555-555-1234 | 555-555-5678
Mary | 25  | 456 Elm St     | Anytown      | CA            | 12345       | 555-555-9876 | 555-555-4321

其中每个键都映射到原始嵌套层次结构中的路径,并且将每个对象作为单独的行映射到 DataFrame 中。