How to convert JSON data into a Python object

The Question :

313 people think this question is useful

I want to use Python to convert JSON data into a Python object.

I receive JSON data objects from the Facebook API, which I want to store in my database.

My current View in Django (Python) (request.POST contains the JSON):

response = request.POST
user = FbApiUser(user_id = response['id'])
user.name = response['name']
user.username = response['username']
user.save()

  • This works fine, but how do I handle complex JSON data objects?

  • Wouldn’t it be much better if I could somehow convert this JSON object into a Python object for easy use?

The Question Comments :
  • Typically JSON gets converted to vanilla lists or dicts. Is that what you want? Or are you hoping to convert JSON straight to a custom type?
  • I want to convert it into an object, something I can access using the “.” . Like from the above example -> reponse.name, response.education.id etc….
  • Using dicts is a weak-sauce way to do object-oriented programming. Dictionaries are a very poor way to communicate expectations to readers of your code. Using a dictionary, how can you clearly and reusably specify that some dictionary keys-value pairs are required, while others aren’t? What about confirming that a given value is in the acceptable range or set? What about functions that are specific to the type of object you are working with (aka methods)? Dictionaries are handy and versatile, but too many devs act like they forgot Python is an object oriented language for a reason.
  • There is a python library for this github.com/jsonpickle/jsonpickle (commenting since answer is too below in the thread and wont be reachable.)

The Answer 1

414 people think this answer is useful

UPDATE

With Python3, you can do it in one line, using SimpleNamespace and object_hook:

import json
from types import SimpleNamespace

data = '{"name": "John Smith", "hometown": {"name": "New York", "id": 123}}'

# Parse JSON into an object with attributes corresponding to dict keys.
x = json.loads(data, object_hook=lambda d: SimpleNamespace(**d))
print(x.name, x.hometown.name, x.hometown.id)

OLD ANSWER (Python2)

In Python2, you can do it in one line, using namedtuple and object_hook (but it’s very slow with many nested objects):

import json
from collections import namedtuple

data = '{"name": "John Smith", "hometown": {"name": "New York", "id": 123}}'

# Parse JSON into an object with attributes corresponding to dict keys.
x = json.loads(data, object_hook=lambda d: namedtuple('X', d.keys())(*d.values()))
print x.name, x.hometown.name, x.hometown.id

or, to reuse this easily:

def _json_object_hook(d): return namedtuple('X', d.keys())(*d.values())
def json2obj(data): return json.loads(data, object_hook=_json_object_hook)

x = json2obj(data)

If you want it to handle keys that aren’t good attribute names, check out namedtuple‘s rename parameter.

The Answer 2

137 people think this answer is useful

Check out the section titled Specializing JSON object decoding in the json module documentation. You can use that to decode a JSON object into a specific Python type.

Here’s an example:

class User(object):
    def __init__(self, name, username):
        self.name = name
        self.username = username

import json
def object_decoder(obj):
    if '__type__' in obj and obj['__type__'] == 'User':
        return User(obj['name'], obj['username'])
    return obj

json.loads('{"__type__": "User", "name": "John Smith", "username": "jsmith"}',
           object_hook=object_decoder)

print type(User)  # -> <type 'type'>

Update

If you want to access data in a dictionary via the json module do this:

user = json.loads('{"__type__": "User", "name": "John Smith", "username": "jsmith"}')
print user['name']
print user['username']

Just like a regular dictionary.

The Answer 3

104 people think this answer is useful

This is not code golf, but here is my shortest trick, using types.SimpleNamespace as the container for JSON objects.

Compared to the leading namedtuple solution, it is:

  • probably faster/smaller as it does not create a class for each object
  • shorter
  • no rename option, and probably the same limitation on keys that are not valid identifiers (uses setattr under the covers)

Example:

from __future__ import print_function
import json

try:
    from types import SimpleNamespace as Namespace
except ImportError:
    # Python 2.x fallback
    from argparse import Namespace

data = '{"name": "John Smith", "hometown": {"name": "New York", "id": 123}}'

x = json.loads(data, object_hook=lambda d: Namespace(**d))

print (x.name, x.hometown.name, x.hometown.id)

The Answer 4

99 people think this answer is useful

You could try this:

class User(object):
    def __init__(self, name, username, *args, **kwargs):
        self.name = name
        self.username = username

import json
j = json.loads(your_json)
u = User(**j)

Just create a new Object, and pass the parameters as a map.

The Answer 5

41 people think this answer is useful

Here’s a quick and dirty json pickle alternative

import json

class User:
    def __init__(self, name, username):
        self.name = name
        self.username = username

    def to_json(self):
        return json.dumps(self.__dict__)

    @classmethod
    def from_json(cls, json_str):
        json_dict = json.loads(json_str)
        return cls(**json_dict)

# example usage
User("tbrown", "Tom Brown").to_json()
User.from_json(User("tbrown", "Tom Brown").to_json()).to_json()

The Answer 6

17 people think this answer is useful

For complex objects, you can use JSON Pickle

Python library for serializing any arbitrary object graph into JSON. It can take almost any Python object and turn the object into JSON. Additionally, it can reconstitute the object back into Python.

The Answer 7

13 people think this answer is useful

If you’re using Python 3.5+, you can use jsons to serialize and deserialize to plain old Python objects:

import jsons

response = request.POST

# You'll need your class attributes to match your dict keys, so in your case do:
response['id'] = response.pop('user_id')

# Then you can load that dict into your class:
user = jsons.load(response, FbApiUser)

user.save()

You could also make FbApiUser inherit from jsons.JsonSerializable for more elegance:

user = FbApiUser.from_json(response)

These examples will work if your class consists of Python default types, like strings, integers, lists, datetimes, etc. The jsons lib will require type hints for custom types though.

The Answer 8

9 people think this answer is useful

If you are using python 3.6+, you can use marshmallow-dataclass. Contrarily to all the solutions listed above, it is both simple, and type safe:

from marshmallow_dataclass import dataclass

@dataclass
class User:
    name: str

user = User.Schema().load({"name": "Ramirez"})

The Answer 9

6 people think this answer is useful

Improving the lovasoa’s very good answer.

If you are using python 3.6+, you can use:
pip install marshmallow-enum and
pip install marshmallow-dataclass

Its simple and type safe.

You can transform your class in a string-json and vice-versa:

From Object to String Json:

    from marshmallow_dataclass import dataclass
    user = User("Danilo","50","RedBull",15,OrderStatus.CREATED)
    user_json = User.Schema().dumps(user)
    user_json_str = user_json.data

From String Json to Object:

    json_str = '{"name":"Danilo", "orderId":"50", "productName":"RedBull", "quantity":15, "status":"Created"}'
    user, err = User.Schema().loads(json_str)
    print(user,flush=True)

Class definitions:

class OrderStatus(Enum):
    CREATED = 'Created'
    PENDING = 'Pending'
    CONFIRMED = 'Confirmed'
    FAILED = 'Failed'

@dataclass
class User:
    def __init__(self, name, orderId, productName, quantity, status):
        self.name = name
        self.orderId = orderId
        self.productName = productName
        self.quantity = quantity
        self.status = status

    name: str
    orderId: str
    productName: str
    quantity: int
    status: OrderStatus

The Answer 10

5 people think this answer is useful

I have written a small (de)serialization framework called any2any that helps doing complex transformations between two Python types.

In your case, I guess you want to transform from a dictionary (obtained with json.loads) to an complex object response.education ; response.name, with a nested structure response.education.id, etc … So that’s exactly what this framework is made for. The documentation is not great yet, but by using any2any.simple.MappingToObject, you should be able to do that very easily. Please ask if you need help.

The Answer 11

4 people think this answer is useful

Since noone provided an answer quite like mine, I am going to post it here.

It is a robust class that can easily convert back and forth between json str and dict that I have copied from my answer to another question:

import json

class PyJSON(object):
    def __init__(self, d):
        if type(d) is str:
            d = json.loads(d)

        self.from_dict(d)

    def from_dict(self, d):
        self.__dict__ = {}
        for key, value in d.items():
            if type(value) is dict:
                value = PyJSON(value)
            self.__dict__[key] = value

    def to_dict(self):
        d = {}
        for key, value in self.__dict__.items():
            if type(value) is PyJSON:
                value = value.to_dict()
            d[key] = value
        return d

    def __repr__(self):
        return str(self.to_dict())

    def __setitem__(self, key, value):
        self.__dict__[key] = value

    def __getitem__(self, key):
        return self.__dict__[key]

json_str = """... json string ..."""

py_json = PyJSON(json_str)

The Answer 12

4 people think this answer is useful

dacite may also be a solution for you, it supports following features:

  • nested structures
  • (basic) types checking
  • optional fields (i.e. typing.Optional)
  • unions
  • forward references
  • collections
  • custom type hooks

https://pypi.org/project/dacite/

from dataclasses import dataclass
from dacite import from_dict


@dataclass
class User:
    name: str
    age: int
    is_active: bool


data = {
    'name': 'John',
    'age': 30,
    'is_active': True,
}

user = from_dict(data_class=User, data=data)

assert user == User(name='John', age=30, is_active=True)

The Answer 13

3 people think this answer is useful

While searching for a solution, I’ve stumbled upon this blog post: https://blog.mosthege.net/2016/11/12/json-deserialization-of-nested-objects/

It uses the same technique as stated in previous answers but with a usage of decorators. Another thing I found useful is the fact that it returns a typed object at the end of deserialisation

class JsonConvert(object):
    class_mappings = {}

    @classmethod
    def class_mapper(cls, d):
        for keys, cls in clsself.mappings.items():
            if keys.issuperset(d.keys()):   # are all required arguments present?
                return cls(**d)
        else:
            # Raise exception instead of silently returning None
            raise ValueError('Unable to find a matching class for object: {!s}'.format(d))

    @classmethod
    def complex_handler(cls, Obj):
        if hasattr(Obj, '__dict__'):
            return Obj.__dict__
        else:
            raise TypeError('Object of type %s with value of %s is not JSON serializable' % (type(Obj), repr(Obj)))

    @classmethod
    def register(cls, claz):
        clsself.mappings[frozenset(tuple([attr for attr,val in cls().__dict__.items()]))] = cls
        return cls

    @classmethod
    def to_json(cls, obj):
        return json.dumps(obj.__dict__, default=cls.complex_handler, indent=4)

    @classmethod
    def from_json(cls, json_str):
        return json.loads(json_str, object_hook=cls.class_mapper)

Usage:

@JsonConvert.register
class Employee(object):
    def __init__(self, Name:int=None, Age:int=None):
        self.Name = Name
        self.Age = Age
        return

@JsonConvert.register
class Company(object):
    def __init__(self, Name:str="", Employees:[Employee]=None):
        self.Name = Name
        self.Employees = [] if Employees is None else Employees
        return

company = Company("Contonso")
company.Employees.append(Employee("Werner", 38))
company.Employees.append(Employee("Mary"))

as_json = JsonConvert.to_json(company)
from_json = JsonConvert.from_json(as_json)
as_json_from_json = JsonConvert.to_json(from_json)

assert(as_json_from_json == as_json)

print(as_json_from_json)

The Answer 14

2 people think this answer is useful

Modifying @DS response a bit, to load from a file:

def _json_object_hook(d): return namedtuple('X', d.keys())(*d.values())
def load_data(file_name):
  with open(file_name, 'r') as file_data:
    return file_data.read().replace('\n', '')
def json2obj(file_name): return json.loads(load_data(file_name), object_hook=_json_object_hook)

One thing: this cannot load items with numbers ahead. Like this:

{
  "1_first_item": {
    "A": "1",
    "B": "2"
  }
}

Because “1_first_item” is not a valid python field name.

The Answer 15

2 people think this answer is useful

Expanding on DS’s answer a bit, if you need the object to be mutable (which namedtuple is not), you can use the recordclass library instead of namedtuple:

import json
from recordclass import recordclass

data = '{"name": "John Smith", "hometown": {"name": "New York", "id": 123}}'

# Parse into a mutable object
x = json.loads(data, object_hook=lambda d: recordclass('X', d.keys())(*d.values()))

The modified object can then be converted back to json very easily using simplejson:

x.name = "John Doe"
new_json = simplejson.dumps(x)

The Answer 16

1 people think this answer is useful

If you’re using Python 3.6 or newer, you could have a look at squema – a lightweight module for statically typed data structures. It makes your code easy to read while at the same time providing simple data validation, conversion and serialization without extra work. You can think of it as a more sophisticated and opinionated alternative to namedtuples and dataclasses. Here’s how you could use it:

from uuid import UUID
from squema import Squema


class FbApiUser(Squema):
    id: UUID
    age: int
    name: str

    def save(self):
        pass


user = FbApiUser(**json.loads(response))
user.save()

The Answer 17

1 people think this answer is useful

You can use

x = Map(json.loads(response))
x.__class__ = MyClass

where

class Map(dict):
    def __init__(self, *args, **kwargs):
        super(Map, self).__init__(*args, **kwargs)
        for arg in args:
            if isinstance(arg, dict):
                for k, v in arg.iteritems():
                    self[k] = v
                    if isinstance(v, dict):
                        self[k] = Map(v)

        if kwargs:
            # for python 3 use kwargs.items()
            for k, v in kwargs.iteritems():
                self[k] = v
                if isinstance(v, dict):
                    self[k] = Map(v)

    def __getattr__(self, attr):
        return self.get(attr)

    def __setattr__(self, key, value):
        self.__setitem__(key, value)

    def __setitem__(self, key, value):
        super(Map, self).__setitem__(key, value)
        self.__dict__.update({key: value})

    def __delattr__(self, item):
        self.__delitem__(item)

    def __delitem__(self, key):
        super(Map, self).__delitem__(key)
        del self.__dict__[key]

For a generic, future-proof solution.

The Answer 18

1 people think this answer is useful

I was searching for a solution that worked with recordclass.RecordClass, supports nested objects and works for both json serialization and json deserialization.

Expanding on DS’s answer, and expanding on solution from BeneStr, I came up with the following that seems to work:

Code:

import json
import recordclass

class NestedRec(recordclass.RecordClass):
    a : int = 0
    b : int = 0

class ExampleRec(recordclass.RecordClass):
    x : int       = None
    y : int       = None
    nested : NestedRec = NestedRec()

class JsonSerializer:
    @staticmethod
    def dumps(obj, ensure_ascii=True, indent=None, sort_keys=False):
        return json.dumps(obj, default=JsonSerializer.__obj_to_dict, ensure_ascii=ensure_ascii, indent=indent, sort_keys=sort_keys)

    @staticmethod
    def loads(s, klass):
        return JsonSerializer.__dict_to_obj(klass, json.loads(s))

    @staticmethod
    def __obj_to_dict(obj):
        if hasattr(obj, "_asdict"):
            return obj._asdict()
        else:
            return json.JSONEncoder().default(obj)

    @staticmethod
    def __dict_to_obj(klass, s_dict):
        kwargs = {
            key : JsonSerializer.__dict_to_obj(cls, s_dict[key]) if hasattr(cls,'_asdict') else s_dict[key] \
                for key,cls in klass.__annotations__.items() \
                    if s_dict is not None and key in s_dict
        }
        return klass(**kwargs)

Usage:

example_0 = ExampleRec(x = 10, y = 20, nested = NestedRec( a = 30, b = 40 ) )

#Serialize to JSON

json_str = JsonSerializer.dumps(example_0)
print(json_str)
#{
#  "x": 10,
#  "y": 20,
#  "nested": {
#    "a": 30,
#    "b": 40
#  }
#}

# Deserialize from JSON
example_1 = JsonSerializer.loads(json_str, ExampleRec)
example_1.x += 1
example_1.y += 1
example_1.nested.a += 1
example_1.nested.b += 1

json_str = JsonSerializer.dumps(example_1)
print(json_str)
#{
#  "x": 11,
#  "y": 21,
#  "nested": {
#    "a": 31,
#    "b": 41
#  }
#}

The Answer 19

1 people think this answer is useful

The answers given here does not return the correct object type, hence I created these methods below. They also fail if you try to add more fields to the class that does not exist in the given JSON:

def dict_to_class(class_name: Any, dictionary: dict) -> Any:
    instance = class_name()
    for key in dictionary.keys():
        setattr(instance, key, dictionary[key])
    return instance


def json_to_class(class_name: Any, json_string: str) -> Any:
    dict_object = json.loads(json_string)
    return dict_to_class(class_name, dict_object)

The Answer 20

1 people think this answer is useful

There are multiple viable answers already, but there are some minor libraries made by individuals that can do the trick for most users.

An example would be json2object. Given a defined class, it deserialises json data to your custom model, including custom attributes and child objects.

Its use is very simple. An example from the library wiki:

from json2object import jsontoobject as jo

class Student:
    def __init__(self):
        self.firstName = None
        self.lastName = None
        self.courses = [Course('')]

class Course:
    def __init__(self, name):
        self.name = name

data = '''{
"firstName": "James",
"lastName": "Bond",
"courses": [{
    "name": "Fighting"},
    {
    "name": "Shooting"}
    ]
}
'''

model = Student()
result = jo.deserialize(data, model)
print(result.courses[0].name)

The Answer 21

0 people think this answer is useful

Python3.x

The best aproach I could reach with my knowledge was this.
Note that this code treat set() too.
This approach is generic just needing the extension of class (in the second example).
Note that I’m just doing it to files, but it’s easy to modify the behavior to your taste.

However this is a CoDec.

With a little more work you can construct your class in other ways. I assume a default constructor to instance it, then I update the class dict.

import json
import collections


class JsonClassSerializable(json.JSONEncoder):

    REGISTERED_CLASS = {}

    def register(ctype):
        JsonClassSerializable.REGISTERED_CLASS[ctype.__name__] = ctype

    def default(self, obj):
        if isinstance(obj, collections.Set):
            return dict(_set_object=list(obj))
        if isinstance(obj, JsonClassSerializable):
            jclass = {}
            jclass["name"] = type(obj).__name__
            jclass["dict"] = obj.__dict__
            return dict(_class_object=jclass)
        else:
            return json.JSONEncoder.default(self, obj)

    def json_to_class(self, dct):
        if '_set_object' in dct:
            return set(dct['_set_object'])
        elif '_class_object' in dct:
            cclass = dct['_class_object']
            cclass_name = cclass["name"]
            if cclass_name not in self.REGISTERED_CLASS:
                raise RuntimeError(
                    "Class {} not registered in JSON Parser"
                    .format(cclass["name"])
                )
            instance = self.REGISTERED_CLASS[cclass_name]()
            instance.__dict__ = cclass["dict"]
            return instance
        return dct

    def encode_(self, file):
        with open(file, 'w') as outfile:
            json.dump(
                self.__dict__, outfile,
                cls=JsonClassSerializable,
                indent=4,
                sort_keys=True
            )

    def decode_(self, file):
        try:
            with open(file, 'r') as infile:
                self.__dict__ = json.load(
                    infile,
                    object_hook=self.json_to_class
                )
        except FileNotFoundError:
            print("Persistence load failed "
                  "'{}' do not exists".format(file)
                  )


class C(JsonClassSerializable):

    def __init__(self):
        self.mill = "s"


JsonClassSerializable.register(C)


class B(JsonClassSerializable):

    def __init__(self):
        self.a = 1230
        self.c = C()


JsonClassSerializable.register(B)


class A(JsonClassSerializable):

    def __init__(self):
        self.a = 1
        self.b = {1, 2}
        self.c = B()

JsonClassSerializable.register(A)

A().encode_("test")
b = A()
b.decode_("test")
print(b.a)
print(b.b)
print(b.c.a)


Edit

With some more of research I found a way to generalize without the need of the SUPERCLASS register method call, using a metaclass

import json
import collections

REGISTERED_CLASS = {}

class MetaSerializable(type):

    def __call__(cls, *args, **kwargs):
        if cls.__name__ not in REGISTERED_CLASS:
            REGISTERED_CLASS[cls.__name__] = cls
        return super(MetaSerializable, cls).__call__(*args, **kwargs)


class JsonClassSerializable(json.JSONEncoder, metaclass=MetaSerializable):

    def default(self, obj):
        if isinstance(obj, collections.Set):
            return dict(_set_object=list(obj))
        if isinstance(obj, JsonClassSerializable):
            jclass = {}
            jclass["name"] = type(obj).__name__
            jclass["dict"] = obj.__dict__
            return dict(_class_object=jclass)
        else:
            return json.JSONEncoder.default(self, obj)

    def json_to_class(self, dct):
        if '_set_object' in dct:
            return set(dct['_set_object'])
        elif '_class_object' in dct:
            cclass = dct['_class_object']
            cclass_name = cclass["name"]
            if cclass_name not in REGISTERED_CLASS:
                raise RuntimeError(
                    "Class {} not registered in JSON Parser"
                    .format(cclass["name"])
                )
            instance = REGISTERED_CLASS[cclass_name]()
            instance.__dict__ = cclass["dict"]
            return instance
        return dct

    def encode_(self, file):
        with open(file, 'w') as outfile:
            json.dump(
                self.__dict__, outfile,
                cls=JsonClassSerializable,
                indent=4,
                sort_keys=True
            )

    def decode_(self, file):
        try:
            with open(file, 'r') as infile:
                self.__dict__ = json.load(
                    infile,
                    object_hook=self.json_to_class
                )
        except FileNotFoundError:
            print("Persistence load failed "
                  "'{}' do not exists".format(file)
                  )


class C(JsonClassSerializable):

    def __init__(self):
        self.mill = "s"


class B(JsonClassSerializable):

    def __init__(self):
        self.a = 1230
        self.c = C()


class A(JsonClassSerializable):

    def __init__(self):
        self.a = 1
        self.b = {1, 2}
        self.c = B()


A().encode_("test")
b = A()
b.decode_("test")
print(b.a)
# 1
print(b.b)
# {1, 2}
print(b.c.a)
# 1230
print(b.c.c.mill)
# s

The Answer 22

0 people think this answer is useful

The lightest solution I think is

import json
from typing import NamedTuple

_j = '{"name":"Иван","age":37,"mother":{"name":"Ольга","age":58},"children":["Маша","Игорь","Таня"],"married": true,' \
     '"dog":null} '


class PersonNameAge(NamedTuple):
    name: str
    age: int


class UserInfo(NamedTuple):
    name: str
    age: int
    mother: PersonNameAge
    children: list
    married: bool
    dog: str


j = json.loads(_j)
u = UserInfo(**j)

print(u.name, u.age, u.mother, u.children, u.married, u.dog)

>>> Ivan 37 {'name': 'Olga', 'age': 58} ['Mary', 'Igor', 'Jane'] True None

The Answer 23

-4 people think this answer is useful

Use the json module (new in Python 2.6) or the simplejson module which is almost always installed.

Add a Comment