
Python dataclasses
Python dataclasses is a built-in module which provides a decorator and functions for automatically adding generated special methods such as __init__()
and __repr__()
to user-defined classes.
dataclasses is supported in Python version 3.7 and above.
Importing dataclasses module:
from dataclasses import dataclass
dataclasses
is a module which contains dataclass
. dataclass
is a decorator function for classes.
dataclass parameters.
@dataclasses.dataclass(*, init=True, repr=True, eq=True, order=False, unsafe_hash=False, frozen=False)
By default, dataclass provides these three methods.
default dataclass parameters:
- __init__() method for initializing objects,
- _repr__() for object representation
- __eq__() method to do equality operations on class objects.
from dataclasses import dataclass
@dataclass
class Student:
firstname:str
lastname:str
rollno:int
grade:str
#Instantiating objects __init__() method does this
s1=Student("karthi","Palani",12,"third")
s2=Student("Sarvesh","Palani",15,"first")
#__repr__() method does this
print (s1)
#Output:Student(firstname='karthi', lastname='Palani', rollno=12, grade='third')
print (s2)
#Output:Student(firstname='Sarvesh', lastname='Palani', rollno=15, grade='first')
#__eq__() method does this
print (s1==s2) #Output:False
print(s1!=s2) #Output:True
In dataclass, no need to specify __init__() ,__repr__() ,__eq__() special methods. dataclass decorator automatically generate these special methods to the user defined class.
class attributes names are defined using type annotations like
firstname:str
lastname:str
rollno:int
grade:str
In normal classes, we have to mention the __init__() ,__repr__() and __eq__() method.
class Student: def __init__(self, firstname, lastname, rollno, grade): self.firstname = firstname self.lastname = lastname self.rollno = rollno self.grade = grade def __repr__(self): return f"{self.firstname}-{self.lastname}-{self.rollno}-{self.grade}" def __eq__(self, other): return (self.firstname, self.lastname, self.rollno, self.grade) == ( other.firstname, other.lastname, other.rollno, other.grade) # Instantiating Student Objects s1 = Student("karthi", "Palani", 12, "third") s2 = Student("Sarvesh", "Palani", 15, "first") # objects representation is defined in __repr__() method. print(s1) # Output:karthi-Palani-12-third print(s2) # Output:Sarvesh-Palani-15-first # performing equality operations by __eq__() method. print(s1 == s2)
Paramaterized dataclasses:
@dataclasses.dataclass(*, init=True, repr=True, eq=True, order=False, unsafe_hash=False, frozen=False)
- order:If true (the default is
False
),__lt__(), __le__(), __gt__(), and __ge__() methods will be generated. - unsafe_hash:If true(default is False),__hash__() method will be generated
- frozen:If frozen is set to True,attributes of class objects are immutable. It can’t be modified.Default is False.
In the below example, have defined parameterized dataclasses.
- order is set to True- Comparison operations(<,>,≤,≥ )can be performed on objects
from dataclasses import dataclass,field
@dataclass(order=True)
class Student:
firstname:str
lastname:str
grade:str
rollno:int
s1=Student("karthi","Palani","third",12)
s2=Student("Sarvesh","Palani","third",12)
#Performing comparision operations.
print (s1>s2) #Output:True
print (s1>=s2) #Output:True
print (s1<=s2)#Output:False
print (s1<s2)#Output:False
- frozen is set to True- attributes values can’t be changed.
- unsafe_hash is set to True
Usually, hash() function is used to calculate the hash value of immutable data types. But in some cases, have to find the hash value of mutable data types. It can be done by setting unsafe_hash=True. By default, unsafe_hash is False, so if we attempt to find the hash value of mutable attributes, it will raise an error.
from dataclasses import dataclass
@dataclass(unsafe_hash=True)
class Student:
firstname:str
lastname:str
rollno:int
grade:str
#instantiating objects __init__() method does this
s1=Student("karthi","Palani",12,"third")
s2=Student("Sarvesh","Palani",15,"first")
#performing hash operations on mutable attributes by setting unsafe_hash=True
print (hash(s1)) #Output:-1138310786
#modiying attribute values
s1.firstname="Indhu"
#since attribute values are changed,hash value is changed.
print (hash(s1)) #Output:-124687988
field():
dataclasses has field() function.It allows to give additional per field information.
Importing field function:
from dataclasses import dataclass,field
dataclasses.field(*, default=MISSING, default_factory=MISSING, repr=True, hash=None, init=True, compare=True, metadata=None)
As shown above, the MISSING
value is a sentinel object used to detect if the default
and default_factory
parameters are provided. This sentinel is used because None
is a valid value for default
. No code should directly use the MISSING
value.
The parameters to field() are:
- default
- default_factory
- init
- repr
- hash
- compare
- metadata
- default : default parameter in field() is used to specify default values for this field. Default attributes should follow non-default attributes. (like function with default parameters)
rollno:int=field(default=1)
or
rollno:int=1
Here, we are mentioning the default value of rollno as 1
from dataclasses import dataclass, field
@dataclass
class Student:
firstname: str
lastname: str
grade: str
rollno: int = field(default=1)
# Instantiating objects
# rollno is not given.it will take defualt value 1
s1 = Student("karthi", "Palani", "third")
print(s1)
# Student(firstname='karthi', lastname='Palani', grade='third', rollno=1)
# rollno is given.
s2 = Student("Sarvesh", "Palani", "first", 15)
print(s2)
# Student(firstname='Sarvesh', lastname='Palani', grade='first', rollno=15)
2.default_factory
default_factory — accepts function. Return value of that function will be the default value of that attribute.
rollno:int=field(default_factory=get_rollno)
from dataclasses import dataclass,field
def get_rollno():
return 12
@dataclass(order=True)
class Student:
firstname:str
lastname:str
grade:str
rollno:int=field(default_factory=get_rollno)
#If rollno is not mentioned, it will take the defualt value 12
s1=Student("karthi","Palani","third")
s2=Student("Sarvesh","Palani","third",15)
print (s1)
#Output:Student(firstname='karthi', lastname='Palani', grade='third', rollno=12)
print (s2)
#Output:Student(firstname='Sarvesh', lastname='Palani', grade='third', rollno=15)
3.init
By default set to True. If init is set to False means, no need to include this field as a parameter while instantiating an object.
from dataclasses import dataclass, field
@dataclass
class Student:
firstname: str
lastname: str
grade: str
rollno: int = field(init=False, default=5)
# Instantiating objects
s1 = Student("karthi", "Palani", "third")
print(s1)
# Output:Student(firstname='karthi', lastname='Palani', grade='third', rollno=5)
4.repr
By default, repr is set to True. That means object representation format includes this field. If we want to exclude this field from object representation means, we can set repr=False
from dataclasses import dataclass, field
@dataclass
class Student:
firstname: str
lastname: str = field(repr=False)
grade: str = field(repr=False)
rollno: int
# Instantiating objects
s1 = Student("karthi", "Palani", "third", 12)
# only firstname and rollno will be displayed in object representation
print(s1)
# Output:Student(firstname='karthi', rollno=12)
s2 = Student("Sarvesh", "Palani", "first", 15)
# Output:Student(firstname='Sarvesh', rollno=15)
5.hash
It can have a bool or None value. If we set hash=True, this field is included in the hash function. Usually, hash function is used when comparing objects. If it is set to None, the value of the compare parameter is used. Default is None.
rollno:int=field(hash=True)
6.compare
By default, compare is set to True. That means this field is included in comparison and equality operations. If set to False, this field is excluded from comparison and equality operation.
from dataclasses import dataclass, field
@dataclass(order=True)
class Student:
firstname: str = field(compare=False)
lastname: str
rollno: int
grade: str
# Instantiating object.
s1 = Student("karthi", "Palani", 12, "third")
s2 = Student("Sarvesh", "Palani", 12, "third")
# Returns True, because firstname is not part of comparison.since compare is set to False
print(s1 == s2)
# Output:True
print(s1 >= s2) # Output:True
7.metadata
It is actually a dictionary(key-value pair).metadata is not used by class objects. But it is important if dataclass is being used or accessed by some third-party applications. It gives some information about this field.
from dataclasses import dataclass,field
@dataclass
class Student:
firstname:str
lastname:str
rollno:int=field(metadata={'student':'register number'})
grade:str=field(default="third")
s1=Student("karthi","Palani",12)
s2=Student("Sarvesh","Palani",12)
#metadata information is retreived
print (s1.__dataclass_fields__['rollno'])
#Output: Field(name='rollno',type=<class 'int'>,default=<dataclasses._MISSING_TYPE object at 0x0357D3E8>,default_factory=<dataclasses._MISSING_TYPE object at 0x0357D3E8>,
init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'student': 'register number'}),_field_type=_FIELD)
post_init processing:
While instantiating object, __init__() method is called. After __init__() method is processed, if __post_init__() is defined in the class, it is called automatically.
It is used for initializing field values that depend on one or more fields.
If we want to initialize an attribute based on other attributes in that class means, we can define that logic in __post_init__() method.
For example: In Student dataclass, if we want to add fullname attribute, it can be created by concatenating firstname and lastname. We can define this in __post_init__() method.
from dataclasses import dataclass, field
@dataclass(order=True)
class Student:
firstname: str
lastname: str
rollno: int
grade: str
def __post_init__(self):
self.fullname=self.firstname+" " +self.lastname
# Instantiating object.
s1 = Student("karthi", "Palani", 12, "third")
s2 = Student("Sarvesh", "Palani", 12, "third")
print (s1.fullname)#Output:karthi Palani
print (s2.fullname)#Output:Sarvesh Palani
asdict(),astuple():
asdict()
Converts dataclass instance into a dictionary. Each dataclass is converted to a dict of its fields(name-value pair)
astuple()
converts dataclass instance into a tuple. Each dataclass is converted to a tuple of its field values.
We have to import asdict
and astuple
from dataclasses module
from dataclasses import dataclass,field,asdict,astuple
is_dataclass()
dataclass.is_dataclass(class_or_instance)
Returns True if its parameter is a dataclass or dataclass object. Otherwise returns False.
from dataclasses import dataclass,field,asdict,astuple,is_dataclass
@dataclass
class Student:
firstname:str
lastname:str
rollno:int
grade:str
fullname:str=field(init=False)
def __post_init__(self):
self.fullname=self.firstname+" " + self.lastname
#Instantiating object.
s1=Student("karthi","Palani",12,"third")
print (s1)
#Output:Student(firstname='karthi', lastname='Palani', rollno=12, grade='third', fullname='karthi Palani')
#s1(dataclass instance) is converted into dict.
print (asdict(s1))
#Output:{'firstname': 'karthi', 'lastname': 'Palani', 'rollno': 12, 'grade': 'third', 'fullname': 'karthi Palani'}
#s1 is converted into tuple.
print (astuple(s1))
#Output: ('karthi', 'Palani', 12, 'third', 'karthi Palani')
#checks whether s1 is dataclass/dataclass object
print (is_dataclass(s1)) #Output:True
#checks whether Student is dataclass/dataclass object
print (is_dataclass(Student))#Output: True
Conclusion:
- If we specify default arguments, that attribute should be the last one. Otherwise, it will raise TypeError.
- If frozen is set to True, we can perform a hash function on the attributes. Because frozen attributes will be immutable.
- Below mentioned three dataclass declarations are equivalent.
@dataclass
class Student
@dataclass()
class Student
@dataclass(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)
class Student
Resources(Python docs):
Make a one-time donation
Make a monthly donation
Make a yearly donation
Choose an amount
Or enter a custom amount
Your contribution is appreciated.
Your contribution is appreciated.
Your contribution is appreciated.
Buy Me a CoffeeBuy Me a CoffeeBuy Me a Coffee