简介

默认情况下，python对象不可比较，除非它们相同。is检查内存中的对象引用，而==对于某些类型（如object），实际上是基于id()比较。

哈希值是数据的唯一标识，相同数据的哈希值总是相同，不同数据的哈希值几乎不可能相同。哈希适用于快速查找，如字典和集合。__hash__和__eq__方法允许自定义对象的哈希行为。

Toroid托瑞德符号.png

2 hash和id 相等

内置函数 hash 和 id 构成了python对象是否相等的支持默认情况下，python对象是不可比较的，除非它们相同。

A = object()
B = object()
A == B
    False

object也被称为哨兵，因为它们可用于准确检查无法复制的值

    __my_sentinel = object()

   def what_was_passed(value=__my_sentinel):
        if value is __my_sentinel:
            print('Nothing was passed.')
        else:
            print(f'You passed a {value!r}.')
    >>> what_was_passed("abc")
        You passed a 'abc'.
    >>> what_was_passed(object())
        You passed a <object object at 0x0000027AF6AD88C0>.
    >>> what_was_passed()
        Nothing was passed.
    >>> what_was_passed(None)
        You passed a None.

python的比较，我们必须了解is关键字python的 is运算符用于检查两个值是否引用内存相同的确切对象。

将python对象想象位空间中漂浮的盒子，变量，数组索引等被命名位指向这些对象的箭头

而 object 的行为 == 被定义为比较 id，例如这样的类型东西可覆盖 ==

  class object:
       def __eq__(self, other):
              return self is other

实际实现object 是用 C 编写的

  x = [1, 2, 3]
  y = [1, 2, 3]
  x is y
      False
  x == y
      True

我们还没有看过all或者zip还没有看过，但这一切都是为了确保所有给定的列表索引都是相等的。

同样，集合是无序的，所以即使它们的位置也不重要，只有它们的“存在” Python 有了hashes的概念。

任何一段数据的“散列”指的是一个看起来非常随机的预先计算的值，但它可以用来识别那段数据。

3 哈希及其属性

哈希有两个特定的属性：

相同的数据将始终具有相同的哈希值。

即使是非常轻微地更改数据，也会以完全不同的哈希值返回。

这意味着如果两个值具有相同的哈希值，那么它们很可能 * 也具有相同的值。

比较哈希是检查“存在”的一种非常快速的方法。这是字典和集合用来立即在其中查找值的方法：

            >>> hash(42) == hash(42.0) == hash(42+0j)
        True

不可变的容器对象，例如字符串（字符串是字符串的序列）、元组和frozensets，通过组合它们的项目的哈希来生成它们的哈希。

这允许您通过组合函数来为您的类创建自定义哈希hash函数

class Car:
    def __init__(self, color, wheels=4):
        self.color = color
        self.wheels = wheels

    def __hash__(self):
        return hash((self.color, self.wheels))

4 dir 和 vars 一切都是字典对象

所有内容都存储在字典中，vars方法公开了存储在对象和类的变量一个函数的的vars方法

    >>> vars(what_was_passed)
        {}

一个类的vars方法

    c = C(x=3)

    >>> vars(C)
    mappingproxy({'__module__': '__main__', 
    'some_cons': 42, 
    '__init__': <function C.__init_
    _ at 0x0000027AF72F4940>, 
    'mth': <function C.mth at 0x0000027AF72F49D0>, 
    '__dict__': <at
    tribute '__dict__' of 'C' objects>, 
    '__weakref__': <attribute '__weakref__' of 'C' objects>, 
    '__doc__': None})
    >>> vars(c)
    {'x': 3}

    c.__class__  # 来自继承
    <class '__main__.C'>

方法（mth和init）实际上作为函数存储在类的字典中，函数本身的代码不会因每个对象而改变，只有传递给它的变量会改变。

>>> class C:
...     def function(self, x):
...         print(f'self={self}, x={x}')

>>> c = C()
>>> C.function(c, 5)
self=<__main__.C object at 0x7f90762461f0>, x=5
>>> c.function(5)
self=<__main__.C object at 0x7f90762461f0>, x=5

5 查看objects的所有属性：

    >>> "__class__" in vars(object)
    True
    >>> vars(object).keys()
    dict_keys(['__repr__', '__hash__', '__str__', '__getattribute__', '__setattr__', '__dela
    ttr__', '__lt__', '__le__', '__eq__', '__ne__', '__gt__', '__ge__', '__init__', '__new__
    ', '__reduce_ex__', '__reduce__', '__subclasshook__', '__init_subclass__', '__format__',
     '__sizeof__', '__dir__', '__class__', '__doc__'])

“方法解析顺序”是如何工作的。简称 MRO，这是一个对象继承属性和方法的类的列表。

    >>> class A:
    ...     def __init__(self):
    ...         self.x = 'x'
    ...         self.y = 'y'
    ...
    >>> class B(A):
    ...     def __init__(self):
    ...         self.z = 'z'
    ...
    >>> a = A()
    >>> b = B()
    >>> B.mro()
    [<class '__main__.B'>, <class '__main__.A'>, <class 'object'>]
    >>> dir(b)
    ['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__',
    '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__',
    '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__',
    '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__',
    '__subclasshook__', '__weakref__', 'x', 'y', 'z']
    >>> set(dir(b)) - set(dir(a))  # all values in dir(b) that are not in dir(a)
    {'z'}
    >>> vars(b).keys()
    dict_keys(['z'])
    >>> set(dir(a)) - set(dir(object))
    {'x', 'y'}
    >>> vars(a).keys()
    dict_keys(['x', 'y'])

因此，每一级继承都会将较新的方法添加到dir列表中，并dir在子类上显示在其方法解析顺序中找到的所有方法。

这就是 Python 在 REPL 中建议方法完成的方式：

class A:
    x = 'x'
class B(A):
    y = 'y'
b = B()
b.   # 两次 tab 键  。？？？
    b.x  b.y   # 自动

slots 它只允许其中定义的内容被类实现。如果对比 go 的接口，可以类型理解，在go接口中如果没有实现，则编译检查报错。在python中如果实现超过 slots 定义，则报错。

这是 Python 具有的一种奇怪/有趣的行为：

object 默认具有 slots，也支持自定义内容。

>>> x = object()   
>>> x.f = 5
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'object' object has no attribute 'f'
>>> class A:   
...     x = 1
...
>>> a  = A()
>>> a.f = 5

这就是hash用武之地。首先，让我复制我自己的类list和object在我自己的类中显示的行为：

>>> class C:
...     __slots__ = ()
...
>>> c = C()
>>> c.foo = 5
AttributeError: 'C' object has no attribute 'foo'

6 python的存储数据方式

字典将更多使用内存，列表结构，本质上类型python的元组，结构具有固定大小，内存更少。

这两种方式通过两个对象属性 dict 和 slots

通常所有实例属性 self.foo 都存储在dict字典
除非定义 slots属性，这样对象只能具有恒定的指定的属性
定义了 slots将不会加载 dict

如下 list 的内置函数：

 vars(list)
    mappingproxy({'__repr__': <slot wrapper '__repr__' of 'list' objects>, '__hash__': None, '__getattribute__': <slot wrapper '__getattribute__' of 'list' objects>, '__lt__': <slot wrapper '__lt__' of 'list' objects>, ...)

    __repr__:<slot wrapper '__repr__' of 'list' objects>

7 清晰一些的显示 list 结构：

__hash__:None
__getattribute__:<slot wrapper '__getattribute__' of 'list' objects>
__lt__:<slot wrapper '__lt__' of 'list' objects>
__le__:<slot wrapper '__le__' of 'list' objects>
__eq__:<slot wrapper '__eq__' of 'list' objects>
__ne__:<slot wrapper '__ne__' of 'list' objects>
__gt__:<slot wrapper '__gt__' of 'list' objects>
__ge__:<slot wrapper '__ge__' of 'list' objects>
__iter__:<slot wrapper '__iter__' of 'list' objects>
__init__:<slot wrapper '__init__' of 'list' objects>
__len__:<slot wrapper '__len__' of 'list' objects>
__getitem__:<method '__getitem__' of 'list' objects>
__setitem__:<slot wrapper '__setitem__' of 'list' objects>
__delitem__:<slot wrapper '__delitem__' of 'list' objects>
__add__:<slot wrapper '__add__' of 'list' objects>
__mul__:<slot wrapper '__mul__' of 'list' objects>
__rmul__:<slot wrapper '__rmul__' of 'list' objects>
__contains__:<slot wrapper '__contains__' of 'list' objects>
__iadd__:<slot wrapper '__iadd__' of 'list' objects>
__imul__:<slot wrapper '__imul__' of 'list' objects>
__new__:<built-in method __new__ of type object at 0x00007FFE87DA1AF0>
__reversed__:<method '__reversed__' of 'list' objects>
__sizeof__:<method '__sizeof__' of 'list' objects>
clear:<method 'clear' of 'list' objects>
copy:<method 'copy' of 'list' objects>
append:<method 'append' of 'list' objects>
insert:<method 'insert' of 'list' objects>
extend:<method 'extend' of 'list' objects>
pop:<method 'pop' of 'list' objects>
remove:<method 'remove' of 'list' objects>
index:<method 'index' of 'list' objects>
count:<method 'count' of 'list' objects>
reverse:<method 'reverse' of 'list' objects>
sort:<method 'sort' of 'list' objects>
__class_getitem__:<method '__class_getitem__' of 'list' objects>
__doc__:Built-in mutable sequence.

8 字典dict 结构：

__repr__:<slot wrapper '__repr__' of 'dict' objects>
__hash__:None
__getattribute__:<slot wrapper '__getattribute__' of 'dict' objects>
__lt__:<slot wrapper '__lt__' of 'dict' objects>
__le__:<slot wrapper '__le__' of 'dict' objects>
__eq__:<slot wrapper '__eq__' of 'dict' objects>
__ne__:<slot wrapper '__ne__' of 'dict' objects>
__gt__:<slot wrapper '__gt__' of 'dict' objects>
__ge__:<slot wrapper '__ge__' of 'dict' objects>
__iter__:<slot wrapper '__iter__' of 'dict' objects>
__init__:<slot wrapper '__init__' of 'dict' objects>
__or__:<slot wrapper '__or__' of 'dict' objects>
__ror__:<slot wrapper '__ror__' of 'dict' objects>
__ior__:<slot wrapper '__ior__' of 'dict' objects>
__len__:<slot wrapper '__len__' of 'dict' objects>
__getitem__:<method '__getitem__' of 'dict' objects>
__setitem__:<slot wrapper '__setitem__' of 'dict' objects>
__delitem__:<slot wrapper '__delitem__' of 'dict' objects>
__contains__:<method '__contains__' of 'dict' objects>
__new__:<built-in method __new__ of type object at 0x00007FFE87D988B0>
__sizeof__:<method '__sizeof__' of 'dict' objects>
get:<method 'get' of 'dict' objects>
setdefault:<method 'setdefault' of 'dict' objects>
pop:<method 'pop' of 'dict' objects>
popitem:<method 'popitem' of 'dict' objects>
keys:<method 'keys' of 'dict' objects>
items:<method 'items' of 'dict' objects>
values:<method 'values' of 'dict' objects>
update:<method 'update' of 'dict' objects>
fromkeys:<method 'fromkeys' of 'dict' objects>
clear:<method 'clear' of 'dict' objects>
copy:<method 'copy' of 'dict' objects>
__reversed__:<method '__reversed__' of 'dict' objects>
__class_getitem__:<method '__class_getitem__' of 'dict' objects>
__doc__:dict() -> new empty dictionary
dict(mapping) -> new dictionary initialized from a mapping object's
    (key, value) pairs
dict(iterable) -> new dictionary initialized as if via:
    d = {}
    for k, v in iterable:
        d[k] = v
dict(**kwargs) -> new dictionary initialized with the name=value pairs
    in the keyword argument list.  For example:  dict(one=1, two=2)

9 小结

在Python中，hash()和is帮助确定对象的相等性。dir()和vars()揭示对象的属性和内部表示，__slots__优化内存使用。列表和字典结构有不同的内存和性能特性，字典使用哈希表进行快速访问。

牢记python对象的操作方式

简介

2 hash和id 相等

3 哈希及其属性

4 dir 和 vars 一切都是字典对象

5 查看objects的所有属性：

6 python的存储数据方式

7 清晰一些的显示 list 结构：

8 字典dict 结构：

9 小结

云原生

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像