blog icon indicating copy to clipboard operation
blog copied to clipboard

Python中的对象

Open junnplus opened this issue 8 years ago • 4 comments

python中一切都是“对象”

0x00 PyObject

PyObject定义在Include/object.h,我们来看看这个对象。

typedef struct _object {
    _PyObject_HEAD_EXTRA
    Py_ssize_t ob_refcnt;
    struct _typeobject *ob_type;
} PyObject;

PyObject中有两个成员的结构体,

  • ob_refcnt,引用记数
  • ob_type,类型对象的指针

其类型分别为Py_ssize_tstruct _typeobject

Py_ssize_t

从名字可以看得出来,ob_refcnt就是Python的内存管理机制中基于引用计数的垃圾回收机制的对象引用数。 在Python中,主要是通过Py_INCREF(op)Py_DECREF(op)两个宏来增加和减少一个对象的Python引用计数。对于一个对象A,当有一个PyObject *引用了该对象A时,A的引用计数就会增加,而当引用A的这个PyObject *被删除,相应的引用计数就会减少,当对象A的引用计数减到0时,对象A对应的析构函数就会被调用,以释放内存。

#define Py_INCREF(op) (                         \
    _Py_INC_REFTOTAL  _Py_REF_DEBUG_COMMA       \
    ((PyObject *)(op))->ob_refcnt++)

#define Py_DECREF(op)                                   \
    do {                                                \
        PyObject *_py_decref_tmp = (PyObject *)(op);    \
        if (_Py_DEC_REFTOTAL  _Py_REF_DEBUG_COMMA       \
        --(_py_decref_tmp)->ob_refcnt != 0)             \
            _Py_CHECK_REFCNT(_py_decref_tmp)            \
        else                                            \
            _Py_Dealloc(_py_decref_tmp);                \
    } while (0)

ob_refcntPy_ssize_t类型,在Include/pyport.h定义如下:

/* Py_ssize_t is a signed integral type such that sizeof(Py_ssize_t) ==
 * sizeof(size_t).  C99 doesn't define such a thing directly (size_t is an
 * unsigned integral type).  See PEP 353 for details.
 */
#ifdef HAVE_SSIZE_T
typedef ssize_t         Py_ssize_t;
#elif SIZEOF_VOID_P == SIZEOF_SIZE_T
typedef Py_intptr_t     Py_ssize_t;
#else
#   error "Python needs a typedef for Py_ssize_t in pyport.h."
#endif

Py_ssize_t是一个所占字节数与ssize_t相同的有符号的整数类型(C99中没有定义ssize_t这种类型,某些编译器比如gcc扩展有该类型)

而在2.5版本之前,ob_refcntint类型的,在PEP 353描述了这改变。

In Python 2.4, indices of sequences are restricted to the C type int. On 64-bit machines, sequences therefore cannot use the full address space, and are restricted to 2**31 elements. This PEP proposes to change this, introducing a platform-specific index type Py_ssize_t. An implementation of the proposed change is in http://svn.python.org/projects/python/branches/ssize_t .

PyTypeObject

ob_type是一个指向_typeobject结构体的指针,用来指定一个类型对象,定义在Include/object.h

#ifdef Py_LIMITED_API
typedef struct _typeobject PyTypeObject; /* opaque */
#else
typedef struct _typeobject {
    PyObject_VAR_HEAD
    const char *tp_name; /* For printing, in format "<module>.<name>" */
    Py_ssize_t tp_basicsize, tp_itemsize; /* For allocation */

    /* Methods to implement standard operations */

    destructor tp_dealloc;
    printfunc tp_print;
    getattrfunc tp_getattr;
    setattrfunc tp_setattr;
    PyAsyncMethods *tp_as_async; /* formerly known as tp_compare (Python 2)
                                    or tp_reserved (Python 3) */
    reprfunc tp_repr;

    /* Method suites for standard classes */

    PyNumberMethods *tp_as_number;
    PySequenceMethods *tp_as_sequence;
    PyMappingMethods *tp_as_mapping;

    /* More standard operations (here for binary compatibility) */

    hashfunc tp_hash;
    ternaryfunc tp_call;
    reprfunc tp_str;
    getattrofunc tp_getattro;
    setattrofunc tp_setattro;

    /* Functions to access object as input/output buffer */
    PyBufferProcs *tp_as_buffer;

    /* Flags to define presence of optional/expanded features */
    unsigned long tp_flags;

    const char *tp_doc; /* Documentation string */

    /* Assigned meaning in release 2.0 */
    /* call function for all accessible objects */
    traverseproc tp_traverse;

    /* delete references to contained objects */
    inquiry tp_clear;

    /* Assigned meaning in release 2.1 */
    /* rich comparisons */
    richcmpfunc tp_richcompare;

    /* weak reference enabler */
    Py_ssize_t tp_weaklistoffset;

    /* Iterators */
    getiterfunc tp_iter;
    iternextfunc tp_iternext;

    /* Attribute descriptor and subclassing stuff */
    struct PyMethodDef *tp_methods;
    struct PyMemberDef *tp_members;
    struct PyGetSetDef *tp_getset;
    struct _typeobject *tp_base;
    PyObject *tp_dict;
    descrgetfunc tp_descr_get;
    descrsetfunc tp_descr_set;
    Py_ssize_t tp_dictoffset;
    initproc tp_init;
    allocfunc tp_alloc;
    newfunc tp_new;
    freefunc tp_free; /* Low-level free-memory routine */
    inquiry tp_is_gc; /* For PyObject_IS_GC */
    PyObject *tp_bases;
    PyObject *tp_mro; /* method resolution order */
    PyObject *tp_cache;
    PyObject *tp_subclasses;
    PyObject *tp_weaklist;
    destructor tp_del;

    /* Type attribute cache version tag. Added in version 2.6 */
    unsigned int tp_version_tag;

    destructor tp_finalize;

#ifdef COUNT_ALLOCS
    /* these must be last and never explicitly initialized */
    Py_ssize_t tp_allocs;
    Py_ssize_t tp_frees;
    Py_ssize_t tp_maxalloc;
    struct _typeobject *tp_prev;
    struct _typeobject *tp_next;
#endif
} PyTypeObject;
#endif

_typeobject结构体中,以宏PyObject_VAR_HEAD开头,对应是一个变长对象

#define PyObject_VAR_HEAD      PyVarObject ob_base;

_typeobject中除了宏PyObject_VAR_HEAD以外的成员,可以分为四类:

  • tp_name,类型名,主要是Python内部以及调试的时候使用,用来识别对象的类型;
  • tp_basicsizetp_itemsize,创建该类型对象时分配内存空间大小的信息;
  • 类型对象对应的操作(诸如tp_print这样的许多的函数指针);
  • 类型对象的类型信息

具体每个字段的作用可以参考Type Objects

0x01 PyVarObject

一般把整数对象这样不包含可变长度的数据对象称为“定长对象”(PyObject),而字符串这样包含可变长度数据的对象称为“变长对象”(PyVarObject)。它们的区别在于定长对象的不同对象占用的内存大小是一样的,而变长对象的不同对象占用的内存可能是不一样的。 摘自:《Python源码剖析》 — 陈儒

注:Python2中的整数对象(PyIntObject)是定长对象,但长整数对象(PyLongObject)是变长对象。而Python3中只有PyLongObject表示整数对象,且也是变长对象。

typedef struct {
    PyObject ob_base;
    Py_ssize_t ob_size; /* Number of items in variable part */
} PyVarObject;

变长对象PyVarObject相比PyObject多了一个变量ob_size,即变长对象中的中一共容纳了多少个元素。Python中list就是一个PyVarObject对象,如果一个list对象有5个元素,那么ob_size就是5。

实际上,PyVarObject是对PyObject的一个扩展,所以对于任何一个PyVarObject,其所占用的内存,开始部分的字节的意义和PyObject是一样的。

不同Python对象与PyObjectPyVarObject的关系(取自《Python源码剖析》):

image

换句话说,在Python内部,每一个对象都拥有相同的对象头部。这就使得在Python中,对对象的引用变得非常的统一,我们只需要用一个PyObject*指针就可以引用任意的一个对象。而不论该对象实际是一个什么对象。 摘自:《Python源码剖析》 — 陈儒


至于PyObject中的_PyObject_HEAD_EXTRA这个宏中,如果没有指定编译选项Py_TRACE_REFS的情况下,可以直接忽略。

#ifdef Py_TRACE_REFS
/* Define pointers to support a doubly-linked list of all live heap objects. */
#define _PyObject_HEAD_EXTRA            \
    struct _object *_ob_next;           \
    struct _object *_ob_prev;

#define _PyObject_EXTRA_INIT 0, 0,

#else
#define _PyObject_HEAD_EXTRA
#define _PyObject_EXTRA_INIT
#endif

Py_TRACE_REFS 只在编译的时候开启 debug 模式下才定义,方便通过 _PyObject_HEAD_EXTRA 的双向链表结构来跟踪对象。

junnplus avatar Apr 26 '17 14:04 junnplus

大佬,你这个是python哪个版本的?

huosan0123 avatar Mar 20 '19 12:03 huosan0123

@huosan0123 3.7吧

junnplus avatar Mar 26 '19 03:03 junnplus

image

请问这块_PyObject_HEAD_EXTRA _typeobject代码是串行了吗?

liyaodev avatar Apr 12 '21 09:04 liyaodev

@yaolipro 是串了代码,已经改过来了,感谢指正

junnplus avatar Apr 16 '21 02:04 junnplus