This patch is a code optimisation, trading text bytes for speed. On
pyboard it's an increase of 0.06% in code size for a gain (in pystone
performance) of roughly 6.5%.
The patch optimises load/store/delete of attributes in user defined classes
by not looking up special accessors (@property, __get__, __delete__,
__set__, __setattr__ and __getattr_) if they are guaranteed not to exist in
the class.
Currently, if you do my_obj.foo() then the runtime has to do a few checks
to see if foo is a property or has __get__, and if so delegate the call.
And for stores things like my_obj.foo = 1 has to first check if foo is a
property or has __set__ defined on it.
Doing all those checks each and every time the attribute is accessed has a
performance penalty. This patch eliminates all those checks for cases when
it's guaranteed that the checks will always fail, ie no attributes are
properties nor have any special accessor methods defined on them.
To make this guarantee it checks all attributes of a user-defined class
when it is first created. If any of the attributes of the user class are
properties or have special accessors, or any of the base classes of the
user class have them, then it sets a flag in the class to indicate that
special accessors must be checked for. Then in the load/store/delete code
it checks this flag to see if it can take the shortcut and optimise the
lookup.
It's an optimisation that's pretty widely applicable because it improves
lookup performance for all methods of user defined classes, and stores of
attributes, at least for those that don't have special accessors. And, it
allows to enable descriptors with minimal additional runtime overhead if
they are not used for a particular user class.
There is one restriction on dynamic class creation that has been introduced
by this patch: a user-defined class cannot go from zero special accessors
to one special accessor (or more) after that class has been subclassed. If
the script attempts this an AttributeError is raised (see addition to
tests/misc/non_compliant.py for an example of this case).
The cost in code space bytes for the optimisation in this patch is:
unix x64: +528
unix nanbox: +508
stm32: +192
cc3200: +200
esp8266: +332
esp32: +244
Performance tests that were done:
- on unix x86-64, pystone improved by about 5%
- on pyboard, pystone improved by about 6.5%, from 1683 up to 1794
- on pyboard, bm_chaos (from CPython benchmark suite) improved by about 5%
- on esp32, pystone improved by about 30% (but there are caching effects)
- on esp32, bm_chaos improved by about 11%