震惊！加速X个数量级！Cython到底有多快？

目前来说，处理数据我还是比较喜欢使用pandas，确实爽到飞起，而这种易用性带来的是底层数据结构的复杂性，从而导致性能损失。但是好用真的太重要了，所以出现了像Numba/Datatable等一些列支持并行化的Dataframe的方案，当然也有dask这一类分布式并行架构，话说回来 Numba已经支持GPU加速～说起来还是挺爽的，但是这不是文章的重点，关于pandas并行，前面已经有一篇文章介绍过了。

这次来我们聊一下加速Python的另一种思路，C语言加速，这里我们使用Cython，sklearn大多数计算由这个方案实现。当然可以利用C语言直接扩展python,PyTorch/TensorFlow/Numpy都是这种方案，但是这个需要对C/C++开发比较熟悉，开发效率可能是不及Cython的，可能哈。

为什么python慢，因为它是动态数据类型，运行时解释器要花费大量时间来确定对象的数据类型，从而判定数据类型的属性，C语言等一些列严格的静态数据类型语言就没有这些遗憾，所以C的效率要高很多，在一些特殊情况下能高出几个数量级。Cython的原始文档：

This can make Python a very relaxed and comfortable language for rapid development, but with a price – the ‘red tape’ of managing data types is dumped onto the interpreter. At run time, the interpreter does a lot of work searching namespaces, fetching attributes and parsing argument and keyword tuples. This run-time ‘late binding’ is a major cause of Python’s relative slowness compared to ‘early binding’ languages such as C++.

This指的是python的动态数据类型优势。直接看例子，来自这篇文章：https://pythonprogramming.net/introduction-and-basics-cython-tutorial


# example_original.py
def test(x):
    y = 0
    for i in range(x):
        y += i
    return y
``````python
# example_cython.pyx
cpdef int test(int x):
    cdef int y = 0
    cdef int i
    for i in range(x):
        y += i
    return y

Cython文件后缀名是”.pyx”，相比原生python方法，增加了数据类型的定义，接下来需要编写setup.py文件用于构建pyx文件


from distutils.core import setup
from Cython.Build import cythonize

setup(ext_modules = cythonize('example_cython.pyx'))

三个文件都是在同一个目录下，进shell执行：

 python setup.py build_ext --inplace

一切顺利的话～～会得到一个Warning～～

FutureWarning: Cython directive ‘language_level’ not set, using 2 for now (Py2). This will change in a later release!

可以在pyx文件中加入如下声明：

# cython: language_level=3

没什么问题就OK了，接下来写一个test文件:

import example_cython, example_original, time

if __name__ == '__main__':
    times = 10000
    add_times = 100

    # original
    original_total_elapse = 0.0
    for i in range(times):
        start_time = time.time()
        example_original.test(add_times)
        original_total_elapse += time.time() - start_time

    # cython
    cython_total_elapse = 0.0
    for i in range(times):
        start_time = time.time()
        example_cython.test(add_times)
        cython_total_elapse += time.time() - start_time

    print("Cython is {}x faster.".format(original_total_elapse / cython_total_elapse) )

震惊！加速X个数量级！Cython到底有多快？

By tensorzen

发表回复取消回复

You Missed

Step by Step实现RAG

timeScale vs fixedDeltaTime

Difference between Gradient and Derivative

Fixed update with Physics.Simulate in Unity

By tensorzen

Related Post

发表回复 取消回复

You Missed

发表回复取消回复