Life is Short, I Use Python!

By 苏剑林 | December 06, 2015

Python Data Analysis and Mining in Action

Python Data Analysis and Mining in Action

During the summer, at the invitation of Teddy Company, I worked on the companion volume to their book MATLAB Data Analysis and Mining in Action, titled Python Data Analysis and Mining in Action (there is also a companion volume for the R language). My primary tasks were writing the introduction to Python and translating the MATLAB code from the book into Python versions. I gladly accepted this task, first to earn some pocket money through part-time work, second to systematically train myself in Python programming, and third, to experience a major "showdown" between MATLAB, R, and Python. The book has now been officially released and can be found on Amazon, Dangdang, JD.com, and Taobao. I am also honored to be listed as one of the authors, making this my first published book.

While this post may seem like an advertisement—and indeed it is—I am not here just to promote this book, but to "promote" Python itself. Regarding the book, I am confident that the Python scripts included, whether in terms of code conciseness or execution efficiency, will not lose to MATLAB or the R language. Moreover, while the cases in the book generally involve only a few hundred data points, I designed the code to consider analysis for tens or even hundreds of thousands of data points. In terms of practical application, in my current part-time role at a company, I use Python to perform data analysis on tens of millions of articles. It is evident that Python performs impressively across the board. No wonder some exclaim: life is short, I use Python!

That's right. Life is short, I use Python! Even if you do not pursue a career as a programmer or data analyst in the future, learning Python can bring great convenience to your existing work, whether you are a student, teacher, researcher, or office professional.

Returning to the book Python Data Analysis and Mining in Action, let me add a few more promotional lines so this post doesn't seem too brief. Actually, most of the cases in this book originate from the three data mining competitions hosted by Teddy Company. The competition problems themselves have very clear practical backgrounds; therefore, the cases in the book are highly practical and cover a wide range of topics. In terms of tasks, it includes data processing, classification, clustering, association analysis, natural language processing, and other fundamental content. In terms of models, it covers common models such as Logistic Regression, SVM, Decision Trees, and Neural Networks. This book might not be "the" perfect book, but it is certainly a very dedicated one. (At least, the author's code translation was quite diligent, and Chapter 2, the introduction to Python, was completely rewritten by me, which I will share with everyone later. ^_^)

If you found this article helpful, you are welcome to share or donate to this post. Donations are not for profit, but to know how much sincere attention Scientific Spaces has received from its readers. Of course, if you ignore it, it will not affect your reading. Welcome and thank you again!