As an economist by training, I did a lot of mathematics and statistics modules during my university days. I only had a one-module experience in programming and that was with Java, which in my opinion was one of the most difficult languages to pick up (hats off to Java Programmers), especially for someone with limited computer science knowledge at that time (or even now).
Being a data professional these days, it is essential to know programming, which is the reason I picked up two of the most popular programming languages for data science – R and Python.
I thought of sharing my learning journey in programming to help people, especially the non-computer science folks, understand more about programming and, hopefully, take the first steps in coding.
As mentioned, my first programming language was Java and I was really bad at it because I could not grasp the mechanism behind the ‘public’ and ‘private’ thing even after many explanations from my friends. What was most frustrating was, back in those days, coding was done on a notepad so I was not able to detect any syntax errors that would have shown up early on with the colour coding.
Moreover, the error messages were so cryptic, I had no idea how to go about correcting the error. There were no Google and Stack Overflow around to help poor souls like me. I had to constantly reference very thick physical textbooks and narrow my solution space so that I could pinpoint the actual error. It was very painful and got to be even more so after I received my grade for the Java programming module. It was a terrible experience and I have the grades to show for it.
I graduated during a very bad economic downturn. The reality then was very brutal to fresh graduates as there were many experienced professionals around and companies were more willing to give them opportunities than fresh grads like me. It was only after a very long period of time that I was finally hired into a research centre. At the research centre, they used a software called SAS, a statistical analysis tool that started in the US during the late 1970s. To use the software, I had to re-visit programming again. Fortunately, when I looked at the syntax and how it worked behind the scene, I was able to grasp it quickly. With my data concepts and background backing me, I knew what I wanted to do with SAS. Google was available then to help with code and syntax details, although the search results were not as good as today. Long story short, I experienced new confidence in programming and proficiency in SAS programming made me more effective in my analysis and modeling work.
Knowledge of SAS programming helped me get into the banks because proficiency in it was very rare at that time. While I was in the banks, I came across R. Just in case you are wondering, the banks were not using R. Rather, I got to know R through my professional network.
R is an open-source software mostly for statistics and data science. A friend of mine told me I should learn R programming to broaden my skills set and have more to offer to prospective employers. With a renewed confidence in programming, I decided to pick up R. There were a lot of online tutorials on R then and I proceeded to research and learn it. Again, the strong concepts in data analytics helped me pick up R quickly. I noticed that the syntax for R programming actually looked very much like Excel functions, with the exception of the Tidyverse packages. With this realisation, I was able to learn R even faster. After R and a successful round of learning another programming language, I went on to pick up Python, a popular general purpose language, paying special attention to the data science portion. That meant focusing on packages like pandas, NumPy, scikit-learn etc.
Nowadays, there is an abundance of online material available for anyone to start learning a programming language. The first challenge is no longer about picking up the language basics but to have an actual environment installed on your computer for practise.
I encourage all my training participants and you, the reader, to pick up programming, to have some programming experience at least. Because with Industry 4.0, most jobs will be associated with data and in order to have better and smarter automation, we cannot run away from interacting with computers. Programming gives us that ability to interact. You can learn programming from online sources or perhaps even use social media to gather a group of like-minded folks to come and learn together. With Google and Stack Overflow, even if the error messages are pretty cryptic (by the way, they still are!), debugging the error is much easier compared to the days when I was learning Java.
What about coding bootcamps? My advice is to attend a coding bootcamp if you want to maintain a structure in your learning, but do check out the trainer’s background beforehand. Make sure that the trainer has enough experience to share industry best-practices and the common pitfalls in that particular programming language.
Here are a few recommended resources to kick things up.
You can install R at Cran R and also install a popular IDE (Integrated Development Environment) for R, call R Studio. A beginner tutorial which I would recommend is the one written by Hadley Wickham, “R for Data Science“.
You can download Anaconda (for a local version) or register an account with Microsoft Azure Notebooks and join the AI for Industry programme run by AI Singapore (it uses courses from DataCamp, led by mentors from AI Singapore). Else, if you want to learn at your own pace, I recommend the online tutorial by Jake Vanderplas, “Python for Data Science“
All in all, I strongly encourage everyone to at least try their hands on programming, especially for folks who want to become a data professional, because there are very, very limited ways you can avoid programming in such a job role. Have fun learning programming!