This week we welcome William Horton (@hortonhearsafoo) as our PyDev of the Week! William is a Senior Software Engineer at Compass and has spoken at several local Python conferences. He is a contributor to PyTorch and fastai.
Let’s spend some time getting to know William better!
Can you tell us a little about yourself (hobbies, education, etc):
A little about myself: people might be surprised about my educational background–I didn’t study computer science. I have a bachelors in the social sciences. So by the time I finished undergrad, the most programming I had done was probably doing regressions in Stata to finish my thesis. I decided against grad school, and instead signed up for a coding bootcamp (App Academy) in NYC. The day I’m writing this, September 28, is actually 5 years to the day that I started at App Academy.
Since then I’ve worked at a few different startups in NYC, across various industries: first investment banking, then online pharmacy, and now real estate. I’m currently a senior engineer on the AI Services team at Compass, working on machine learning solutions for our real estate agents and consumers.
I like to spend my free time on a few different hobbies. I’m a competitive powerlifter, so I like to get into the gym a few times a week (although with the pandemic in NYC I didn’t lift for six months or so). I’ve actually found powerlifting to be a pretty common hobby among software engineers. Every time someone new joined my gym, it seemed like they came from a different startup. I love to play basketball. And I’m passionate about music: I’ve been a singer almost my whole life, and most recently was performing with an a cappella group in NYC. And in the last year I’ve picked up the guitar, after not touching it since I was a teenager, and that has been very fulfilling.
Why did you start using Python?
But I think the real turning point for me was when I discovered the fast.ai course in the fall of 2017. I had taken a few machine learning courses online, including the Andrew Ng Coursera course, and it was a topic that I found interesting. But the fast.ai course just really sucked me in–the way that Jeremy Howard presented the material just gripped me in a certain way, and made me want to find out more. I loved his pitch: if you know some Python, and you have high school level math, you can get hands-on with machine learning, and start to grow your skills.
So by the time I was looking for a job in 2018, I knew I wanted to do something closer to data and machine learning. I joined Compass for a backend data role, on a growing team that was handling all of the real estate listing data we had coming in from different sources. That gave me the chance to learn some important tools: I set up the first Airflow instance at Compass, and worked on our PySpark code. And then when the machine learning team started up, I was able to contribute to the first project, and eventually join the team full-time.
What other programming languages do you know and which is your favorite?
What projects are you working on now?
My main project right now is working on Likely to Sell Recommendations at Compass. We use historical data to learn a model of which properties are likely to sell, and then connect that with the addresses that agents have put into their contacts lists in the Compass CRM. It’s a Python-powered project all the way through: the model is scikit-learn, we use PySpark for processing the data, and the API is a Python GRPC service. We have a blog post on the Compass Medium page that has more information if people are interested in learning more.
The other project I’m really excited about is our Machine Learning Pipelines project, which we’re building on top of the open-source platform Kubeflow. It’s a way to define and run machine learning workflows on top of Kubernetes, which allows you to get some big benefits in terms of leveraging distributed computing, parallelization, and resource management. We’re already using it for the Likely to Sell project I mentioned above, and it’s allowed us to iterate and experiment more quickly. I had the chance to present a poster about Kubeflow Pipelines at SciPy 2020, and I also have a (virtual) talk on the topic at SciPy Japan (Oct. 30-Nov. 2)
Which Python libraries are your favorite (core or 3rd party)?
It’s hard to pick, there are a lot of great libraries out there! But to name a few: for the work I do professionally, I think that Jupyter Notebook, pandas, and scikit-learn are just essential. Really great libraries that have been around a while, and have stood the test of time. And I also have to shout out pytorch and fastai for fostering my interest in deep learning and machine learning in general, which is what started me down the road to my current role.
How did you get into giving talks at Python conferences?
I would say a few things contributed to it: my own curiosity, the support of the community, and also, admittedly, just luck. It all started because I signed up for a meetup that was a PyGotham talk brainstorm session hosted at Dropbox NYC. They took us through some exercises, and we all shared some ideas, and I took my best one and submitted to the PyGotham 2018 CFP. But I got rejected.
However, the PyOhio CFP was around the same time, and I saw a Tweet that was encouraging people to submit to that one too, so I sent the same proposal to PyOhio. And I got in! I was pretty excited to make the trip, but also very nervous to do my talk. PyOhio did offer speaker coaching to first-time speakers, so I’m thankful to them for that. I ended up having a great time giving the talk, and enjoyed the chance to meet some people and see some of the other talks. So I decided I wanted to do it again.
And then came…more rejections. I think I sent that same talk to two more conferences at the end of 2018 and got rejected. I decided to come up with some fresh material for 2019, and I’d say PyTexas 2019 is when I really hit my stride. I gave a talk that I was really proud of “CUDA in your Python”, but I also started meeting more people in the community, and that really contributed a lot to my conference experience.
Do you have any tips for people who would like to give technical talks?
The first thing I’d say is: put yourself out there. I’m a perfectionist by nature, so it’s really hard for me to actually hit the submit button on a CFP (even now, when I’ve had talks accepted). But at the end of the day, some reviewers are going to like your proposal, and some aren’t, so if you want to give the talk, you just have to play the numbers game, submit to a few places, and hope for the best.
The other thing I’d stress is that you don’t have to be the world’s expert on something to give a talk about it. It can be intimidating starting out when you see speakers who are the authors of libraries, or who have ten years more experience than you, or who work at a big-name company. But I would tell people starting out: all you have to do is create a 25-minute experience where people enjoy the presentation and learn something from it that they didn’t know before. A lot of people coming to conferences, especially the regional Python conferences, are early on in their learning process, so there’s a lot of value in just presenting your own take on some intro-level material.
Thanks for doing the interview, William!