operations

Memory Profiling in Python

February 15, 2016

Data Scientists often need to sharpen their tools. If you use Python for analyzing data or running predictive models, here’s a tool to help you avoid those dreaded out-of-memory issues that tend to come up with large datasets. Enter memory_profiler for Python This memory profile was designed to assess the memory usage of Python programs. It’s cross platform and should work on any modern Python version (2.7 and up). To use it, you’ll need to install it (using pip is the preferred way).

Using Jupyter on Remote Servers

January 28, 2016

As a data scientist, it really helps to have a powerful computer nearby when you need it. Even with an i7 laptop with 16GB of RAM in it, you’ll sometimes find yourself needing more power. Whether your task is compute or memory constrained, though, you’ll find yourself looking to the cloud for more resources. Today I’ll outline how to be more effective when you have to compute remotely. I like to refer folks to this great article on setting up SSH configs.

python jupyter operations

Getting Up and Running With Python Virtual Environments

May 8, 2015

Python is a great tool to have available for all sorts of tasks, including data analysis or machine learning. It’s a great language to start off with if you’re a beginner, and there are loads of tutorials out there. So, if you’re a neophyte Pythonista, head over there and come back here later. Additionally, plenty of great developers have been working on tools that just get the job done, including pandas for wrangling your data (and turning it into something that looks like a spreadsheet), as well as Scikit-Learn for running anything from basic statistics to more complex learning algorithms on your data.

python operations getting-started

You Probably Need a Database

May 7, 2015

When I see organizations using and talking about their data, they love to present the tools they’re using to handle and wrangle it. You’ve probably heard terms like Hadoop, Spark, Shark, PostgreSQL, MySQL, MongoDB, and rarely Excel. (If you haven’t, there’s a good list to look up on Wikipedia.) I won’t argue that taming data doesn’t take good tools, but what I will argue is that the tools you use depend on the scale of your data.

database operations scaling data-science

Language, AI, & Complex Systems

Posts tagged "operations"

Memory Profiling in Python

Using Jupyter on Remote Servers

Getting Up and Running With Python Virtual Environments

You Probably Need a Database