I look at data every day. If I had to go back to a past version of myself to give him advice, I’d offer this: make it a rule to fit your data into a box.
There are plenty of mathematical techniques out there for analyzing data but to effectively apply them to your particular data, your data needs to fit the following format:
Data consists of rows and columns Your data should be viewable using any common spreadsheet application Each row represents an instance of data (in other words, each row represents one object under study be it a person, a spammy email, or a photograph with a face to be recognized) Each column represents a feature or something that we can use to describe the instance (and this could be a person’s height, the number of occurrences of the word “FREE” in a spammy email, or a length of a detected edge in a picture of a face) When you encounter some new data, it’s best to strive to fit it into that framework.