Bite-sized bits of data science for the non-data scientist
Disclaimer: All * terms to be defined at a later point. As well as many others
data scientist: a role that includes basic engineering, analytics, and statistics; often builds machine learning models
- depending on the company, might be a product analyst, research scientists, statistician, AI specialist, or other
- a job title made up by a guy at Facebook and a guy at LinkedIn trying to get better candidates for advanced analytics positions
in a sentence: We need to hire a data scientist!
data science: advanced analytics, plus coding and machine learning
- producing insights for decision makers or putting models into production*
- 80% cleaning data, 20% data sciencing
in a sentence: We need to hire a data science team!
artificial intelligence: the ability for a machine to produce inference from input without human directive
- used to describe everything from basic data science to self driving cars to Ava’s Ex Machina
- no one aggress a definition
- “AI is whatever hasn’t been done yet.” ~ Douglas Hofstadter
in a sentence: We just got our AI startup funded–join our team and become our first data scientist!
machine learning: algorithms and statistical models that enable computers to uncover patterns in data
- claims a large part of old school statistics as its own, plus some fun new algorithms
- it’s probably logistic regression. or a random forest….or linear regression.
- AI if you’re feeling fancy
in a sentence: We need you to machine learn [the core of our startup].
model: a hypothesized relationship about the data, usually associated with an algorithm
- may be used to refer to the algorithm itself, the relationship, a fit* model, the statistical model, maybe the mathematical model, perhaps Emily Ratajkowski, certainly not a small locomotive
- it’s probably logistic regression. or random forest….or linear regression.
in a sentence: I’m training a model. (and not how to turn left)
feature: an input to the model; x value; a predictor; a column of input data
- when a data scientist is ‘engineering,’ this is usually what they’re making
- if your model says that wine points predicts price, then sommeliers’ ratings is your feature
in a sentence: Feature engineering will make or break this model.
label: the output of a predictive model; y value; the column of output data
- the thing the startup got funded to predict